qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
@ 2024-11-14 16:00 Richard Henderson
  2024-11-14 16:00 ` [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
                   ` (54 more replies)
  0 siblings, 55 replies; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

v1: 20241009150855.804605-1-richard.henderson@linaro.org

The initial idea was: how much can we do with an intelligent data
structure for the same cost as a linear search through an array?


r~


Richard Henderson (54):
  util/interval-tree: Introduce interval_tree_free_nodes
  accel/tcg: Split out tlbfast_flush_locked
  accel/tcg: Split out tlbfast_{index,entry}
  accel/tcg: Split out tlbfast_flush_range_locked
  accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx*
  accel/tcg: Assert bits in range in tlb_flush_range_by_mmuidx*
  accel/tcg: Flush entire tlb when a masked range wraps
  accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  accel/tcg: Populate IntervalTree in tlb_set_page_full
  accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  accel/tcg: Process IntervalTree entries in tlb_set_dirty
  accel/tcg: Use tlb_hit_page in victim_tlb_hit
  accel/tcg: Pass full addr to victim_tlb_hit
  accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  accel/tcg: Remove the victim tlb
  accel/tcg: Remove tlb_n_used_entries_inc
  include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  accel/tcg: Delay plugin adjustment in probe_access_internal
  accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  accel/tcg: Check original prot bits for read in atomic_mmu_lookup
  accel/tcg: Preserve tlb flags in tlb_set_compare
  accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu
  accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full
  accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal
  accel/tcg: Introduce tlb_lookup
  accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput
  accel/tcg: Merge mmu_lookup1 into mmu_lookup
  accel/tcg: Always use IntervalTree for code lookups
  accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  accel/tcg: Remove CPUTLBDesc.fulltlb
  target/alpha: Convert to TCGCPUOps.tlb_fill_align
  target/avr: Convert to TCGCPUOps.tlb_fill_align
  target/i386: Convert to TCGCPUOps.tlb_fill_align
  target/loongarch: Convert to TCGCPUOps.tlb_fill_align
  target/m68k: Convert to TCGCPUOps.tlb_fill_align
  target/m68k: Do not call tlb_set_page in helper_ptest
  target/microblaze: Convert to TCGCPUOps.tlb_fill_align
  target/mips: Convert to TCGCPUOps.tlb_fill_align
  target/openrisc: Convert to TCGCPUOps.tlb_fill_align
  target/ppc: Convert to TCGCPUOps.tlb_fill_align
  target/riscv: Convert to TCGCPUOps.tlb_fill_align
  target/rx: Convert to TCGCPUOps.tlb_fill_align
  target/s390x: Convert to TCGCPUOps.tlb_fill_align
  target/sh4: Convert to TCGCPUOps.tlb_fill_align
  target/sparc: Convert to TCGCPUOps.tlb_fill_align
  target/tricore: Convert to TCGCPUOps.tlb_fill_align
  target/xtensa: Convert to TCGCPUOps.tlb_fill_align
  accel/tcg: Drop TCGCPUOps.tlb_fill
  accel/tcg: Unexport tlb_set_page*
  accel/tcg: Merge tlb_fill_align into callers
  accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full

 include/exec/cpu-all.h               |   3 +
 include/exec/exec-all.h              |  65 +-
 include/exec/tlb-common.h            |  68 +-
 include/hw/core/cpu.h                |  75 +-
 include/hw/core/tcg-cpu-ops.h        |  10 -
 include/qemu/interval-tree.h         |  11 +
 target/alpha/cpu.h                   |   6 +-
 target/avr/cpu.h                     |   7 +-
 target/i386/tcg/helper-tcg.h         |   6 +-
 target/loongarch/internals.h         |   7 +-
 target/m68k/cpu.h                    |   7 +-
 target/microblaze/cpu.h              |   7 +-
 target/mips/tcg/tcg-internal.h       |   6 +-
 target/openrisc/cpu.h                |   8 +-
 target/ppc/internal.h                |   7 +-
 target/riscv/cpu.h                   |   8 +-
 target/s390x/s390x-internal.h        |   7 +-
 target/sh4/cpu.h                     |   8 +-
 target/sparc/cpu.h                   |   8 +-
 target/tricore/cpu.h                 |   7 +-
 target/xtensa/cpu.h                  |   8 +-
 accel/tcg/cputlb.c                   | 994 +++++++++++++--------------
 target/alpha/cpu.c                   |   2 +-
 target/alpha/helper.c                |  23 +-
 target/arm/ptw.c                     |  10 +-
 target/arm/tcg/helper-a64.c          |   4 +-
 target/arm/tcg/mte_helper.c          |  15 +-
 target/arm/tcg/sve_helper.c          |   6 +-
 target/avr/cpu.c                     |   2 +-
 target/avr/helper.c                  |  19 +-
 target/i386/tcg/sysemu/excp_helper.c |  36 +-
 target/i386/tcg/tcg-cpu.c            |   2 +-
 target/loongarch/cpu.c               |   2 +-
 target/loongarch/tcg/tlb_helper.c    |  17 +-
 target/m68k/cpu.c                    |   2 +-
 target/m68k/helper.c                 |  32 +-
 target/microblaze/cpu.c              |   2 +-
 target/microblaze/helper.c           |  33 +-
 target/mips/cpu.c                    |   2 +-
 target/mips/tcg/sysemu/tlb_helper.c  |  29 +-
 target/openrisc/cpu.c                |   2 +-
 target/openrisc/mmu.c                |  39 +-
 target/ppc/cpu_init.c                |   2 +-
 target/ppc/mmu_helper.c              |  21 +-
 target/riscv/cpu_helper.c            |  22 +-
 target/riscv/tcg/tcg-cpu.c           |   2 +-
 target/rx/cpu.c                      |  19 +-
 target/s390x/cpu.c                   |   4 +-
 target/s390x/tcg/excp_helper.c       |  23 +-
 target/sh4/cpu.c                     |   2 +-
 target/sh4/helper.c                  |  24 +-
 target/sparc/cpu.c                   |   2 +-
 target/sparc/mmu_helper.c            |  44 +-
 target/tricore/cpu.c                 |   2 +-
 target/tricore/helper.c              |  19 +-
 target/xtensa/cpu.c                  |   2 +-
 target/xtensa/helper.c               |  28 +-
 util/interval-tree.c                 |  20 +
 util/selfmap.c                       |  13 +-
 59 files changed, 938 insertions(+), 923 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 114+ messages in thread

* [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:51   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
                   ` (53 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Provide a general-purpose release-all-nodes operation, that allows
for the IntervalTreeNode to be embeded within a larger structure.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/qemu/interval-tree.h | 11 +++++++++++
 util/interval-tree.c         | 20 ++++++++++++++++++++
 util/selfmap.c               | 13 +------------
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h
index 25006debe8..d90ea6d17f 100644
--- a/include/qemu/interval-tree.h
+++ b/include/qemu/interval-tree.h
@@ -96,4 +96,15 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
 IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
                                           uint64_t start, uint64_t last);
 
+/**
+ * interval_tree_free_nodes:
+ * @root: root of the tree
+ * @it_offset: offset from outermost type to IntervalTreeNode
+ *
+ * Free, via g_free, all nodes under @root.  IntervalTreeNode may
+ * not be the true type of the nodes allocated; @it_offset gives
+ * the offset from the outermost type to the IntervalTreeNode member.
+ */
+void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset);
+
 #endif /* QEMU_INTERVAL_TREE_H */
diff --git a/util/interval-tree.c b/util/interval-tree.c
index 53465182e6..663d3ec222 100644
--- a/util/interval-tree.c
+++ b/util/interval-tree.c
@@ -639,6 +639,16 @@ static void rb_erase_augmented_cached(RBNode *node, RBRootLeftCached *root,
     rb_erase_augmented(node, &root->rb_root, augment);
 }
 
+static void rb_node_free(RBNode *rb, size_t rb_offset)
+{
+    if (rb->rb_left) {
+        rb_node_free(rb->rb_left, rb_offset);
+    }
+    if (rb->rb_right) {
+        rb_node_free(rb->rb_right, rb_offset);
+    }
+    g_free((void *)rb - rb_offset);
+}
 
 /*
  * Interval trees.
@@ -870,6 +880,16 @@ IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
     }
 }
 
+void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset)
+{
+    if (root && root->rb_root.rb_node) {
+        rb_node_free(root->rb_root.rb_node,
+                     it_offset + offsetof(IntervalTreeNode, rb));
+        root->rb_root.rb_node = NULL;
+        root->rb_leftmost = NULL;
+    }
+}
+
 /* Occasionally useful for calling from within the debugger. */
 #if 0
 static void debug_interval_tree_int(IntervalTreeNode *node,
diff --git a/util/selfmap.c b/util/selfmap.c
index 483cb617e2..d2b86da301 100644
--- a/util/selfmap.c
+++ b/util/selfmap.c
@@ -87,23 +87,12 @@ IntervalTreeRoot *read_self_maps(void)
  * @root: an interval tree
  *
  * Free a tree of MapInfo structures.
- * Since we allocated each MapInfo in one chunk, we need not consider the
- * contents and can simply free each RBNode.
  */
 
-static void free_rbnode(RBNode *n)
-{
-    if (n) {
-        free_rbnode(n->rb_left);
-        free_rbnode(n->rb_right);
-        g_free(n);
-    }
-}
-
 void free_self_maps(IntervalTreeRoot *root)
 {
     if (root) {
-        free_rbnode(root->rb_root.rb_node);
+        interval_tree_free_nodes(root, offsetof(MapInfo, itree));
         g_free(root);
     }
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
  2024-11-14 16:00 ` [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:52   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
                   ` (52 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

We will have a need to flush only the "fast" portion
of the tlb, allowing re-fill from the "full" portion.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b76a4eac4e..c1838412e8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -284,13 +284,18 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
     }
 }
 
-static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
+static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
 {
     desc->n_used_entries = 0;
+    memset(fast->table, -1, sizeof_tlb(fast));
+}
+
+static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
+{
+    tlbfast_flush_locked(desc, fast);
     desc->large_page_addr = -1;
     desc->large_page_mask = -1;
     desc->vindex = 0;
-    memset(fast->table, -1, sizeof_tlb(fast));
     memset(desc->vtable, -1, sizeof(desc->vtable));
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry}
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
  2024-11-14 16:00 ` [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
  2024-11-14 16:00 ` [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:52   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 04/54] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
                   ` (51 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Often we already have the CPUTLBDescFast structure pointer.
Allows future code simplification.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c1838412e8..e37af24525 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -131,20 +131,28 @@ static inline uint64_t tlb_addr_write(const CPUTLBEntry *entry)
     return tlb_read_idx(entry, MMU_DATA_STORE);
 }
 
+static inline uintptr_t tlbfast_index(CPUTLBDescFast *fast, vaddr addr)
+{
+    return (addr >> TARGET_PAGE_BITS) & (fast->mask >> CPU_TLB_ENTRY_BITS);
+}
+
+static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
+{
+    return fast->table + tlbfast_index(fast, addr);
+}
+
 /* Find the TLB index corresponding to the mmu_idx + address pair.  */
 static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
                                   vaddr addr)
 {
-    uintptr_t size_mask = cpu->neg.tlb.f[mmu_idx].mask >> CPU_TLB_ENTRY_BITS;
-
-    return (addr >> TARGET_PAGE_BITS) & size_mask;
+    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
 }
 
 /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
 static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
                                      vaddr addr)
 {
-    return &cpu->neg.tlb.f[mmu_idx].table[tlb_index(cpu, mmu_idx, addr)];
+    return tlbfast_entry(&cpu->neg.tlb.f[mmu_idx], addr);
 }
 
 static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 04/54] accel/tcg: Split out tlbfast_flush_range_locked
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (2 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 16:00 ` [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
                   ` (50 subsequent siblings)
  54 siblings, 0 replies; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Pierrick Bouvier

While this may at present be overly complicated for use
by single page flushes, do so with the expectation that
this will eventually allow simplification of large pages.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 66 +++++++++++++++++++++++-----------------------
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e37af24525..46fa0ae802 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -493,11 +493,6 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
     return false;
 }
 
-static inline bool tlb_flush_entry_locked(CPUTLBEntry *tlb_entry, vaddr page)
-{
-    return tlb_flush_entry_mask_locked(tlb_entry, page, -1);
-}
-
 /* Called with tlb_c.lock held */
 static void tlb_flush_vtlb_page_mask_locked(CPUState *cpu, int mmu_idx,
                                             vaddr page,
@@ -520,10 +515,37 @@ static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
     tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
 }
 
+static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
+                                       vaddr addr, vaddr len, vaddr mask)
+{
+    /*
+     * If @mask is smaller than the tlb size, there may be multiple entries
+     * within the TLB; for now, just flush the entire TLB.
+     * Otherwise all addresses that match under @mask hit the same TLB entry.
+     *
+     * If @len is larger than the tlb size, then it will take longer to
+     * test all of the entries in the TLB than it will to flush it all.
+     */
+    if (mask < fast->mask || len > fast->mask) {
+        tlbfast_flush_locked(desc, fast);
+        return;
+    }
+
+    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
+        vaddr page = addr + i;
+        CPUTLBEntry *entry = tlbfast_entry(fast, page);
+
+        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
+            desc->n_used_entries--;
+        }
+    }
+}
+
 static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
 {
-    vaddr lp_addr = cpu->neg.tlb.d[midx].large_page_addr;
-    vaddr lp_mask = cpu->neg.tlb.d[midx].large_page_mask;
+    CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
+    vaddr lp_addr = desc->large_page_addr;
+    vaddr lp_mask = desc->large_page_mask;
 
     /* Check if we need to flush due to large pages.  */
     if ((page & lp_mask) == lp_addr) {
@@ -532,9 +554,8 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
                   midx, lp_addr, lp_mask);
         tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
     } else {
-        if (tlb_flush_entry_locked(tlb_entry(cpu, midx, page), page)) {
-            tlb_n_used_entries_dec(cpu, midx);
-        }
+        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
+                                   page, TARGET_PAGE_SIZE, -1);
         tlb_flush_vtlb_page_locked(cpu, midx, page);
     }
 }
@@ -689,24 +710,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
     CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
     vaddr mask = MAKE_64BIT_MASK(0, bits);
 
-    /*
-     * If @bits is smaller than the tlb size, there may be multiple entries
-     * within the TLB; otherwise all addresses that match under @mask hit
-     * the same TLB entry.
-     * TODO: Perhaps allow bits to be a few bits less than the size.
-     * For now, just flush the entire TLB.
-     *
-     * If @len is larger than the tlb size, then it will take longer to
-     * test all of the entries in the TLB than it will to flush it all.
-     */
-    if (mask < f->mask || len > f->mask) {
-        tlb_debug("forcing full flush midx %d ("
-                  "%016" VADDR_PRIx "/%016" VADDR_PRIx "+%016" VADDR_PRIx ")\n",
-                  midx, addr, mask, len);
-        tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
-        return;
-    }
-
     /*
      * Check if we need to flush due to large pages.
      * Because large_page_mask contains all 1's from the msb,
@@ -720,13 +723,10 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
         return;
     }
 
+    tlbfast_flush_range_locked(d, f, addr, len, mask);
+
     for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
         vaddr page = addr + i;
-        CPUTLBEntry *entry = tlb_entry(cpu, midx, page);
-
-        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
-            tlb_n_used_entries_dec(cpu, midx);
-        }
         tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
     }
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (3 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 04/54] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:54   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx* Richard Henderson
                   ` (49 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

The INVALID bit should only be auto-cleared when we have
just called tlb_fill, not along the victim_tlb_hit path.

In atomic_mmu_lookup, rename tlb_addr to flags, as that
is what we're actually carrying around.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 46fa0ae802..77b972fd93 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1652,7 +1652,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
     bool maybe_resized = false;
     CPUTLBEntryFull *full;
-    int flags;
+    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
 
     /* If the TLB entry is for a different page, reload and try again.  */
     if (!tlb_hit(tlb_addr, addr)) {
@@ -1663,8 +1663,14 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
             maybe_resized = true;
             index = tlb_index(cpu, mmu_idx, addr);
             entry = tlb_entry(cpu, mmu_idx, addr);
+            /*
+             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+             * to force the next access through tlb_fill.  We've just
+             * called tlb_fill, so we know that this entry *is* valid.
+             */
+            flags &= ~TLB_INVALID_MASK;
         }
-        tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
+        tlb_addr = tlb_read_idx(entry, access_type);
     }
 
     full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
@@ -1814,10 +1820,10 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     MemOp mop = get_memop(oi);
     uintptr_t index;
     CPUTLBEntry *tlbe;
-    vaddr tlb_addr;
     void *hostaddr;
     CPUTLBEntryFull *full;
     bool did_tlb_fill = false;
+    int flags;
 
     tcg_debug_assert(mmu_idx < NB_MMU_MODES);
 
@@ -1828,8 +1834,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     tlbe = tlb_entry(cpu, mmu_idx, addr);
 
     /* Check TLB entry and enforce page permissions.  */
-    tlb_addr = tlb_addr_write(tlbe);
-    if (!tlb_hit(tlb_addr, addr)) {
+    flags = TLB_FLAGS_MASK;
+    if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
         if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
                             addr & TARGET_PAGE_MASK)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
@@ -1837,8 +1843,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
             did_tlb_fill = true;
             index = tlb_index(cpu, mmu_idx, addr);
             tlbe = tlb_entry(cpu, mmu_idx, addr);
+            /*
+             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+             * to force the next access through tlb_fill.  We've just
+             * called tlb_fill, so we know that this entry *is* valid.
+             */
+            flags &= ~TLB_INVALID_MASK;
         }
-        tlb_addr = tlb_addr_write(tlbe) & ~TLB_INVALID_MASK;
     }
 
     /*
@@ -1874,11 +1885,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         goto stop_the_world;
     }
 
-    /* Collect tlb flags for read. */
-    tlb_addr |= tlbe->addr_read;
+    /* Collect tlb flags for read and write. */
+    flags &= tlbe->addr_read | tlb_addr_write(tlbe);
 
     /* Notice an IO access or a needs-MMU-lookup access */
-    if (unlikely(tlb_addr & (TLB_MMIO | TLB_DISCARD_WRITE))) {
+    if (unlikely(flags & (TLB_MMIO | TLB_DISCARD_WRITE))) {
         /* There's really nothing that can be done to
            support this apart from stop-the-world.  */
         goto stop_the_world;
@@ -1887,11 +1898,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
     full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
 
-    if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
+    if (unlikely(flags & TLB_NOTDIRTY)) {
         notdirty_write(cpu, addr, size, full, retaddr);
     }
 
-    if (unlikely(tlb_addr & TLB_FORCE_SLOW)) {
+    if (unlikely(flags & TLB_FORCE_SLOW)) {
         int wp_flags = 0;
 
         if (full->slow_flags[MMU_DATA_STORE] & TLB_WATCHPOINT) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx*
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (4 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:56   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 07/54] accel/tcg: Assert bits in range " Richard Henderson
                   ` (48 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Next patches will assume non-zero length.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 77b972fd93..1346a26d90 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -791,6 +791,7 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
     TLBFlushRangeData d;
 
     assert_cpu_is_self(cpu);
+    assert(len != 0);
 
     /*
      * If all bits are significant, and len is small,
@@ -830,6 +831,8 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
     TLBFlushRangeData d, *p;
     CPUState *dst_cpu;
 
+    assert(len != 0);
+
     /*
      * If all bits are significant, and len is small,
      * this devolves to tlb_flush_page.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 07/54] accel/tcg: Assert bits in range in tlb_flush_range_by_mmuidx*
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (5 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx* Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:56   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
                   ` (47 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

The only target that does not use TARGET_LONG_BITS is Arm, which
only reduces bits based on TBI.  There is no point in handling
odd combinations of parameters.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 1346a26d90..5510f40333 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -792,20 +792,16 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
 
     assert_cpu_is_self(cpu);
     assert(len != 0);
+    assert(bits > TARGET_PAGE_BITS && bits <= TARGET_LONG_BITS);
 
     /*
      * If all bits are significant, and len is small,
      * this devolves to tlb_flush_page.
      */
-    if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
+    if (bits == TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
         tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
         return;
     }
-    /* If no page bits are significant, this devolves to tlb_flush. */
-    if (bits < TARGET_PAGE_BITS) {
-        tlb_flush_by_mmuidx(cpu, idxmap);
-        return;
-    }
 
     /* This should already be page aligned */
     d.addr = addr & TARGET_PAGE_MASK;
@@ -832,20 +828,16 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
     CPUState *dst_cpu;
 
     assert(len != 0);
+    assert(bits > TARGET_PAGE_BITS && bits <= TARGET_LONG_BITS);
 
     /*
      * If all bits are significant, and len is small,
      * this devolves to tlb_flush_page.
      */
-    if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
+    if (bits == TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
         tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
         return;
     }
-    /* If no page bits are significant, this devolves to tlb_flush. */
-    if (bits < TARGET_PAGE_BITS) {
-        tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
-        return;
-    }
 
     /* This should already be page aligned */
     d.addr = addr & TARGET_PAGE_MASK;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (6 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 07/54] accel/tcg: Assert bits in range " Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:58   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
                   ` (46 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

We expect masked address spaces to be quite large, e.g. 56 bits
for AArch64 top-byte-ignore mode.  We do not expect addr+len to
wrap around, but it is possible with AArch64 guest flush range
instructions.

Convert this unlikely case to a full tlb flush.  This can simplify
the subroutines actually performing the range flush.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 5510f40333..31c45a6213 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -802,6 +802,11 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
         tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
         return;
     }
+    /* If addr+len wraps in len bits, fall back to full flush. */
+    if (bits < TARGET_LONG_BITS && ((addr ^ (addr + len - 1)) >> bits)) {
+        tlb_flush_by_mmuidx(cpu, idxmap);
+        return;
+    }
 
     /* This should already be page aligned */
     d.addr = addr & TARGET_PAGE_MASK;
@@ -838,6 +843,11 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
         tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
         return;
     }
+    /* If addr+len wraps in len bits, fall back to full flush. */
+    if (bits < TARGET_LONG_BITS && ((addr ^ (addr + len - 1)) >> bits)) {
+        tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
+        return;
+    }
 
     /* This should already be page aligned */
     d.addr = addr & TARGET_PAGE_MASK;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (7 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 17:59   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
                   ` (45 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Add the data structures for tracking softmmu pages via
a balanced interval tree.  So far, only initialize and
destroy the data structure.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  3 +++
 accel/tcg/cputlb.c    | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index db8a6fbc6e..1ebc999a73 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -35,6 +35,7 @@
 #include "qemu/queue.h"
 #include "qemu/lockcnt.h"
 #include "qemu/thread.h"
+#include "qemu/interval-tree.h"
 #include "qom/object.h"
 
 typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
@@ -290,6 +291,8 @@ typedef struct CPUTLBDesc {
     CPUTLBEntry vtable[CPU_VTLB_SIZE];
     CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
     CPUTLBEntryFull *fulltlb;
+    /* All active tlb entries for this address space. */
+    IntervalTreeRoot iroot;
 } CPUTLBDesc;
 
 /*
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 31c45a6213..aa51fc1d26 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -89,6 +89,13 @@ QEMU_BUILD_BUG_ON(sizeof(vaddr) > sizeof(run_on_cpu_data));
 QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
 #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
 
+/* Extra data required to manage CPUTLBEntryFull within an interval tree. */
+typedef struct CPUTLBEntryTree {
+    IntervalTreeNode itree;
+    CPUTLBEntry copy;
+    CPUTLBEntryFull full;
+} CPUTLBEntryTree;
+
 static inline size_t tlb_n_entries(CPUTLBDescFast *fast)
 {
     return (fast->mask >> CPU_TLB_ENTRY_BITS) + 1;
@@ -305,6 +312,7 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     desc->large_page_mask = -1;
     desc->vindex = 0;
     memset(desc->vtable, -1, sizeof(desc->vtable));
+    interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
 }
 
 static void tlb_flush_one_mmuidx_locked(CPUState *cpu, int mmu_idx,
@@ -326,6 +334,7 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
     fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_new(CPUTLBEntry, n_entries);
     desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
+    memset(&desc->iroot, 0, sizeof(desc->iroot));
     tlb_mmu_flush_locked(desc, fast);
 }
 
@@ -365,6 +374,8 @@ void tlb_destroy(CPUState *cpu)
 
         g_free(fast->table);
         g_free(desc->fulltlb);
+        interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
+                                 offsetof(CPUTLBEntryTree, itree));
     }
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (8 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:00   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
                   ` (44 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Add or replace an entry in the IntervalTree for each
page installed into softmmu.  We do not yet use the
tree for anything else.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index aa51fc1d26..ea6a5177de 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -305,6 +305,17 @@ static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     memset(fast->table, -1, sizeof_tlb(fast));
 }
 
+static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
+{
+    IntervalTreeNode *i = interval_tree_iter_first(&desc->iroot, s, l);
+    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
+}
+
+static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
+{
+    return tlbtree_lookup_range(desc, addr, addr);
+}
+
 static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
 {
     tlbfast_flush_locked(desc, fast);
@@ -1072,7 +1083,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     MemoryRegionSection *section;
     unsigned int index, read_flags, write_flags;
     uintptr_t addend;
-    CPUTLBEntry *te, tn;
+    CPUTLBEntry *te;
+    CPUTLBEntryTree *node;
     hwaddr iotlb, xlat, sz, paddr_page;
     vaddr addr_page;
     int asidx, wp_flags, prot;
@@ -1180,6 +1192,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
         tlb_n_used_entries_dec(cpu, mmu_idx);
     }
 
+    /* Replace an old IntervalTree entry, or create a new one. */
+    node = tlbtree_lookup_addr(desc, addr_page);
+    if (!node) {
+        node = g_new(CPUTLBEntryTree, 1);
+        node->itree.start = addr_page;
+        node->itree.last = addr_page + TARGET_PAGE_SIZE - 1;
+        interval_tree_insert(&node->itree, &desc->iroot);
+    }
+
     /* refill the tlb */
     /*
      * When memory region is ram, iotlb contains a TARGET_PAGE_BITS
@@ -1201,15 +1222,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     full->phys_addr = paddr_page;
 
     /* Now calculate the new entry */
-    tn.addend = addend - addr_page;
+    node->copy.addend = addend - addr_page;
 
-    tlb_set_compare(full, &tn, addr_page, read_flags,
+    tlb_set_compare(full, &node->copy, addr_page, read_flags,
                     MMU_INST_FETCH, prot & PAGE_EXEC);
 
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
     }
-    tlb_set_compare(full, &tn, addr_page, read_flags,
+    tlb_set_compare(full, &node->copy, addr_page, read_flags,
                     MMU_DATA_LOAD, prot & PAGE_READ);
 
     if (prot & PAGE_WRITE_INV) {
@@ -1218,10 +1239,11 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     if (wp_flags & BP_MEM_WRITE) {
         write_flags |= TLB_WATCHPOINT;
     }
-    tlb_set_compare(full, &tn, addr_page, write_flags,
+    tlb_set_compare(full, &node->copy, addr_page, write_flags,
                     MMU_DATA_STORE, prot & PAGE_WRITE);
 
-    copy_tlb_helper_locked(te, &tn);
+    node->full = *full;
+    copy_tlb_helper_locked(te, &node->copy);
     tlb_n_used_entries_inc(cpu, mmu_idx);
     qemu_spin_unlock(&tlb->c.lock);
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (9 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:01   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
                   ` (43 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Flush a page from the IntervalTree cache.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ea6a5177de..d532d69083 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -568,6 +568,7 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
     CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
     vaddr lp_addr = desc->large_page_addr;
     vaddr lp_mask = desc->large_page_mask;
+    CPUTLBEntryTree *node;
 
     /* Check if we need to flush due to large pages.  */
     if ((page & lp_mask) == lp_addr) {
@@ -575,10 +576,17 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
                   VADDR_PRIx "/%016" VADDR_PRIx ")\n",
                   midx, lp_addr, lp_mask);
         tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
-    } else {
-        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
-                                   page, TARGET_PAGE_SIZE, -1);
-        tlb_flush_vtlb_page_locked(cpu, midx, page);
+        return;
+    }
+
+    tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
+                               page, TARGET_PAGE_SIZE, -1);
+    tlb_flush_vtlb_page_locked(cpu, midx, page);
+
+    node = tlbtree_lookup_addr(desc, page);
+    if (node) {
+        interval_tree_remove(&node->itree, &desc->iroot);
+        g_free(node);
     }
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (10 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:01   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
                   ` (42 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Flush a masked range of pages from the IntervalTree cache.
When the mask is not used there is a redundant comparison,
but that is better than duplicating code at this point.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d532d69083..e2c855f147 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -311,6 +311,13 @@ static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
     return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
 }
 
+static CPUTLBEntryTree *tlbtree_lookup_range_next(CPUTLBEntryTree *prev,
+                                                  vaddr s, vaddr l)
+{
+    IntervalTreeNode *i = interval_tree_iter_next(&prev->itree, s, l);
+    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
+}
+
 static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
 {
     return tlbtree_lookup_range(desc, addr, addr);
@@ -739,6 +746,8 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
     CPUTLBDesc *d = &cpu->neg.tlb.d[midx];
     CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
     vaddr mask = MAKE_64BIT_MASK(0, bits);
+    CPUTLBEntryTree *node;
+    vaddr addr_mask, last_mask, last_imask;
 
     /*
      * Check if we need to flush due to large pages.
@@ -759,6 +768,22 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
         vaddr page = addr + i;
         tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
     }
+
+    addr_mask = addr & mask;
+    last_mask = addr_mask + len - 1;
+    last_imask = last_mask | ~mask;
+    node = tlbtree_lookup_range(d, addr_mask, last_imask);
+    while (node) {
+        CPUTLBEntryTree *next =
+            tlbtree_lookup_range_next(node, addr_mask, last_imask);
+        vaddr page_mask = node->itree.start & mask;
+
+        if (page_mask >= addr_mask && page_mask < last_mask) {
+            interval_tree_remove(&node->itree, &d->iroot);
+            g_free(node);
+        }
+        node = next;
+    }
 }
 
 typedef struct {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (11 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:02   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
                   ` (41 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Update the addr_write copy within each interval tree node.
Tidy the iteration within the other two loops as well.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e2c855f147..0c9f834cbe 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1010,17 +1010,20 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 
     qemu_spin_lock(&cpu->neg.tlb.c.lock);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        unsigned int i;
-        unsigned int n = tlb_n_entries(&cpu->neg.tlb.f[mmu_idx]);
+        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+        CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
 
-        for (i = 0; i < n; i++) {
-            tlb_reset_dirty_range_locked(&cpu->neg.tlb.f[mmu_idx].table[i],
-                                         start1, length);
+        for (size_t i = 0, n = tlb_n_entries(fast); i < n; i++) {
+            tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
         }
 
-        for (i = 0; i < CPU_VTLB_SIZE; i++) {
-            tlb_reset_dirty_range_locked(&cpu->neg.tlb.d[mmu_idx].vtable[i],
-                                         start1, length);
+        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
+            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
+        }
+
+        for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
+             t = tlbtree_lookup_range_next(t, 0, -1)) {
+            tlb_reset_dirty_range_locked(&t->copy, start1, length);
         }
     }
     qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (12 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:02   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit Richard Henderson
                   ` (40 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Update the addr_write copy within an interval tree node.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 0c9f834cbe..eb85e96ee2 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1049,13 +1049,18 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
     addr &= TARGET_PAGE_MASK;
     qemu_spin_lock(&cpu->neg.tlb.c.lock);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
-    }
+        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+        CPUTLBEntryTree *node;
 
-    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        int k;
-        for (k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_set_dirty1_locked(&cpu->neg.tlb.d[mmu_idx].vtable[k], addr);
+        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
+
+        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
+            tlb_set_dirty1_locked(&desc->vtable[k], addr);
+        }
+
+        node = tlbtree_lookup_addr(desc, addr);
+        if (node) {
+            tlb_set_dirty1_locked(&node->copy, addr);
         }
     }
     qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (13 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:03   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit Richard Henderson
                   ` (39 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

This is clearer than directly comparing the
page address and the comparator.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index eb85e96ee2..7ecd327297 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1394,9 +1394,8 @@ static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
     assert_cpu_is_self(cpu);
     for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
         CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
-        uint64_t cmp = tlb_read_idx(vtlb, access_type);
 
-        if (cmp == page) {
+        if (tlb_hit_page(tlb_read_idx(vtlb, access_type), page)) {
             /* Found entry in victim tlb, swap tlb and iotlb.  */
             CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (14 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:04   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
                   ` (38 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Do not mask the address to the page in these calls.
It is easy enough to use a different helper instead.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 7ecd327297..3aab72ea82 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1387,7 +1387,7 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
 /* Return true if ADDR is present in the victim tlb, and has been copied
    back to the main tlb.  */
 static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
-                           MMUAccessType access_type, vaddr page)
+                           MMUAccessType access_type, vaddr addr)
 {
     size_t vidx;
 
@@ -1395,7 +1395,7 @@ static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
     for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
         CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
 
-        if (tlb_hit_page(tlb_read_idx(vtlb, access_type), page)) {
+        if (tlb_hit(tlb_read_idx(vtlb, access_type), addr)) {
             /* Found entry in victim tlb, swap tlb and iotlb.  */
             CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
 
@@ -1448,13 +1448,12 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
-    vaddr page_addr = addr & TARGET_PAGE_MASK;
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
     bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
     CPUTLBEntryFull *full;
 
-    if (!tlb_hit_page(tlb_addr, page_addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, page_addr)) {
+    if (!tlb_hit(tlb_addr, addr)) {
+        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
             if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
                                 0, fault_size, nonfault, retaddr)) {
                 /* Non-faulting page table read failed.  */
@@ -1734,8 +1733,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
 
     /* If the TLB entry is for a different page, reload and try again.  */
     if (!tlb_hit(tlb_addr, addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type,
-                            addr & TARGET_PAGE_MASK)) {
+        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
             tlb_fill_align(cpu, addr, access_type, mmu_idx,
                            memop, data->size, false, ra);
             maybe_resized = true;
@@ -1914,8 +1912,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     /* Check TLB entry and enforce page permissions.  */
     flags = TLB_FLAGS_MASK;
     if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
-                            addr & TARGET_PAGE_MASK)) {
+        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE, addr)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
                            mop, size, false, retaddr);
             did_tlb_fill = true;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (15 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:06   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 18/54] accel/tcg: Remove the victim tlb Richard Henderson
                   ` (37 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Change from a linear search on the victim tlb
to a balanced binary tree search on the interval tree.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 59 ++++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 28 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3aab72ea82..ea4b78866b 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1384,35 +1384,38 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
     }
 }
 
-/* Return true if ADDR is present in the victim tlb, and has been copied
-   back to the main tlb.  */
-static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
-                           MMUAccessType access_type, vaddr addr)
+/*
+ * Return true if ADDR is present in the interval tree,
+ * and has been copied back to the main tlb.
+ */
+static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
+                        MMUAccessType access_type, vaddr addr)
 {
-    size_t vidx;
+    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
+    CPUTLBEntryTree *node;
+    size_t index;
 
     assert_cpu_is_self(cpu);
-    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
-        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
-
-        if (tlb_hit(tlb_read_idx(vtlb, access_type), addr)) {
-            /* Found entry in victim tlb, swap tlb and iotlb.  */
-            CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
-
-            qemu_spin_lock(&cpu->neg.tlb.c.lock);
-            copy_tlb_helper_locked(&tmptlb, tlb);
-            copy_tlb_helper_locked(tlb, vtlb);
-            copy_tlb_helper_locked(vtlb, &tmptlb);
-            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-
-            CPUTLBEntryFull *f1 = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
-            CPUTLBEntryFull *f2 = &cpu->neg.tlb.d[mmu_idx].vfulltlb[vidx];
-            CPUTLBEntryFull tmpf;
-            tmpf = *f1; *f1 = *f2; *f2 = tmpf;
-            return true;
-        }
+    node = tlbtree_lookup_addr(desc, addr);
+    if (!node) {
+        /* There is no cached mapping for this page. */
+        return false;
     }
-    return false;
+
+    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
+        /* This access is not permitted. */
+        return false;
+    }
+
+    /* Install the cached entry. */
+    index = tlbfast_index(fast, addr);
+    qemu_spin_lock(&cpu->neg.tlb.c.lock);
+    copy_tlb_helper_locked(&fast->table[index], &node->copy);
+    qemu_spin_unlock(&cpu->neg.tlb.c.lock);
+
+    desc->fulltlb[index] = node->full;
+    return true;
 }
 
 static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
@@ -1453,7 +1456,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     CPUTLBEntryFull *full;
 
     if (!tlb_hit(tlb_addr, addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
+        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
             if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
                                 0, fault_size, nonfault, retaddr)) {
                 /* Non-faulting page table read failed.  */
@@ -1733,7 +1736,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
 
     /* If the TLB entry is for a different page, reload and try again.  */
     if (!tlb_hit(tlb_addr, addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
+        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
             tlb_fill_align(cpu, addr, access_type, mmu_idx,
                            memop, data->size, false, ra);
             maybe_resized = true;
@@ -1912,7 +1915,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     /* Check TLB entry and enforce page permissions.  */
     flags = TLB_FLAGS_MASK;
     if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE, addr)) {
+        if (!tlbtree_hit(cpu, mmu_idx, MMU_DATA_STORE, addr)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
                            mop, size, false, retaddr);
             did_tlb_fill = true;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 18/54] accel/tcg: Remove the victim tlb
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (16 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:07   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc Richard Henderson
                   ` (36 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

This has been functionally replaced by the IntervalTree.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  8 -----
 accel/tcg/cputlb.c    | 74 -------------------------------------------
 2 files changed, 82 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 1ebc999a73..8eda0574b2 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -201,9 +201,6 @@ struct CPUClass {
  */
 #define NB_MMU_MODES 16
 
-/* Use a fully associative victim tlb of 8 entries. */
-#define CPU_VTLB_SIZE 8
-
 /*
  * The full TLB entry, which is not accessed by generated TCG code,
  * so the layout is not as critical as that of CPUTLBEntry. This is
@@ -285,11 +282,6 @@ typedef struct CPUTLBDesc {
     /* maximum number of entries observed in the window */
     size_t window_max_entries;
     size_t n_used_entries;
-    /* The next index to use in the tlb victim table.  */
-    size_t vindex;
-    /* The tlb victim table, in two parts.  */
-    CPUTLBEntry vtable[CPU_VTLB_SIZE];
-    CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
     CPUTLBEntryFull *fulltlb;
     /* All active tlb entries for this address space. */
     IntervalTreeRoot iroot;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ea4b78866b..8caa8c0f1d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -328,8 +328,6 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     tlbfast_flush_locked(desc, fast);
     desc->large_page_addr = -1;
     desc->large_page_mask = -1;
-    desc->vindex = 0;
-    memset(desc->vtable, -1, sizeof(desc->vtable));
     interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
 }
 
@@ -361,11 +359,6 @@ static inline void tlb_n_used_entries_inc(CPUState *cpu, uintptr_t mmu_idx)
     cpu->neg.tlb.d[mmu_idx].n_used_entries++;
 }
 
-static inline void tlb_n_used_entries_dec(CPUState *cpu, uintptr_t mmu_idx)
-{
-    cpu->neg.tlb.d[mmu_idx].n_used_entries--;
-}
-
 void tlb_init(CPUState *cpu)
 {
     int64_t now = get_clock_realtime();
@@ -496,20 +489,6 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
             page == (tlb_entry->addr_code & mask));
 }
 
-static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
-{
-    return tlb_hit_page_mask_anyprot(tlb_entry, page, -1);
-}
-
-/**
- * tlb_entry_is_empty - return true if the entry is not in use
- * @te: pointer to CPUTLBEntry
- */
-static inline bool tlb_entry_is_empty(const CPUTLBEntry *te)
-{
-    return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
-}
-
 /* Called with tlb_c.lock held */
 static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
                                         vaddr page,
@@ -522,28 +501,6 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
     return false;
 }
 
-/* Called with tlb_c.lock held */
-static void tlb_flush_vtlb_page_mask_locked(CPUState *cpu, int mmu_idx,
-                                            vaddr page,
-                                            vaddr mask)
-{
-    CPUTLBDesc *d = &cpu->neg.tlb.d[mmu_idx];
-    int k;
-
-    assert_cpu_is_self(cpu);
-    for (k = 0; k < CPU_VTLB_SIZE; k++) {
-        if (tlb_flush_entry_mask_locked(&d->vtable[k], page, mask)) {
-            tlb_n_used_entries_dec(cpu, mmu_idx);
-        }
-    }
-}
-
-static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
-                                              vaddr page)
-{
-    tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
-}
-
 static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
                                        vaddr addr, vaddr len, vaddr mask)
 {
@@ -588,7 +545,6 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
 
     tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
                                page, TARGET_PAGE_SIZE, -1);
-    tlb_flush_vtlb_page_locked(cpu, midx, page);
 
     node = tlbtree_lookup_addr(desc, page);
     if (node) {
@@ -764,11 +720,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
 
     tlbfast_flush_range_locked(d, f, addr, len, mask);
 
-    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
-        vaddr page = addr + i;
-        tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
-    }
-
     addr_mask = addr & mask;
     last_mask = addr_mask + len - 1;
     last_imask = last_mask | ~mask;
@@ -1017,10 +968,6 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
             tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
         }
 
-        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
-            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
-        }
-
         for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
              t = tlbtree_lookup_range_next(t, 0, -1)) {
             tlb_reset_dirty_range_locked(&t->copy, start1, length);
@@ -1054,10 +1001,6 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
 
         tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
 
-        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_set_dirty1_locked(&desc->vtable[k], addr);
-        }
-
         node = tlbtree_lookup_addr(desc, addr);
         if (node) {
             tlb_set_dirty1_locked(&node->copy, addr);
@@ -1216,23 +1159,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     /* Note that the tlb is no longer clean.  */
     tlb->c.dirty |= 1 << mmu_idx;
 
-    /* Make sure there's no cached translation for the new page.  */
-    tlb_flush_vtlb_page_locked(cpu, mmu_idx, addr_page);
-
-    /*
-     * Only evict the old entry to the victim tlb if it's for a
-     * different page; otherwise just overwrite the stale data.
-     */
-    if (!tlb_hit_page_anyprot(te, addr_page) && !tlb_entry_is_empty(te)) {
-        unsigned vidx = desc->vindex++ % CPU_VTLB_SIZE;
-        CPUTLBEntry *tv = &desc->vtable[vidx];
-
-        /* Evict the old entry into the victim tlb.  */
-        copy_tlb_helper_locked(tv, te);
-        desc->vfulltlb[vidx] = desc->fulltlb[index];
-        tlb_n_used_entries_dec(cpu, mmu_idx);
-    }
-
     /* Replace an old IntervalTree entry, or create a new one. */
     node = tlbtree_lookup_addr(desc, addr_page);
     if (!node) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (17 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 18/54] accel/tcg: Remove the victim tlb Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:07   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
                   ` (35 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Expand the function into its only caller, using the
existing CPUTLBDesc local pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8caa8c0f1d..3e24529f4f 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -354,11 +354,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
     tlb_mmu_flush_locked(desc, fast);
 }
 
-static inline void tlb_n_used_entries_inc(CPUState *cpu, uintptr_t mmu_idx)
-{
-    cpu->neg.tlb.d[mmu_idx].n_used_entries++;
-}
-
 void tlb_init(CPUState *cpu)
 {
     int64_t now = get_clock_realtime();
@@ -1211,7 +1206,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
 
     node->full = *full;
     copy_tlb_helper_locked(te, &node->copy);
-    tlb_n_used_entries_inc(cpu, mmu_idx);
+    desc->n_used_entries++;
     qemu_spin_unlock(&tlb->c.lock);
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (18 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:08   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
                   ` (34 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

CPUTLBEntryFull structures are no longer directly included within
the CPUState structure.  Move the structure definition out of cpu.h
to reduce visibility.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/tlb-common.h | 63 +++++++++++++++++++++++++++++++++++++++
 include/hw/core/cpu.h     | 63 ---------------------------------------
 2 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index dc5a5faa0b..300f9fae67 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -53,4 +53,67 @@ typedef struct CPUTLBDescFast {
     CPUTLBEntry *table;
 } CPUTLBDescFast QEMU_ALIGNED(2 * sizeof(void *));
 
+/*
+ * The full TLB entry, which is not accessed by generated TCG code,
+ * so the layout is not as critical as that of CPUTLBEntry. This is
+ * also why we don't want to combine the two structs.
+ */
+struct CPUTLBEntryFull {
+    /*
+     * @xlat_section contains:
+     *  - in the lower TARGET_PAGE_BITS, a physical section number
+     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
+     *    must be added to the virtual address to obtain:
+     *     + the ram_addr_t of the target RAM (if the physical section
+     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
+     *     + the offset within the target MemoryRegion (otherwise)
+     */
+    hwaddr xlat_section;
+
+    /*
+     * @phys_addr contains the physical address in the address space
+     * given by cpu_asidx_from_attrs(cpu, @attrs).
+     */
+    hwaddr phys_addr;
+
+    /* @attrs contains the memory transaction attributes for the page. */
+    MemTxAttrs attrs;
+
+    /* @prot contains the complete protections for the page. */
+    uint8_t prot;
+
+    /* @lg_page_size contains the log2 of the page size. */
+    uint8_t lg_page_size;
+
+    /* Additional tlb flags requested by tlb_fill. */
+    uint8_t tlb_fill_flags;
+
+    /*
+     * Additional tlb flags for use by the slow path. If non-zero,
+     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
+     */
+    uint8_t slow_flags[MMU_ACCESS_COUNT];
+
+    /*
+     * Allow target-specific additions to this structure.
+     * This may be used to cache items from the guest cpu
+     * page tables for later use by the implementation.
+     */
+    union {
+        /*
+         * Cache the attrs and shareability fields from the page table entry.
+         *
+         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
+         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
+         * For shareability and guarded, as in the SH and GP fields respectively
+         * of the VMSAv8-64 PTEs.
+         */
+        struct {
+            uint8_t pte_attrs;
+            uint8_t shareability;
+            bool guarded;
+        } arm;
+    } extra;
+};
+
 #endif /* EXEC_TLB_COMMON_H */
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 8eda0574b2..4364ddb1db 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -201,69 +201,6 @@ struct CPUClass {
  */
 #define NB_MMU_MODES 16
 
-/*
- * The full TLB entry, which is not accessed by generated TCG code,
- * so the layout is not as critical as that of CPUTLBEntry. This is
- * also why we don't want to combine the two structs.
- */
-struct CPUTLBEntryFull {
-    /*
-     * @xlat_section contains:
-     *  - in the lower TARGET_PAGE_BITS, a physical section number
-     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
-     *    must be added to the virtual address to obtain:
-     *     + the ram_addr_t of the target RAM (if the physical section
-     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
-     *     + the offset within the target MemoryRegion (otherwise)
-     */
-    hwaddr xlat_section;
-
-    /*
-     * @phys_addr contains the physical address in the address space
-     * given by cpu_asidx_from_attrs(cpu, @attrs).
-     */
-    hwaddr phys_addr;
-
-    /* @attrs contains the memory transaction attributes for the page. */
-    MemTxAttrs attrs;
-
-    /* @prot contains the complete protections for the page. */
-    uint8_t prot;
-
-    /* @lg_page_size contains the log2 of the page size. */
-    uint8_t lg_page_size;
-
-    /* Additional tlb flags requested by tlb_fill. */
-    uint8_t tlb_fill_flags;
-
-    /*
-     * Additional tlb flags for use by the slow path. If non-zero,
-     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
-     */
-    uint8_t slow_flags[MMU_ACCESS_COUNT];
-
-    /*
-     * Allow target-specific additions to this structure.
-     * This may be used to cache items from the guest cpu
-     * page tables for later use by the implementation.
-     */
-    union {
-        /*
-         * Cache the attrs and shareability fields from the page table entry.
-         *
-         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
-         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
-         * For shareability and guarded, as in the SH and GP fields respectively
-         * of the VMSAv8-64 PTEs.
-         */
-        struct {
-            uint8_t pte_attrs;
-            uint8_t shareability;
-            bool guarded;
-        } arm;
-    } extra;
-};
-
 /*
  * Data elements that are per MMU mode, minus the bits accessed by
  * the TCG fast path.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (19 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:09   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
                   ` (33 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Remove force_mmio and place the expression into the IF
expression, behind the short-circuit logic expressions
that might eliminate its computation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3e24529f4f..a4c69bcbf1 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1373,7 +1373,6 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
-    bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
     CPUTLBEntryFull *full;
 
     if (!tlb_hit(tlb_addr, addr)) {
@@ -1404,9 +1403,14 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
     flags |= full->slow_flags[access_type];
 
-    /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
-    if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
-        || (access_type != MMU_INST_FETCH && force_mmio)) {
+    /*
+     * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
+     * These cannot be treated as RAM.
+     */
+    if ((flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
+        || (access_type != MMU_INST_FETCH
+            && check_mem_cbs
+            && cpu_plugin_mem_cbs_enabled(cpu))) {
         *phost = NULL;
         return TLB_MMIO;
     }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (20 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:09   ` Pierrick Bouvier
  2024-11-14 16:00 ` [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup Richard Henderson
                   ` (32 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

Ensure a common entry point for all code lookups.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index a4c69bcbf1..c975dd2322 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2924,28 +2924,28 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(cs, true));
-    return do_ld1_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldb_code_mmu(env, addr, oi, 0);
 }
 
 uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(cs, true));
-    return do_ld2_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldw_code_mmu(env, addr, oi, 0);
 }
 
 uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(cs, true));
-    return do_ld4_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldl_code_mmu(env, addr, oi, 0);
 }
 
 uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(cs, true));
-    return do_ld8_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldq_code_mmu(env, addr, oi, 0);
 }
 
 uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (21 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
@ 2024-11-14 16:00 ` Richard Henderson
  2024-11-14 18:09   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare Richard Henderson
                   ` (31 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:00 UTC (permalink / raw)
  To: qemu-devel

In the mist before CPUTLBEntryFull existed, we had to be
clever to detect write-only pages.  Now we can directly
test the saved prot bits, which is clearer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c975dd2322..ae3a99eb47 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1854,14 +1854,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
             flags &= ~TLB_INVALID_MASK;
         }
     }
+    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
 
     /*
      * Let the guest notice RMW on a write-only page.
      * We have just verified that the page is writable.
-     * Subpage lookups may have left TLB_INVALID_MASK set,
-     * but addr_read will only be -1 if PAGE_READ was unset.
      */
-    if (unlikely(tlbe->addr_read == -1)) {
+    if (unlikely(!(full->prot & PAGE_READ))) {
         tlb_fill_align(cpu, addr, MMU_DATA_LOAD, mmu_idx,
                        0, size, false, retaddr);
         /*
@@ -1899,7 +1898,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     }
 
     hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
 
     if (unlikely(flags & TLB_NOTDIRTY)) {
         notdirty_write(cpu, addr, size, full, retaddr);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (22 preceding siblings ...)
  2024-11-14 16:00 ` [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:11   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu Richard Henderson
                   ` (30 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Before, if !enable, we squashed the entire address comparator to -1.
This works because TLB_INVALID_MASK is set.  It seemed natural, because
the tlb is cleared with memset of 0xff.

With this patch, we retain all of the other TLB_* bits even when
the page is not enabled.  This works because TLB_INVALID_MASK is set.
This will be used in a subsequent patch; the addr_read comparator
contains the flags for pages that are executable but not readable.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ae3a99eb47..585f4171cc 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1032,15 +1032,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
                                    vaddr address, int flags,
                                    MMUAccessType access_type, bool enable)
 {
-    if (enable) {
-        address |= flags & TLB_FLAGS_MASK;
-        flags &= TLB_SLOW_FLAGS_MASK;
-        if (flags) {
-            address |= TLB_FORCE_SLOW;
-        }
-    } else {
-        address = -1;
-        flags = 0;
+    if (!enable) {
+        address = TLB_INVALID_MASK;
+    }
+    address |= flags & TLB_FLAGS_MASK;
+    flags &= TLB_SLOW_FLAGS_MASK;
+    if (flags) {
+        address |= TLB_FORCE_SLOW;
     }
     ent->addr_idx[access_type] = address;
     full->slow_flags[access_type] = flags;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (23 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:11   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full Richard Henderson
                   ` (29 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Return a copy of the structure, not a pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h              |  2 +-
 accel/tcg/cputlb.c                   | 13 ++++++++-----
 target/arm/ptw.c                     | 10 +++++-----
 target/i386/tcg/sysemu/excp_helper.c |  8 ++++----
 4 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 2e4c4cc4b4..df7d0b5ad0 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -393,7 +393,7 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
  */
 int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
                           MMUAccessType access_type, int mmu_idx,
-                          void **phost, CPUTLBEntryFull **pfull);
+                          void **phost, CPUTLBEntryFull *pfull);
 
 #endif /* !CONFIG_USER_ONLY */
 #endif /* CONFIG_TCG */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 585f4171cc..81135524eb 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1439,25 +1439,28 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
 
 int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
                           MMUAccessType access_type, int mmu_idx,
-                          void **phost, CPUTLBEntryFull **pfull)
+                          void **phost, CPUTLBEntryFull *pfull)
 {
     void *discard_phost;
-    CPUTLBEntryFull *discard_tlb;
+    CPUTLBEntryFull *full;
 
     /* privately handle users that don't need full results */
     phost = phost ? phost : &discard_phost;
-    pfull = pfull ? pfull : &discard_tlb;
 
     int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
-                                      mmu_idx, true, phost, pfull, 0, false);
+                                      mmu_idx, true, phost, &full, 0, false);
 
     /* Handle clean RAM pages.  */
     if (unlikely(flags & TLB_NOTDIRTY)) {
         int dirtysize = size == 0 ? 1 : size;
-        notdirty_write(env_cpu(env), addr, dirtysize, *pfull, 0);
+        notdirty_write(env_cpu(env), addr, dirtysize, full, 0);
         flags &= ~TLB_NOTDIRTY;
     }
 
+    if (pfull) {
+        *pfull = *full;
+    }
+
     return flags;
 }
 
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 9849949508..3ae5f524de 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -592,7 +592,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
         ptw->out_space = s2.f.attrs.space;
     } else {
 #ifdef CONFIG_TCG
-        CPUTLBEntryFull *full;
+        CPUTLBEntryFull full;
         int flags;
 
         env->tlb_fi = fi;
@@ -604,10 +604,10 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
         if (unlikely(flags & TLB_INVALID_MASK)) {
             goto fail;
         }
-        ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
-        ptw->out_rw = full->prot & PAGE_WRITE;
-        pte_attrs = full->extra.arm.pte_attrs;
-        ptw->out_space = full->attrs.space;
+        ptw->out_phys = full.phys_addr | (addr & ~TARGET_PAGE_MASK);
+        ptw->out_rw = full.prot & PAGE_WRITE;
+        pte_attrs = full.extra.arm.pte_attrs;
+        ptw->out_space = full.attrs.space;
 #else
         g_assert_not_reached();
 #endif
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 02d3486421..168ff8e5f3 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -436,7 +436,7 @@ do_check_protect_pse36:
      * addresses) using the address with the A20 bit set.
      */
     if (in->ptw_idx == MMU_NESTED_IDX) {
-        CPUTLBEntryFull *full;
+        CPUTLBEntryFull full;
         int flags, nested_page_size;
 
         flags = probe_access_full_mmu(env, paddr, 0, access_type,
@@ -451,7 +451,7 @@ do_check_protect_pse36:
         }
 
         /* Merge stage1 & stage2 protection bits. */
-        prot &= full->prot;
+        prot &= full.prot;
 
         /* Re-verify resulting protection. */
         if ((prot & (1 << access_type)) == 0) {
@@ -459,8 +459,8 @@ do_check_protect_pse36:
         }
 
         /* Merge stage1 & stage2 addresses to final physical address. */
-        nested_page_size = 1 << full->lg_page_size;
-        paddr = (full->phys_addr & ~(nested_page_size - 1))
+        nested_page_size = 1 << full.lg_page_size;
+        paddr = (full.phys_addr & ~(nested_page_size - 1))
               | (paddr & (nested_page_size - 1));
 
         /*
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (24 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:12   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal Richard Henderson
                   ` (28 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Return a copy of the structure, not a pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h     |  6 +-----
 accel/tcg/cputlb.c          |  8 +++++---
 target/arm/tcg/helper-a64.c |  4 ++--
 target/arm/tcg/mte_helper.c | 15 ++++++---------
 target/arm/tcg/sve_helper.c |  6 +++---
 5 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index df7d0b5ad0..69bdb77584 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -365,10 +365,6 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
  * probe_access_full:
  * Like probe_access_flags, except also return into @pfull.
  *
- * The CPUTLBEntryFull structure returned via @pfull is transient
- * and must be consumed or copied immediately, before any further
- * access or changes to TLB @mmu_idx.
- *
  * This function will not fault if @nonfault is set, but will
  * return TLB_INVALID_MASK if the page is not mapped, or is not
  * accessible with @access_type.
@@ -379,7 +375,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
 int probe_access_full(CPUArchState *env, vaddr addr, int size,
                       MMUAccessType access_type, int mmu_idx,
                       bool nonfault, void **phost,
-                      CPUTLBEntryFull **pfull, uintptr_t retaddr);
+                      CPUTLBEntryFull *pfull, uintptr_t retaddr);
 
 /**
  * probe_access_full_mmu:
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 81135524eb..84e7e633e3 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1420,20 +1420,22 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
 
 int probe_access_full(CPUArchState *env, vaddr addr, int size,
                       MMUAccessType access_type, int mmu_idx,
-                      bool nonfault, void **phost, CPUTLBEntryFull **pfull,
+                      bool nonfault, void **phost, CPUTLBEntryFull *pfull,
                       uintptr_t retaddr)
 {
+    CPUTLBEntryFull *full;
     int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
-                                      mmu_idx, nonfault, phost, pfull, retaddr,
+                                      mmu_idx, nonfault, phost, &full, retaddr,
                                       true);
 
     /* Handle clean RAM pages.  */
     if (unlikely(flags & TLB_NOTDIRTY)) {
         int dirtysize = size == 0 ? 1 : size;
-        notdirty_write(env_cpu(env), addr, dirtysize, *pfull, retaddr);
+        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
         flags &= ~TLB_NOTDIRTY;
     }
 
+    *pfull = *full;
     return flags;
 }
 
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index 8f42a28d07..783864d6db 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -1883,14 +1883,14 @@ static bool is_guarded_page(CPUARMState *env, target_ulong addr, uintptr_t ra)
 #ifdef CONFIG_USER_ONLY
     return page_get_flags(addr) & PAGE_BTI;
 #else
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     void *host;
     int mmu_idx = cpu_mmu_index(env_cpu(env), true);
     int flags = probe_access_full(env, addr, 0, MMU_INST_FETCH, mmu_idx,
                                   false, &host, &full, ra);
 
     assert(!(flags & TLB_INVALID_MASK));
-    return full->extra.arm.guarded;
+    return full.extra.arm.guarded;
 #endif
 }
 
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index 9d2ba287ee..870b2875af 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -83,8 +83,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
                       TARGET_PAGE_BITS - LOG2_TAG_GRANULE - 1);
     return tags + index;
 #else
-    CPUTLBEntryFull *full;
-    MemTxAttrs attrs;
+    CPUTLBEntryFull full;
     int in_page, flags;
     hwaddr ptr_paddr, tag_paddr, xlat;
     MemoryRegion *mr;
@@ -110,7 +109,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
     assert(!(flags & TLB_INVALID_MASK));
 
     /* If the virtual page MemAttr != Tagged, access unchecked. */
-    if (full->extra.arm.pte_attrs != 0xf0) {
+    if (full.extra.arm.pte_attrs != 0xf0) {
         return NULL;
     }
 
@@ -129,9 +128,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
      * Remember these values across the second lookup below,
      * which may invalidate this pointer via tlb resize.
      */
-    ptr_paddr = full->phys_addr | (ptr & ~TARGET_PAGE_MASK);
-    attrs = full->attrs;
-    full = NULL;
+    ptr_paddr = full.phys_addr | (ptr & ~TARGET_PAGE_MASK);
 
     /*
      * The Normal memory access can extend to the next page.  E.g. a single
@@ -150,17 +147,17 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
     if (!probe && unlikely(flags & TLB_WATCHPOINT)) {
         int wp = ptr_access == MMU_DATA_LOAD ? BP_MEM_READ : BP_MEM_WRITE;
         assert(ra != 0);
-        cpu_check_watchpoint(env_cpu(env), ptr, ptr_size, attrs, wp, ra);
+        cpu_check_watchpoint(env_cpu(env), ptr, ptr_size, full.attrs, wp, ra);
     }
 
     /* Convert to the physical address in tag space.  */
     tag_paddr = ptr_paddr >> (LOG2_TAG_GRANULE + 1);
 
     /* Look up the address in tag space. */
-    tag_asi = attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
+    tag_asi = full.attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
     tag_as = cpu_get_address_space(env_cpu(env), tag_asi);
     mr = address_space_translate(tag_as, tag_paddr, &xlat, NULL,
-                                 tag_access == MMU_DATA_STORE, attrs);
+                                 tag_access == MMU_DATA_STORE, full.attrs);
 
     /*
      * Note that @mr will never be NULL.  If there is nothing in the address
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index f1ee0e060f..dad0d5e518 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5357,7 +5357,7 @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
     flags = probe_access_flags(env, addr, 0, access_type, mmu_idx, nofault,
                                &info->host, retaddr);
 #else
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     flags = probe_access_full(env, addr, 0, access_type, mmu_idx, nofault,
                               &info->host, &full, retaddr);
 #endif
@@ -5373,8 +5373,8 @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
     /* Require both ANON and MTE; see allocation_tag_mem(). */
     info->tagged = (flags & PAGE_ANON) && (flags & PAGE_MTE);
 #else
-    info->attrs = full->attrs;
-    info->tagged = full->extra.arm.pte_attrs == 0xf0;
+    info->attrs = full.attrs;
+    info->tagged = full.extra.arm.pte_attrs == 0xf0;
 #endif
 
     /* Ensure that info->host[] is relative to addr, not addr + mem_off. */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (25 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:13   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup Richard Henderson
                   ` (27 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Return a copy of the structure, not a pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 84e7e633e3..41b2f76cc9 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1364,7 +1364,7 @@ static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
 static int probe_access_internal(CPUState *cpu, vaddr addr,
                                  int fault_size, MMUAccessType access_type,
                                  int mmu_idx, bool nonfault,
-                                 void **phost, CPUTLBEntryFull **pfull,
+                                 void **phost, CPUTLBEntryFull *pfull,
                                  uintptr_t retaddr, bool check_mem_cbs)
 {
     uintptr_t index = tlb_index(cpu, mmu_idx, addr);
@@ -1379,7 +1379,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
                                 0, fault_size, nonfault, retaddr)) {
                 /* Non-faulting page table read failed.  */
                 *phost = NULL;
-                *pfull = NULL;
+                memset(pfull, 0, sizeof(*pfull));
                 return TLB_INVALID_MASK;
             }
 
@@ -1398,8 +1398,9 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     }
     flags &= tlb_addr;
 
-    *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
     flags |= full->slow_flags[access_type];
+    *pfull = *full;
 
     /*
      * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
@@ -1423,19 +1424,17 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
                       bool nonfault, void **phost, CPUTLBEntryFull *pfull,
                       uintptr_t retaddr)
 {
-    CPUTLBEntryFull *full;
     int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
-                                      mmu_idx, nonfault, phost, &full, retaddr,
+                                      mmu_idx, nonfault, phost, pfull, retaddr,
                                       true);
 
     /* Handle clean RAM pages.  */
     if (unlikely(flags & TLB_NOTDIRTY)) {
         int dirtysize = size == 0 ? 1 : size;
-        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
+        notdirty_write(env_cpu(env), addr, dirtysize, pfull, retaddr);
         flags &= ~TLB_NOTDIRTY;
     }
 
-    *pfull = *full;
     return flags;
 }
 
@@ -1444,25 +1443,22 @@ int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
                           void **phost, CPUTLBEntryFull *pfull)
 {
     void *discard_phost;
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull discard_full;
 
     /* privately handle users that don't need full results */
     phost = phost ? phost : &discard_phost;
+    pfull = pfull ? pfull : &discard_full;
 
     int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
-                                      mmu_idx, true, phost, &full, 0, false);
+                                      mmu_idx, true, phost, pfull, 0, false);
 
     /* Handle clean RAM pages.  */
     if (unlikely(flags & TLB_NOTDIRTY)) {
         int dirtysize = size == 0 ? 1 : size;
-        notdirty_write(env_cpu(env), addr, dirtysize, full, 0);
+        notdirty_write(env_cpu(env), addr, dirtysize, pfull, 0);
         flags &= ~TLB_NOTDIRTY;
     }
 
-    if (pfull) {
-        *pfull = *full;
-    }
-
     return flags;
 }
 
@@ -1470,7 +1466,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
                        MMUAccessType access_type, int mmu_idx,
                        bool nonfault, void **phost, uintptr_t retaddr)
 {
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     int flags;
 
     g_assert(-(addr | TARGET_PAGE_MASK) >= size);
@@ -1482,7 +1478,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
     /* Handle clean RAM pages. */
     if (unlikely(flags & TLB_NOTDIRTY)) {
         int dirtysize = size == 0 ? 1 : size;
-        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
+        notdirty_write(env_cpu(env), addr, dirtysize, &full, retaddr);
         flags &= ~TLB_NOTDIRTY;
     }
 
@@ -1492,7 +1488,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
 void *probe_access(CPUArchState *env, vaddr addr, int size,
                    MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
 {
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     void *host;
     int flags;
 
@@ -1513,12 +1509,12 @@ void *probe_access(CPUArchState *env, vaddr addr, int size,
             int wp_access = (access_type == MMU_DATA_STORE
                              ? BP_MEM_WRITE : BP_MEM_READ);
             cpu_check_watchpoint(env_cpu(env), addr, size,
-                                 full->attrs, wp_access, retaddr);
+                                 full.attrs, wp_access, retaddr);
         }
 
         /* Handle clean RAM pages.  */
         if (flags & TLB_NOTDIRTY) {
-            notdirty_write(env_cpu(env), addr, size, full, retaddr);
+            notdirty_write(env_cpu(env), addr, size, &full, retaddr);
         }
     }
 
@@ -1528,7 +1524,7 @@ void *probe_access(CPUArchState *env, vaddr addr, int size,
 void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
                         MMUAccessType access_type, int mmu_idx)
 {
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     void *host;
     int flags;
 
@@ -1552,7 +1548,7 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
 tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
                                         void **hostp)
 {
-    CPUTLBEntryFull *full;
+    CPUTLBEntryFull full;
     void *p;
 
     (void)probe_access_internal(env_cpu(env), addr, 1, MMU_INST_FETCH,
@@ -1562,7 +1558,7 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
         return -1;
     }
 
-    if (full->lg_page_size < TARGET_PAGE_BITS) {
+    if (full.lg_page_size < TARGET_PAGE_BITS) {
         return -1;
     }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (26 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:29   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput Richard Henderson
                   ` (26 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Unify 3 instances of tlb lookup, through tlb_hit, tlbtree_hit,
and tlb_full_align.  Use structures to avoid too many arguments.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 369 ++++++++++++++++++++++-----------------------
 1 file changed, 178 insertions(+), 191 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 41b2f76cc9..a33bebf55a 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1271,6 +1271,118 @@ static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
                                           mmu_idx, retaddr);
 }
 
+typedef struct TLBLookupInput {
+    vaddr addr;
+    uintptr_t ra;
+    int memop_probe           : 16;
+    unsigned int size         : 8;
+    MMUAccessType access_type : 4;
+    unsigned int mmu_idx      : 4;
+} TLBLookupInput;
+
+typedef struct TLBLookupOutput {
+    CPUTLBEntryFull full;
+    void *haddr;
+    int flags;
+    bool did_tlb_fill;
+} TLBLookupOutput;
+
+static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
+                       const TLBLookupInput *i)
+{
+    CPUTLBDesc *desc = &cpu->neg.tlb.d[i->mmu_idx];
+    CPUTLBDescFast *fast = &cpu->neg.tlb.f[i->mmu_idx];
+    vaddr addr = i->addr;
+    MMUAccessType access_type = i->access_type;
+    CPUTLBEntryFull *full;
+    CPUTLBEntryTree *node;
+    CPUTLBEntry *entry;
+    uint64_t cmp;
+    bool probe = i->memop_probe < 0;
+    MemOp memop = probe ? 0 : i->memop_probe;
+    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
+
+    assert_cpu_is_self(cpu);
+    o->did_tlb_fill = false;
+
+    /* Primary lookup in the fast tlb. */
+    entry = tlbfast_entry(fast, addr);
+    full = &desc->fulltlb[tlbfast_index(fast, addr)];
+    cmp = tlb_read_idx(entry, access_type);
+    if (tlb_hit(cmp, addr)) {
+        goto found;
+    }
+
+    /* Secondary lookup in the IntervalTree. */
+    node = tlbtree_lookup_addr(desc, addr);
+    if (node) {
+        cmp = tlb_read_idx(&node->copy, access_type);
+        if (tlb_hit(cmp, addr)) {
+            /* Install the cached entry. */
+            qemu_spin_lock(&cpu->neg.tlb.c.lock);
+            copy_tlb_helper_locked(entry, &node->copy);
+            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
+            *full = node->full;
+            goto found;
+        }
+    }
+
+    /* Finally, query the target hook. */
+    if (!tlb_fill_align(cpu, addr, access_type, i->mmu_idx,
+                        memop, i->size, probe, i->ra)) {
+        tcg_debug_assert(probe);
+        return false;
+    }
+
+    o->did_tlb_fill = true;
+
+    entry = tlbfast_entry(fast, addr);
+    full = &desc->fulltlb[tlbfast_index(fast, addr)];
+    cmp = tlb_read_idx(entry, access_type);
+    /*
+     * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+     * to force the next access through tlb_fill_align.  We've just
+     * called tlb_fill_align, so we know that this entry *is* valid.
+     */
+    flags &= ~TLB_INVALID_MASK;
+    goto done;
+
+ found:
+    /* Alignment has not been checked by tlb_fill_align. */
+    {
+        int a_bits = memop_alignment_bits(memop);
+
+        /*
+         * The TLB_CHECK_ALIGNED check differs from the normal alignment
+         * check, in that this is based on the atomicity of the operation.
+         * The intended use case is the ARM memory type field of each PTE,
+         * where access to pages with Device memory type require alignment.
+         */
+        if (unlikely(flags & TLB_CHECK_ALIGNED)) {
+            int at_bits = memop_atomicity_bits(memop);
+            a_bits = MAX(a_bits, at_bits);
+        }
+        if (unlikely(addr & ((1 << a_bits) - 1))) {
+            cpu_unaligned_access(cpu, addr, access_type, i->mmu_idx, i->ra);
+        }
+    }
+
+ done:
+    flags &= cmp;
+    flags |= full->slow_flags[access_type];
+    o->flags = flags;
+    o->full = *full;
+    o->haddr = (void *)((uintptr_t)addr + entry->addend);
+    return true;
+}
+
+static void tlb_lookup_nofail(CPUState *cpu, TLBLookupOutput *o,
+                              const TLBLookupInput *i)
+{
+    bool ok = tlb_lookup(cpu, o, i);
+    tcg_debug_assert(ok);
+}
+
 static MemoryRegionSection *
 io_prepare(hwaddr *out_offset, CPUState *cpu, hwaddr xlat,
            MemTxAttrs attrs, vaddr addr, uintptr_t retaddr)
@@ -1303,40 +1415,6 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
     }
 }
 
-/*
- * Return true if ADDR is present in the interval tree,
- * and has been copied back to the main tlb.
- */
-static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
-                        MMUAccessType access_type, vaddr addr)
-{
-    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
-    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
-    CPUTLBEntryTree *node;
-    size_t index;
-
-    assert_cpu_is_self(cpu);
-    node = tlbtree_lookup_addr(desc, addr);
-    if (!node) {
-        /* There is no cached mapping for this page. */
-        return false;
-    }
-
-    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
-        /* This access is not permitted. */
-        return false;
-    }
-
-    /* Install the cached entry. */
-    index = tlbfast_index(fast, addr);
-    qemu_spin_lock(&cpu->neg.tlb.c.lock);
-    copy_tlb_helper_locked(&fast->table[index], &node->copy);
-    qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-
-    desc->fulltlb[index] = node->full;
-    return true;
-}
-
 static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
                            CPUTLBEntryFull *full, uintptr_t retaddr)
 {
@@ -1367,40 +1445,26 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
                                  void **phost, CPUTLBEntryFull *pfull,
                                  uintptr_t retaddr, bool check_mem_cbs)
 {
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
-    uint64_t tlb_addr = tlb_read_idx(entry, access_type);
-    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
-    CPUTLBEntryFull *full;
+    TLBLookupInput i = {
+        .addr = addr,
+        .ra = retaddr,
+        .access_type = access_type,
+        .size = fault_size,
+        .memop_probe = nonfault ? -1 : 0,
+        .mmu_idx = mmu_idx,
+    };
+    TLBLookupOutput o;
+    int flags;
 
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
-            if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
-                                0, fault_size, nonfault, retaddr)) {
-                /* Non-faulting page table read failed.  */
-                *phost = NULL;
-                memset(pfull, 0, sizeof(*pfull));
-                return TLB_INVALID_MASK;
-            }
-
-            /* TLB resize via tlb_fill_align may have moved the entry.  */
-            index = tlb_index(cpu, mmu_idx, addr);
-            entry = tlb_entry(cpu, mmu_idx, addr);
-
-            /*
-             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
-             * to force the next access through tlb_fill_align.  We've just
-             * called tlb_fill_align, so we know that this entry *is* valid.
-             */
-            flags &= ~TLB_INVALID_MASK;
-        }
-        tlb_addr = tlb_read_idx(entry, access_type);
+    if (!tlb_lookup(cpu, &o, &i)) {
+        /* Non-faulting page table read failed.  */
+        *phost = NULL;
+        memset(pfull, 0, sizeof(*pfull));
+        return TLB_INVALID_MASK;
     }
-    flags &= tlb_addr;
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
-    flags |= full->slow_flags[access_type];
-    *pfull = *full;
+    *pfull = o.full;
+    flags = o.flags;
 
     /*
      * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
@@ -1415,7 +1479,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     }
 
     /* Everything else is RAM. */
-    *phost = (void *)((uintptr_t)addr + entry->addend);
+    *phost = o.haddr;
     return flags;
 }
 
@@ -1625,6 +1689,7 @@ typedef struct MMULookupPageData {
     vaddr addr;
     int flags;
     int size;
+    TLBLookupOutput o;
 } MMULookupPageData;
 
 typedef struct MMULookupLocals {
@@ -1644,67 +1709,25 @@ typedef struct MMULookupLocals {
  *
  * Resolve the translation for the one page at @data.addr, filling in
  * the rest of @data with the results.  If the translation fails,
- * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
- * @mmu_idx may have resized.
+ * tlb_fill_align will longjmp out.
  */
-static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
+static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
                         int mmu_idx, MMUAccessType access_type, uintptr_t ra)
 {
-    vaddr addr = data->addr;
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
-    uint64_t tlb_addr = tlb_read_idx(entry, access_type);
-    bool maybe_resized = false;
-    CPUTLBEntryFull *full;
-    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
+    TLBLookupInput i = {
+        .addr = data->addr,
+        .ra = ra,
+        .access_type = access_type,
+        .memop_probe = memop,
+        .size = data->size,
+        .mmu_idx = mmu_idx,
+    };
 
-    /* If the TLB entry is for a different page, reload and try again.  */
-    if (!tlb_hit(tlb_addr, addr)) {
-        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
-            tlb_fill_align(cpu, addr, access_type, mmu_idx,
-                           memop, data->size, false, ra);
-            maybe_resized = true;
-            index = tlb_index(cpu, mmu_idx, addr);
-            entry = tlb_entry(cpu, mmu_idx, addr);
-            /*
-             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
-             * to force the next access through tlb_fill.  We've just
-             * called tlb_fill, so we know that this entry *is* valid.
-             */
-            flags &= ~TLB_INVALID_MASK;
-        }
-        tlb_addr = tlb_read_idx(entry, access_type);
-    }
+    tlb_lookup_nofail(cpu, &data->o, &i);
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
-    flags = tlb_addr & (TLB_FLAGS_MASK & ~TLB_FORCE_SLOW);
-    flags |= full->slow_flags[access_type];
-
-    if (likely(!maybe_resized)) {
-        /* Alignment has not been checked by tlb_fill_align. */
-        int a_bits = memop_alignment_bits(memop);
-
-        /*
-         * This alignment check differs from the one above, in that this is
-         * based on the atomicity of the operation. The intended use case is
-         * the ARM memory type field of each PTE, where access to pages with
-         * Device memory type require alignment.
-         */
-        if (unlikely(flags & TLB_CHECK_ALIGNED)) {
-            int at_bits = memop_atomicity_bits(memop);
-            a_bits = MAX(a_bits, at_bits);
-        }
-        if (unlikely(addr & ((1 << a_bits) - 1))) {
-            cpu_unaligned_access(cpu, addr, access_type, mmu_idx, ra);
-        }
-    }
-
-    data->full = full;
-    data->flags = flags;
-    /* Compute haddr speculatively; depending on flags it might be invalid. */
-    data->haddr = (void *)((uintptr_t)addr + entry->addend);
-
-    return maybe_resized;
+    data->full = &data->o.full;
+    data->flags = data->o.flags;
+    data->haddr = data->o.haddr;
 }
 
 /**
@@ -1785,15 +1808,9 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         l->page[1].size = l->page[0].size - size0;
         l->page[0].size = size0;
 
-        /*
-         * Lookup both pages, recognizing exceptions from either.  If the
-         * second lookup potentially resized, refresh first CPUTLBEntryFull.
-         */
+        /* Lookup both pages, recognizing exceptions from either. */
         mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
-        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
-            uintptr_t index = tlb_index(cpu, l->mmu_idx, addr);
-            l->page[0].full = &cpu->neg.tlb.d[l->mmu_idx].fulltlb[index];
-        }
+        mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
 
         flags = l->page[0].flags | l->page[1].flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
@@ -1819,49 +1836,26 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
 static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
                                int size, uintptr_t retaddr)
 {
-    uintptr_t mmu_idx = get_mmuidx(oi);
-    MemOp mop = get_memop(oi);
-    uintptr_t index;
-    CPUTLBEntry *tlbe;
-    void *hostaddr;
-    CPUTLBEntryFull *full;
-    bool did_tlb_fill = false;
-    int flags;
+    TLBLookupInput i = {
+        .addr = addr,
+        .ra = retaddr - GETPC_ADJ,
+        .access_type = MMU_DATA_STORE,
+        .memop_probe = get_memop(oi),
+        .mmu_idx = get_mmuidx(oi),
+    };
+    TLBLookupOutput o;
+    int flags, wp_flags;
 
-    tcg_debug_assert(mmu_idx < NB_MMU_MODES);
-
-    /* Adjust the given return address.  */
-    retaddr -= GETPC_ADJ;
-
-    index = tlb_index(cpu, mmu_idx, addr);
-    tlbe = tlb_entry(cpu, mmu_idx, addr);
-
-    /* Check TLB entry and enforce page permissions.  */
-    flags = TLB_FLAGS_MASK;
-    if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
-        if (!tlbtree_hit(cpu, mmu_idx, MMU_DATA_STORE, addr)) {
-            tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
-                           mop, size, false, retaddr);
-            did_tlb_fill = true;
-            index = tlb_index(cpu, mmu_idx, addr);
-            tlbe = tlb_entry(cpu, mmu_idx, addr);
-            /*
-             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
-             * to force the next access through tlb_fill.  We've just
-             * called tlb_fill, so we know that this entry *is* valid.
-             */
-            flags &= ~TLB_INVALID_MASK;
-        }
-    }
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    i.size = memop_size(i.memop_probe);
+    tlb_lookup_nofail(cpu, &o, &i);
 
     /*
      * Let the guest notice RMW on a write-only page.
      * We have just verified that the page is writable.
      */
-    if (unlikely(!(full->prot & PAGE_READ))) {
-        tlb_fill_align(cpu, addr, MMU_DATA_LOAD, mmu_idx,
-                       0, size, false, retaddr);
+    if (unlikely(!(o.full.prot & PAGE_READ))) {
+        tlb_fill_align(cpu, addr, MMU_DATA_LOAD, i.mmu_idx,
+                       0, i.size, false, i.ra);
         /*
          * Since we don't support reads and writes to different
          * addresses, and we do have the proper page loaded for
@@ -1871,12 +1865,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     }
 
     /* Enforce guest required alignment, if not handled by tlb_fill_align. */
-    if (!did_tlb_fill && (addr & ((1 << memop_alignment_bits(mop)) - 1))) {
-        cpu_unaligned_access(cpu, addr, MMU_DATA_STORE, mmu_idx, retaddr);
+    if (!o.did_tlb_fill
+        && (addr & ((1 << memop_alignment_bits(i.memop_probe)) - 1))) {
+        cpu_unaligned_access(cpu, addr, MMU_DATA_STORE, i.mmu_idx, i.ra);
     }
 
     /* Enforce qemu required alignment.  */
-    if (unlikely(addr & (size - 1))) {
+    if (unlikely(addr & (i.size - 1))) {
         /*
          * We get here if guest alignment was not requested, or was not
          * enforced by cpu_unaligned_access or tlb_fill_align above.
@@ -1886,41 +1881,33 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         goto stop_the_world;
     }
 
-    /* Collect tlb flags for read and write. */
-    flags &= tlbe->addr_read | tlb_addr_write(tlbe);
-
     /* Notice an IO access or a needs-MMU-lookup access */
+    flags = o.flags;
     if (unlikely(flags & (TLB_MMIO | TLB_DISCARD_WRITE))) {
         /* There's really nothing that can be done to
            support this apart from stop-the-world.  */
         goto stop_the_world;
     }
 
-    hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
-
     if (unlikely(flags & TLB_NOTDIRTY)) {
-        notdirty_write(cpu, addr, size, full, retaddr);
+        notdirty_write(cpu, addr, i.size, &o.full, i.ra);
     }
 
-    if (unlikely(flags & TLB_FORCE_SLOW)) {
-        int wp_flags = 0;
-
-        if (full->slow_flags[MMU_DATA_STORE] & TLB_WATCHPOINT) {
-            wp_flags |= BP_MEM_WRITE;
-        }
-        if (full->slow_flags[MMU_DATA_LOAD] & TLB_WATCHPOINT) {
-            wp_flags |= BP_MEM_READ;
-        }
-        if (wp_flags) {
-            cpu_check_watchpoint(cpu, addr, size,
-                                 full->attrs, wp_flags, retaddr);
-        }
+    wp_flags = 0;
+    if (flags & TLB_WATCHPOINT) {
+        wp_flags |= BP_MEM_WRITE;
+    }
+    if (o.full.slow_flags[MMU_DATA_LOAD] & TLB_WATCHPOINT) {
+        wp_flags |= BP_MEM_READ;
+    }
+    if (unlikely(wp_flags)) {
+        cpu_check_watchpoint(cpu, addr, i.size, o.full.attrs, wp_flags, i.ra);
     }
 
-    return hostaddr;
+    return o.haddr;
 
  stop_the_world:
-    cpu_loop_exit_atomic(cpu, retaddr);
+    cpu_loop_exit_atomic(cpu, i.ra);
 }
 
 /*
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (27 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:29   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup Richard Henderson
                   ` (25 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 151 ++++++++++++++++++++++-----------------------
 1 file changed, 74 insertions(+), 77 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index a33bebf55a..8f459be5a8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1684,10 +1684,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
  */
 
 typedef struct MMULookupPageData {
-    CPUTLBEntryFull *full;
-    void *haddr;
     vaddr addr;
-    int flags;
     int size;
     TLBLookupOutput o;
 } MMULookupPageData;
@@ -1724,10 +1721,6 @@ static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
     };
 
     tlb_lookup_nofail(cpu, &data->o, &i);
-
-    data->full = &data->o.full;
-    data->flags = data->o.flags;
-    data->haddr = data->o.haddr;
 }
 
 /**
@@ -1743,24 +1736,22 @@ static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
 static void mmu_watch_or_dirty(CPUState *cpu, MMULookupPageData *data,
                                MMUAccessType access_type, uintptr_t ra)
 {
-    CPUTLBEntryFull *full = data->full;
-    vaddr addr = data->addr;
-    int flags = data->flags;
-    int size = data->size;
+    int flags = data->o.flags;
 
     /* On watchpoint hit, this will longjmp out.  */
     if (flags & TLB_WATCHPOINT) {
         int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ;
-        cpu_check_watchpoint(cpu, addr, size, full->attrs, wp, ra);
+        cpu_check_watchpoint(cpu, data->addr, data->size,
+                             data->o.full.attrs, wp, ra);
         flags &= ~TLB_WATCHPOINT;
     }
 
     /* Note that notdirty is only set for writes. */
     if (flags & TLB_NOTDIRTY) {
-        notdirty_write(cpu, addr, size, full, ra);
+        notdirty_write(cpu, data->addr, data->size, &data->o.full, ra);
         flags &= ~TLB_NOTDIRTY;
     }
-    data->flags = flags;
+    data->o.flags = flags;
 }
 
 /**
@@ -1795,7 +1786,7 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     if (likely(!crosspage)) {
         mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
 
-        flags = l->page[0].flags;
+        flags = l->page[0].o.flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
             mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
         }
@@ -1812,7 +1803,7 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
         mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
 
-        flags = l->page[0].flags | l->page[1].flags;
+        flags = l->page[0].o.flags | l->page[1].o.flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
             mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
             mmu_watch_or_dirty(cpu, &l->page[1], type, ra);
@@ -2029,7 +2020,7 @@ static Int128 do_ld16_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
  */
 static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
 {
-    uint8_t *haddr = p->haddr;
+    uint8_t *haddr = p->o.haddr;
     int i, size = p->size;
 
     for (i = 0; i < size; i++) {
@@ -2047,7 +2038,7 @@ static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
  */
 static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be)
 {
-    void *haddr = p->haddr;
+    void *haddr = p->o.haddr;
     int size = p->size;
 
     do {
@@ -2097,7 +2088,7 @@ static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be)
 static uint64_t do_ld_whole_be4(MMULookupPageData *p, uint64_t ret_be)
 {
     int o = p->addr & 3;
-    uint32_t x = load_atomic4(p->haddr - o);
+    uint32_t x = load_atomic4(p->o.haddr - o);
 
     x = cpu_to_be32(x);
     x <<= o * 8;
@@ -2117,7 +2108,7 @@ static uint64_t do_ld_whole_be8(CPUState *cpu, uintptr_t ra,
                                 MMULookupPageData *p, uint64_t ret_be)
 {
     int o = p->addr & 7;
-    uint64_t x = load_atomic8_or_exit(cpu, ra, p->haddr - o);
+    uint64_t x = load_atomic8_or_exit(cpu, ra, p->o.haddr - o);
 
     x = cpu_to_be64(x);
     x <<= o * 8;
@@ -2137,7 +2128,7 @@ static Int128 do_ld_whole_be16(CPUState *cpu, uintptr_t ra,
                                MMULookupPageData *p, uint64_t ret_be)
 {
     int o = p->addr & 15;
-    Int128 x, y = load_atomic16_or_exit(cpu, ra, p->haddr - o);
+    Int128 x, y = load_atomic16_or_exit(cpu, ra, p->o.haddr - o);
     int size = p->size;
 
     if (!HOST_BIG_ENDIAN) {
@@ -2160,8 +2151,8 @@ static uint64_t do_ld_beN(CPUState *cpu, MMULookupPageData *p,
     MemOp atom;
     unsigned tmp, half_size;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        return do_ld_mmio_beN(cpu, p->full, ret_be, p->addr, p->size,
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        return do_ld_mmio_beN(cpu, &p->o.full, ret_be, p->addr, p->size,
                               mmu_idx, type, ra);
     }
 
@@ -2210,8 +2201,9 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
     uint64_t b;
     MemOp atom;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        return do_ld16_mmio_beN(cpu, p->full, a, p->addr, size, mmu_idx, ra);
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        return do_ld16_mmio_beN(cpu, &p->o.full, a, p->addr,
+                                size, mmu_idx, ra);
     }
 
     /*
@@ -2223,7 +2215,7 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
     case MO_ATOM_SUBALIGN:
         p->size = size - 8;
         a = do_ld_parts_beN(p, a);
-        p->haddr += size - 8;
+        p->o.haddr += size - 8;
         p->size = 8;
         b = do_ld_parts_beN(p, 0);
         break;
@@ -2242,7 +2234,7 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
     case MO_ATOM_NONE:
         p->size = size - 8;
         a = do_ld_bytes_beN(p, a);
-        b = ldq_be_p(p->haddr + size - 8);
+        b = ldq_be_p(p->o.haddr + size - 8);
         break;
 
     default:
@@ -2255,10 +2247,11 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
 static uint8_t do_ld_1(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
                        MMUAccessType type, uintptr_t ra)
 {
-    if (unlikely(p->flags & TLB_MMIO)) {
-        return do_ld_mmio_beN(cpu, p->full, 0, p->addr, 1, mmu_idx, type, ra);
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        return do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 1,
+                              mmu_idx, type, ra);
     } else {
-        return *(uint8_t *)p->haddr;
+        return *(uint8_t *)p->o.haddr;
     }
 }
 
@@ -2267,14 +2260,15 @@ static uint16_t do_ld_2(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
 {
     uint16_t ret;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 2, mmu_idx, type, ra);
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 2,
+                             mmu_idx, type, ra);
         if ((memop & MO_BSWAP) == MO_LE) {
             ret = bswap16(ret);
         }
     } else {
         /* Perform the load host endian, then swap if necessary. */
-        ret = load_atom_2(cpu, ra, p->haddr, memop);
+        ret = load_atom_2(cpu, ra, p->o.haddr, memop);
         if (memop & MO_BSWAP) {
             ret = bswap16(ret);
         }
@@ -2287,14 +2281,15 @@ static uint32_t do_ld_4(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
 {
     uint32_t ret;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 4, mmu_idx, type, ra);
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 4,
+                             mmu_idx, type, ra);
         if ((memop & MO_BSWAP) == MO_LE) {
             ret = bswap32(ret);
         }
     } else {
         /* Perform the load host endian. */
-        ret = load_atom_4(cpu, ra, p->haddr, memop);
+        ret = load_atom_4(cpu, ra, p->o.haddr, memop);
         if (memop & MO_BSWAP) {
             ret = bswap32(ret);
         }
@@ -2307,14 +2302,15 @@ static uint64_t do_ld_8(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
 {
     uint64_t ret;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 8, mmu_idx, type, ra);
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 8,
+                             mmu_idx, type, ra);
         if ((memop & MO_BSWAP) == MO_LE) {
             ret = bswap64(ret);
         }
     } else {
         /* Perform the load host endian. */
-        ret = load_atom_8(cpu, ra, p->haddr, memop);
+        ret = load_atom_8(cpu, ra, p->o.haddr, memop);
         if (memop & MO_BSWAP) {
             ret = bswap64(ret);
         }
@@ -2414,15 +2410,15 @@ static Int128 do_ld16_mmu(CPUState *cpu, vaddr addr,
     cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     crosspage = mmu_lookup(cpu, addr, oi, ra, MMU_DATA_LOAD, &l);
     if (likely(!crosspage)) {
-        if (unlikely(l.page[0].flags & TLB_MMIO)) {
-            ret = do_ld16_mmio_beN(cpu, l.page[0].full, 0, addr, 16,
+        if (unlikely(l.page[0].o.flags & TLB_MMIO)) {
+            ret = do_ld16_mmio_beN(cpu, &l.page[0].o.full, 0, addr, 16,
                                    l.mmu_idx, ra);
             if ((l.memop & MO_BSWAP) == MO_LE) {
                 ret = bswap128(ret);
             }
         } else {
             /* Perform the load host endian. */
-            ret = load_atom_16(cpu, ra, l.page[0].haddr, l.memop);
+            ret = load_atom_16(cpu, ra, l.page[0].o.haddr, l.memop);
             if (l.memop & MO_BSWAP) {
                 ret = bswap128(ret);
             }
@@ -2568,10 +2564,10 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
     MemOp atom;
     unsigned tmp, half_size;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        return do_st_mmio_leN(cpu, p->full, val_le, p->addr,
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        return do_st_mmio_leN(cpu, &p->o.full, val_le, p->addr,
                               p->size, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         return val_le >> (p->size * 8);
     }
 
@@ -2582,7 +2578,7 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
     atom = mop & MO_ATOM_MASK;
     switch (atom) {
     case MO_ATOM_SUBALIGN:
-        return store_parts_leN(p->haddr, p->size, val_le);
+        return store_parts_leN(p->o.haddr, p->size, val_le);
 
     case MO_ATOM_IFALIGN_PAIR:
     case MO_ATOM_WITHIN16_PAIR:
@@ -2593,9 +2589,9 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
             ? p->size == half_size
             : p->size >= half_size) {
             if (!HAVE_al8_fast && p->size <= 4) {
-                return store_whole_le4(p->haddr, p->size, val_le);
+                return store_whole_le4(p->o.haddr, p->size, val_le);
             } else if (HAVE_al8) {
-                return store_whole_le8(p->haddr, p->size, val_le);
+                return store_whole_le8(p->o.haddr, p->size, val_le);
             } else {
                 cpu_loop_exit_atomic(cpu, ra);
             }
@@ -2605,7 +2601,7 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
     case MO_ATOM_IFALIGN:
     case MO_ATOM_WITHIN16:
     case MO_ATOM_NONE:
-        return store_bytes_leN(p->haddr, p->size, val_le);
+        return store_bytes_leN(p->o.haddr, p->size, val_le);
 
     default:
         g_assert_not_reached();
@@ -2622,10 +2618,10 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
     int size = p->size;
     MemOp atom;
 
-    if (unlikely(p->flags & TLB_MMIO)) {
-        return do_st16_mmio_leN(cpu, p->full, val_le, p->addr,
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        return do_st16_mmio_leN(cpu, &p->o.full, val_le, p->addr,
                                 size, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         return int128_gethi(val_le) >> ((size - 8) * 8);
     }
 
@@ -2636,8 +2632,8 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
     atom = mop & MO_ATOM_MASK;
     switch (atom) {
     case MO_ATOM_SUBALIGN:
-        store_parts_leN(p->haddr, 8, int128_getlo(val_le));
-        return store_parts_leN(p->haddr + 8, p->size - 8,
+        store_parts_leN(p->o.haddr, 8, int128_getlo(val_le));
+        return store_parts_leN(p->o.haddr + 8, p->size - 8,
                                int128_gethi(val_le));
 
     case MO_ATOM_WITHIN16_PAIR:
@@ -2645,7 +2641,7 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
         if (!HAVE_CMPXCHG128) {
             cpu_loop_exit_atomic(cpu, ra);
         }
-        return store_whole_le16(p->haddr, p->size, val_le);
+        return store_whole_le16(p->o.haddr, p->size, val_le);
 
     case MO_ATOM_IFALIGN_PAIR:
         /*
@@ -2655,8 +2651,8 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
     case MO_ATOM_IFALIGN:
     case MO_ATOM_WITHIN16:
     case MO_ATOM_NONE:
-        stq_le_p(p->haddr, int128_getlo(val_le));
-        return store_bytes_leN(p->haddr + 8, p->size - 8,
+        stq_le_p(p->o.haddr, int128_getlo(val_le));
+        return store_bytes_leN(p->o.haddr + 8, p->size - 8,
                                int128_gethi(val_le));
 
     default:
@@ -2667,69 +2663,69 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
 static void do_st_1(CPUState *cpu, MMULookupPageData *p, uint8_t val,
                     int mmu_idx, uintptr_t ra)
 {
-    if (unlikely(p->flags & TLB_MMIO)) {
-        do_st_mmio_leN(cpu, p->full, val, p->addr, 1, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+    if (unlikely(p->o.flags & TLB_MMIO)) {
+        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 1, mmu_idx, ra);
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         /* nothing */
     } else {
-        *(uint8_t *)p->haddr = val;
+        *(uint8_t *)p->o.haddr = val;
     }
 }
 
 static void do_st_2(CPUState *cpu, MMULookupPageData *p, uint16_t val,
                     int mmu_idx, MemOp memop, uintptr_t ra)
 {
-    if (unlikely(p->flags & TLB_MMIO)) {
+    if (unlikely(p->o.flags & TLB_MMIO)) {
         if ((memop & MO_BSWAP) != MO_LE) {
             val = bswap16(val);
         }
-        do_st_mmio_leN(cpu, p->full, val, p->addr, 2, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 2, mmu_idx, ra);
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         /* nothing */
     } else {
         /* Swap to host endian if necessary, then store. */
         if (memop & MO_BSWAP) {
             val = bswap16(val);
         }
-        store_atom_2(cpu, ra, p->haddr, memop, val);
+        store_atom_2(cpu, ra, p->o.haddr, memop, val);
     }
 }
 
 static void do_st_4(CPUState *cpu, MMULookupPageData *p, uint32_t val,
                     int mmu_idx, MemOp memop, uintptr_t ra)
 {
-    if (unlikely(p->flags & TLB_MMIO)) {
+    if (unlikely(p->o.flags & TLB_MMIO)) {
         if ((memop & MO_BSWAP) != MO_LE) {
             val = bswap32(val);
         }
-        do_st_mmio_leN(cpu, p->full, val, p->addr, 4, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 4, mmu_idx, ra);
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         /* nothing */
     } else {
         /* Swap to host endian if necessary, then store. */
         if (memop & MO_BSWAP) {
             val = bswap32(val);
         }
-        store_atom_4(cpu, ra, p->haddr, memop, val);
+        store_atom_4(cpu, ra, p->o.haddr, memop, val);
     }
 }
 
 static void do_st_8(CPUState *cpu, MMULookupPageData *p, uint64_t val,
                     int mmu_idx, MemOp memop, uintptr_t ra)
 {
-    if (unlikely(p->flags & TLB_MMIO)) {
+    if (unlikely(p->o.flags & TLB_MMIO)) {
         if ((memop & MO_BSWAP) != MO_LE) {
             val = bswap64(val);
         }
-        do_st_mmio_leN(cpu, p->full, val, p->addr, 8, mmu_idx, ra);
-    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
+        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 8, mmu_idx, ra);
+    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
         /* nothing */
     } else {
         /* Swap to host endian if necessary, then store. */
         if (memop & MO_BSWAP) {
             val = bswap64(val);
         }
-        store_atom_8(cpu, ra, p->haddr, memop, val);
+        store_atom_8(cpu, ra, p->o.haddr, memop, val);
     }
 }
 
@@ -2822,19 +2818,20 @@ static void do_st16_mmu(CPUState *cpu, vaddr addr, Int128 val,
     cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     crosspage = mmu_lookup(cpu, addr, oi, ra, MMU_DATA_STORE, &l);
     if (likely(!crosspage)) {
-        if (unlikely(l.page[0].flags & TLB_MMIO)) {
+        if (unlikely(l.page[0].o.flags & TLB_MMIO)) {
             if ((l.memop & MO_BSWAP) != MO_LE) {
                 val = bswap128(val);
             }
-            do_st16_mmio_leN(cpu, l.page[0].full, val, addr, 16, l.mmu_idx, ra);
-        } else if (unlikely(l.page[0].flags & TLB_DISCARD_WRITE)) {
+            do_st16_mmio_leN(cpu, &l.page[0].o.full, val, addr,
+                             16, l.mmu_idx, ra);
+        } else if (unlikely(l.page[0].o.flags & TLB_DISCARD_WRITE)) {
             /* nothing */
         } else {
             /* Swap to host endian if necessary, then store. */
             if (l.memop & MO_BSWAP) {
                 val = bswap128(val);
             }
-            store_atom_16(cpu, ra, l.page[0].haddr, l.memop, val);
+            store_atom_16(cpu, ra, l.page[0].o.haddr, l.memop, val);
         }
         return;
     }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (28 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:31   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
                   ` (24 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Reuse most of TLBLookupInput between calls to tlb_lookup.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 65 ++++++++++++++++++----------------------------
 1 file changed, 25 insertions(+), 40 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8f459be5a8..981098a6f2 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1695,34 +1695,6 @@ typedef struct MMULookupLocals {
     int mmu_idx;
 } MMULookupLocals;
 
-/**
- * mmu_lookup1: translate one page
- * @cpu: generic cpu state
- * @data: lookup parameters
- * @memop: memory operation for the access, or 0
- * @mmu_idx: virtual address context
- * @access_type: load/store/code
- * @ra: return address into tcg generated code, or 0
- *
- * Resolve the translation for the one page at @data.addr, filling in
- * the rest of @data with the results.  If the translation fails,
- * tlb_fill_align will longjmp out.
- */
-static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
-                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
-{
-    TLBLookupInput i = {
-        .addr = data->addr,
-        .ra = ra,
-        .access_type = access_type,
-        .memop_probe = memop,
-        .size = data->size,
-        .mmu_idx = mmu_idx,
-    };
-
-    tlb_lookup_nofail(cpu, &data->o, &i);
-}
-
 /**
  * mmu_watch_or_dirty
  * @cpu: generic cpu state
@@ -1769,26 +1741,36 @@ static void mmu_watch_or_dirty(CPUState *cpu, MMULookupPageData *data,
 static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
                        uintptr_t ra, MMUAccessType type, MMULookupLocals *l)
 {
+    MemOp memop = get_memop(oi);
+    int mmu_idx = get_mmuidx(oi);
+    TLBLookupInput i = {
+        .addr = addr,
+        .ra = ra,
+        .access_type = type,
+        .memop_probe = memop,
+        .size = memop_size(memop),
+        .mmu_idx = mmu_idx,
+    };
     bool crosspage;
     int flags;
 
-    l->memop = get_memop(oi);
-    l->mmu_idx = get_mmuidx(oi);
+    l->memop = memop;
+    l->mmu_idx = mmu_idx;
 
-    tcg_debug_assert(l->mmu_idx < NB_MMU_MODES);
+    tcg_debug_assert(mmu_idx < NB_MMU_MODES);
 
     l->page[0].addr = addr;
-    l->page[0].size = memop_size(l->memop);
-    l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK;
+    l->page[0].size = i.size;
+    l->page[1].addr = (addr + i.size - 1) & TARGET_PAGE_MASK;
     l->page[1].size = 0;
     crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK;
 
     if (likely(!crosspage)) {
-        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
+        tlb_lookup_nofail(cpu, &l->page[0].o, &i);
 
         flags = l->page[0].o.flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
-            mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
+            mmu_watch_or_dirty(cpu, &l->page[0], i.access_type, i.ra);
         }
         if (unlikely(flags & TLB_BSWAP)) {
             l->memop ^= MO_BSWAP;
@@ -1796,17 +1778,20 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     } else {
         /* Finish compute of page crossing. */
         int size0 = l->page[1].addr - addr;
-        l->page[1].size = l->page[0].size - size0;
+        l->page[1].size = i.size - size0;
         l->page[0].size = size0;
 
         /* Lookup both pages, recognizing exceptions from either. */
-        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
-        mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
+        i.size = size0;
+        tlb_lookup_nofail(cpu, &l->page[0].o, &i);
+        i.addr = l->page[1].addr;
+        i.size = l->page[1].size;
+        tlb_lookup_nofail(cpu, &l->page[1].o, &i);
 
         flags = l->page[0].o.flags | l->page[1].o.flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
-            mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
-            mmu_watch_or_dirty(cpu, &l->page[1], type, ra);
+            mmu_watch_or_dirty(cpu, &l->page[0], i.access_type, i.ra);
+            mmu_watch_or_dirty(cpu, &l->page[1], i.access_type, i.ra);
         }
 
         /*
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (29 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:32   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
                   ` (23 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Because translation is special, we don't need the speed
of the direct-mapped softmmu tlb.  We cache a lookups in
DisasContextBase within the translator loop anyway.

Drop the addr_code comparator from CPUTLBEntry.
Go directly to the IntervalTree for MMU_INST_FETCH.
Derive exec flags from read flags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h    |  3 ++
 include/exec/tlb-common.h |  5 ++-
 accel/tcg/cputlb.c        | 76 ++++++++++++++++++++++++---------------
 3 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 45e6676938..ad160c328a 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -339,6 +339,9 @@ static inline int cpu_mmu_index(CPUState *cs, bool ifetch)
     (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO \
     | TLB_FORCE_SLOW | TLB_DISCARD_WRITE)
 
+/* Filter read flags to exec flags. */
+#define TLB_EXEC_FLAGS_MASK  (TLB_MMIO)
+
 /*
  * Flags stored in CPUTLBEntryFull.slow_flags[x].
  * TLB_FORCE_SLOW must be set in CPUTLBEntry.addr_idx[x].
diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index 300f9fae67..feaa471299 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -26,7 +26,6 @@ typedef union CPUTLBEntry {
     struct {
         uint64_t addr_read;
         uint64_t addr_write;
-        uint64_t addr_code;
         /*
          * Addend to virtual address to get host address.  IO accesses
          * use the corresponding iotlb value.
@@ -35,7 +34,7 @@ typedef union CPUTLBEntry {
     };
     /*
      * Padding to get a power of two size, as well as index
-     * access to addr_{read,write,code}.
+     * access to addr_{read,write}.
      */
     uint64_t addr_idx[(1 << CPU_TLB_ENTRY_BITS) / sizeof(uint64_t)];
 } CPUTLBEntry;
@@ -92,7 +91,7 @@ struct CPUTLBEntryFull {
      * Additional tlb flags for use by the slow path. If non-zero,
      * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
      */
-    uint8_t slow_flags[MMU_ACCESS_COUNT];
+    uint8_t slow_flags[2];
 
     /*
      * Allow target-specific additions to this structure.
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 981098a6f2..be2ea1bc70 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -114,8 +114,9 @@ static inline uint64_t tlb_read_idx(const CPUTLBEntry *entry,
                       MMU_DATA_LOAD * sizeof(uint64_t));
     QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
                       MMU_DATA_STORE * sizeof(uint64_t));
-    QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
-                      MMU_INST_FETCH * sizeof(uint64_t));
+
+    tcg_debug_assert(access_type == MMU_DATA_LOAD ||
+                     access_type == MMU_DATA_STORE);
 
 #if TARGET_LONG_BITS == 32
     /* Use qatomic_read, in case of addr_write; only care about low bits. */
@@ -480,8 +481,7 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
     mask &= TARGET_PAGE_MASK | TLB_INVALID_MASK;
 
     return (page == (tlb_entry->addr_read & mask) ||
-            page == (tlb_addr_write(tlb_entry) & mask) ||
-            page == (tlb_entry->addr_code & mask));
+            page == (tlb_addr_write(tlb_entry) & mask));
 }
 
 /* Called with tlb_c.lock held */
@@ -1184,9 +1184,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     /* Now calculate the new entry */
     node->copy.addend = addend - addr_page;
 
-    tlb_set_compare(full, &node->copy, addr_page, read_flags,
-                    MMU_INST_FETCH, prot & PAGE_EXEC);
-
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
     }
@@ -1308,22 +1305,30 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
     /* Primary lookup in the fast tlb. */
     entry = tlbfast_entry(fast, addr);
     full = &desc->fulltlb[tlbfast_index(fast, addr)];
-    cmp = tlb_read_idx(entry, access_type);
-    if (tlb_hit(cmp, addr)) {
-        goto found;
+    if (access_type != MMU_INST_FETCH) {
+        cmp = tlb_read_idx(entry, access_type);
+        if (tlb_hit(cmp, addr)) {
+            goto found_data;
+        }
     }
 
     /* Secondary lookup in the IntervalTree. */
     node = tlbtree_lookup_addr(desc, addr);
     if (node) {
-        cmp = tlb_read_idx(&node->copy, access_type);
-        if (tlb_hit(cmp, addr)) {
-            /* Install the cached entry. */
-            qemu_spin_lock(&cpu->neg.tlb.c.lock);
-            copy_tlb_helper_locked(entry, &node->copy);
-            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-            *full = node->full;
-            goto found;
+        if (access_type == MMU_INST_FETCH) {
+            if (node->full.prot & PAGE_EXEC) {
+                goto found_code;
+            }
+        } else {
+            cmp = tlb_read_idx(&node->copy, access_type);
+            if (tlb_hit(cmp, addr)) {
+                /* Install the cached entry. */
+                qemu_spin_lock(&cpu->neg.tlb.c.lock);
+                copy_tlb_helper_locked(entry, &node->copy);
+                qemu_spin_unlock(&cpu->neg.tlb.c.lock);
+                *full = node->full;
+                goto found_data;
+            }
         }
     }
 
@@ -1333,9 +1338,14 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
         tcg_debug_assert(probe);
         return false;
     }
-
     o->did_tlb_fill = true;
 
+    if (access_type == MMU_INST_FETCH) {
+        node = tlbtree_lookup_addr(desc, addr);
+        tcg_debug_assert(node);
+        goto found_code;
+    }
+
     entry = tlbfast_entry(fast, addr);
     full = &desc->fulltlb[tlbfast_index(fast, addr)];
     cmp = tlb_read_idx(entry, access_type);
@@ -1345,14 +1355,29 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
      * called tlb_fill_align, so we know that this entry *is* valid.
      */
     flags &= ~TLB_INVALID_MASK;
+    goto found_data;
+
+ found_data:
+    flags &= cmp;
+    flags |= full->slow_flags[access_type];
+    o->flags = flags;
+    o->full = *full;
+    o->haddr = (void *)((uintptr_t)addr + entry->addend);
     goto done;
 
- found:
-    /* Alignment has not been checked by tlb_fill_align. */
-    {
+ found_code:
+    o->flags = node->copy.addr_read & TLB_EXEC_FLAGS_MASK;
+    o->full = node->full;
+    o->haddr = (void *)((uintptr_t)addr + node->copy.addend);
+    goto done;
+
+ done:
+    if (!o->did_tlb_fill) {
         int a_bits = memop_alignment_bits(memop);
 
         /*
+         * Alignment has not been checked by tlb_fill_align.
+         *
          * The TLB_CHECK_ALIGNED check differs from the normal alignment
          * check, in that this is based on the atomicity of the operation.
          * The intended use case is the ARM memory type field of each PTE,
@@ -1366,13 +1391,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
             cpu_unaligned_access(cpu, addr, access_type, i->mmu_idx, i->ra);
         }
     }
-
- done:
-    flags &= cmp;
-    flags |= full->slow_flags[access_type];
-    o->flags = flags;
-    o->full = *full;
-    o->haddr = (void *)((uintptr_t)addr + entry->addend);
     return true;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (30 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:39   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
                   ` (22 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Link from the fast tlb entry to the interval tree node.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/tlb-common.h |  2 ++
 accel/tcg/cputlb.c        | 26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index feaa471299..3b57d61112 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -31,6 +31,8 @@ typedef union CPUTLBEntry {
          * use the corresponding iotlb value.
          */
         uintptr_t addend;
+        /* The defining IntervalTree entry. */
+        struct CPUTLBEntryTree *tree;
     };
     /*
      * Padding to get a power of two size, as well as index
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index be2ea1bc70..3282436752 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -490,7 +490,10 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
                                         vaddr mask)
 {
     if (tlb_hit_page_mask_anyprot(tlb_entry, page, mask)) {
-        memset(tlb_entry, -1, sizeof(*tlb_entry));
+        tlb_entry->addr_read = -1;
+        tlb_entry->addr_write = -1;
+        tlb_entry->addend = 0;
+        tlb_entry->tree = NULL;
         return true;
     }
     return false;
@@ -1183,6 +1186,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
 
     /* Now calculate the new entry */
     node->copy.addend = addend - addr_page;
+    node->copy.tree = node;
 
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
@@ -1291,7 +1295,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
     CPUTLBDescFast *fast = &cpu->neg.tlb.f[i->mmu_idx];
     vaddr addr = i->addr;
     MMUAccessType access_type = i->access_type;
-    CPUTLBEntryFull *full;
     CPUTLBEntryTree *node;
     CPUTLBEntry *entry;
     uint64_t cmp;
@@ -1304,9 +1307,9 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
 
     /* Primary lookup in the fast tlb. */
     entry = tlbfast_entry(fast, addr);
-    full = &desc->fulltlb[tlbfast_index(fast, addr)];
     if (access_type != MMU_INST_FETCH) {
         cmp = tlb_read_idx(entry, access_type);
+        node = entry->tree;
         if (tlb_hit(cmp, addr)) {
             goto found_data;
         }
@@ -1326,7 +1329,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
                 qemu_spin_lock(&cpu->neg.tlb.c.lock);
                 copy_tlb_helper_locked(entry, &node->copy);
                 qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-                *full = node->full;
                 goto found_data;
             }
         }
@@ -1347,8 +1349,8 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
     }
 
     entry = tlbfast_entry(fast, addr);
-    full = &desc->fulltlb[tlbfast_index(fast, addr)];
     cmp = tlb_read_idx(entry, access_type);
+    node = entry->tree;
     /*
      * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
      * to force the next access through tlb_fill_align.  We've just
@@ -1359,19 +1361,18 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
 
  found_data:
     flags &= cmp;
-    flags |= full->slow_flags[access_type];
+    flags |= node->full.slow_flags[access_type];
     o->flags = flags;
-    o->full = *full;
-    o->haddr = (void *)((uintptr_t)addr + entry->addend);
-    goto done;
+    goto found_common;
 
  found_code:
     o->flags = node->copy.addr_read & TLB_EXEC_FLAGS_MASK;
+    goto found_common;
+
+ found_common:
     o->full = node->full;
     o->haddr = (void *)((uintptr_t)addr + node->copy.addend);
-    goto done;
 
- done:
     if (!o->did_tlb_fill) {
         int a_bits = memop_alignment_bits(memop);
 
@@ -1669,7 +1670,6 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
                        bool is_store, struct qemu_plugin_hwaddr *data)
 {
     CPUTLBEntry *tlbe = tlb_entry(cpu, mmu_idx, addr);
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     MMUAccessType access_type = is_store ? MMU_DATA_STORE : MMU_DATA_LOAD;
     uint64_t tlb_addr = tlb_read_idx(tlbe, access_type);
     CPUTLBEntryFull *full;
@@ -1678,7 +1678,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
         return false;
     }
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    full = &tlbe->tree->full;
     data->phys_addr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
 
     /* We must have an iotlb entry for MMIO */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (31 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:49   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
                   ` (21 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

This array is now write-only, and may be removed.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  1 -
 accel/tcg/cputlb.c    | 34 +++++++---------------------------
 2 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 4364ddb1db..5c069f2a00 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -219,7 +219,6 @@ typedef struct CPUTLBDesc {
     /* maximum number of entries observed in the window */
     size_t window_max_entries;
     size_t n_used_entries;
-    CPUTLBEntryFull *fulltlb;
     /* All active tlb entries for this address space. */
     IntervalTreeRoot iroot;
 } CPUTLBDesc;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3282436752..7f63dc3fd8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -149,13 +149,6 @@ static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
     return fast->table + tlbfast_index(fast, addr);
 }
 
-/* Find the TLB index corresponding to the mmu_idx + address pair.  */
-static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
-                                  vaddr addr)
-{
-    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
-}
-
 /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
 static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
                                      vaddr addr)
@@ -270,22 +263,20 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
     }
 
     g_free(fast->table);
-    g_free(desc->fulltlb);
 
     tlb_window_reset(desc, now, 0);
     /* desc->n_used_entries is cleared by the caller */
     fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_try_new(CPUTLBEntry, new_size);
-    desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
 
     /*
-     * If the allocations fail, try smaller sizes. We just freed some
+     * If the allocation fails, try smaller sizes. We just freed some
      * memory, so going back to half of new_size has a good chance of working.
      * Increased memory pressure elsewhere in the system might cause the
      * allocations to fail though, so we progressively reduce the allocation
      * size, aborting if we cannot even allocate the smallest TLB we support.
      */
-    while (fast->table == NULL || desc->fulltlb == NULL) {
+    while (fast->table == NULL) {
         if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
             error_report("%s: %s", __func__, strerror(errno));
             abort();
@@ -294,9 +285,7 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
         fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
 
         g_free(fast->table);
-        g_free(desc->fulltlb);
         fast->table = g_try_new(CPUTLBEntry, new_size);
-        desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
     }
 }
 
@@ -350,7 +339,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
     desc->n_used_entries = 0;
     fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_new(CPUTLBEntry, n_entries);
-    desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
     memset(&desc->iroot, 0, sizeof(desc->iroot));
     tlb_mmu_flush_locked(desc, fast);
 }
@@ -372,15 +360,9 @@ void tlb_init(CPUState *cpu)
 
 void tlb_destroy(CPUState *cpu)
 {
-    int i;
-
     qemu_spin_destroy(&cpu->neg.tlb.c.lock);
-    for (i = 0; i < NB_MMU_MODES; i++) {
-        CPUTLBDesc *desc = &cpu->neg.tlb.d[i];
-        CPUTLBDescFast *fast = &cpu->neg.tlb.f[i];
-
-        g_free(fast->table);
-        g_free(desc->fulltlb);
+    for (int i = 0; i < NB_MMU_MODES; i++) {
+        g_free(cpu->neg.tlb.f[i].table);
         interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
                                  offsetof(CPUTLBEntryTree, itree));
     }
@@ -1061,7 +1043,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     CPUTLB *tlb = &cpu->neg.tlb;
     CPUTLBDesc *desc = &tlb->d[mmu_idx];
     MemoryRegionSection *section;
-    unsigned int index, read_flags, write_flags;
+    unsigned int read_flags, write_flags;
     uintptr_t addend;
     CPUTLBEntry *te;
     CPUTLBEntryTree *node;
@@ -1140,7 +1122,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     wp_flags = cpu_watchpoint_address_matches(cpu, addr_page,
                                               TARGET_PAGE_SIZE);
 
-    index = tlb_index(cpu, mmu_idx, addr_page);
     te = tlb_entry(cpu, mmu_idx, addr_page);
 
     /*
@@ -1179,8 +1160,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
      * subtract here is that of the page base, and not the same as the
      * vaddr we add back in io_prepare()/get_page_addr_code().
      */
-    desc->fulltlb[index] = *full;
-    full = &desc->fulltlb[index];
+    node->full = *full;
+    full = &node->full;
     full->xlat_section = iotlb - addr_page;
     full->phys_addr = paddr_page;
 
@@ -1203,7 +1184,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     tlb_set_compare(full, &node->copy, addr_page, write_flags,
                     MMU_DATA_STORE, prot & PAGE_WRITE);
 
-    node->full = *full;
     copy_tlb_helper_locked(te, &node->copy);
     desc->n_used_entries++;
     qemu_spin_unlock(&tlb->c.lock);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (32 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 35/54] target/avr: " Richard Henderson
                   ` (20 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/alpha/cpu.h    |  6 +++---
 target/alpha/cpu.c    |  2 +-
 target/alpha/helper.c | 23 +++++++++++++++++------
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index 3556d3227f..70331c0b83 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -449,9 +449,9 @@ void alpha_cpu_record_sigsegv(CPUState *cs, vaddr address,
 void alpha_cpu_record_sigbus(CPUState *cs, vaddr address,
                              MMUAccessType access_type, uintptr_t retaddr);
 #else
-bool alpha_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr);
+bool alpha_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
+                              MMUAccessType access_type, int mmu_idx,
+                              MemOp memop, int size, bool probe, uintptr_t ra);
 G_NORETURN void alpha_cpu_do_unaligned_access(CPUState *cpu, vaddr addr,
                                               MMUAccessType access_type, int mmu_idx,
                                               uintptr_t retaddr);
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index 5d75c941f7..7bcc48420d 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -228,7 +228,7 @@ static const TCGCPUOps alpha_tcg_ops = {
     .record_sigsegv = alpha_cpu_record_sigsegv,
     .record_sigbus = alpha_cpu_record_sigbus,
 #else
-    .tlb_fill = alpha_cpu_tlb_fill,
+    .tlb_fill_align = alpha_cpu_tlb_fill_align,
     .cpu_exec_interrupt = alpha_cpu_exec_interrupt,
     .cpu_exec_halt = alpha_cpu_has_work,
     .do_interrupt = alpha_cpu_do_interrupt,
diff --git a/target/alpha/helper.c b/target/alpha/helper.c
index 2f1000c99f..26eadfe3ca 100644
--- a/target/alpha/helper.c
+++ b/target/alpha/helper.c
@@ -294,14 +294,21 @@ hwaddr alpha_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
     return (fail >= 0 ? -1 : phys);
 }
 
-bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr)
+bool alpha_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
+                              MMUAccessType access_type, int mmu_idx,
+                              MemOp memop, int size, bool probe, uintptr_t ra)
 {
     CPUAlphaState *env = cpu_env(cs);
     target_ulong phys;
     int prot, fail;
 
+    if (addr & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        alpha_cpu_do_unaligned_access(cs, addr, access_type, mmu_idx, ra);
+    }
+
     fail = get_physical_address(env, addr, 1 << access_type,
                                 mmu_idx, &phys, &prot);
     if (unlikely(fail >= 0)) {
@@ -314,11 +321,15 @@ bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
         env->trap_arg2 = (access_type == MMU_DATA_LOAD ? 0ull :
                           access_type == MMU_DATA_STORE ? 1ull :
                           /* access_type == MMU_INST_FETCH */ -1ull);
-        cpu_loop_exit_restore(cs, retaddr);
+        cpu_loop_exit_restore(cs, ra);
     }
 
-    tlb_set_page(cs, addr & TARGET_PAGE_MASK, phys & TARGET_PAGE_MASK,
-                 prot, mmu_idx, TARGET_PAGE_SIZE);
+    memset(out, 0, sizeof(*out));
+    out->phys_addr = phys;
+    out->prot = prot;
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
+    out->lg_page_size = TARGET_PAGE_BITS;
+
     return true;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 35/54] target/avr: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (33 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 36/54] target/i386: " Richard Henderson
                   ` (19 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/avr/cpu.h    |  7 ++++---
 target/avr/cpu.c    |  2 +-
 target/avr/helper.c | 19 ++++++++++++-------
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/target/avr/cpu.h b/target/avr/cpu.h
index 4725535102..cdd3bcd418 100644
--- a/target/avr/cpu.h
+++ b/target/avr/cpu.h
@@ -23,6 +23,7 @@
 
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 
 #ifdef CONFIG_USER_ONLY
 #error "AVR 8-bit does not support user mode"
@@ -238,9 +239,9 @@ static inline void cpu_set_sreg(CPUAVRState *env, uint8_t sreg)
     env->sregI = (sreg >> 7) & 0x01;
 }
 
-bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr);
+bool avr_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
+                            MMUAccessType access_type, int mmu_idx,
+                            MemOp memop, int size, bool probe, uintptr_t ra);
 
 #include "exec/cpu-all.h"
 
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
index 3132842d56..a7fe869396 100644
--- a/target/avr/cpu.c
+++ b/target/avr/cpu.c
@@ -211,7 +211,7 @@ static const TCGCPUOps avr_tcg_ops = {
     .restore_state_to_opc = avr_restore_state_to_opc,
     .cpu_exec_interrupt = avr_cpu_exec_interrupt,
     .cpu_exec_halt = avr_cpu_has_work,
-    .tlb_fill = avr_cpu_tlb_fill,
+    .tlb_fill_align = avr_cpu_tlb_fill_align,
     .do_interrupt = avr_cpu_do_interrupt,
 };
 
diff --git a/target/avr/helper.c b/target/avr/helper.c
index 345708a1b3..a18f11aa9f 100644
--- a/target/avr/helper.c
+++ b/target/avr/helper.c
@@ -104,11 +104,11 @@ hwaddr avr_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
     return addr; /* I assume 1:1 address correspondence */
 }
 
-bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr)
+bool avr_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
+                            MMUAccessType access_type, int mmu_idx,
+                            MemOp memop, int size, bool probe, uintptr_t ra)
 {
-    int prot, page_size = TARGET_PAGE_SIZE;
+    int prot, lg_page_size = TARGET_PAGE_BITS;
     uint32_t paddr;
 
     address &= TARGET_PAGE_MASK;
@@ -141,15 +141,20 @@ bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
              * to force tlb_fill to be called for the next access.
              */
             if (probe) {
-                page_size = 1;
+                lg_page_size = 0;
             } else {
                 cpu_env(cs)->fullacc = 1;
-                cpu_loop_exit_restore(cs, retaddr);
+                cpu_loop_exit_restore(cs, ra);
             }
         }
     }
 
-    tlb_set_page(cs, address, paddr, prot, mmu_idx, page_size);
+    memset(out, 0, sizeof(*out));
+    out->phys_addr = paddr;
+    out->prot = prot;
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
+    out->lg_page_size = lg_page_size;
+
     return true;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 36/54] target/i386: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (34 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 35/54] target/avr: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 37/54] target/loongarch: " Richard Henderson
                   ` (18 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/i386/tcg/helper-tcg.h         |  6 +++---
 target/i386/tcg/sysemu/excp_helper.c | 28 ++++++++++++++++------------
 target/i386/tcg/tcg-cpu.c            |  2 +-
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/target/i386/tcg/helper-tcg.h b/target/i386/tcg/helper-tcg.h
index 696d6ef016..b2164f41e6 100644
--- a/target/i386/tcg/helper-tcg.h
+++ b/target/i386/tcg/helper-tcg.h
@@ -79,9 +79,9 @@ void x86_cpu_record_sigsegv(CPUState *cs, vaddr addr,
 void x86_cpu_record_sigbus(CPUState *cs, vaddr addr,
                            MMUAccessType access_type, uintptr_t ra);
 #else
-bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr);
+bool x86_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
+                            MMUAccessType access_type, int mmu_idx,
+                            MemOp memop, int size, bool probe, uintptr_t ra);
 G_NORETURN void x86_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
                                             MMUAccessType access_type,
                                             int mmu_idx, uintptr_t retaddr);
diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 168ff8e5f3..d23d28fef5 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -601,25 +601,29 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
     return true;
 }
 
-bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr)
+bool x86_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *full, vaddr addr,
+                            MMUAccessType access_type, int mmu_idx,
+                            MemOp memop, int size, bool probe,
+                            uintptr_t retaddr)
 {
     CPUX86State *env = cpu_env(cs);
     TranslateResult out;
     TranslateFault err;
 
+    if (addr & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        x86_cpu_do_unaligned_access(cs, addr, access_type, mmu_idx, retaddr);
+    }
+
     if (get_physical_address(env, addr, access_type, mmu_idx, &out, &err,
                              retaddr)) {
-        /*
-         * Even if 4MB pages, we map only one 4KB page in the cache to
-         * avoid filling it too fast.
-         */
-        assert(out.prot & (1 << access_type));
-        tlb_set_page_with_attrs(cs, addr & TARGET_PAGE_MASK,
-                                out.paddr & TARGET_PAGE_MASK,
-                                cpu_get_mem_attrs(env),
-                                out.prot, mmu_idx, out.page_size);
+        memset(full, 0, sizeof(*full));
+        full->phys_addr = out.paddr;
+        full->prot = out.prot;
+        full->lg_page_size = ctz32(out.page_size);
+        full->attrs = cpu_get_mem_attrs(env);
         return true;
     }
 
diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
index cca19cd40e..6fce6227c7 100644
--- a/target/i386/tcg/tcg-cpu.c
+++ b/target/i386/tcg/tcg-cpu.c
@@ -117,7 +117,7 @@ static const TCGCPUOps x86_tcg_ops = {
     .record_sigsegv = x86_cpu_record_sigsegv,
     .record_sigbus = x86_cpu_record_sigbus,
 #else
-    .tlb_fill = x86_cpu_tlb_fill,
+    .tlb_fill_align = x86_cpu_tlb_fill_align,
     .do_interrupt = x86_cpu_do_interrupt,
     .cpu_exec_halt = x86_cpu_exec_halt,
     .cpu_exec_interrupt = x86_cpu_exec_interrupt,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 37/54] target/loongarch: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (35 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 36/54] target/i386: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 38/54] target/m68k: " Richard Henderson
                   ` (17 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/loongarch/internals.h      |  7 ++++---
 target/loongarch/cpu.c            |  2 +-
 target/loongarch/tcg/tlb_helper.c | 17 +++++++++++------
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h
index 1a02427627..a9f73f27b2 100644
--- a/target/loongarch/internals.h
+++ b/target/loongarch/internals.h
@@ -60,9 +60,10 @@ int get_physical_address(CPULoongArchState *env, hwaddr *physical,
 hwaddr loongarch_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 
 #ifdef CONFIG_TCG
-bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                            MMUAccessType access_type, int mmu_idx,
-                            bool probe, uintptr_t retaddr);
+bool loongarch_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                  vaddr addr, MMUAccessType access_type,
+                                  int mmu_idx, MemOp memop, int size,
+                                  bool probe, uintptr_t ra);
 #endif
 #endif /* !CONFIG_USER_ONLY */
 
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index 57cc4f314b..47d69f1788 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -798,7 +798,7 @@ static const TCGCPUOps loongarch_tcg_ops = {
     .restore_state_to_opc = loongarch_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = loongarch_cpu_tlb_fill,
+    .tlb_fill_align = loongarch_cpu_tlb_fill_align,
     .cpu_exec_interrupt = loongarch_cpu_exec_interrupt,
     .cpu_exec_halt = loongarch_cpu_has_work,
     .do_interrupt = loongarch_cpu_do_interrupt,
diff --git a/target/loongarch/tcg/tlb_helper.c b/target/loongarch/tcg/tlb_helper.c
index 97f38fc391..94d5df08a4 100644
--- a/target/loongarch/tcg/tlb_helper.c
+++ b/target/loongarch/tcg/tlb_helper.c
@@ -474,9 +474,10 @@ void helper_invtlb_page_asid_or_g(CPULoongArchState *env,
     tlb_flush(env_cpu(env));
 }
 
-bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                            MMUAccessType access_type, int mmu_idx,
-                            bool probe, uintptr_t retaddr)
+bool loongarch_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                  vaddr address, MMUAccessType access_type,
+                                  int mmu_idx, MemOp memop, int size,
+                                  bool probe, uintptr_t retaddr)
 {
     CPULoongArchState *env = cpu_env(cs);
     hwaddr physical;
@@ -488,12 +489,16 @@ bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
                                access_type, mmu_idx);
 
     if (ret == TLBRET_MATCH) {
-        tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                     physical & TARGET_PAGE_MASK, prot,
-                     mmu_idx, TARGET_PAGE_SIZE);
         qemu_log_mask(CPU_LOG_MMU,
                       "%s address=%" VADDR_PRIx " physical " HWADDR_FMT_plx
                       " prot %d\n", __func__, address, physical, prot);
+
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = physical;
+        out->prot = prot;
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
+        out->lg_page_size = TARGET_PAGE_BITS;
+
         return true;
     } else {
         qemu_log_mask(CPU_LOG_MMU,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 38/54] target/m68k: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (36 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 37/54] target/loongarch: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest Richard Henderson
                   ` (16 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/m68k/cpu.h    |  7 ++++---
 target/m68k/cpu.c    |  2 +-
 target/m68k/helper.c | 22 +++++++++++++---------
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index b5bbeedb7a..4401426a0b 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -22,6 +22,7 @@
 #define M68K_CPU_H
 
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 #include "qemu/cpu-float.h"
 #include "cpu-qom.h"
 
@@ -582,10 +583,10 @@ enum {
 #define MMU_KERNEL_IDX 0
 #define MMU_USER_IDX 1
 
-bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr);
 #ifndef CONFIG_USER_ONLY
+bool m68k_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
+                             MMUAccessType access_type, int mmu_idx,
+                             MemOp memop, int size, bool probe, uintptr_t ra);
 void m68k_cpu_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
                                  unsigned size, MMUAccessType access_type,
                                  int mmu_idx, MemTxAttrs attrs,
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 5fe335558a..5316cf8922 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -550,7 +550,7 @@ static const TCGCPUOps m68k_tcg_ops = {
     .restore_state_to_opc = m68k_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = m68k_cpu_tlb_fill,
+    .tlb_fill_align = m68k_cpu_tlb_fill_align,
     .cpu_exec_interrupt = m68k_cpu_exec_interrupt,
     .cpu_exec_halt = m68k_cpu_has_work,
     .do_interrupt = m68k_cpu_do_interrupt,
diff --git a/target/m68k/helper.c b/target/m68k/helper.c
index 9bfc6ae97c..1decb6f39c 100644
--- a/target/m68k/helper.c
+++ b/target/m68k/helper.c
@@ -950,9 +950,10 @@ void m68k_set_irq_level(M68kCPU *cpu, int level, uint8_t vector)
     }
 }
 
-bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType qemu_access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr)
+bool m68k_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                             vaddr address, MMUAccessType qemu_access_type,
+                             int mmu_idx, MemOp memop, int size,
+                             bool probe, uintptr_t retaddr)
 {
     CPUM68KState *env = cpu_env(cs);
     hwaddr physical;
@@ -961,12 +962,14 @@ bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
     int ret;
     target_ulong page_size;
 
+    memset(out, 0, sizeof(*out));
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
+
     if ((env->mmu.tcr & M68K_TCR_ENABLED) == 0) {
         /* MMU disabled */
-        tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                     address & TARGET_PAGE_MASK,
-                     PAGE_READ | PAGE_WRITE | PAGE_EXEC,
-                     mmu_idx, TARGET_PAGE_SIZE);
+        out->phys_addr = address;
+        out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
+        out->lg_page_size = TARGET_PAGE_BITS;
         return true;
     }
 
@@ -985,8 +988,9 @@ bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
     ret = get_physical_address(env, &physical, &prot,
                                address, access_type, &page_size);
     if (likely(ret == 0)) {
-        tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                     physical & TARGET_PAGE_MASK, prot, mmu_idx, page_size);
+        out->phys_addr = physical;
+        out->prot = prot;
+        out->lg_page_size = ctz32(page_size);
         return true;
     }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (37 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 38/54] target/m68k: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
                   ` (15 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

The entire operation of ptest is performed within
get_physical_address as part of ACCESS_PTEST.
There is no need to install the page into softmmu.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/m68k/helper.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/m68k/helper.c b/target/m68k/helper.c
index 1decb6f39c..0a54eca9bb 100644
--- a/target/m68k/helper.c
+++ b/target/m68k/helper.c
@@ -1460,7 +1460,6 @@ void HELPER(ptest)(CPUM68KState *env, uint32_t addr, uint32_t is_read)
     hwaddr physical;
     int access_type;
     int prot;
-    int ret;
     target_ulong page_size;
 
     access_type = ACCESS_PTEST;
@@ -1476,14 +1475,7 @@ void HELPER(ptest)(CPUM68KState *env, uint32_t addr, uint32_t is_read)
 
     env->mmu.mmusr = 0;
     env->mmu.ssw = 0;
-    ret = get_physical_address(env, &physical, &prot, addr,
-                               access_type, &page_size);
-    if (ret == 0) {
-        tlb_set_page(env_cpu(env), addr & TARGET_PAGE_MASK,
-                     physical & TARGET_PAGE_MASK,
-                     prot, access_type & ACCESS_SUPER ?
-                     MMU_KERNEL_IDX : MMU_USER_IDX, page_size);
-    }
+    get_physical_address(env, &physical, &prot, addr, access_type, &page_size);
 }
 
 void HELPER(pflush)(CPUM68KState *env, uint32_t addr, uint32_t opmode)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (38 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 41/54] target/mips: " Richard Henderson
                   ` (14 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/microblaze/cpu.h    |  7 +++----
 target/microblaze/cpu.c    |  2 +-
 target/microblaze/helper.c | 33 ++++++++++++++++++++-------------
 3 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h
index 3e5a3e5c60..b0eadfd9b1 100644
--- a/target/microblaze/cpu.h
+++ b/target/microblaze/cpu.h
@@ -421,10 +421,9 @@ static inline void cpu_get_tb_cpu_state(CPUMBState *env, vaddr *pc,
 }
 
 #if !defined(CONFIG_USER_ONLY)
-bool mb_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                     MMUAccessType access_type, int mmu_idx,
-                     bool probe, uintptr_t retaddr);
-
+bool mb_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
+                            MMUAccessType access_type, int mmu_idx,
+                           MemOp memop, int size, bool probe, uintptr_t ra);
 void mb_cpu_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
                                unsigned size, MMUAccessType access_type,
                                int mmu_idx, MemTxAttrs attrs,
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 710eb1146c..212cad2143 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -425,7 +425,7 @@ static const TCGCPUOps mb_tcg_ops = {
     .restore_state_to_opc = mb_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = mb_cpu_tlb_fill,
+    .tlb_fill_align = mb_cpu_tlb_fill_align,
     .cpu_exec_interrupt = mb_cpu_exec_interrupt,
     .cpu_exec_halt = mb_cpu_has_work,
     .do_interrupt = mb_cpu_do_interrupt,
diff --git a/target/microblaze/helper.c b/target/microblaze/helper.c
index 5d3259ce31..b6375564b4 100644
--- a/target/microblaze/helper.c
+++ b/target/microblaze/helper.c
@@ -36,37 +36,44 @@ static bool mb_cpu_access_is_secure(MicroBlazeCPU *cpu,
     }
 }
 
-bool mb_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                     MMUAccessType access_type, int mmu_idx,
-                     bool probe, uintptr_t retaddr)
+bool mb_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
+                           MMUAccessType access_type, int mmu_idx,
+                           MemOp memop, int size,
+                           bool probe, uintptr_t retaddr)
 {
     MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
     CPUMBState *env = &cpu->env;
     MicroBlazeMMULookup lu;
     unsigned int hit;
-    int prot;
-    MemTxAttrs attrs = {};
 
-    attrs.secure = mb_cpu_access_is_secure(cpu, access_type);
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        mb_cpu_do_unaligned_access(cs, address, access_type, mmu_idx, retaddr);
+    }
+
+    memset(out, 0, sizeof(*out));
+    out->attrs.secure = mb_cpu_access_is_secure(cpu, access_type);
+    out->lg_page_size = TARGET_PAGE_BITS;
 
     if (mmu_idx == MMU_NOMMU_IDX) {
         /* MMU disabled or not available.  */
-        address &= TARGET_PAGE_MASK;
-        prot = PAGE_RWX;
-        tlb_set_page_with_attrs(cs, address, address, attrs, prot, mmu_idx,
-                                TARGET_PAGE_SIZE);
+        out->phys_addr = address;
+        out->prot = PAGE_RWX;
         return true;
     }
 
     hit = mmu_translate(cpu, &lu, address, access_type, mmu_idx);
     if (likely(hit)) {
-        uint32_t vaddr = address & TARGET_PAGE_MASK;
+        uint32_t vaddr = address;
         uint32_t paddr = lu.paddr + vaddr - lu.vaddr;
 
         qemu_log_mask(CPU_LOG_MMU, "MMU map mmu=%d v=%x p=%x prot=%x\n",
                       mmu_idx, vaddr, paddr, lu.prot);
-        tlb_set_page_with_attrs(cs, vaddr, paddr, attrs, lu.prot, mmu_idx,
-                                TARGET_PAGE_SIZE);
+
+        out->phys_addr = paddr;
+        out->prot = lu.prot;
         return true;
     }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 41/54] target/mips: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (39 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 42/54] target/openrisc: " Richard Henderson
                   ` (13 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/mips/tcg/tcg-internal.h      |  6 +++---
 target/mips/cpu.c                   |  2 +-
 target/mips/tcg/sysemu/tlb_helper.c | 29 ++++++++++++++++++++---------
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/target/mips/tcg/tcg-internal.h b/target/mips/tcg/tcg-internal.h
index aef032c48d..f4b00354af 100644
--- a/target/mips/tcg/tcg-internal.h
+++ b/target/mips/tcg/tcg-internal.h
@@ -61,9 +61,9 @@ void mips_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
                                     MemTxResult response, uintptr_t retaddr);
 void cpu_mips_tlb_flush(CPUMIPSState *env);
 
-bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr);
+bool mips_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
+                             MMUAccessType access_type, int mmu_idx,
+                             MemOp memop, int size, bool probe, uintptr_t ra);
 
 void mips_semihosting(CPUMIPSState *env);
 
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index d0a43b6d5c..3a453c9285 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -556,7 +556,7 @@ static const TCGCPUOps mips_tcg_ops = {
     .restore_state_to_opc = mips_restore_state_to_opc,
 
 #if !defined(CONFIG_USER_ONLY)
-    .tlb_fill = mips_cpu_tlb_fill,
+    .tlb_fill_align = mips_cpu_tlb_fill_align,
     .cpu_exec_interrupt = mips_cpu_exec_interrupt,
     .cpu_exec_halt = mips_cpu_has_work,
     .do_interrupt = mips_cpu_do_interrupt,
diff --git a/target/mips/tcg/sysemu/tlb_helper.c b/target/mips/tcg/sysemu/tlb_helper.c
index e98bb95951..ac76396525 100644
--- a/target/mips/tcg/sysemu/tlb_helper.c
+++ b/target/mips/tcg/sysemu/tlb_helper.c
@@ -904,15 +904,28 @@ refill:
 }
 #endif
 
-bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr)
+bool mips_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
+                             MMUAccessType access_type, int mmu_idx,
+                             MemOp memop, int size,
+                             bool probe, uintptr_t retaddr)
 {
     CPUMIPSState *env = cpu_env(cs);
     hwaddr physical;
     int prot;
     int ret = TLBRET_BADADDR;
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        mips_cpu_do_unaligned_access(cs, address, access_type,
+                                     mmu_idx, retaddr);
+    }
+
+    memset(out, 0, sizeof(*out));
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
+    out->lg_page_size = TARGET_PAGE_BITS;
+
     /* data access */
     /* XXX: put correct access by using cpu_restore_state() correctly */
     ret = get_physical_address(env, &physical, &prot, address,
@@ -930,9 +943,8 @@ bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
         break;
     }
     if (ret == TLBRET_MATCH) {
-        tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                     physical & TARGET_PAGE_MASK, prot,
-                     mmu_idx, TARGET_PAGE_SIZE);
+        out->phys_addr = physical;
+        out->prot = prot;
         return true;
     }
 #if !defined(TARGET_MIPS64)
@@ -948,9 +960,8 @@ bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
             ret = get_physical_address(env, &physical, &prot, address,
                                        access_type, mmu_idx);
             if (ret == TLBRET_MATCH) {
-                tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                             physical & TARGET_PAGE_MASK, prot,
-                             mmu_idx, TARGET_PAGE_SIZE);
+                out->phys_addr = physical;
+                out->prot = prot;
                 return true;
             }
         }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 42/54] target/openrisc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (40 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 41/54] target/mips: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 43/54] target/ppc: " Richard Henderson
                   ` (12 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/openrisc/cpu.h |  8 +++++---
 target/openrisc/cpu.c |  2 +-
 target/openrisc/mmu.c | 39 +++++++++++++++++++++------------------
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index c9fe9ae12d..e177ad8b84 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -22,6 +22,7 @@
 
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 #include "fpu/softfloat-types.h"
 
 /**
@@ -306,9 +307,10 @@ int print_insn_or1k(bfd_vma addr, disassemble_info *info);
 #ifndef CONFIG_USER_ONLY
 hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
 
-bool openrisc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                           MMUAccessType access_type, int mmu_idx,
-                           bool probe, uintptr_t retaddr);
+bool openrisc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                 vaddr addr, MMUAccessType access_type,
+                                 int mmu_idx, MemOp memop, int size,
+                                 bool probe, uintptr_t ra);
 
 extern const VMStateDescription vmstate_openrisc_cpu;
 
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index b96561d1f2..6aa04ff7d3 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -237,7 +237,7 @@ static const TCGCPUOps openrisc_tcg_ops = {
     .restore_state_to_opc = openrisc_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = openrisc_cpu_tlb_fill,
+    .tlb_fill_align = openrisc_cpu_tlb_fill_align,
     .cpu_exec_interrupt = openrisc_cpu_exec_interrupt,
     .cpu_exec_halt = openrisc_cpu_has_work,
     .do_interrupt = openrisc_cpu_do_interrupt,
diff --git a/target/openrisc/mmu.c b/target/openrisc/mmu.c
index c632d5230b..eafab356a6 100644
--- a/target/openrisc/mmu.c
+++ b/target/openrisc/mmu.c
@@ -104,39 +104,42 @@ static void raise_mmu_exception(OpenRISCCPU *cpu, target_ulong address,
     cpu->env.lock_addr = -1;
 }
 
-bool openrisc_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
-                           MMUAccessType access_type, int mmu_idx,
-                           bool probe, uintptr_t retaddr)
+bool openrisc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                 vaddr addr, MMUAccessType access_type,
+                                 int mmu_idx, MemOp memop, int size,
+                                 bool probe, uintptr_t retaddr)
 {
     OpenRISCCPU *cpu = OPENRISC_CPU(cs);
-    int excp = EXCP_DPF;
     int prot;
     hwaddr phys_addr;
 
+    /* TODO: alignment faults not currently handled. */
+
     if (mmu_idx == MMU_NOMMU_IDX) {
         /* The mmu is disabled; lookups never fail.  */
         get_phys_nommu(&phys_addr, &prot, addr);
-        excp = 0;
     } else {
         bool super = mmu_idx == MMU_SUPERVISOR_IDX;
         int need = (access_type == MMU_INST_FETCH ? PAGE_EXEC
                     : access_type == MMU_DATA_STORE ? PAGE_WRITE
                     : PAGE_READ);
-        excp = get_phys_mmu(cpu, &phys_addr, &prot, addr, need, super);
+        int excp = get_phys_mmu(cpu, &phys_addr, &prot, addr, need, super);
+
+        if (unlikely(excp)) {
+            if (probe) {
+                return false;
+            }
+            raise_mmu_exception(cpu, addr, excp);
+            cpu_loop_exit_restore(cs, retaddr);
+        }
     }
 
-    if (likely(excp == 0)) {
-        tlb_set_page(cs, addr & TARGET_PAGE_MASK,
-                     phys_addr & TARGET_PAGE_MASK, prot,
-                     mmu_idx, TARGET_PAGE_SIZE);
-        return true;
-    }
-    if (probe) {
-        return false;
-    }
-
-    raise_mmu_exception(cpu, addr, excp);
-    cpu_loop_exit_restore(cs, retaddr);
+    memset(out, 0, sizeof(*out));
+    out->phys_addr = phys_addr;
+    out->prot = prot;
+    out->lg_page_size = TARGET_PAGE_BITS;
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
+    return true;
 }
 
 hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 43/54] target/ppc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (41 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 42/54] target/openrisc: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:53   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 44/54] target/riscv: " Richard Henderson
                   ` (11 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/ppc/internal.h   |  7 ++++---
 target/ppc/cpu_init.c   |  2 +-
 target/ppc/mmu_helper.c | 21 ++++++++++++++++-----
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 20fb2ec593..9d132d35a1 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -273,9 +273,10 @@ void ppc_cpu_record_sigsegv(CPUState *cs, vaddr addr,
                             MMUAccessType access_type,
                             bool maperr, uintptr_t ra);
 #else
-bool ppc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr);
+bool ppc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                            vaddr addr, MMUAccessType access_type,
+                            int mmu_idx, MemOp memop, int size,
+                            bool probe, uintptr_t ra);
 G_NORETURN void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
                                             MMUAccessType access_type, int mmu_idx,
                                             uintptr_t retaddr);
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index efcb80d1c2..387c7ff2da 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -7422,7 +7422,7 @@ static const TCGCPUOps ppc_tcg_ops = {
 #ifdef CONFIG_USER_ONLY
   .record_sigsegv = ppc_cpu_record_sigsegv,
 #else
-  .tlb_fill = ppc_cpu_tlb_fill,
+  .tlb_fill_align = ppc_cpu_tlb_fill_align,
   .cpu_exec_interrupt = ppc_cpu_exec_interrupt,
   .cpu_exec_halt = ppc_cpu_has_work,
   .do_interrupt = ppc_cpu_do_interrupt,
diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index b167b37e0a..bf98e0efb0 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1357,18 +1357,29 @@ void helper_check_tlb_flush_global(CPUPPCState *env)
 }
 
 
-bool ppc_cpu_tlb_fill(CPUState *cs, vaddr eaddr, int size,
-                      MMUAccessType access_type, int mmu_idx,
-                      bool probe, uintptr_t retaddr)
+bool ppc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                            vaddr eaddr, MMUAccessType access_type,
+                            int mmu_idx, MemOp memop, int size,
+                            bool probe, uintptr_t retaddr)
 {
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     hwaddr raddr;
     int page_size, prot;
 
+    if (eaddr & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        ppc_cpu_do_unaligned_access(cs, eaddr, access_type, mmu_idx, retaddr);
+    }
+
     if (ppc_xlate(cpu, eaddr, access_type, &raddr,
                   &page_size, &prot, mmu_idx, !probe)) {
-        tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
-                     prot, mmu_idx, 1UL << page_size);
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = raddr;
+        out->prot = prot;
+        out->lg_page_size = page_size;
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     }
     if (probe) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 44/54] target/riscv: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (42 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 43/54] target/ppc: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 45/54] target/rx: " Richard Henderson
                   ` (10 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/cpu.h         |  8 +++++---
 target/riscv/cpu_helper.c  | 22 +++++++++++++++++-----
 target/riscv/tcg/tcg-cpu.c |  2 +-
 3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 284b112821..f97c4f3410 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -25,6 +25,7 @@
 #include "hw/qdev-properties.h"
 #include "exec/cpu-defs.h"
 #include "exec/gdbstub.h"
+#include "exec/memop.h"
 #include "qemu/cpu-float.h"
 #include "qom/object.h"
 #include "qemu/int128.h"
@@ -563,9 +564,10 @@ bool cpu_get_bcfien(CPURISCVState *env);
 G_NORETURN void  riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
                                                MMUAccessType access_type,
                                                int mmu_idx, uintptr_t retaddr);
-bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr);
+bool riscv_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr addr, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t ra);
 char *riscv_isa_string(RISCVCPU *cpu);
 int riscv_cpu_max_xlen(RISCVCPUClass *mcc);
 bool riscv_cpu_option_set(const char *optname);
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 0a3ead69ea..edb2edfc55 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1429,9 +1429,10 @@ static void pmu_tlb_fill_incr_ctr(RISCVCPU *cpu, MMUAccessType access_type)
     riscv_pmu_incr_ctr(cpu, pmu_event_type);
 }
 
-bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr)
+bool riscv_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr address, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr)
 {
     RISCVCPU *cpu = RISCV_CPU(cs);
     CPURISCVState *env = &cpu->env;
@@ -1452,6 +1453,14 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
     qemu_log_mask(CPU_LOG_MMU, "%s ad %" VADDR_PRIx " rw %d mmu_idx %d\n",
                   __func__, address, access_type, mmu_idx);
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        riscv_cpu_do_unaligned_access(cs, address, access_type,
+                                      mmu_idx, retaddr);
+    }
+
     pmu_tlb_fill_incr_ctr(cpu, access_type);
     if (two_stage_lookup) {
         /* Two stage lookup */
@@ -1544,8 +1553,11 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
     }
 
     if (ret == TRANSLATE_SUCCESS) {
-        tlb_set_page(cs, address & ~(tlb_size - 1), pa & ~(tlb_size - 1),
-                     prot, mmu_idx, tlb_size);
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = pa;
+        out->prot = prot;
+        out->lg_page_size = ctz64(tlb_size);
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     } else if (probe) {
         return false;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index c62c221696..f3b436bb86 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -138,7 +138,7 @@ static const TCGCPUOps riscv_tcg_ops = {
     .restore_state_to_opc = riscv_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = riscv_cpu_tlb_fill,
+    .tlb_fill_align = riscv_cpu_tlb_fill_align,
     .cpu_exec_interrupt = riscv_cpu_exec_interrupt,
     .cpu_exec_halt = riscv_cpu_has_work,
     .do_interrupt = riscv_cpu_do_interrupt,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 45/54] target/rx: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (43 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 44/54] target/riscv: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 46/54] target/s390x: " Richard Henderson
                   ` (9 subsequent siblings)
  54 siblings, 2 replies; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/rx/cpu.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/target/rx/cpu.c b/target/rx/cpu.c
index 65a74ce720..c83a582141 100644
--- a/target/rx/cpu.c
+++ b/target/rx/cpu.c
@@ -161,16 +161,19 @@ static void rx_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
     info->print_insn = print_insn_rx;
 }
 
-static bool rx_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
-                            MMUAccessType access_type, int mmu_idx,
-                            bool probe, uintptr_t retaddr)
+static bool rx_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                  vaddr addr, MMUAccessType access_type,
+                                  int mmu_idx, MemOp memop, int size,
+                                  bool probe, uintptr_t retaddr)
 {
-    uint32_t address, physical, prot;
+    /* TODO: alignment faults not currently handled. */
 
     /* Linear mapping */
-    address = physical = addr & TARGET_PAGE_MASK;
-    prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
-    tlb_set_page(cs, address, physical, prot, mmu_idx, TARGET_PAGE_SIZE);
+    memset(out, 0, sizeof(*out));
+    out->phys_addr = addr;
+    out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
+    out->lg_page_size = TARGET_PAGE_BITS;
+    out->attrs = MEMTXATTRS_UNSPECIFIED;
     return true;
 }
 
@@ -195,7 +198,7 @@ static const TCGCPUOps rx_tcg_ops = {
     .initialize = rx_translate_init,
     .synchronize_from_tb = rx_cpu_synchronize_from_tb,
     .restore_state_to_opc = rx_restore_state_to_opc,
-    .tlb_fill = rx_cpu_tlb_fill,
+    .tlb_fill_align = rx_cpu_tlb_fill_align,
 
 #ifndef CONFIG_USER_ONLY
     .cpu_exec_interrupt = rx_cpu_exec_interrupt,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 46/54] target/s390x: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (44 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 45/54] target/rx: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 47/54] target/sh4: " Richard Henderson
                   ` (8 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/s390x/s390x-internal.h  |  7 ++++---
 target/s390x/cpu.c             |  4 ++--
 target/s390x/tcg/excp_helper.c | 23 ++++++++++++++++++-----
 3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h
index 825252d728..eb6fe24c9a 100644
--- a/target/s390x/s390x-internal.h
+++ b/target/s390x/s390x-internal.h
@@ -278,9 +278,10 @@ void s390_cpu_record_sigsegv(CPUState *cs, vaddr address,
 void s390_cpu_record_sigbus(CPUState *cs, vaddr address,
                             MMUAccessType access_type, uintptr_t retaddr);
 #else
-bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr);
+bool s390x_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr addr, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr);
 G_NORETURN void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
                                               MMUAccessType access_type, int mmu_idx,
                                               uintptr_t retaddr);
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 514c70f301..4d0eb129e3 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -330,7 +330,7 @@ void cpu_get_tb_cpu_state(CPUS390XState *env, vaddr *pc,
          * Instructions must be at even addresses.
          * This needs to be checked before address translation.
          */
-        env->int_pgm_ilen = 2; /* see s390_cpu_tlb_fill() */
+        env->int_pgm_ilen = 2; /* see s390x_cpu_tlb_fill_align() */
         tcg_s390_program_interrupt(env, PGM_SPECIFICATION, 0);
     }
 
@@ -364,7 +364,7 @@ static const TCGCPUOps s390_tcg_ops = {
     .record_sigsegv = s390_cpu_record_sigsegv,
     .record_sigbus = s390_cpu_record_sigbus,
 #else
-    .tlb_fill = s390_cpu_tlb_fill,
+    .tlb_fill_align = s390x_cpu_tlb_fill_align,
     .cpu_exec_interrupt = s390_cpu_exec_interrupt,
     .cpu_exec_halt = s390_cpu_has_work,
     .do_interrupt = s390_cpu_do_interrupt,
diff --git a/target/s390x/tcg/excp_helper.c b/target/s390x/tcg/excp_helper.c
index 4c0b692c9e..6d61032a4a 100644
--- a/target/s390x/tcg/excp_helper.c
+++ b/target/s390x/tcg/excp_helper.c
@@ -139,9 +139,10 @@ static inline uint64_t cpu_mmu_idx_to_asc(int mmu_idx)
     }
 }
 
-bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                       MMUAccessType access_type, int mmu_idx,
-                       bool probe, uintptr_t retaddr)
+bool s390x_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr address, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr)
 {
     CPUS390XState *env = cpu_env(cs);
     target_ulong vaddr, raddr;
@@ -151,6 +152,14 @@ bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
     qemu_log_mask(CPU_LOG_MMU, "%s: addr 0x%" VADDR_PRIx " rw %d mmu_idx %d\n",
                   __func__, address, access_type, mmu_idx);
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        s390x_cpu_do_unaligned_access(cs, address, access_type,
+                                      mmu_idx, retaddr);
+    }
+
     vaddr = address;
 
     if (mmu_idx < MMU_REAL_IDX) {
@@ -177,8 +186,12 @@ bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
         qemu_log_mask(CPU_LOG_MMU,
                       "%s: set tlb %" PRIx64 " -> %" PRIx64 " (%x)\n",
                       __func__, (uint64_t)vaddr, (uint64_t)raddr, prot);
-        tlb_set_page(cs, address & TARGET_PAGE_MASK, raddr, prot,
-                     mmu_idx, TARGET_PAGE_SIZE);
+
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = raddr;
+        out->prot = prot;
+        out->lg_page_size = TARGET_PAGE_BITS;
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     }
     if (probe) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 47/54] target/sh4: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (45 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 46/54] target/s390x: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 48/54] target/sparc: " Richard Henderson
                   ` (7 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/sh4/cpu.h    |  8 +++++---
 target/sh4/cpu.c    |  2 +-
 target/sh4/helper.c | 24 +++++++++++++++++-------
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index d928bcf006..161efdefcf 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -22,6 +22,7 @@
 
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 #include "qemu/cpu-float.h"
 
 /* CPU Subtypes */
@@ -251,9 +252,10 @@ void sh4_translate_init(void);
 
 #if !defined(CONFIG_USER_ONLY)
 hwaddr superh_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
-bool superh_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                         MMUAccessType access_type, int mmu_idx,
-                         bool probe, uintptr_t retaddr);
+bool superh_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                               vaddr addr, MMUAccessType access_type,
+                               int mmu_idx, MemOp memop, int size,
+                               bool probe, uintptr_t retaddr);
 void superh_cpu_do_interrupt(CPUState *cpu);
 bool superh_cpu_exec_interrupt(CPUState *cpu, int int_req);
 void cpu_sh4_invalidate_tlb(CPUSH4State *s);
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
index 8f07261dcf..8ca8b90e3c 100644
--- a/target/sh4/cpu.c
+++ b/target/sh4/cpu.c
@@ -252,7 +252,7 @@ static const TCGCPUOps superh_tcg_ops = {
     .restore_state_to_opc = superh_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = superh_cpu_tlb_fill,
+    .tlb_fill_align = superh_cpu_tlb_fill_align,
     .cpu_exec_interrupt = superh_cpu_exec_interrupt,
     .cpu_exec_halt = superh_cpu_has_work,
     .do_interrupt = superh_cpu_do_interrupt,
diff --git a/target/sh4/helper.c b/target/sh4/helper.c
index 9659c69550..543ac1b843 100644
--- a/target/sh4/helper.c
+++ b/target/sh4/helper.c
@@ -792,22 +792,32 @@ bool superh_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
     return false;
 }
 
-bool superh_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                         MMUAccessType access_type, int mmu_idx,
-                         bool probe, uintptr_t retaddr)
+bool superh_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                               vaddr address, MMUAccessType access_type,
+                               int mmu_idx, MemOp memop, int size,
+                               bool probe, uintptr_t retaddr)
 {
     CPUSH4State *env = cpu_env(cs);
     int ret;
-
     target_ulong physical;
     int prot;
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        superh_cpu_do_unaligned_access(cs, address, access_type,
+                                       mmu_idx, retaddr);
+    }
+
     ret = get_physical_address(env, &physical, &prot, address, access_type);
 
     if (ret == MMU_OK) {
-        address &= TARGET_PAGE_MASK;
-        physical &= TARGET_PAGE_MASK;
-        tlb_set_page(cs, address, physical, prot, mmu_idx, TARGET_PAGE_SIZE);
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = physical;
+        out->prot = prot;
+        out->lg_page_size = TARGET_PAGE_BITS;
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     }
     if (probe) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 48/54] target/sparc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (46 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 47/54] target/sh4: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 49/54] target/tricore: " Richard Henderson
                   ` (6 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/sparc/cpu.h        |  8 ++++---
 target/sparc/cpu.c        |  2 +-
 target/sparc/mmu_helper.c | 44 +++++++++++++++++++++++++--------------
 3 files changed, 34 insertions(+), 20 deletions(-)

diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index f517e5a383..4c8927e9fa 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -4,6 +4,7 @@
 #include "qemu/bswap.h"
 #include "cpu-qom.h"
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 #include "qemu/cpu-float.h"
 
 #if !defined(TARGET_SPARC64)
@@ -596,9 +597,10 @@ G_NORETURN void cpu_raise_exception_ra(CPUSPARCState *, int, uintptr_t);
 void cpu_sparc_set_id(CPUSPARCState *env, unsigned int cpu);
 void sparc_cpu_list(void);
 /* mmu_helper.c */
-bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr);
+bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr addr, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr);
 target_ulong mmu_probe(CPUSPARCState *env, target_ulong address, int mmulev);
 void dump_mmu(CPUSPARCState *env);
 
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index dd7af86de7..57ae53bd71 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -932,7 +932,7 @@ static const TCGCPUOps sparc_tcg_ops = {
     .restore_state_to_opc = sparc_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = sparc_cpu_tlb_fill,
+    .tlb_fill_align = sparc_cpu_tlb_fill_align,
     .cpu_exec_interrupt = sparc_cpu_exec_interrupt,
     .cpu_exec_halt = sparc_cpu_has_work,
     .do_interrupt = sparc_cpu_do_interrupt,
diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
index 9ff06026b8..32766a37d6 100644
--- a/target/sparc/mmu_helper.c
+++ b/target/sparc/mmu_helper.c
@@ -203,12 +203,12 @@ static int get_physical_address(CPUSPARCState *env, CPUTLBEntryFull *full,
 }
 
 /* Perform address translation */
-bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr)
+bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr address, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr)
 {
     CPUSPARCState *env = cpu_env(cs);
-    CPUTLBEntryFull full = {};
     target_ulong vaddr;
     int error_code = 0, access_index;
 
@@ -220,16 +220,21 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
      */
     assert(!probe);
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        sparc_cpu_do_unaligned_access(cs, address, access_type,
+                                      mmu_idx, retaddr);
+    }
+
+    memset(out, 0, sizeof(*out));
     address &= TARGET_PAGE_MASK;
-    error_code = get_physical_address(env, &full, &access_index,
+    error_code = get_physical_address(env, out, &access_index,
                                       address, access_type, mmu_idx);
     vaddr = address;
     if (likely(error_code == 0)) {
         qemu_log_mask(CPU_LOG_MMU,
                       "Translate at %" VADDR_PRIx " -> "
                       HWADDR_FMT_plx ", vaddr " TARGET_FMT_lx "\n",
-                      address, full.phys_addr, vaddr);
-        tlb_set_page_full(cs, mmu_idx, vaddr, &full);
+                      address, out->phys_addr, vaddr);
         return true;
     }
 
@@ -244,8 +249,7 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
            permissions. If no mapping is available, redirect accesses to
            neverland. Fake/overridden mappings will be flushed when
            switching to normal mode. */
-        full.prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
-        tlb_set_page_full(cs, mmu_idx, vaddr, &full);
+        out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
         return true;
     } else {
         if (access_type == MMU_INST_FETCH) {
@@ -754,22 +758,30 @@ static int get_physical_address(CPUSPARCState *env, CPUTLBEntryFull *full,
 }
 
 /* Perform address translation */
-bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                        MMUAccessType access_type, int mmu_idx,
-                        bool probe, uintptr_t retaddr)
+bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                              vaddr address, MMUAccessType access_type,
+                              int mmu_idx, MemOp memop, int size,
+                              bool probe, uintptr_t retaddr)
 {
     CPUSPARCState *env = cpu_env(cs);
-    CPUTLBEntryFull full = {};
     int error_code = 0, access_index;
 
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        sparc_cpu_do_unaligned_access(cs, address, access_type,
+                                      mmu_idx, retaddr);
+    }
+
+    memset(out, 0, sizeof(*out));
     address &= TARGET_PAGE_MASK;
-    error_code = get_physical_address(env, &full, &access_index,
+    error_code = get_physical_address(env, out, &access_index,
                                       address, access_type, mmu_idx);
     if (likely(error_code == 0)) {
-        trace_mmu_helper_mmu_fault(address, full.phys_addr, mmu_idx, env->tl,
+        trace_mmu_helper_mmu_fault(address, out->phys_addr, mmu_idx, env->tl,
                                    env->dmmu.mmu_primary_context,
                                    env->dmmu.mmu_secondary_context);
-        tlb_set_page_full(cs, mmu_idx, address, &full);
         return true;
     }
     if (probe) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 49/54] target/tricore: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (47 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 48/54] target/sparc: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 50/54] target/xtensa: " Richard Henderson
                   ` (5 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/tricore/cpu.h    |  7 ++++---
 target/tricore/cpu.c    |  2 +-
 target/tricore/helper.c | 19 ++++++++++++-------
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/target/tricore/cpu.h b/target/tricore/cpu.h
index 220af69fc2..5f141ce8f3 100644
--- a/target/tricore/cpu.h
+++ b/target/tricore/cpu.h
@@ -268,8 +268,9 @@ static inline void cpu_get_tb_cpu_state(CPUTriCoreState *env, vaddr *pc,
 #define CPU_RESOLVING_TYPE TYPE_TRICORE_CPU
 
 /* helpers.c */
-bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                          MMUAccessType access_type, int mmu_idx,
-                          bool probe, uintptr_t retaddr);
+bool tricore_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                vaddr addr, MMUAccessType access_type,
+                                int mmu_idx, MemOp memop, int size,
+                                bool probe, uintptr_t retaddr);
 
 #endif /* TRICORE_CPU_H */
diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
index 1a26171590..29e0b5d129 100644
--- a/target/tricore/cpu.c
+++ b/target/tricore/cpu.c
@@ -173,7 +173,7 @@ static const TCGCPUOps tricore_tcg_ops = {
     .initialize = tricore_tcg_init,
     .synchronize_from_tb = tricore_cpu_synchronize_from_tb,
     .restore_state_to_opc = tricore_restore_state_to_opc,
-    .tlb_fill = tricore_cpu_tlb_fill,
+    .tlb_fill_align = tricore_cpu_tlb_fill_align,
     .cpu_exec_interrupt = tricore_cpu_exec_interrupt,
     .cpu_exec_halt = tricore_cpu_has_work,
 };
diff --git a/target/tricore/helper.c b/target/tricore/helper.c
index 7014255f77..8c6bf63298 100644
--- a/target/tricore/helper.c
+++ b/target/tricore/helper.c
@@ -64,16 +64,19 @@ static void raise_mmu_exception(CPUTriCoreState *env, target_ulong address,
 {
 }
 
-bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                          MMUAccessType rw, int mmu_idx,
-                          bool probe, uintptr_t retaddr)
+bool tricore_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                                vaddr address, MMUAccessType access_type,
+                                int mmu_idx, MemOp memop, int size,
+                                bool probe, uintptr_t retaddr)
 {
     CPUTriCoreState *env = cpu_env(cs);
     hwaddr physical;
     int prot;
     int ret = 0;
+    int rw = access_type & 1;
+
+    /* TODO: alignment faults not currently handled. */
 
-    rw &= 1;
     ret = get_physical_address(env, &physical, &prot,
                                address, rw, mmu_idx);
 
@@ -82,9 +85,11 @@ bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
                   __func__, address, ret, physical, prot);
 
     if (ret == TLBRET_MATCH) {
-        tlb_set_page(cs, address & TARGET_PAGE_MASK,
-                     physical & TARGET_PAGE_MASK, prot | PAGE_EXEC,
-                     mmu_idx, TARGET_PAGE_SIZE);
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = physical;
+        out->prot = prot | PAGE_EXEC;
+        out->lg_page_size = TARGET_PAGE_BITS;
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     } else {
         assert(ret < 0);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 50/54] target/xtensa: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (48 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 49/54] target/tricore: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
                   ` (4 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/xtensa/cpu.h    |  8 +++++---
 target/xtensa/cpu.c    |  2 +-
 target/xtensa/helper.c | 28 ++++++++++++++++++++--------
 3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 77e48eef19..68c3d90d41 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -31,6 +31,7 @@
 #include "cpu-qom.h"
 #include "qemu/cpu-float.h"
 #include "exec/cpu-defs.h"
+#include "exec/memop.h"
 #include "hw/clock.h"
 #include "xtensa-isa.h"
 
@@ -580,9 +581,10 @@ struct XtensaCPUClass {
 };
 
 #ifndef CONFIG_USER_ONLY
-bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                         MMUAccessType access_type, int mmu_idx,
-                         bool probe, uintptr_t retaddr);
+bool xtensa_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                               vaddr addr, MMUAccessType access_type,
+                               int mmu_idx, MemOp memop, int size,
+                               bool probe, uintptr_t retaddr);
 void xtensa_cpu_do_interrupt(CPUState *cpu);
 bool xtensa_cpu_exec_interrupt(CPUState *cpu, int interrupt_request);
 void xtensa_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
index 6f9039abae..3e4ec97e0e 100644
--- a/target/xtensa/cpu.c
+++ b/target/xtensa/cpu.c
@@ -232,7 +232,7 @@ static const TCGCPUOps xtensa_tcg_ops = {
     .restore_state_to_opc = xtensa_restore_state_to_opc,
 
 #ifndef CONFIG_USER_ONLY
-    .tlb_fill = xtensa_cpu_tlb_fill,
+    .tlb_fill_align = xtensa_cpu_tlb_fill_align,
     .cpu_exec_interrupt = xtensa_cpu_exec_interrupt,
     .cpu_exec_halt = xtensa_cpu_has_work,
     .do_interrupt = xtensa_cpu_do_interrupt,
diff --git a/target/xtensa/helper.c b/target/xtensa/helper.c
index ca214b948a..69b0e661c8 100644
--- a/target/xtensa/helper.c
+++ b/target/xtensa/helper.c
@@ -261,15 +261,26 @@ void xtensa_cpu_do_unaligned_access(CPUState *cs,
                                   addr);
 }
 
-bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
-                         MMUAccessType access_type, int mmu_idx,
-                         bool probe, uintptr_t retaddr)
+bool xtensa_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
+                               vaddr address, MMUAccessType access_type,
+                               int mmu_idx, MemOp memop, int size,
+                               bool probe, uintptr_t retaddr)
 {
     CPUXtensaState *env = cpu_env(cs);
     uint32_t paddr;
     uint32_t page_size;
     unsigned access;
-    int ret = xtensa_get_physical_addr(env, true, address, access_type,
+    int ret;
+
+    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
+        if (probe) {
+            return false;
+        }
+        xtensa_cpu_do_unaligned_access(cs, address, access_type,
+                                       mmu_idx, retaddr);
+    }
+
+    ret = xtensa_get_physical_addr(env, true, address, access_type,
                                        mmu_idx, &paddr, &page_size, &access);
 
     qemu_log_mask(CPU_LOG_MMU, "%s(%08" VADDR_PRIx
@@ -277,10 +288,11 @@ bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
                   __func__, address, access_type, mmu_idx, paddr, ret);
 
     if (ret == 0) {
-        tlb_set_page(cs,
-                     address & TARGET_PAGE_MASK,
-                     paddr & TARGET_PAGE_MASK,
-                     access, mmu_idx, page_size);
+        memset(out, 0, sizeof(*out));
+        out->phys_addr = paddr;
+        out->prot = access;
+        out->lg_page_size = ctz32(page_size);
+        out->attrs = MEMTXATTRS_UNSPECIFIED;
         return true;
     } else if (probe) {
         return false;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (49 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 50/54] target/xtensa: " Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:55   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page* Richard Henderson
                   ` (3 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Now that all targets have been converted to tlb_fill_align,
remove the tlb_fill hook.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/tcg-cpu-ops.h | 10 ----------
 accel/tcg/cputlb.c            | 19 ++++---------------
 2 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index 663efb9133..70cafcc6cd 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -157,16 +157,6 @@ struct TCGCPUOps {
     bool (*tlb_fill_align)(CPUState *cpu, CPUTLBEntryFull *out, vaddr addr,
                            MMUAccessType access_type, int mmu_idx,
                            MemOp memop, int size, bool probe, uintptr_t ra);
-    /**
-     * @tlb_fill: Handle a softmmu tlb miss
-     *
-     * If the access is valid, call tlb_set_page and return true;
-     * if the access is invalid and probe is true, return false;
-     * otherwise raise an exception and do not return.
-     */
-    bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
-                     MMUAccessType access_type, int mmu_idx,
-                     bool probe, uintptr_t retaddr);
     /**
      * @do_transaction_failed: Callback for handling failed memory transactions
      * (ie bus faults or external aborts; not MMU faults)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 7f63dc3fd8..ec597ed6f5 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1222,23 +1222,12 @@ static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
                            int mmu_idx, MemOp memop, int size,
                            bool probe, uintptr_t ra)
 {
-    const TCGCPUOps *ops = cpu->cc->tcg_ops;
     CPUTLBEntryFull full;
 
-    if (ops->tlb_fill_align) {
-        if (ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
-                                memop, size, probe, ra)) {
-            tlb_set_page_full(cpu, mmu_idx, addr, &full);
-            return true;
-        }
-    } else {
-        /* Legacy behaviour is alignment before paging. */
-        if (addr & ((1u << memop_alignment_bits(memop)) - 1)) {
-            ops->do_unaligned_access(cpu, addr, type, mmu_idx, ra);
-        }
-        if (ops->tlb_fill(cpu, addr, size, type, mmu_idx, probe, ra)) {
-            return true;
-        }
+    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
+                                         memop, size, probe, ra)) {
+        tlb_set_page_full(cpu, mmu_idx, addr, &full);
+        return true;
     }
     assert(probe);
     return false;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page*
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (50 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:56   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers Richard Henderson
                   ` (2 subsequent siblings)
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

The new tlb_fill_align hook returns page data via structure
rather than by function call, so we can make tlb_set_page_full
be local to cputlb.c.  There are no users of tlb_set_page
or tlb_set_page_with_attrs, so those can be eliminated.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 57 -----------------------------------------
 accel/tcg/cputlb.c      | 27 ++-----------------
 2 files changed, 2 insertions(+), 82 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 69bdb77584..b65fc547bd 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -184,63 +184,6 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *cpu,
                                                vaddr len,
                                                uint16_t idxmap,
                                                unsigned bits);
-
-/**
- * tlb_set_page_full:
- * @cpu: CPU context
- * @mmu_idx: mmu index of the tlb to modify
- * @addr: virtual address of the entry to add
- * @full: the details of the tlb entry
- *
- * Add an entry to @cpu tlb index @mmu_idx.  All of the fields of
- * @full must be filled, except for xlat_section, and constitute
- * the complete description of the translated page.
- *
- * This is generally called by the target tlb_fill function after
- * having performed a successful page table walk to find the physical
- * address and attributes for the translation.
- *
- * At most one entry for a given virtual address is permitted. Only a
- * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
- * used by tlb_flush_page.
- */
-void tlb_set_page_full(CPUState *cpu, int mmu_idx, vaddr addr,
-                       CPUTLBEntryFull *full);
-
-/**
- * tlb_set_page_with_attrs:
- * @cpu: CPU to add this TLB entry for
- * @addr: virtual address of page to add entry for
- * @paddr: physical address of the page
- * @attrs: memory transaction attributes
- * @prot: access permissions (PAGE_READ/PAGE_WRITE/PAGE_EXEC bits)
- * @mmu_idx: MMU index to insert TLB entry for
- * @size: size of the page in bytes
- *
- * Add an entry to this CPU's TLB (a mapping from virtual address
- * @addr to physical address @paddr) with the specified memory
- * transaction attributes. This is generally called by the target CPU
- * specific code after it has been called through the tlb_fill()
- * entry point and performed a successful page table walk to find
- * the physical address and attributes for the virtual address
- * which provoked the TLB miss.
- *
- * At most one entry for a given virtual address is permitted. Only a
- * single TARGET_PAGE_SIZE region is mapped; the supplied @size is only
- * used by tlb_flush_page.
- */
-void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
-                             hwaddr paddr, MemTxAttrs attrs,
-                             int prot, int mmu_idx, vaddr size);
-/* tlb_set_page:
- *
- * This function is equivalent to calling tlb_set_page_with_attrs()
- * with an @attrs argument of MEMTXATTRS_UNSPECIFIED. It's provided
- * as a convenience for CPUs which don't use memory transaction attributes.
- */
-void tlb_set_page(CPUState *cpu, vaddr addr,
-                  hwaddr paddr, int prot,
-                  int mmu_idx, vaddr size);
 #else
 static inline void tlb_init(CPUState *cpu)
 {
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ec597ed6f5..3d731b8f3d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1037,8 +1037,8 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
  * Called from TCG-generated code, which is under an RCU read-side
  * critical section.
  */
-void tlb_set_page_full(CPUState *cpu, int mmu_idx,
-                       vaddr addr, CPUTLBEntryFull *full)
+static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
+                              vaddr addr, CPUTLBEntryFull *full)
 {
     CPUTLB *tlb = &cpu->neg.tlb;
     CPUTLBDesc *desc = &tlb->d[mmu_idx];
@@ -1189,29 +1189,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     qemu_spin_unlock(&tlb->c.lock);
 }
 
-void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
-                             hwaddr paddr, MemTxAttrs attrs, int prot,
-                             int mmu_idx, uint64_t size)
-{
-    CPUTLBEntryFull full = {
-        .phys_addr = paddr,
-        .attrs = attrs,
-        .prot = prot,
-        .lg_page_size = ctz64(size)
-    };
-
-    assert(is_power_of_2(size));
-    tlb_set_page_full(cpu, mmu_idx, addr, &full);
-}
-
-void tlb_set_page(CPUState *cpu, vaddr addr,
-                  hwaddr paddr, int prot,
-                  int mmu_idx, uint64_t size)
-{
-    tlb_set_page_with_attrs(cpu, addr, paddr, MEMTXATTRS_UNSPECIFIED,
-                            prot, mmu_idx, size);
-}
-
 /*
  * Note: tlb_fill_align() can trigger a resize of the TLB.
  * This means that all of the caller's prior references to the TLB table
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (51 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page* Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:57   ` Pierrick Bouvier
  2024-11-14 16:01 ` [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full Richard Henderson
  2024-11-14 19:56 ` [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Pierrick Bouvier
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

In tlb_lookup, we still call tlb_set_page_full.
In atomic_mmu_lookup, we're expecting noreturn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 31 ++++++-------------------------
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 3d731b8f3d..20af48c6c5 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1189,27 +1189,6 @@ static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     qemu_spin_unlock(&tlb->c.lock);
 }
 
-/*
- * Note: tlb_fill_align() can trigger a resize of the TLB.
- * This means that all of the caller's prior references to the TLB table
- * (e.g. CPUTLBEntry pointers) must be discarded and looked up again
- * (e.g. via tlb_entry()).
- */
-static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
-                           int mmu_idx, MemOp memop, int size,
-                           bool probe, uintptr_t ra)
-{
-    CPUTLBEntryFull full;
-
-    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
-                                         memop, size, probe, ra)) {
-        tlb_set_page_full(cpu, mmu_idx, addr, &full);
-        return true;
-    }
-    assert(probe);
-    return false;
-}
-
 static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
                                         MMUAccessType access_type,
                                         int mmu_idx, uintptr_t retaddr)
@@ -1281,11 +1260,13 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
     }
 
     /* Finally, query the target hook. */
-    if (!tlb_fill_align(cpu, addr, access_type, i->mmu_idx,
-                        memop, i->size, probe, i->ra)) {
+    if (!cpu->cc->tcg_ops->tlb_fill_align(cpu, &o->full, addr, access_type,
+                                          i->mmu_idx, memop, i->size,
+                                          probe, i->ra)) {
         tcg_debug_assert(probe);
         return false;
     }
+    tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
     o->did_tlb_fill = true;
 
     if (access_type == MMU_INST_FETCH) {
@@ -1794,8 +1775,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
      * We have just verified that the page is writable.
      */
     if (unlikely(!(o.full.prot & PAGE_READ))) {
-        tlb_fill_align(cpu, addr, MMU_DATA_LOAD, i.mmu_idx,
-                       0, i.size, false, i.ra);
+        cpu->cc->tcg_ops->tlb_fill_align(cpu, &o.full, addr, MMU_DATA_LOAD,
+                                         i.mmu_idx, 0, i.size, false, i.ra);
         /*
          * Since we don't support reads and writes to different
          * addresses, and we do have the proper page loaded for
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (52 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers Richard Henderson
@ 2024-11-14 16:01 ` Richard Henderson
  2024-11-14 18:59   ` Pierrick Bouvier
  2024-11-14 19:56 ` [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Pierrick Bouvier
  54 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 16:01 UTC (permalink / raw)
  To: qemu-devel

Avoid a lookup to find the node that we have just inserted.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 20af48c6c5..6d316e8767 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1037,8 +1037,8 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
  * Called from TCG-generated code, which is under an RCU read-side
  * critical section.
  */
-static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
-                              vaddr addr, CPUTLBEntryFull *full)
+static CPUTLBEntryTree *tlb_set_page_full(CPUState *cpu, int mmu_idx,
+                                          vaddr addr, CPUTLBEntryFull *full)
 {
     CPUTLB *tlb = &cpu->neg.tlb;
     CPUTLBDesc *desc = &tlb->d[mmu_idx];
@@ -1187,6 +1187,8 @@ static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     copy_tlb_helper_locked(te, &node->copy);
     desc->n_used_entries++;
     qemu_spin_unlock(&tlb->c.lock);
+
+    return node;
 }
 
 static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
@@ -1266,18 +1268,14 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
         tcg_debug_assert(probe);
         return false;
     }
-    tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
+    node = tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
     o->did_tlb_fill = true;
 
     if (access_type == MMU_INST_FETCH) {
-        node = tlbtree_lookup_addr(desc, addr);
-        tcg_debug_assert(node);
         goto found_code;
     }
 
-    entry = tlbfast_entry(fast, addr);
-    cmp = tlb_read_idx(entry, access_type);
-    node = entry->tree;
+    cmp = tlb_read_idx(&node->copy, access_type);
     /*
      * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
      * to force the next access through tlb_fill_align.  We've just
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes
  2024-11-14 16:00 ` [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
@ 2024-11-14 17:51   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Provide a general-purpose release-all-nodes operation, that allows
> for the IntervalTreeNode to be embeded within a larger structure.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/qemu/interval-tree.h | 11 +++++++++++
>   util/interval-tree.c         | 20 ++++++++++++++++++++
>   util/selfmap.c               | 13 +------------
>   3 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h
> index 25006debe8..d90ea6d17f 100644
> --- a/include/qemu/interval-tree.h
> +++ b/include/qemu/interval-tree.h
> @@ -96,4 +96,15 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
>   IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
>                                             uint64_t start, uint64_t last);
>   
> +/**
> + * interval_tree_free_nodes:
> + * @root: root of the tree
> + * @it_offset: offset from outermost type to IntervalTreeNode
> + *
> + * Free, via g_free, all nodes under @root.  IntervalTreeNode may
> + * not be the true type of the nodes allocated; @it_offset gives
> + * the offset from the outermost type to the IntervalTreeNode member.
> + */
> +void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset);
> +
>   #endif /* QEMU_INTERVAL_TREE_H */
> diff --git a/util/interval-tree.c b/util/interval-tree.c
> index 53465182e6..663d3ec222 100644
> --- a/util/interval-tree.c
> +++ b/util/interval-tree.c
> @@ -639,6 +639,16 @@ static void rb_erase_augmented_cached(RBNode *node, RBRootLeftCached *root,
>       rb_erase_augmented(node, &root->rb_root, augment);
>   }
>   
> +static void rb_node_free(RBNode *rb, size_t rb_offset)
> +{
> +    if (rb->rb_left) {
> +        rb_node_free(rb->rb_left, rb_offset);
> +    }
> +    if (rb->rb_right) {
> +        rb_node_free(rb->rb_right, rb_offset);
> +    }
> +    g_free((void *)rb - rb_offset);
> +}
>   
>   /*
>    * Interval trees.
> @@ -870,6 +880,16 @@ IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
>       }
>   }
>   
> +void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset)
> +{
> +    if (root && root->rb_root.rb_node) {
> +        rb_node_free(root->rb_root.rb_node,
> +                     it_offset + offsetof(IntervalTreeNode, rb));
> +        root->rb_root.rb_node = NULL;
> +        root->rb_leftmost = NULL;
> +    }
> +}
> +
>   /* Occasionally useful for calling from within the debugger. */
>   #if 0
>   static void debug_interval_tree_int(IntervalTreeNode *node,
> diff --git a/util/selfmap.c b/util/selfmap.c
> index 483cb617e2..d2b86da301 100644
> --- a/util/selfmap.c
> +++ b/util/selfmap.c
> @@ -87,23 +87,12 @@ IntervalTreeRoot *read_self_maps(void)
>    * @root: an interval tree
>    *
>    * Free a tree of MapInfo structures.
> - * Since we allocated each MapInfo in one chunk, we need not consider the
> - * contents and can simply free each RBNode.
>    */
>   
> -static void free_rbnode(RBNode *n)
> -{
> -    if (n) {
> -        free_rbnode(n->rb_left);
> -        free_rbnode(n->rb_right);
> -        g_free(n);
> -    }
> -}
> -
>   void free_self_maps(IntervalTreeRoot *root)
>   {
>       if (root) {
> -        free_rbnode(root->rb_root.rb_node);
> +        interval_tree_free_nodes(root, offsetof(MapInfo, itree));
>           g_free(root);
>       }
>   }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked
  2024-11-14 16:00 ` [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
@ 2024-11-14 17:52   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> We will have a need to flush only the "fast" portion
> of the tlb, allowing re-fill from the "full" portion.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index b76a4eac4e..c1838412e8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -284,13 +284,18 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>       }
>   }
>   
> -static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
> +static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>   {
>       desc->n_used_entries = 0;
> +    memset(fast->table, -1, sizeof_tlb(fast));
> +}
> +
> +static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
> +{
> +    tlbfast_flush_locked(desc, fast);
>       desc->large_page_addr = -1;
>       desc->large_page_mask = -1;
>       desc->vindex = 0;
> -    memset(fast->table, -1, sizeof_tlb(fast));
>       memset(desc->vtable, -1, sizeof(desc->vtable));
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry}
  2024-11-14 16:00 ` [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
@ 2024-11-14 17:52   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Often we already have the CPUTLBDescFast structure pointer.
> Allows future code simplification.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index c1838412e8..e37af24525 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -131,20 +131,28 @@ static inline uint64_t tlb_addr_write(const CPUTLBEntry *entry)
>       return tlb_read_idx(entry, MMU_DATA_STORE);
>   }
>   
> +static inline uintptr_t tlbfast_index(CPUTLBDescFast *fast, vaddr addr)
> +{
> +    return (addr >> TARGET_PAGE_BITS) & (fast->mask >> CPU_TLB_ENTRY_BITS);
> +}
> +
> +static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
> +{
> +    return fast->table + tlbfast_index(fast, addr);
> +}
> +
>   /* Find the TLB index corresponding to the mmu_idx + address pair.  */
>   static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
>                                     vaddr addr)
>   {
> -    uintptr_t size_mask = cpu->neg.tlb.f[mmu_idx].mask >> CPU_TLB_ENTRY_BITS;
> -
> -    return (addr >> TARGET_PAGE_BITS) & size_mask;
> +    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
>   }
>   
>   /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
>   static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
>                                        vaddr addr)
>   {
> -    return &cpu->neg.tlb.f[mmu_idx].table[tlb_index(cpu, mmu_idx, addr)];
> +    return tlbfast_entry(&cpu->neg.tlb.f[mmu_idx], addr);
>   }
>   
>   static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  2024-11-14 16:00 ` [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
@ 2024-11-14 17:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> The INVALID bit should only be auto-cleared when we have
> just called tlb_fill, not along the victim_tlb_hit path.
> 
> In atomic_mmu_lookup, rename tlb_addr to flags, as that
> is what we're actually carrying around.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 33 ++++++++++++++++++++++-----------
>   1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 46fa0ae802..77b972fd93 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1652,7 +1652,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
>       bool maybe_resized = false;
>       CPUTLBEntryFull *full;
> -    int flags;
> +    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
>   
>       /* If the TLB entry is for a different page, reload and try again.  */
>       if (!tlb_hit(tlb_addr, addr)) {
> @@ -1663,8 +1663,14 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>               maybe_resized = true;
>               index = tlb_index(cpu, mmu_idx, addr);
>               entry = tlb_entry(cpu, mmu_idx, addr);
> +            /*
> +             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> +             * to force the next access through tlb_fill.  We've just
> +             * called tlb_fill, so we know that this entry *is* valid.
> +             */
> +            flags &= ~TLB_INVALID_MASK;
>           }
> -        tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
> +        tlb_addr = tlb_read_idx(entry, access_type);
>       }
>   
>       full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> @@ -1814,10 +1820,10 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       MemOp mop = get_memop(oi);
>       uintptr_t index;
>       CPUTLBEntry *tlbe;
> -    vaddr tlb_addr;
>       void *hostaddr;
>       CPUTLBEntryFull *full;
>       bool did_tlb_fill = false;
> +    int flags;
>   
>       tcg_debug_assert(mmu_idx < NB_MMU_MODES);
>   
> @@ -1828,8 +1834,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       tlbe = tlb_entry(cpu, mmu_idx, addr);
>   
>       /* Check TLB entry and enforce page permissions.  */
> -    tlb_addr = tlb_addr_write(tlbe);
> -    if (!tlb_hit(tlb_addr, addr)) {
> +    flags = TLB_FLAGS_MASK;
> +    if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
>           if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
>                               addr & TARGET_PAGE_MASK)) {
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
> @@ -1837,8 +1843,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>               did_tlb_fill = true;
>               index = tlb_index(cpu, mmu_idx, addr);
>               tlbe = tlb_entry(cpu, mmu_idx, addr);
> +            /*
> +             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> +             * to force the next access through tlb_fill.  We've just
> +             * called tlb_fill, so we know that this entry *is* valid.
> +             */
> +            flags &= ~TLB_INVALID_MASK;
>           }
> -        tlb_addr = tlb_addr_write(tlbe) & ~TLB_INVALID_MASK;
>       }
>   
>       /*
> @@ -1874,11 +1885,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           goto stop_the_world;
>       }
>   
> -    /* Collect tlb flags for read. */
> -    tlb_addr |= tlbe->addr_read;
> +    /* Collect tlb flags for read and write. */
> +    flags &= tlbe->addr_read | tlb_addr_write(tlbe);
>   
>       /* Notice an IO access or a needs-MMU-lookup access */
> -    if (unlikely(tlb_addr & (TLB_MMIO | TLB_DISCARD_WRITE))) {
> +    if (unlikely(flags & (TLB_MMIO | TLB_DISCARD_WRITE))) {
>           /* There's really nothing that can be done to
>              support this apart from stop-the-world.  */
>           goto stop_the_world;
> @@ -1887,11 +1898,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
>       full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>   
> -    if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
> +    if (unlikely(flags & TLB_NOTDIRTY)) {
>           notdirty_write(cpu, addr, size, full, retaddr);
>       }
>   
> -    if (unlikely(tlb_addr & TLB_FORCE_SLOW)) {
> +    if (unlikely(flags & TLB_FORCE_SLOW)) {
>           int wp_flags = 0;
>   
>           if (full->slow_flags[MMU_DATA_STORE] & TLB_WATCHPOINT) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx*
  2024-11-14 16:00 ` [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx* Richard Henderson
@ 2024-11-14 17:56   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Next patches will assume non-zero length.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 77b972fd93..1346a26d90 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -791,6 +791,7 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>       TLBFlushRangeData d;
>   
>       assert_cpu_is_self(cpu);
> +    assert(len != 0);
>   
>       /*
>        * If all bits are significant, and len is small,
> @@ -830,6 +831,8 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>       TLBFlushRangeData d, *p;
>       CPUState *dst_cpu;
>   
> +    assert(len != 0);
> +
>       /*
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 07/54] accel/tcg: Assert bits in range in tlb_flush_range_by_mmuidx*
  2024-11-14 16:00 ` [PATCH v2 07/54] accel/tcg: Assert bits in range " Richard Henderson
@ 2024-11-14 17:56   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> The only target that does not use TARGET_LONG_BITS is Arm, which
> only reduces bits based on TBI.  There is no point in handling
> odd combinations of parameters.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++------------
>   1 file changed, 4 insertions(+), 12 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 1346a26d90..5510f40333 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -792,20 +792,16 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>   
>       assert_cpu_is_self(cpu);
>       assert(len != 0);
> +    assert(bits > TARGET_PAGE_BITS && bits <= TARGET_LONG_BITS);
>   
>       /*
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> -    if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
> +    if (bits == TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
>           return;
>       }
> -    /* If no page bits are significant, this devolves to tlb_flush. */
> -    if (bits < TARGET_PAGE_BITS) {
> -        tlb_flush_by_mmuidx(cpu, idxmap);
> -        return;
> -    }
>   
>       /* This should already be page aligned */
>       d.addr = addr & TARGET_PAGE_MASK;
> @@ -832,20 +828,16 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>       CPUState *dst_cpu;
>   
>       assert(len != 0);
> +    assert(bits > TARGET_PAGE_BITS && bits <= TARGET_LONG_BITS);
>   
>       /*
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> -    if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
> +    if (bits == TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
>           return;
>       }
> -    /* If no page bits are significant, this devolves to tlb_flush. */
> -    if (bits < TARGET_PAGE_BITS) {
> -        tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
> -        return;
> -    }
>   
>       /* This should already be page aligned */
>       d.addr = addr & TARGET_PAGE_MASK;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps
  2024-11-14 16:00 ` [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
@ 2024-11-14 17:58   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:58 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> We expect masked address spaces to be quite large, e.g. 56 bits
> for AArch64 top-byte-ignore mode.  We do not expect addr+len to
> wrap around, but it is possible with AArch64 guest flush range
> instructions.
> 
> Convert this unlikely case to a full tlb flush.  This can simplify
> the subroutines actually performing the range flush.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 5510f40333..31c45a6213 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -802,6 +802,11 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>           tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
>           return;
>       }
> +    /* If addr+len wraps in len bits, fall back to full flush. */
> +    if (bits < TARGET_LONG_BITS && ((addr ^ (addr + len - 1)) >> bits)) {
> +        tlb_flush_by_mmuidx(cpu, idxmap);
> +        return;
> +    }
>   
>       /* This should already be page aligned */
>       d.addr = addr & TARGET_PAGE_MASK;
> @@ -838,6 +843,11 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>           tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
>           return;
>       }
> +    /* If addr+len wraps in len bits, fall back to full flush. */
> +    if (bits < TARGET_LONG_BITS && ((addr ^ (addr + len - 1)) >> bits)) {
> +        tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
> +        return;
> +    }
>   
>       /* This should already be page aligned */
>       d.addr = addr & TARGET_PAGE_MASK;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  2024-11-14 16:00 ` [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
@ 2024-11-14 17:59   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 17:59 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Add the data structures for tracking softmmu pages via
> a balanced interval tree.  So far, only initialize and
> destroy the data structure.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  3 +++
>   accel/tcg/cputlb.c    | 11 +++++++++++
>   2 files changed, 14 insertions(+)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index db8a6fbc6e..1ebc999a73 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -35,6 +35,7 @@
>   #include "qemu/queue.h"
>   #include "qemu/lockcnt.h"
>   #include "qemu/thread.h"
> +#include "qemu/interval-tree.h"
>   #include "qom/object.h"
>   
>   typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
> @@ -290,6 +291,8 @@ typedef struct CPUTLBDesc {
>       CPUTLBEntry vtable[CPU_VTLB_SIZE];
>       CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
>       CPUTLBEntryFull *fulltlb;
> +    /* All active tlb entries for this address space. */
> +    IntervalTreeRoot iroot;
>   } CPUTLBDesc;
>   
>   /*
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 31c45a6213..aa51fc1d26 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -89,6 +89,13 @@ QEMU_BUILD_BUG_ON(sizeof(vaddr) > sizeof(run_on_cpu_data));
>   QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
>   #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
>   
> +/* Extra data required to manage CPUTLBEntryFull within an interval tree. */
> +typedef struct CPUTLBEntryTree {
> +    IntervalTreeNode itree;
> +    CPUTLBEntry copy;
> +    CPUTLBEntryFull full;
> +} CPUTLBEntryTree;
> +
>   static inline size_t tlb_n_entries(CPUTLBDescFast *fast)
>   {
>       return (fast->mask >> CPU_TLB_ENTRY_BITS) + 1;
> @@ -305,6 +312,7 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       desc->large_page_mask = -1;
>       desc->vindex = 0;
>       memset(desc->vtable, -1, sizeof(desc->vtable));
> +    interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
>   }
>   
>   static void tlb_flush_one_mmuidx_locked(CPUState *cpu, int mmu_idx,
> @@ -326,6 +334,7 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
>       fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_new(CPUTLBEntry, n_entries);
>       desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
> +    memset(&desc->iroot, 0, sizeof(desc->iroot));
>       tlb_mmu_flush_locked(desc, fast);
>   }
>   
> @@ -365,6 +374,8 @@ void tlb_destroy(CPUState *cpu)
>   
>           g_free(fast->table);
>           g_free(desc->fulltlb);
> +        interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
> +                                 offsetof(CPUTLBEntryTree, itree));
>       }
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full
  2024-11-14 16:00 ` [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
@ 2024-11-14 18:00   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Add or replace an entry in the IntervalTree for each
> page installed into softmmu.  We do not yet use the
> tree for anything else.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 34 ++++++++++++++++++++++++++++------
>   1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index aa51fc1d26..ea6a5177de 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -305,6 +305,17 @@ static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       memset(fast->table, -1, sizeof_tlb(fast));
>   }
>   
> +static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
> +{
> +    IntervalTreeNode *i = interval_tree_iter_first(&desc->iroot, s, l);
> +    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
> +}
> +
> +static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
> +{
> +    return tlbtree_lookup_range(desc, addr, addr);
> +}
> +
>   static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>   {
>       tlbfast_flush_locked(desc, fast);
> @@ -1072,7 +1083,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       MemoryRegionSection *section;
>       unsigned int index, read_flags, write_flags;
>       uintptr_t addend;
> -    CPUTLBEntry *te, tn;
> +    CPUTLBEntry *te;
> +    CPUTLBEntryTree *node;
>       hwaddr iotlb, xlat, sz, paddr_page;
>       vaddr addr_page;
>       int asidx, wp_flags, prot;
> @@ -1180,6 +1192,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>           tlb_n_used_entries_dec(cpu, mmu_idx);
>       }
>   
> +    /* Replace an old IntervalTree entry, or create a new one. */
> +    node = tlbtree_lookup_addr(desc, addr_page);
> +    if (!node) {
> +        node = g_new(CPUTLBEntryTree, 1);
> +        node->itree.start = addr_page;
> +        node->itree.last = addr_page + TARGET_PAGE_SIZE - 1;
> +        interval_tree_insert(&node->itree, &desc->iroot);
> +    }
> +
>       /* refill the tlb */
>       /*
>        * When memory region is ram, iotlb contains a TARGET_PAGE_BITS
> @@ -1201,15 +1222,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       full->phys_addr = paddr_page;
>   
>       /* Now calculate the new entry */
> -    tn.addend = addend - addr_page;
> +    node->copy.addend = addend - addr_page;
>   
> -    tlb_set_compare(full, &tn, addr_page, read_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>                       MMU_INST_FETCH, prot & PAGE_EXEC);
>   
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
>       }
> -    tlb_set_compare(full, &tn, addr_page, read_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>                       MMU_DATA_LOAD, prot & PAGE_READ);
>   
>       if (prot & PAGE_WRITE_INV) {
> @@ -1218,10 +1239,11 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       if (wp_flags & BP_MEM_WRITE) {
>           write_flags |= TLB_WATCHPOINT;
>       }
> -    tlb_set_compare(full, &tn, addr_page, write_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, write_flags,
>                       MMU_DATA_STORE, prot & PAGE_WRITE);
>   
> -    copy_tlb_helper_locked(te, &tn);
> +    node->full = *full;
> +    copy_tlb_helper_locked(te, &node->copy);
>       tlb_n_used_entries_inc(cpu, mmu_idx);
>       qemu_spin_unlock(&tlb->c.lock);
>   }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  2024-11-14 16:00 ` [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
@ 2024-11-14 18:01   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:01 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Flush a page from the IntervalTree cache.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ea6a5177de..d532d69083 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -568,6 +568,7 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>       CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
>       vaddr lp_addr = desc->large_page_addr;
>       vaddr lp_mask = desc->large_page_mask;
> +    CPUTLBEntryTree *node;
>   
>       /* Check if we need to flush due to large pages.  */
>       if ((page & lp_mask) == lp_addr) {
> @@ -575,10 +576,17 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>                     VADDR_PRIx "/%016" VADDR_PRIx ")\n",
>                     midx, lp_addr, lp_mask);
>           tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
> -    } else {
> -        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
> -                                   page, TARGET_PAGE_SIZE, -1);
> -        tlb_flush_vtlb_page_locked(cpu, midx, page);
> +        return;
> +    }
> +
> +    tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
> +                               page, TARGET_PAGE_SIZE, -1);
> +    tlb_flush_vtlb_page_locked(cpu, midx, page);
> +
> +    node = tlbtree_lookup_addr(desc, page);
> +    if (node) {
> +        interval_tree_remove(&node->itree, &desc->iroot);
> +        g_free(node);
>       }
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  2024-11-14 16:00 ` [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
@ 2024-11-14 18:01   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:01 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Flush a masked range of pages from the IntervalTree cache.
> When the mask is not used there is a redundant comparison,
> but that is better than duplicating code at this point.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index d532d69083..e2c855f147 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -311,6 +311,13 @@ static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
>       return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
>   }
>   
> +static CPUTLBEntryTree *tlbtree_lookup_range_next(CPUTLBEntryTree *prev,
> +                                                  vaddr s, vaddr l)
> +{
> +    IntervalTreeNode *i = interval_tree_iter_next(&prev->itree, s, l);
> +    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
> +}
> +
>   static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
>   {
>       return tlbtree_lookup_range(desc, addr, addr);
> @@ -739,6 +746,8 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>       CPUTLBDesc *d = &cpu->neg.tlb.d[midx];
>       CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
>       vaddr mask = MAKE_64BIT_MASK(0, bits);
> +    CPUTLBEntryTree *node;
> +    vaddr addr_mask, last_mask, last_imask;
>   
>       /*
>        * Check if we need to flush due to large pages.
> @@ -759,6 +768,22 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>           vaddr page = addr + i;
>           tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
>       }
> +
> +    addr_mask = addr & mask;
> +    last_mask = addr_mask + len - 1;
> +    last_imask = last_mask | ~mask;
> +    node = tlbtree_lookup_range(d, addr_mask, last_imask);
> +    while (node) {
> +        CPUTLBEntryTree *next =
> +            tlbtree_lookup_range_next(node, addr_mask, last_imask);
> +        vaddr page_mask = node->itree.start & mask;
> +
> +        if (page_mask >= addr_mask && page_mask < last_mask) {
> +            interval_tree_remove(&node->itree, &d->iroot);
> +            g_free(node);
> +        }
> +        node = next;
> +    }
>   }
>   
>   typedef struct {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  2024-11-14 16:00 ` [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
@ 2024-11-14 18:02   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:02 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Update the addr_write copy within each interval tree node.
> Tidy the iteration within the other two loops as well.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index e2c855f147..0c9f834cbe 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1010,17 +1010,20 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
>   
>       qemu_spin_lock(&cpu->neg.tlb.c.lock);
>       for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        unsigned int i;
> -        unsigned int n = tlb_n_entries(&cpu->neg.tlb.f[mmu_idx]);
> +        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +        CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
>   
> -        for (i = 0; i < n; i++) {
> -            tlb_reset_dirty_range_locked(&cpu->neg.tlb.f[mmu_idx].table[i],
> -                                         start1, length);
> +        for (size_t i = 0, n = tlb_n_entries(fast); i < n; i++) {
> +            tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
>           }
>   
> -        for (i = 0; i < CPU_VTLB_SIZE; i++) {
> -            tlb_reset_dirty_range_locked(&cpu->neg.tlb.d[mmu_idx].vtable[i],
> -                                         start1, length);
> +        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
> +            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
> +        }
> +
> +        for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
> +             t = tlbtree_lookup_range_next(t, 0, -1)) {
> +            tlb_reset_dirty_range_locked(&t->copy, start1, length);
>           }
>       }
>       qemu_spin_unlock(&cpu->neg.tlb.c.lock);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty
  2024-11-14 16:00 ` [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
@ 2024-11-14 18:02   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:02 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Update the addr_write copy within an interval tree node.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 0c9f834cbe..eb85e96ee2 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1049,13 +1049,18 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
>       addr &= TARGET_PAGE_MASK;
>       qemu_spin_lock(&cpu->neg.tlb.c.lock);
>       for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
> -    }
> +        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +        CPUTLBEntryTree *node;
>   
> -    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        int k;
> -        for (k = 0; k < CPU_VTLB_SIZE; k++) {
> -            tlb_set_dirty1_locked(&cpu->neg.tlb.d[mmu_idx].vtable[k], addr);
> +        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
> +
> +        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
> +            tlb_set_dirty1_locked(&desc->vtable[k], addr);
> +        }
> +
> +        node = tlbtree_lookup_addr(desc, addr);
> +        if (node) {
> +            tlb_set_dirty1_locked(&node->copy, addr);
>           }
>       }
>       qemu_spin_unlock(&cpu->neg.tlb.c.lock);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit
  2024-11-14 16:00 ` [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit Richard Henderson
@ 2024-11-14 18:03   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:03 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> This is clearer than directly comparing the
> page address and the comparator.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index eb85e96ee2..7ecd327297 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1394,9 +1394,8 @@ static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
>       assert_cpu_is_self(cpu);
>       for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
>           CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
> -        uint64_t cmp = tlb_read_idx(vtlb, access_type);
>   
> -        if (cmp == page) {
> +        if (tlb_hit_page(tlb_read_idx(vtlb, access_type), page)) {
>               /* Found entry in victim tlb, swap tlb and iotlb.  */
>               CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit
  2024-11-14 16:00 ` [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit Richard Henderson
@ 2024-11-14 18:04   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> Do not mask the address to the page in these calls.
> It is easy enough to use a different helper instead.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 15 ++++++---------
>   1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 7ecd327297..3aab72ea82 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1387,7 +1387,7 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
>   /* Return true if ADDR is present in the victim tlb, and has been copied
>      back to the main tlb.  */
>   static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
> -                           MMUAccessType access_type, vaddr page)
> +                           MMUAccessType access_type, vaddr addr)
>   {
>       size_t vidx;
>   
> @@ -1395,7 +1395,7 @@ static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
>       for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
>           CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
>   
> -        if (tlb_hit_page(tlb_read_idx(vtlb, access_type), page)) {
> +        if (tlb_hit(tlb_read_idx(vtlb, access_type), addr)) {
>               /* Found entry in victim tlb, swap tlb and iotlb.  */
>               CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
>   
> @@ -1448,13 +1448,12 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
> -    vaddr page_addr = addr & TARGET_PAGE_MASK;
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
>       bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
>       CPUTLBEntryFull *full;
>   
> -    if (!tlb_hit_page(tlb_addr, page_addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, page_addr)) {
> +    if (!tlb_hit(tlb_addr, addr)) {
> +        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
>               if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                                   0, fault_size, nonfault, retaddr)) {
>                   /* Non-faulting page table read failed.  */
> @@ -1734,8 +1733,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>   
>       /* If the TLB entry is for a different page, reload and try again.  */
>       if (!tlb_hit(tlb_addr, addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type,
> -                            addr & TARGET_PAGE_MASK)) {
> +        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
>               tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                              memop, data->size, false, ra);
>               maybe_resized = true;
> @@ -1914,8 +1912,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       /* Check TLB entry and enforce page permissions.  */
>       flags = TLB_FLAGS_MASK;
>       if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
> -                            addr & TARGET_PAGE_MASK)) {
> +        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE, addr)) {
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
>                              mop, size, false, retaddr);
>               did_tlb_fill = true;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-11-14 16:00 ` [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
@ 2024-11-14 18:06   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:06 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Change from a linear search on the victim tlb
> to a balanced binary tree search on the interval tree.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 59 ++++++++++++++++++++++++----------------------
>   1 file changed, 31 insertions(+), 28 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 3aab72ea82..ea4b78866b 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1384,35 +1384,38 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
>       }
>   }
>   
> -/* Return true if ADDR is present in the victim tlb, and has been copied
> -   back to the main tlb.  */
> -static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
> -                           MMUAccessType access_type, vaddr addr)
> +/*
> + * Return true if ADDR is present in the interval tree,
> + * and has been copied back to the main tlb.
> + */
> +static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
> +                        MMUAccessType access_type, vaddr addr)
>   {
> -    size_t vidx;
> +    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
> +    CPUTLBEntryTree *node;
> +    size_t index;
>   
>       assert_cpu_is_self(cpu);
> -    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
> -        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
> -
> -        if (tlb_hit(tlb_read_idx(vtlb, access_type), addr)) {
> -            /* Found entry in victim tlb, swap tlb and iotlb.  */
> -            CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
> -
> -            qemu_spin_lock(&cpu->neg.tlb.c.lock);
> -            copy_tlb_helper_locked(&tmptlb, tlb);
> -            copy_tlb_helper_locked(tlb, vtlb);
> -            copy_tlb_helper_locked(vtlb, &tmptlb);
> -            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> -
> -            CPUTLBEntryFull *f1 = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> -            CPUTLBEntryFull *f2 = &cpu->neg.tlb.d[mmu_idx].vfulltlb[vidx];
> -            CPUTLBEntryFull tmpf;
> -            tmpf = *f1; *f1 = *f2; *f2 = tmpf;
> -            return true;
> -        }
> +    node = tlbtree_lookup_addr(desc, addr);
> +    if (!node) {
> +        /* There is no cached mapping for this page. */
> +        return false;
>       }
> -    return false;
> +
> +    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
> +        /* This access is not permitted. */
> +        return false;
> +    }
> +
> +    /* Install the cached entry. */
> +    index = tlbfast_index(fast, addr);
> +    qemu_spin_lock(&cpu->neg.tlb.c.lock);
> +    copy_tlb_helper_locked(&fast->table[index], &node->copy);
> +    qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> +
> +    desc->fulltlb[index] = node->full;
> +    return true;
>   }
>   
>   static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
> @@ -1453,7 +1456,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       CPUTLBEntryFull *full;
>   
>       if (!tlb_hit(tlb_addr, addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
>               if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                                   0, fault_size, nonfault, retaddr)) {
>                   /* Non-faulting page table read failed.  */
> @@ -1733,7 +1736,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>   
>       /* If the TLB entry is for a different page, reload and try again.  */
>       if (!tlb_hit(tlb_addr, addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, addr)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
>               tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                              memop, data->size, false, ra);
>               maybe_resized = true;
> @@ -1912,7 +1915,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       /* Check TLB entry and enforce page permissions.  */
>       flags = TLB_FLAGS_MASK;
>       if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE, addr)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, MMU_DATA_STORE, addr)) {
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
>                              mop, size, false, retaddr);
>               did_tlb_fill = true;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 18/54] accel/tcg: Remove the victim tlb
  2024-11-14 16:00 ` [PATCH v2 18/54] accel/tcg: Remove the victim tlb Richard Henderson
@ 2024-11-14 18:07   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:07 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 11/14/24 08:00, Richard Henderson wrote:
> This has been functionally replaced by the IntervalTree.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  8 -----
>   accel/tcg/cputlb.c    | 74 -------------------------------------------
>   2 files changed, 82 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 1ebc999a73..8eda0574b2 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -201,9 +201,6 @@ struct CPUClass {
>    */
>   #define NB_MMU_MODES 16
>   
> -/* Use a fully associative victim tlb of 8 entries. */
> -#define CPU_VTLB_SIZE 8
> -
>   /*
>    * The full TLB entry, which is not accessed by generated TCG code,
>    * so the layout is not as critical as that of CPUTLBEntry. This is
> @@ -285,11 +282,6 @@ typedef struct CPUTLBDesc {
>       /* maximum number of entries observed in the window */
>       size_t window_max_entries;
>       size_t n_used_entries;
> -    /* The next index to use in the tlb victim table.  */
> -    size_t vindex;
> -    /* The tlb victim table, in two parts.  */
> -    CPUTLBEntry vtable[CPU_VTLB_SIZE];
> -    CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
>       CPUTLBEntryFull *fulltlb;
>       /* All active tlb entries for this address space. */
>       IntervalTreeRoot iroot;
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ea4b78866b..8caa8c0f1d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -328,8 +328,6 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       tlbfast_flush_locked(desc, fast);
>       desc->large_page_addr = -1;
>       desc->large_page_mask = -1;
> -    desc->vindex = 0;
> -    memset(desc->vtable, -1, sizeof(desc->vtable));
>       interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
>   }
>   
> @@ -361,11 +359,6 @@ static inline void tlb_n_used_entries_inc(CPUState *cpu, uintptr_t mmu_idx)
>       cpu->neg.tlb.d[mmu_idx].n_used_entries++;
>   }
>   
> -static inline void tlb_n_used_entries_dec(CPUState *cpu, uintptr_t mmu_idx)
> -{
> -    cpu->neg.tlb.d[mmu_idx].n_used_entries--;
> -}
> -
>   void tlb_init(CPUState *cpu)
>   {
>       int64_t now = get_clock_realtime();
> @@ -496,20 +489,6 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
>               page == (tlb_entry->addr_code & mask));
>   }
>   
> -static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
> -{
> -    return tlb_hit_page_mask_anyprot(tlb_entry, page, -1);
> -}
> -
> -/**
> - * tlb_entry_is_empty - return true if the entry is not in use
> - * @te: pointer to CPUTLBEntry
> - */
> -static inline bool tlb_entry_is_empty(const CPUTLBEntry *te)
> -{
> -    return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
> -}
> -
>   /* Called with tlb_c.lock held */
>   static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
>                                           vaddr page,
> @@ -522,28 +501,6 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
>       return false;
>   }
>   
> -/* Called with tlb_c.lock held */
> -static void tlb_flush_vtlb_page_mask_locked(CPUState *cpu, int mmu_idx,
> -                                            vaddr page,
> -                                            vaddr mask)
> -{
> -    CPUTLBDesc *d = &cpu->neg.tlb.d[mmu_idx];
> -    int k;
> -
> -    assert_cpu_is_self(cpu);
> -    for (k = 0; k < CPU_VTLB_SIZE; k++) {
> -        if (tlb_flush_entry_mask_locked(&d->vtable[k], page, mask)) {
> -            tlb_n_used_entries_dec(cpu, mmu_idx);
> -        }
> -    }
> -}
> -
> -static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
> -                                              vaddr page)
> -{
> -    tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
> -}
> -
>   static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>                                          vaddr addr, vaddr len, vaddr mask)
>   {
> @@ -588,7 +545,6 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>   
>       tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
>                                  page, TARGET_PAGE_SIZE, -1);
> -    tlb_flush_vtlb_page_locked(cpu, midx, page);
>   
>       node = tlbtree_lookup_addr(desc, page);
>       if (node) {
> @@ -764,11 +720,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>   
>       tlbfast_flush_range_locked(d, f, addr, len, mask);
>   
> -    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
> -        vaddr page = addr + i;
> -        tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
> -    }
> -
>       addr_mask = addr & mask;
>       last_mask = addr_mask + len - 1;
>       last_imask = last_mask | ~mask;
> @@ -1017,10 +968,6 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
>               tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
>           }
>   
> -        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
> -            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
> -        }
> -
>           for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
>                t = tlbtree_lookup_range_next(t, 0, -1)) {
>               tlb_reset_dirty_range_locked(&t->copy, start1, length);
> @@ -1054,10 +1001,6 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
>   
>           tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
>   
> -        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
> -            tlb_set_dirty1_locked(&desc->vtable[k], addr);
> -        }
> -
>           node = tlbtree_lookup_addr(desc, addr);
>           if (node) {
>               tlb_set_dirty1_locked(&node->copy, addr);
> @@ -1216,23 +1159,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       /* Note that the tlb is no longer clean.  */
>       tlb->c.dirty |= 1 << mmu_idx;
>   
> -    /* Make sure there's no cached translation for the new page.  */
> -    tlb_flush_vtlb_page_locked(cpu, mmu_idx, addr_page);
> -
> -    /*
> -     * Only evict the old entry to the victim tlb if it's for a
> -     * different page; otherwise just overwrite the stale data.
> -     */
> -    if (!tlb_hit_page_anyprot(te, addr_page) && !tlb_entry_is_empty(te)) {
> -        unsigned vidx = desc->vindex++ % CPU_VTLB_SIZE;
> -        CPUTLBEntry *tv = &desc->vtable[vidx];
> -
> -        /* Evict the old entry into the victim tlb.  */
> -        copy_tlb_helper_locked(tv, te);
> -        desc->vfulltlb[vidx] = desc->fulltlb[index];
> -        tlb_n_used_entries_dec(cpu, mmu_idx);
> -    }
> -
>       /* Replace an old IntervalTree entry, or create a new one. */
>       node = tlbtree_lookup_addr(desc, addr_page);
>       if (!node) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc
  2024-11-14 16:00 ` [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc Richard Henderson
@ 2024-11-14 18:07   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:07 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Expand the function into its only caller, using the
> existing CPUTLBDesc local pointer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 7 +------
>   1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 8caa8c0f1d..3e24529f4f 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -354,11 +354,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
>       tlb_mmu_flush_locked(desc, fast);
>   }
>   
> -static inline void tlb_n_used_entries_inc(CPUState *cpu, uintptr_t mmu_idx)
> -{
> -    cpu->neg.tlb.d[mmu_idx].n_used_entries++;
> -}
> -
>   void tlb_init(CPUState *cpu)
>   {
>       int64_t now = get_clock_realtime();
> @@ -1211,7 +1206,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>   
>       node->full = *full;
>       copy_tlb_helper_locked(te, &node->copy);
> -    tlb_n_used_entries_inc(cpu, mmu_idx);
> +    desc->n_used_entries++;
>       qemu_spin_unlock(&tlb->c.lock);
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  2024-11-14 16:00 ` [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
@ 2024-11-14 18:08   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> CPUTLBEntryFull structures are no longer directly included within
> the CPUState structure.  Move the structure definition out of cpu.h
> to reduce visibility.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/tlb-common.h | 63 +++++++++++++++++++++++++++++++++++++++
>   include/hw/core/cpu.h     | 63 ---------------------------------------
>   2 files changed, 63 insertions(+), 63 deletions(-)
> 
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index dc5a5faa0b..300f9fae67 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -53,4 +53,67 @@ typedef struct CPUTLBDescFast {
>       CPUTLBEntry *table;
>   } CPUTLBDescFast QEMU_ALIGNED(2 * sizeof(void *));
>   
> +/*
> + * The full TLB entry, which is not accessed by generated TCG code,
> + * so the layout is not as critical as that of CPUTLBEntry. This is
> + * also why we don't want to combine the two structs.
> + */
> +struct CPUTLBEntryFull {
> +    /*
> +     * @xlat_section contains:
> +     *  - in the lower TARGET_PAGE_BITS, a physical section number
> +     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
> +     *    must be added to the virtual address to obtain:
> +     *     + the ram_addr_t of the target RAM (if the physical section
> +     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
> +     *     + the offset within the target MemoryRegion (otherwise)
> +     */
> +    hwaddr xlat_section;
> +
> +    /*
> +     * @phys_addr contains the physical address in the address space
> +     * given by cpu_asidx_from_attrs(cpu, @attrs).
> +     */
> +    hwaddr phys_addr;
> +
> +    /* @attrs contains the memory transaction attributes for the page. */
> +    MemTxAttrs attrs;
> +
> +    /* @prot contains the complete protections for the page. */
> +    uint8_t prot;
> +
> +    /* @lg_page_size contains the log2 of the page size. */
> +    uint8_t lg_page_size;
> +
> +    /* Additional tlb flags requested by tlb_fill. */
> +    uint8_t tlb_fill_flags;
> +
> +    /*
> +     * Additional tlb flags for use by the slow path. If non-zero,
> +     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
> +     */
> +    uint8_t slow_flags[MMU_ACCESS_COUNT];
> +
> +    /*
> +     * Allow target-specific additions to this structure.
> +     * This may be used to cache items from the guest cpu
> +     * page tables for later use by the implementation.
> +     */
> +    union {
> +        /*
> +         * Cache the attrs and shareability fields from the page table entry.
> +         *
> +         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
> +         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
> +         * For shareability and guarded, as in the SH and GP fields respectively
> +         * of the VMSAv8-64 PTEs.
> +         */
> +        struct {
> +            uint8_t pte_attrs;
> +            uint8_t shareability;
> +            bool guarded;
> +        } arm;
> +    } extra;
> +};
> +
>   #endif /* EXEC_TLB_COMMON_H */
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 8eda0574b2..4364ddb1db 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -201,69 +201,6 @@ struct CPUClass {
>    */
>   #define NB_MMU_MODES 16
>   
> -/*
> - * The full TLB entry, which is not accessed by generated TCG code,
> - * so the layout is not as critical as that of CPUTLBEntry. This is
> - * also why we don't want to combine the two structs.
> - */
> -struct CPUTLBEntryFull {
> -    /*
> -     * @xlat_section contains:
> -     *  - in the lower TARGET_PAGE_BITS, a physical section number
> -     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
> -     *    must be added to the virtual address to obtain:
> -     *     + the ram_addr_t of the target RAM (if the physical section
> -     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
> -     *     + the offset within the target MemoryRegion (otherwise)
> -     */
> -    hwaddr xlat_section;
> -
> -    /*
> -     * @phys_addr contains the physical address in the address space
> -     * given by cpu_asidx_from_attrs(cpu, @attrs).
> -     */
> -    hwaddr phys_addr;
> -
> -    /* @attrs contains the memory transaction attributes for the page. */
> -    MemTxAttrs attrs;
> -
> -    /* @prot contains the complete protections for the page. */
> -    uint8_t prot;
> -
> -    /* @lg_page_size contains the log2 of the page size. */
> -    uint8_t lg_page_size;
> -
> -    /* Additional tlb flags requested by tlb_fill. */
> -    uint8_t tlb_fill_flags;
> -
> -    /*
> -     * Additional tlb flags for use by the slow path. If non-zero,
> -     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
> -     */
> -    uint8_t slow_flags[MMU_ACCESS_COUNT];
> -
> -    /*
> -     * Allow target-specific additions to this structure.
> -     * This may be used to cache items from the guest cpu
> -     * page tables for later use by the implementation.
> -     */
> -    union {
> -        /*
> -         * Cache the attrs and shareability fields from the page table entry.
> -         *
> -         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
> -         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
> -         * For shareability and guarded, as in the SH and GP fields respectively
> -         * of the VMSAv8-64 PTEs.
> -         */
> -        struct {
> -            uint8_t pte_attrs;
> -            uint8_t shareability;
> -            bool guarded;
> -        } arm;
> -    } extra;
> -};
> -
>   /*
>    * Data elements that are per MMU mode, minus the bits accessed by
>    * the TCG fast path.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal
  2024-11-14 16:00 ` [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
@ 2024-11-14 18:09   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:09 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Remove force_mmio and place the expression into the IF
> expression, behind the short-circuit logic expressions
> that might eliminate its computation.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 3e24529f4f..a4c69bcbf1 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1373,7 +1373,6 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
> -    bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
>       CPUTLBEntryFull *full;
>   
>       if (!tlb_hit(tlb_addr, addr)) {
> @@ -1404,9 +1403,14 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>       flags |= full->slow_flags[access_type];
>   
> -    /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
> -    if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
> -        || (access_type != MMU_INST_FETCH && force_mmio)) {
> +    /*
> +     * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
> +     * These cannot be treated as RAM.
> +     */
> +    if ((flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
> +        || (access_type != MMU_INST_FETCH
> +            && check_mem_cbs
> +            && cpu_plugin_mem_cbs_enabled(cpu))) {
>           *phost = NULL;
>           return TLB_MMIO;
>       }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-11-14 16:00 ` [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
@ 2024-11-14 18:09   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:09 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> Ensure a common entry point for all code lookups.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index a4c69bcbf1..c975dd2322 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -2924,28 +2924,28 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(cs, true));
> -    return do_ld1_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldb_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(cs, true));
> -    return do_ld2_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldw_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(cs, true));
> -    return do_ld4_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldl_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(cs, true));
> -    return do_ld8_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldq_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup
  2024-11-14 16:00 ` [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup Richard Henderson
@ 2024-11-14 18:09   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:09 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> In the mist before CPUTLBEntryFull existed, we had to be
> clever to detect write-only pages.  Now we can directly
> test the saved prot bits, which is clearer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index c975dd2322..ae3a99eb47 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1854,14 +1854,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>               flags &= ~TLB_INVALID_MASK;
>           }
>       }
> +    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>   
>       /*
>        * Let the guest notice RMW on a write-only page.
>        * We have just verified that the page is writable.
> -     * Subpage lookups may have left TLB_INVALID_MASK set,
> -     * but addr_read will only be -1 if PAGE_READ was unset.
>        */
> -    if (unlikely(tlbe->addr_read == -1)) {
> +    if (unlikely(!(full->prot & PAGE_READ))) {
>           tlb_fill_align(cpu, addr, MMU_DATA_LOAD, mmu_idx,
>                          0, size, false, retaddr);
>           /*
> @@ -1899,7 +1898,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       }
>   
>       hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>   
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           notdirty_write(cpu, addr, size, full, retaddr);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare
  2024-11-14 16:01 ` [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare Richard Henderson
@ 2024-11-14 18:11   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:11 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Before, if !enable, we squashed the entire address comparator to -1.
> This works because TLB_INVALID_MASK is set.  It seemed natural, because
> the tlb is cleared with memset of 0xff.
> 
> With this patch, we retain all of the other TLB_* bits even when
> the page is not enabled.  This works because TLB_INVALID_MASK is set.
> This will be used in a subsequent patch; the addr_read comparator
> contains the flags for pages that are executable but not readable.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 +++++++---------
>   1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ae3a99eb47..585f4171cc 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1032,15 +1032,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
>                                      vaddr address, int flags,
>                                      MMUAccessType access_type, bool enable)
>   {
> -    if (enable) {
> -        address |= flags & TLB_FLAGS_MASK;
> -        flags &= TLB_SLOW_FLAGS_MASK;
> -        if (flags) {
> -            address |= TLB_FORCE_SLOW;
> -        }
> -    } else {
> -        address = -1;
> -        flags = 0;
> +    if (!enable) {
> +        address = TLB_INVALID_MASK;
> +    }
> +    address |= flags & TLB_FLAGS_MASK;
> +    flags &= TLB_SLOW_FLAGS_MASK;
> +    if (flags) {
> +        address |= TLB_FORCE_SLOW;
>       }
>       ent->addr_idx[access_type] = address;
>       full->slow_flags[access_type] = flags;

Good that you extracted that from original patch, it's much more clear now.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu
  2024-11-14 16:01 ` [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu Richard Henderson
@ 2024-11-14 18:11   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:11 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Return a copy of the structure, not a pointer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/exec-all.h              |  2 +-
>   accel/tcg/cputlb.c                   | 13 ++++++++-----
>   target/arm/ptw.c                     | 10 +++++-----
>   target/i386/tcg/sysemu/excp_helper.c |  8 ++++----
>   4 files changed, 18 insertions(+), 15 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 2e4c4cc4b4..df7d0b5ad0 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -393,7 +393,7 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
>    */
>   int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
>                             MMUAccessType access_type, int mmu_idx,
> -                          void **phost, CPUTLBEntryFull **pfull);
> +                          void **phost, CPUTLBEntryFull *pfull);
>   
>   #endif /* !CONFIG_USER_ONLY */
>   #endif /* CONFIG_TCG */
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 585f4171cc..81135524eb 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1439,25 +1439,28 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
>   
>   int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
>                             MMUAccessType access_type, int mmu_idx,
> -                          void **phost, CPUTLBEntryFull **pfull)
> +                          void **phost, CPUTLBEntryFull *pfull)
>   {
>       void *discard_phost;
> -    CPUTLBEntryFull *discard_tlb;
> +    CPUTLBEntryFull *full;
>   
>       /* privately handle users that don't need full results */
>       phost = phost ? phost : &discard_phost;
> -    pfull = pfull ? pfull : &discard_tlb;
>   
>       int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
> -                                      mmu_idx, true, phost, pfull, 0, false);
> +                                      mmu_idx, true, phost, &full, 0, false);
>   
>       /* Handle clean RAM pages.  */
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           int dirtysize = size == 0 ? 1 : size;
> -        notdirty_write(env_cpu(env), addr, dirtysize, *pfull, 0);
> +        notdirty_write(env_cpu(env), addr, dirtysize, full, 0);
>           flags &= ~TLB_NOTDIRTY;
>       }
>   
> +    if (pfull) {
> +        *pfull = *full;
> +    }
> +
>       return flags;
>   }
>   
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> index 9849949508..3ae5f524de 100644
> --- a/target/arm/ptw.c
> +++ b/target/arm/ptw.c
> @@ -592,7 +592,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
>           ptw->out_space = s2.f.attrs.space;
>       } else {
>   #ifdef CONFIG_TCG
> -        CPUTLBEntryFull *full;
> +        CPUTLBEntryFull full;
>           int flags;
>   
>           env->tlb_fi = fi;
> @@ -604,10 +604,10 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
>           if (unlikely(flags & TLB_INVALID_MASK)) {
>               goto fail;
>           }
> -        ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
> -        ptw->out_rw = full->prot & PAGE_WRITE;
> -        pte_attrs = full->extra.arm.pte_attrs;
> -        ptw->out_space = full->attrs.space;
> +        ptw->out_phys = full.phys_addr | (addr & ~TARGET_PAGE_MASK);
> +        ptw->out_rw = full.prot & PAGE_WRITE;
> +        pte_attrs = full.extra.arm.pte_attrs;
> +        ptw->out_space = full.attrs.space;
>   #else
>           g_assert_not_reached();
>   #endif
> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
> index 02d3486421..168ff8e5f3 100644
> --- a/target/i386/tcg/sysemu/excp_helper.c
> +++ b/target/i386/tcg/sysemu/excp_helper.c
> @@ -436,7 +436,7 @@ do_check_protect_pse36:
>        * addresses) using the address with the A20 bit set.
>        */
>       if (in->ptw_idx == MMU_NESTED_IDX) {
> -        CPUTLBEntryFull *full;
> +        CPUTLBEntryFull full;
>           int flags, nested_page_size;
>   
>           flags = probe_access_full_mmu(env, paddr, 0, access_type,
> @@ -451,7 +451,7 @@ do_check_protect_pse36:
>           }
>   
>           /* Merge stage1 & stage2 protection bits. */
> -        prot &= full->prot;
> +        prot &= full.prot;
>   
>           /* Re-verify resulting protection. */
>           if ((prot & (1 << access_type)) == 0) {
> @@ -459,8 +459,8 @@ do_check_protect_pse36:
>           }
>   
>           /* Merge stage1 & stage2 addresses to final physical address. */
> -        nested_page_size = 1 << full->lg_page_size;
> -        paddr = (full->phys_addr & ~(nested_page_size - 1))
> +        nested_page_size = 1 << full.lg_page_size;
> +        paddr = (full.phys_addr & ~(nested_page_size - 1))
>                 | (paddr & (nested_page_size - 1));
>   
>           /*

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full
  2024-11-14 16:01 ` [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full Richard Henderson
@ 2024-11-14 18:12   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:12 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Return a copy of the structure, not a pointer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/exec-all.h     |  6 +-----
>   accel/tcg/cputlb.c          |  8 +++++---
>   target/arm/tcg/helper-a64.c |  4 ++--
>   target/arm/tcg/mte_helper.c | 15 ++++++---------
>   target/arm/tcg/sve_helper.c |  6 +++---
>   5 files changed, 17 insertions(+), 22 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index df7d0b5ad0..69bdb77584 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -365,10 +365,6 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
>    * probe_access_full:
>    * Like probe_access_flags, except also return into @pfull.
>    *
> - * The CPUTLBEntryFull structure returned via @pfull is transient
> - * and must be consumed or copied immediately, before any further
> - * access or changes to TLB @mmu_idx.
> - *
>    * This function will not fault if @nonfault is set, but will
>    * return TLB_INVALID_MASK if the page is not mapped, or is not
>    * accessible with @access_type.
> @@ -379,7 +375,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
>   int probe_access_full(CPUArchState *env, vaddr addr, int size,
>                         MMUAccessType access_type, int mmu_idx,
>                         bool nonfault, void **phost,
> -                      CPUTLBEntryFull **pfull, uintptr_t retaddr);
> +                      CPUTLBEntryFull *pfull, uintptr_t retaddr);
>   
>   /**
>    * probe_access_full_mmu:
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 81135524eb..84e7e633e3 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1420,20 +1420,22 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>   
>   int probe_access_full(CPUArchState *env, vaddr addr, int size,
>                         MMUAccessType access_type, int mmu_idx,
> -                      bool nonfault, void **phost, CPUTLBEntryFull **pfull,
> +                      bool nonfault, void **phost, CPUTLBEntryFull *pfull,
>                         uintptr_t retaddr)
>   {
> +    CPUTLBEntryFull *full;
>       int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
> -                                      mmu_idx, nonfault, phost, pfull, retaddr,
> +                                      mmu_idx, nonfault, phost, &full, retaddr,
>                                         true);
>   
>       /* Handle clean RAM pages.  */
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           int dirtysize = size == 0 ? 1 : size;
> -        notdirty_write(env_cpu(env), addr, dirtysize, *pfull, retaddr);
> +        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
>           flags &= ~TLB_NOTDIRTY;
>       }
>   
> +    *pfull = *full;
>       return flags;
>   }
>   
> diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
> index 8f42a28d07..783864d6db 100644
> --- a/target/arm/tcg/helper-a64.c
> +++ b/target/arm/tcg/helper-a64.c
> @@ -1883,14 +1883,14 @@ static bool is_guarded_page(CPUARMState *env, target_ulong addr, uintptr_t ra)
>   #ifdef CONFIG_USER_ONLY
>       return page_get_flags(addr) & PAGE_BTI;
>   #else
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       void *host;
>       int mmu_idx = cpu_mmu_index(env_cpu(env), true);
>       int flags = probe_access_full(env, addr, 0, MMU_INST_FETCH, mmu_idx,
>                                     false, &host, &full, ra);
>   
>       assert(!(flags & TLB_INVALID_MASK));
> -    return full->extra.arm.guarded;
> +    return full.extra.arm.guarded;
>   #endif
>   }
>   
> diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
> index 9d2ba287ee..870b2875af 100644
> --- a/target/arm/tcg/mte_helper.c
> +++ b/target/arm/tcg/mte_helper.c
> @@ -83,8 +83,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
>                         TARGET_PAGE_BITS - LOG2_TAG_GRANULE - 1);
>       return tags + index;
>   #else
> -    CPUTLBEntryFull *full;
> -    MemTxAttrs attrs;
> +    CPUTLBEntryFull full;
>       int in_page, flags;
>       hwaddr ptr_paddr, tag_paddr, xlat;
>       MemoryRegion *mr;
> @@ -110,7 +109,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
>       assert(!(flags & TLB_INVALID_MASK));
>   
>       /* If the virtual page MemAttr != Tagged, access unchecked. */
> -    if (full->extra.arm.pte_attrs != 0xf0) {
> +    if (full.extra.arm.pte_attrs != 0xf0) {
>           return NULL;
>       }
>   
> @@ -129,9 +128,7 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
>        * Remember these values across the second lookup below,
>        * which may invalidate this pointer via tlb resize.
>        */
> -    ptr_paddr = full->phys_addr | (ptr & ~TARGET_PAGE_MASK);
> -    attrs = full->attrs;
> -    full = NULL;
> +    ptr_paddr = full.phys_addr | (ptr & ~TARGET_PAGE_MASK);
>   
>       /*
>        * The Normal memory access can extend to the next page.  E.g. a single
> @@ -150,17 +147,17 @@ uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
>       if (!probe && unlikely(flags & TLB_WATCHPOINT)) {
>           int wp = ptr_access == MMU_DATA_LOAD ? BP_MEM_READ : BP_MEM_WRITE;
>           assert(ra != 0);
> -        cpu_check_watchpoint(env_cpu(env), ptr, ptr_size, attrs, wp, ra);
> +        cpu_check_watchpoint(env_cpu(env), ptr, ptr_size, full.attrs, wp, ra);
>       }
>   
>       /* Convert to the physical address in tag space.  */
>       tag_paddr = ptr_paddr >> (LOG2_TAG_GRANULE + 1);
>   
>       /* Look up the address in tag space. */
> -    tag_asi = attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
> +    tag_asi = full.attrs.secure ? ARMASIdx_TagS : ARMASIdx_TagNS;
>       tag_as = cpu_get_address_space(env_cpu(env), tag_asi);
>       mr = address_space_translate(tag_as, tag_paddr, &xlat, NULL,
> -                                 tag_access == MMU_DATA_STORE, attrs);
> +                                 tag_access == MMU_DATA_STORE, full.attrs);
>   
>       /*
>        * Note that @mr will never be NULL.  If there is nothing in the address
> diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
> index f1ee0e060f..dad0d5e518 100644
> --- a/target/arm/tcg/sve_helper.c
> +++ b/target/arm/tcg/sve_helper.c
> @@ -5357,7 +5357,7 @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
>       flags = probe_access_flags(env, addr, 0, access_type, mmu_idx, nofault,
>                                  &info->host, retaddr);
>   #else
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       flags = probe_access_full(env, addr, 0, access_type, mmu_idx, nofault,
>                                 &info->host, &full, retaddr);
>   #endif
> @@ -5373,8 +5373,8 @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
>       /* Require both ANON and MTE; see allocation_tag_mem(). */
>       info->tagged = (flags & PAGE_ANON) && (flags & PAGE_MTE);
>   #else
> -    info->attrs = full->attrs;
> -    info->tagged = full->extra.arm.pte_attrs == 0xf0;
> +    info->attrs = full.attrs;
> +    info->tagged = full.extra.arm.pte_attrs == 0xf0;
>   #endif
>   
>       /* Ensure that info->host[] is relative to addr, not addr + mem_off. */

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal
  2024-11-14 16:01 ` [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal Richard Henderson
@ 2024-11-14 18:13   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:13 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Return a copy of the structure, not a pointer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 40 ++++++++++++++++++----------------------
>   1 file changed, 18 insertions(+), 22 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 84e7e633e3..41b2f76cc9 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1364,7 +1364,7 @@ static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
>   static int probe_access_internal(CPUState *cpu, vaddr addr,
>                                    int fault_size, MMUAccessType access_type,
>                                    int mmu_idx, bool nonfault,
> -                                 void **phost, CPUTLBEntryFull **pfull,
> +                                 void **phost, CPUTLBEntryFull *pfull,
>                                    uintptr_t retaddr, bool check_mem_cbs)
>   {
>       uintptr_t index = tlb_index(cpu, mmu_idx, addr);
> @@ -1379,7 +1379,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>                                   0, fault_size, nonfault, retaddr)) {
>                   /* Non-faulting page table read failed.  */
>                   *phost = NULL;
> -                *pfull = NULL;
> +                memset(pfull, 0, sizeof(*pfull));
>                   return TLB_INVALID_MASK;
>               }
>   
> @@ -1398,8 +1398,9 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       }
>       flags &= tlb_addr;
>   
> -    *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>       flags |= full->slow_flags[access_type];
> +    *pfull = *full;
>   
>       /*
>        * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
> @@ -1423,19 +1424,17 @@ int probe_access_full(CPUArchState *env, vaddr addr, int size,
>                         bool nonfault, void **phost, CPUTLBEntryFull *pfull,
>                         uintptr_t retaddr)
>   {
> -    CPUTLBEntryFull *full;
>       int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
> -                                      mmu_idx, nonfault, phost, &full, retaddr,
> +                                      mmu_idx, nonfault, phost, pfull, retaddr,
>                                         true);
>   
>       /* Handle clean RAM pages.  */
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           int dirtysize = size == 0 ? 1 : size;
> -        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
> +        notdirty_write(env_cpu(env), addr, dirtysize, pfull, retaddr);
>           flags &= ~TLB_NOTDIRTY;
>       }
>   
> -    *pfull = *full;
>       return flags;
>   }
>   
> @@ -1444,25 +1443,22 @@ int probe_access_full_mmu(CPUArchState *env, vaddr addr, int size,
>                             void **phost, CPUTLBEntryFull *pfull)
>   {
>       void *discard_phost;
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull discard_full;
>   
>       /* privately handle users that don't need full results */
>       phost = phost ? phost : &discard_phost;
> +    pfull = pfull ? pfull : &discard_full;
>   
>       int flags = probe_access_internal(env_cpu(env), addr, size, access_type,
> -                                      mmu_idx, true, phost, &full, 0, false);
> +                                      mmu_idx, true, phost, pfull, 0, false);
>   
>       /* Handle clean RAM pages.  */
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           int dirtysize = size == 0 ? 1 : size;
> -        notdirty_write(env_cpu(env), addr, dirtysize, full, 0);
> +        notdirty_write(env_cpu(env), addr, dirtysize, pfull, 0);
>           flags &= ~TLB_NOTDIRTY;
>       }
>   
> -    if (pfull) {
> -        *pfull = *full;
> -    }
> -
>       return flags;
>   }
>   
> @@ -1470,7 +1466,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
>                          MMUAccessType access_type, int mmu_idx,
>                          bool nonfault, void **phost, uintptr_t retaddr)
>   {
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       int flags;
>   
>       g_assert(-(addr | TARGET_PAGE_MASK) >= size);
> @@ -1482,7 +1478,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
>       /* Handle clean RAM pages. */
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           int dirtysize = size == 0 ? 1 : size;
> -        notdirty_write(env_cpu(env), addr, dirtysize, full, retaddr);
> +        notdirty_write(env_cpu(env), addr, dirtysize, &full, retaddr);
>           flags &= ~TLB_NOTDIRTY;
>       }
>   
> @@ -1492,7 +1488,7 @@ int probe_access_flags(CPUArchState *env, vaddr addr, int size,
>   void *probe_access(CPUArchState *env, vaddr addr, int size,
>                      MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
>   {
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       void *host;
>       int flags;
>   
> @@ -1513,12 +1509,12 @@ void *probe_access(CPUArchState *env, vaddr addr, int size,
>               int wp_access = (access_type == MMU_DATA_STORE
>                                ? BP_MEM_WRITE : BP_MEM_READ);
>               cpu_check_watchpoint(env_cpu(env), addr, size,
> -                                 full->attrs, wp_access, retaddr);
> +                                 full.attrs, wp_access, retaddr);
>           }
>   
>           /* Handle clean RAM pages.  */
>           if (flags & TLB_NOTDIRTY) {
> -            notdirty_write(env_cpu(env), addr, size, full, retaddr);
> +            notdirty_write(env_cpu(env), addr, size, &full, retaddr);
>           }
>       }
>   
> @@ -1528,7 +1524,7 @@ void *probe_access(CPUArchState *env, vaddr addr, int size,
>   void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
>                           MMUAccessType access_type, int mmu_idx)
>   {
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       void *host;
>       int flags;
>   
> @@ -1552,7 +1548,7 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
>   tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
>                                           void **hostp)
>   {
> -    CPUTLBEntryFull *full;
> +    CPUTLBEntryFull full;
>       void *p;
>   
>       (void)probe_access_internal(env_cpu(env), addr, 1, MMU_INST_FETCH,
> @@ -1562,7 +1558,7 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
>           return -1;
>       }
>   
> -    if (full->lg_page_size < TARGET_PAGE_BITS) {
> +    if (full.lg_page_size < TARGET_PAGE_BITS) {
>           return -1;
>       }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup
  2024-11-14 16:01 ` [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup Richard Henderson
@ 2024-11-14 18:29   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:29 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Unify 3 instances of tlb lookup, through tlb_hit, tlbtree_hit,
> and tlb_full_align.  Use structures to avoid too many arguments.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

This is a great simplification.

> +static void tlb_lookup_nofail(CPUState *cpu, TLBLookupOutput *o,
> +                              const TLBLookupInput *i)
> +{
> +    bool ok = tlb_lookup(cpu, o, i);
> +    tcg_debug_assert(ok);

 From the function name, would that be safe to use a normal assert 
instead? In case we have a weird bug coming in the future, it would 
expose it more easily.

For the rest,
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput
  2024-11-14 16:01 ` [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput Richard Henderson
@ 2024-11-14 18:29   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:29 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 151 ++++++++++++++++++++++-----------------------
>   1 file changed, 74 insertions(+), 77 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index a33bebf55a..8f459be5a8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1684,10 +1684,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
>    */
>   
>   typedef struct MMULookupPageData {
> -    CPUTLBEntryFull *full;
> -    void *haddr;
>       vaddr addr;
> -    int flags;
>       int size;
>       TLBLookupOutput o;
>   } MMULookupPageData;
> @@ -1724,10 +1721,6 @@ static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>       };
>   
>       tlb_lookup_nofail(cpu, &data->o, &i);
> -
> -    data->full = &data->o.full;
> -    data->flags = data->o.flags;
> -    data->haddr = data->o.haddr;
>   }
>   
>   /**
> @@ -1743,24 +1736,22 @@ static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>   static void mmu_watch_or_dirty(CPUState *cpu, MMULookupPageData *data,
>                                  MMUAccessType access_type, uintptr_t ra)
>   {
> -    CPUTLBEntryFull *full = data->full;
> -    vaddr addr = data->addr;
> -    int flags = data->flags;
> -    int size = data->size;
> +    int flags = data->o.flags;
>   
>       /* On watchpoint hit, this will longjmp out.  */
>       if (flags & TLB_WATCHPOINT) {
>           int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ;
> -        cpu_check_watchpoint(cpu, addr, size, full->attrs, wp, ra);
> +        cpu_check_watchpoint(cpu, data->addr, data->size,
> +                             data->o.full.attrs, wp, ra);
>           flags &= ~TLB_WATCHPOINT;
>       }
>   
>       /* Note that notdirty is only set for writes. */
>       if (flags & TLB_NOTDIRTY) {
> -        notdirty_write(cpu, addr, size, full, ra);
> +        notdirty_write(cpu, data->addr, data->size, &data->o.full, ra);
>           flags &= ~TLB_NOTDIRTY;
>       }
> -    data->flags = flags;
> +    data->o.flags = flags;
>   }
>   
>   /**
> @@ -1795,7 +1786,7 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       if (likely(!crosspage)) {
>           mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
>   
> -        flags = l->page[0].flags;
> +        flags = l->page[0].o.flags;
>           if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
>               mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
>           }
> @@ -1812,7 +1803,7 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
>           mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
>   
> -        flags = l->page[0].flags | l->page[1].flags;
> +        flags = l->page[0].o.flags | l->page[1].o.flags;
>           if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
>               mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
>               mmu_watch_or_dirty(cpu, &l->page[1], type, ra);
> @@ -2029,7 +2020,7 @@ static Int128 do_ld16_mmio_beN(CPUState *cpu, CPUTLBEntryFull *full,
>    */
>   static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
>   {
> -    uint8_t *haddr = p->haddr;
> +    uint8_t *haddr = p->o.haddr;
>       int i, size = p->size;
>   
>       for (i = 0; i < size; i++) {
> @@ -2047,7 +2038,7 @@ static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be)
>    */
>   static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be)
>   {
> -    void *haddr = p->haddr;
> +    void *haddr = p->o.haddr;
>       int size = p->size;
>   
>       do {
> @@ -2097,7 +2088,7 @@ static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be)
>   static uint64_t do_ld_whole_be4(MMULookupPageData *p, uint64_t ret_be)
>   {
>       int o = p->addr & 3;
> -    uint32_t x = load_atomic4(p->haddr - o);
> +    uint32_t x = load_atomic4(p->o.haddr - o);
>   
>       x = cpu_to_be32(x);
>       x <<= o * 8;
> @@ -2117,7 +2108,7 @@ static uint64_t do_ld_whole_be8(CPUState *cpu, uintptr_t ra,
>                                   MMULookupPageData *p, uint64_t ret_be)
>   {
>       int o = p->addr & 7;
> -    uint64_t x = load_atomic8_or_exit(cpu, ra, p->haddr - o);
> +    uint64_t x = load_atomic8_or_exit(cpu, ra, p->o.haddr - o);
>   
>       x = cpu_to_be64(x);
>       x <<= o * 8;
> @@ -2137,7 +2128,7 @@ static Int128 do_ld_whole_be16(CPUState *cpu, uintptr_t ra,
>                                  MMULookupPageData *p, uint64_t ret_be)
>   {
>       int o = p->addr & 15;
> -    Int128 x, y = load_atomic16_or_exit(cpu, ra, p->haddr - o);
> +    Int128 x, y = load_atomic16_or_exit(cpu, ra, p->o.haddr - o);
>       int size = p->size;
>   
>       if (!HOST_BIG_ENDIAN) {
> @@ -2160,8 +2151,8 @@ static uint64_t do_ld_beN(CPUState *cpu, MMULookupPageData *p,
>       MemOp atom;
>       unsigned tmp, half_size;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        return do_ld_mmio_beN(cpu, p->full, ret_be, p->addr, p->size,
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        return do_ld_mmio_beN(cpu, &p->o.full, ret_be, p->addr, p->size,
>                                 mmu_idx, type, ra);
>       }
>   
> @@ -2210,8 +2201,9 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
>       uint64_t b;
>       MemOp atom;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        return do_ld16_mmio_beN(cpu, p->full, a, p->addr, size, mmu_idx, ra);
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        return do_ld16_mmio_beN(cpu, &p->o.full, a, p->addr,
> +                                size, mmu_idx, ra);
>       }
>   
>       /*
> @@ -2223,7 +2215,7 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
>       case MO_ATOM_SUBALIGN:
>           p->size = size - 8;
>           a = do_ld_parts_beN(p, a);
> -        p->haddr += size - 8;
> +        p->o.haddr += size - 8;
>           p->size = 8;
>           b = do_ld_parts_beN(p, 0);
>           break;
> @@ -2242,7 +2234,7 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
>       case MO_ATOM_NONE:
>           p->size = size - 8;
>           a = do_ld_bytes_beN(p, a);
> -        b = ldq_be_p(p->haddr + size - 8);
> +        b = ldq_be_p(p->o.haddr + size - 8);
>           break;
>   
>       default:
> @@ -2255,10 +2247,11 @@ static Int128 do_ld16_beN(CPUState *cpu, MMULookupPageData *p,
>   static uint8_t do_ld_1(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
>                          MMUAccessType type, uintptr_t ra)
>   {
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        return do_ld_mmio_beN(cpu, p->full, 0, p->addr, 1, mmu_idx, type, ra);
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        return do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 1,
> +                              mmu_idx, type, ra);
>       } else {
> -        return *(uint8_t *)p->haddr;
> +        return *(uint8_t *)p->o.haddr;
>       }
>   }
>   
> @@ -2267,14 +2260,15 @@ static uint16_t do_ld_2(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
>   {
>       uint16_t ret;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 2, mmu_idx, type, ra);
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 2,
> +                             mmu_idx, type, ra);
>           if ((memop & MO_BSWAP) == MO_LE) {
>               ret = bswap16(ret);
>           }
>       } else {
>           /* Perform the load host endian, then swap if necessary. */
> -        ret = load_atom_2(cpu, ra, p->haddr, memop);
> +        ret = load_atom_2(cpu, ra, p->o.haddr, memop);
>           if (memop & MO_BSWAP) {
>               ret = bswap16(ret);
>           }
> @@ -2287,14 +2281,15 @@ static uint32_t do_ld_4(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
>   {
>       uint32_t ret;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 4, mmu_idx, type, ra);
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 4,
> +                             mmu_idx, type, ra);
>           if ((memop & MO_BSWAP) == MO_LE) {
>               ret = bswap32(ret);
>           }
>       } else {
>           /* Perform the load host endian. */
> -        ret = load_atom_4(cpu, ra, p->haddr, memop);
> +        ret = load_atom_4(cpu, ra, p->o.haddr, memop);
>           if (memop & MO_BSWAP) {
>               ret = bswap32(ret);
>           }
> @@ -2307,14 +2302,15 @@ static uint64_t do_ld_8(CPUState *cpu, MMULookupPageData *p, int mmu_idx,
>   {
>       uint64_t ret;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        ret = do_ld_mmio_beN(cpu, p->full, 0, p->addr, 8, mmu_idx, type, ra);
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        ret = do_ld_mmio_beN(cpu, &p->o.full, 0, p->addr, 8,
> +                             mmu_idx, type, ra);
>           if ((memop & MO_BSWAP) == MO_LE) {
>               ret = bswap64(ret);
>           }
>       } else {
>           /* Perform the load host endian. */
> -        ret = load_atom_8(cpu, ra, p->haddr, memop);
> +        ret = load_atom_8(cpu, ra, p->o.haddr, memop);
>           if (memop & MO_BSWAP) {
>               ret = bswap64(ret);
>           }
> @@ -2414,15 +2410,15 @@ static Int128 do_ld16_mmu(CPUState *cpu, vaddr addr,
>       cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
>       crosspage = mmu_lookup(cpu, addr, oi, ra, MMU_DATA_LOAD, &l);
>       if (likely(!crosspage)) {
> -        if (unlikely(l.page[0].flags & TLB_MMIO)) {
> -            ret = do_ld16_mmio_beN(cpu, l.page[0].full, 0, addr, 16,
> +        if (unlikely(l.page[0].o.flags & TLB_MMIO)) {
> +            ret = do_ld16_mmio_beN(cpu, &l.page[0].o.full, 0, addr, 16,
>                                      l.mmu_idx, ra);
>               if ((l.memop & MO_BSWAP) == MO_LE) {
>                   ret = bswap128(ret);
>               }
>           } else {
>               /* Perform the load host endian. */
> -            ret = load_atom_16(cpu, ra, l.page[0].haddr, l.memop);
> +            ret = load_atom_16(cpu, ra, l.page[0].o.haddr, l.memop);
>               if (l.memop & MO_BSWAP) {
>                   ret = bswap128(ret);
>               }
> @@ -2568,10 +2564,10 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
>       MemOp atom;
>       unsigned tmp, half_size;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        return do_st_mmio_leN(cpu, p->full, val_le, p->addr,
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        return do_st_mmio_leN(cpu, &p->o.full, val_le, p->addr,
>                                 p->size, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           return val_le >> (p->size * 8);
>       }
>   
> @@ -2582,7 +2578,7 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
>       atom = mop & MO_ATOM_MASK;
>       switch (atom) {
>       case MO_ATOM_SUBALIGN:
> -        return store_parts_leN(p->haddr, p->size, val_le);
> +        return store_parts_leN(p->o.haddr, p->size, val_le);
>   
>       case MO_ATOM_IFALIGN_PAIR:
>       case MO_ATOM_WITHIN16_PAIR:
> @@ -2593,9 +2589,9 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
>               ? p->size == half_size
>               : p->size >= half_size) {
>               if (!HAVE_al8_fast && p->size <= 4) {
> -                return store_whole_le4(p->haddr, p->size, val_le);
> +                return store_whole_le4(p->o.haddr, p->size, val_le);
>               } else if (HAVE_al8) {
> -                return store_whole_le8(p->haddr, p->size, val_le);
> +                return store_whole_le8(p->o.haddr, p->size, val_le);
>               } else {
>                   cpu_loop_exit_atomic(cpu, ra);
>               }
> @@ -2605,7 +2601,7 @@ static uint64_t do_st_leN(CPUState *cpu, MMULookupPageData *p,
>       case MO_ATOM_IFALIGN:
>       case MO_ATOM_WITHIN16:
>       case MO_ATOM_NONE:
> -        return store_bytes_leN(p->haddr, p->size, val_le);
> +        return store_bytes_leN(p->o.haddr, p->size, val_le);
>   
>       default:
>           g_assert_not_reached();
> @@ -2622,10 +2618,10 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
>       int size = p->size;
>       MemOp atom;
>   
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        return do_st16_mmio_leN(cpu, p->full, val_le, p->addr,
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        return do_st16_mmio_leN(cpu, &p->o.full, val_le, p->addr,
>                                   size, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           return int128_gethi(val_le) >> ((size - 8) * 8);
>       }
>   
> @@ -2636,8 +2632,8 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
>       atom = mop & MO_ATOM_MASK;
>       switch (atom) {
>       case MO_ATOM_SUBALIGN:
> -        store_parts_leN(p->haddr, 8, int128_getlo(val_le));
> -        return store_parts_leN(p->haddr + 8, p->size - 8,
> +        store_parts_leN(p->o.haddr, 8, int128_getlo(val_le));
> +        return store_parts_leN(p->o.haddr + 8, p->size - 8,
>                                  int128_gethi(val_le));
>   
>       case MO_ATOM_WITHIN16_PAIR:
> @@ -2645,7 +2641,7 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
>           if (!HAVE_CMPXCHG128) {
>               cpu_loop_exit_atomic(cpu, ra);
>           }
> -        return store_whole_le16(p->haddr, p->size, val_le);
> +        return store_whole_le16(p->o.haddr, p->size, val_le);
>   
>       case MO_ATOM_IFALIGN_PAIR:
>           /*
> @@ -2655,8 +2651,8 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
>       case MO_ATOM_IFALIGN:
>       case MO_ATOM_WITHIN16:
>       case MO_ATOM_NONE:
> -        stq_le_p(p->haddr, int128_getlo(val_le));
> -        return store_bytes_leN(p->haddr + 8, p->size - 8,
> +        stq_le_p(p->o.haddr, int128_getlo(val_le));
> +        return store_bytes_leN(p->o.haddr + 8, p->size - 8,
>                                  int128_gethi(val_le));
>   
>       default:
> @@ -2667,69 +2663,69 @@ static uint64_t do_st16_leN(CPUState *cpu, MMULookupPageData *p,
>   static void do_st_1(CPUState *cpu, MMULookupPageData *p, uint8_t val,
>                       int mmu_idx, uintptr_t ra)
>   {
> -    if (unlikely(p->flags & TLB_MMIO)) {
> -        do_st_mmio_leN(cpu, p->full, val, p->addr, 1, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
> +        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 1, mmu_idx, ra);
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           /* nothing */
>       } else {
> -        *(uint8_t *)p->haddr = val;
> +        *(uint8_t *)p->o.haddr = val;
>       }
>   }
>   
>   static void do_st_2(CPUState *cpu, MMULookupPageData *p, uint16_t val,
>                       int mmu_idx, MemOp memop, uintptr_t ra)
>   {
> -    if (unlikely(p->flags & TLB_MMIO)) {
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
>           if ((memop & MO_BSWAP) != MO_LE) {
>               val = bswap16(val);
>           }
> -        do_st_mmio_leN(cpu, p->full, val, p->addr, 2, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 2, mmu_idx, ra);
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           /* nothing */
>       } else {
>           /* Swap to host endian if necessary, then store. */
>           if (memop & MO_BSWAP) {
>               val = bswap16(val);
>           }
> -        store_atom_2(cpu, ra, p->haddr, memop, val);
> +        store_atom_2(cpu, ra, p->o.haddr, memop, val);
>       }
>   }
>   
>   static void do_st_4(CPUState *cpu, MMULookupPageData *p, uint32_t val,
>                       int mmu_idx, MemOp memop, uintptr_t ra)
>   {
> -    if (unlikely(p->flags & TLB_MMIO)) {
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
>           if ((memop & MO_BSWAP) != MO_LE) {
>               val = bswap32(val);
>           }
> -        do_st_mmio_leN(cpu, p->full, val, p->addr, 4, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 4, mmu_idx, ra);
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           /* nothing */
>       } else {
>           /* Swap to host endian if necessary, then store. */
>           if (memop & MO_BSWAP) {
>               val = bswap32(val);
>           }
> -        store_atom_4(cpu, ra, p->haddr, memop, val);
> +        store_atom_4(cpu, ra, p->o.haddr, memop, val);
>       }
>   }
>   
>   static void do_st_8(CPUState *cpu, MMULookupPageData *p, uint64_t val,
>                       int mmu_idx, MemOp memop, uintptr_t ra)
>   {
> -    if (unlikely(p->flags & TLB_MMIO)) {
> +    if (unlikely(p->o.flags & TLB_MMIO)) {
>           if ((memop & MO_BSWAP) != MO_LE) {
>               val = bswap64(val);
>           }
> -        do_st_mmio_leN(cpu, p->full, val, p->addr, 8, mmu_idx, ra);
> -    } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
> +        do_st_mmio_leN(cpu, &p->o.full, val, p->addr, 8, mmu_idx, ra);
> +    } else if (unlikely(p->o.flags & TLB_DISCARD_WRITE)) {
>           /* nothing */
>       } else {
>           /* Swap to host endian if necessary, then store. */
>           if (memop & MO_BSWAP) {
>               val = bswap64(val);
>           }
> -        store_atom_8(cpu, ra, p->haddr, memop, val);
> +        store_atom_8(cpu, ra, p->o.haddr, memop, val);
>       }
>   }
>   
> @@ -2822,19 +2818,20 @@ static void do_st16_mmu(CPUState *cpu, vaddr addr, Int128 val,
>       cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
>       crosspage = mmu_lookup(cpu, addr, oi, ra, MMU_DATA_STORE, &l);
>       if (likely(!crosspage)) {
> -        if (unlikely(l.page[0].flags & TLB_MMIO)) {
> +        if (unlikely(l.page[0].o.flags & TLB_MMIO)) {
>               if ((l.memop & MO_BSWAP) != MO_LE) {
>                   val = bswap128(val);
>               }
> -            do_st16_mmio_leN(cpu, l.page[0].full, val, addr, 16, l.mmu_idx, ra);
> -        } else if (unlikely(l.page[0].flags & TLB_DISCARD_WRITE)) {
> +            do_st16_mmio_leN(cpu, &l.page[0].o.full, val, addr,
> +                             16, l.mmu_idx, ra);
> +        } else if (unlikely(l.page[0].o.flags & TLB_DISCARD_WRITE)) {
>               /* nothing */
>           } else {
>               /* Swap to host endian if necessary, then store. */
>               if (l.memop & MO_BSWAP) {
>                   val = bswap128(val);
>               }
> -            store_atom_16(cpu, ra, l.page[0].haddr, l.memop, val);
> +            store_atom_16(cpu, ra, l.page[0].o.haddr, l.memop, val);
>           }
>           return;
>       }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup
  2024-11-14 16:01 ` [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup Richard Henderson
@ 2024-11-14 18:31   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:31 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Reuse most of TLBLookupInput between calls to tlb_lookup.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 65 ++++++++++++++++++----------------------------
>   1 file changed, 25 insertions(+), 40 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 8f459be5a8..981098a6f2 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1695,34 +1695,6 @@ typedef struct MMULookupLocals {
>       int mmu_idx;
>   } MMULookupLocals;
>   
> -/**
> - * mmu_lookup1: translate one page
> - * @cpu: generic cpu state
> - * @data: lookup parameters
> - * @memop: memory operation for the access, or 0
> - * @mmu_idx: virtual address context
> - * @access_type: load/store/code
> - * @ra: return address into tcg generated code, or 0
> - *
> - * Resolve the translation for the one page at @data.addr, filling in
> - * the rest of @data with the results.  If the translation fails,
> - * tlb_fill_align will longjmp out.
> - */
> -static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
> -                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
> -{
> -    TLBLookupInput i = {
> -        .addr = data->addr,
> -        .ra = ra,
> -        .access_type = access_type,
> -        .memop_probe = memop,
> -        .size = data->size,
> -        .mmu_idx = mmu_idx,
> -    };
> -
> -    tlb_lookup_nofail(cpu, &data->o, &i);
> -}
> -
>   /**
>    * mmu_watch_or_dirty
>    * @cpu: generic cpu state
> @@ -1769,26 +1741,36 @@ static void mmu_watch_or_dirty(CPUState *cpu, MMULookupPageData *data,
>   static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>                          uintptr_t ra, MMUAccessType type, MMULookupLocals *l)
>   {
> +    MemOp memop = get_memop(oi);
> +    int mmu_idx = get_mmuidx(oi);
> +    TLBLookupInput i = {
> +        .addr = addr,
> +        .ra = ra,
> +        .access_type = type,
> +        .memop_probe = memop,
> +        .size = memop_size(memop),
> +        .mmu_idx = mmu_idx,
> +    };
>       bool crosspage;
>       int flags;
>   
> -    l->memop = get_memop(oi);
> -    l->mmu_idx = get_mmuidx(oi);
> +    l->memop = memop;
> +    l->mmu_idx = mmu_idx;
>   
> -    tcg_debug_assert(l->mmu_idx < NB_MMU_MODES);
> +    tcg_debug_assert(mmu_idx < NB_MMU_MODES);
>   
>       l->page[0].addr = addr;
> -    l->page[0].size = memop_size(l->memop);
> -    l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK;
> +    l->page[0].size = i.size;
> +    l->page[1].addr = (addr + i.size - 1) & TARGET_PAGE_MASK;
>       l->page[1].size = 0;
>       crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK;
>   
>       if (likely(!crosspage)) {
> -        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
> +        tlb_lookup_nofail(cpu, &l->page[0].o, &i);
>   
>           flags = l->page[0].o.flags;
>           if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
> -            mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
> +            mmu_watch_or_dirty(cpu, &l->page[0], i.access_type, i.ra);
>           }
>           if (unlikely(flags & TLB_BSWAP)) {
>               l->memop ^= MO_BSWAP;
> @@ -1796,17 +1778,20 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       } else {
>           /* Finish compute of page crossing. */
>           int size0 = l->page[1].addr - addr;
> -        l->page[1].size = l->page[0].size - size0;
> +        l->page[1].size = i.size - size0;
>           l->page[0].size = size0;
>   
>           /* Lookup both pages, recognizing exceptions from either. */
> -        mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
> -        mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
> +        i.size = size0;
> +        tlb_lookup_nofail(cpu, &l->page[0].o, &i);
> +        i.addr = l->page[1].addr;
> +        i.size = l->page[1].size;
> +        tlb_lookup_nofail(cpu, &l->page[1].o, &i);
>   
>           flags = l->page[0].o.flags | l->page[1].o.flags;
>           if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
> -            mmu_watch_or_dirty(cpu, &l->page[0], type, ra);
> -            mmu_watch_or_dirty(cpu, &l->page[1], type, ra);
> +            mmu_watch_or_dirty(cpu, &l->page[0], i.access_type, i.ra);
> +            mmu_watch_or_dirty(cpu, &l->page[1], i.access_type, i.ra);
>           }
>   
>           /*

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups
  2024-11-14 16:01 ` [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
@ 2024-11-14 18:32   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:32 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Because translation is special, we don't need the speed
> of the direct-mapped softmmu tlb.  We cache a lookups in
> DisasContextBase within the translator loop anyway.
> 
> Drop the addr_code comparator from CPUTLBEntry.
> Go directly to the IntervalTree for MMU_INST_FETCH.
> Derive exec flags from read flags.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/cpu-all.h    |  3 ++
>   include/exec/tlb-common.h |  5 ++-
>   accel/tcg/cputlb.c        | 76 ++++++++++++++++++++++++---------------
>   3 files changed, 52 insertions(+), 32 deletions(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 45e6676938..ad160c328a 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -339,6 +339,9 @@ static inline int cpu_mmu_index(CPUState *cs, bool ifetch)
>       (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO \
>       | TLB_FORCE_SLOW | TLB_DISCARD_WRITE)
>   
> +/* Filter read flags to exec flags. */
> +#define TLB_EXEC_FLAGS_MASK  (TLB_MMIO)
> +
>   /*
>    * Flags stored in CPUTLBEntryFull.slow_flags[x].
>    * TLB_FORCE_SLOW must be set in CPUTLBEntry.addr_idx[x].
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index 300f9fae67..feaa471299 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -26,7 +26,6 @@ typedef union CPUTLBEntry {
>       struct {
>           uint64_t addr_read;
>           uint64_t addr_write;
> -        uint64_t addr_code;
>           /*
>            * Addend to virtual address to get host address.  IO accesses
>            * use the corresponding iotlb value.
> @@ -35,7 +34,7 @@ typedef union CPUTLBEntry {
>       };
>       /*
>        * Padding to get a power of two size, as well as index
> -     * access to addr_{read,write,code}.
> +     * access to addr_{read,write}.
>        */
>       uint64_t addr_idx[(1 << CPU_TLB_ENTRY_BITS) / sizeof(uint64_t)];
>   } CPUTLBEntry;
> @@ -92,7 +91,7 @@ struct CPUTLBEntryFull {
>        * Additional tlb flags for use by the slow path. If non-zero,
>        * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
>        */
> -    uint8_t slow_flags[MMU_ACCESS_COUNT];
> +    uint8_t slow_flags[2];
>   
>       /*
>        * Allow target-specific additions to this structure.
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 981098a6f2..be2ea1bc70 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -114,8 +114,9 @@ static inline uint64_t tlb_read_idx(const CPUTLBEntry *entry,
>                         MMU_DATA_LOAD * sizeof(uint64_t));
>       QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
>                         MMU_DATA_STORE * sizeof(uint64_t));
> -    QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
> -                      MMU_INST_FETCH * sizeof(uint64_t));
> +
> +    tcg_debug_assert(access_type == MMU_DATA_LOAD ||
> +                     access_type == MMU_DATA_STORE);
>   
>   #if TARGET_LONG_BITS == 32
>       /* Use qatomic_read, in case of addr_write; only care about low bits. */
> @@ -480,8 +481,7 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
>       mask &= TARGET_PAGE_MASK | TLB_INVALID_MASK;
>   
>       return (page == (tlb_entry->addr_read & mask) ||
> -            page == (tlb_addr_write(tlb_entry) & mask) ||
> -            page == (tlb_entry->addr_code & mask));
> +            page == (tlb_addr_write(tlb_entry) & mask));
>   }
>   
>   /* Called with tlb_c.lock held */
> @@ -1184,9 +1184,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       /* Now calculate the new entry */
>       node->copy.addend = addend - addr_page;
>   
> -    tlb_set_compare(full, &node->copy, addr_page, read_flags,
> -                    MMU_INST_FETCH, prot & PAGE_EXEC);
> -
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
>       }
> @@ -1308,22 +1305,30 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>       /* Primary lookup in the fast tlb. */
>       entry = tlbfast_entry(fast, addr);
>       full = &desc->fulltlb[tlbfast_index(fast, addr)];
> -    cmp = tlb_read_idx(entry, access_type);
> -    if (tlb_hit(cmp, addr)) {
> -        goto found;
> +    if (access_type != MMU_INST_FETCH) {
> +        cmp = tlb_read_idx(entry, access_type);
> +        if (tlb_hit(cmp, addr)) {
> +            goto found_data;
> +        }
>       }
>   
>       /* Secondary lookup in the IntervalTree. */
>       node = tlbtree_lookup_addr(desc, addr);
>       if (node) {
> -        cmp = tlb_read_idx(&node->copy, access_type);
> -        if (tlb_hit(cmp, addr)) {
> -            /* Install the cached entry. */
> -            qemu_spin_lock(&cpu->neg.tlb.c.lock);
> -            copy_tlb_helper_locked(entry, &node->copy);
> -            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> -            *full = node->full;
> -            goto found;
> +        if (access_type == MMU_INST_FETCH) {
> +            if (node->full.prot & PAGE_EXEC) {
> +                goto found_code;
> +            }
> +        } else {
> +            cmp = tlb_read_idx(&node->copy, access_type);
> +            if (tlb_hit(cmp, addr)) {
> +                /* Install the cached entry. */
> +                qemu_spin_lock(&cpu->neg.tlb.c.lock);
> +                copy_tlb_helper_locked(entry, &node->copy);
> +                qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> +                *full = node->full;
> +                goto found_data;
> +            }
>           }
>       }
>   
> @@ -1333,9 +1338,14 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>           tcg_debug_assert(probe);
>           return false;
>       }
> -
>       o->did_tlb_fill = true;
>   
> +    if (access_type == MMU_INST_FETCH) {
> +        node = tlbtree_lookup_addr(desc, addr);
> +        tcg_debug_assert(node);
> +        goto found_code;
> +    }
> +
>       entry = tlbfast_entry(fast, addr);
>       full = &desc->fulltlb[tlbfast_index(fast, addr)];
>       cmp = tlb_read_idx(entry, access_type);
> @@ -1345,14 +1355,29 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>        * called tlb_fill_align, so we know that this entry *is* valid.
>        */
>       flags &= ~TLB_INVALID_MASK;
> +    goto found_data;
> +
> + found_data:
> +    flags &= cmp;
> +    flags |= full->slow_flags[access_type];
> +    o->flags = flags;
> +    o->full = *full;
> +    o->haddr = (void *)((uintptr_t)addr + entry->addend);
>       goto done;
>   
> - found:
> -    /* Alignment has not been checked by tlb_fill_align. */
> -    {
> + found_code:
> +    o->flags = node->copy.addr_read & TLB_EXEC_FLAGS_MASK;
> +    o->full = node->full;
> +    o->haddr = (void *)((uintptr_t)addr + node->copy.addend);
> +    goto done;
> +
> + done:
> +    if (!o->did_tlb_fill) {
>           int a_bits = memop_alignment_bits(memop);
>   
>           /*
> +         * Alignment has not been checked by tlb_fill_align.
> +         *
>            * The TLB_CHECK_ALIGNED check differs from the normal alignment
>            * check, in that this is based on the atomicity of the operation.
>            * The intended use case is the ARM memory type field of each PTE,
> @@ -1366,13 +1391,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>               cpu_unaligned_access(cpu, addr, access_type, i->mmu_idx, i->ra);
>           }
>       }
> -
> - done:
> -    flags &= cmp;
> -    flags |= full->slow_flags[access_type];
> -    o->flags = flags;
> -    o->full = *full;
> -    o->haddr = (void *)((uintptr_t)addr + entry->addend);
>       return true;
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  2024-11-14 16:01 ` [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
@ 2024-11-14 18:39   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:39 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Link from the fast tlb entry to the interval tree node.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/tlb-common.h |  2 ++
>   accel/tcg/cputlb.c        | 26 +++++++++++++-------------
>   2 files changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index feaa471299..3b57d61112 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -31,6 +31,8 @@ typedef union CPUTLBEntry {
>            * use the corresponding iotlb value.
>            */
>           uintptr_t addend;
> +        /* The defining IntervalTree entry. */
> +        struct CPUTLBEntryTree *tree;
>       };
>       /*
>        * Padding to get a power of two size, as well as index
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index be2ea1bc70..3282436752 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -490,7 +490,10 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
>                                           vaddr mask)
>   {
>       if (tlb_hit_page_mask_anyprot(tlb_entry, page, mask)) {
> -        memset(tlb_entry, -1, sizeof(*tlb_entry));
> +        tlb_entry->addr_read = -1;
> +        tlb_entry->addr_write = -1;
> +        tlb_entry->addend = 0;
> +        tlb_entry->tree = NULL;
>           return true;
>       }
>       return false;
> @@ -1183,6 +1186,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>   
>       /* Now calculate the new entry */
>       node->copy.addend = addend - addr_page;
> +    node->copy.tree = node;
>   
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
> @@ -1291,7 +1295,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>       CPUTLBDescFast *fast = &cpu->neg.tlb.f[i->mmu_idx];
>       vaddr addr = i->addr;
>       MMUAccessType access_type = i->access_type;
> -    CPUTLBEntryFull *full;
>       CPUTLBEntryTree *node;
>       CPUTLBEntry *entry;
>       uint64_t cmp;
> @@ -1304,9 +1307,9 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>   
>       /* Primary lookup in the fast tlb. */
>       entry = tlbfast_entry(fast, addr);
> -    full = &desc->fulltlb[tlbfast_index(fast, addr)];
>       if (access_type != MMU_INST_FETCH) {
>           cmp = tlb_read_idx(entry, access_type);
> +        node = entry->tree;
>           if (tlb_hit(cmp, addr)) {
>               goto found_data;
>           }
> @@ -1326,7 +1329,6 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>                   qemu_spin_lock(&cpu->neg.tlb.c.lock);
>                   copy_tlb_helper_locked(entry, &node->copy);
>                   qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> -                *full = node->full;
>                   goto found_data;
>               }
>           }
> @@ -1347,8 +1349,8 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>       }
>   
>       entry = tlbfast_entry(fast, addr);
> -    full = &desc->fulltlb[tlbfast_index(fast, addr)];
>       cmp = tlb_read_idx(entry, access_type);
> +    node = entry->tree;
>       /*
>        * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
>        * to force the next access through tlb_fill_align.  We've just
> @@ -1359,19 +1361,18 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>   
>    found_data:
>       flags &= cmp;
> -    flags |= full->slow_flags[access_type];
> +    flags |= node->full.slow_flags[access_type];
>       o->flags = flags;
> -    o->full = *full;
> -    o->haddr = (void *)((uintptr_t)addr + entry->addend);
> -    goto done;
> +    goto found_common;
>   
>    found_code:
>       o->flags = node->copy.addr_read & TLB_EXEC_FLAGS_MASK;
> +    goto found_common;
> +
> + found_common:
>       o->full = node->full;
>       o->haddr = (void *)((uintptr_t)addr + node->copy.addend);
> -    goto done;
>   
> - done:
>       if (!o->did_tlb_fill) {
>           int a_bits = memop_alignment_bits(memop);
>   
> @@ -1669,7 +1670,6 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
>                          bool is_store, struct qemu_plugin_hwaddr *data)
>   {
>       CPUTLBEntry *tlbe = tlb_entry(cpu, mmu_idx, addr);
> -    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       MMUAccessType access_type = is_store ? MMU_DATA_STORE : MMU_DATA_LOAD;
>       uint64_t tlb_addr = tlb_read_idx(tlbe, access_type);
>       CPUTLBEntryFull *full;
> @@ -1678,7 +1678,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
>           return false;
>       }
>   
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +    full = &tlbe->tree->full;
>       data->phys_addr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
>   
>       /* We must have an iotlb entry for MMIO */

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb
  2024-11-14 16:01 ` [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
@ 2024-11-14 18:49   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> This array is now write-only, and may be removed.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  1 -
>   accel/tcg/cputlb.c    | 34 +++++++---------------------------
>   2 files changed, 7 insertions(+), 28 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 4364ddb1db..5c069f2a00 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -219,7 +219,6 @@ typedef struct CPUTLBDesc {
>       /* maximum number of entries observed in the window */
>       size_t window_max_entries;
>       size_t n_used_entries;
> -    CPUTLBEntryFull *fulltlb;
>       /* All active tlb entries for this address space. */
>       IntervalTreeRoot iroot;
>   } CPUTLBDesc;
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 3282436752..7f63dc3fd8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -149,13 +149,6 @@ static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
>       return fast->table + tlbfast_index(fast, addr);
>   }
>   
> -/* Find the TLB index corresponding to the mmu_idx + address pair.  */
> -static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
> -                                  vaddr addr)
> -{
> -    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
> -}
> -
>   /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
>   static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
>                                        vaddr addr)
> @@ -270,22 +263,20 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>       }
>   
>       g_free(fast->table);
> -    g_free(desc->fulltlb);
>   
>       tlb_window_reset(desc, now, 0);
>       /* desc->n_used_entries is cleared by the caller */
>       fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_try_new(CPUTLBEntry, new_size);
> -    desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
>   
>       /*
> -     * If the allocations fail, try smaller sizes. We just freed some
> +     * If the allocation fails, try smaller sizes. We just freed some
>        * memory, so going back to half of new_size has a good chance of working.
>        * Increased memory pressure elsewhere in the system might cause the
>        * allocations to fail though, so we progressively reduce the allocation
>        * size, aborting if we cannot even allocate the smallest TLB we support.
>        */
> -    while (fast->table == NULL || desc->fulltlb == NULL) {
> +    while (fast->table == NULL) {
>           if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
>               error_report("%s: %s", __func__, strerror(errno));
>               abort();
> @@ -294,9 +285,7 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>           fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
>   
>           g_free(fast->table);
> -        g_free(desc->fulltlb);
>           fast->table = g_try_new(CPUTLBEntry, new_size);
> -        desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
>       }
>   }
>   
> @@ -350,7 +339,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
>       desc->n_used_entries = 0;
>       fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_new(CPUTLBEntry, n_entries);
> -    desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
>       memset(&desc->iroot, 0, sizeof(desc->iroot));
>       tlb_mmu_flush_locked(desc, fast);
>   }
> @@ -372,15 +360,9 @@ void tlb_init(CPUState *cpu)
>   
>   void tlb_destroy(CPUState *cpu)
>   {
> -    int i;
> -
>       qemu_spin_destroy(&cpu->neg.tlb.c.lock);
> -    for (i = 0; i < NB_MMU_MODES; i++) {
> -        CPUTLBDesc *desc = &cpu->neg.tlb.d[i];
> -        CPUTLBDescFast *fast = &cpu->neg.tlb.f[i];
> -
> -        g_free(fast->table);
> -        g_free(desc->fulltlb);
> +    for (int i = 0; i < NB_MMU_MODES; i++) {
> +        g_free(cpu->neg.tlb.f[i].table);
>           interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
>                                    offsetof(CPUTLBEntryTree, itree));
>       }
> @@ -1061,7 +1043,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       CPUTLB *tlb = &cpu->neg.tlb;
>       CPUTLBDesc *desc = &tlb->d[mmu_idx];
>       MemoryRegionSection *section;
> -    unsigned int index, read_flags, write_flags;
> +    unsigned int read_flags, write_flags;
>       uintptr_t addend;
>       CPUTLBEntry *te;
>       CPUTLBEntryTree *node;
> @@ -1140,7 +1122,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       wp_flags = cpu_watchpoint_address_matches(cpu, addr_page,
>                                                 TARGET_PAGE_SIZE);
>   
> -    index = tlb_index(cpu, mmu_idx, addr_page);
>       te = tlb_entry(cpu, mmu_idx, addr_page);
>   
>       /*
> @@ -1179,8 +1160,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>        * subtract here is that of the page base, and not the same as the
>        * vaddr we add back in io_prepare()/get_page_addr_code().
>        */
> -    desc->fulltlb[index] = *full;
> -    full = &desc->fulltlb[index];
> +    node->full = *full;
> +    full = &node->full;
>       full->xlat_section = iotlb - addr_page;
>       full->phys_addr = paddr_page;
>   
> @@ -1203,7 +1184,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       tlb_set_compare(full, &node->copy, addr_page, write_flags,
>                       MMU_DATA_STORE, prot & PAGE_WRITE);
>   
> -    node->full = *full;
>       copy_tlb_helper_locked(te, &node->copy);
>       desc->n_used_entries++;
>       qemu_spin_unlock(&tlb->c.lock);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/alpha/cpu.h    |  6 +++---
>   target/alpha/cpu.c    |  2 +-
>   target/alpha/helper.c | 23 +++++++++++++++++------
>   3 files changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
> index 3556d3227f..70331c0b83 100644
> --- a/target/alpha/cpu.h
> +++ b/target/alpha/cpu.h
> @@ -449,9 +449,9 @@ void alpha_cpu_record_sigsegv(CPUState *cs, vaddr address,
>   void alpha_cpu_record_sigbus(CPUState *cs, vaddr address,
>                                MMUAccessType access_type, uintptr_t retaddr);
>   #else
> -bool alpha_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr);
> +bool alpha_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
> +                              MMUAccessType access_type, int mmu_idx,
> +                              MemOp memop, int size, bool probe, uintptr_t ra);
>   G_NORETURN void alpha_cpu_do_unaligned_access(CPUState *cpu, vaddr addr,
>                                                 MMUAccessType access_type, int mmu_idx,
>                                                 uintptr_t retaddr);
> diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
> index 5d75c941f7..7bcc48420d 100644
> --- a/target/alpha/cpu.c
> +++ b/target/alpha/cpu.c
> @@ -228,7 +228,7 @@ static const TCGCPUOps alpha_tcg_ops = {
>       .record_sigsegv = alpha_cpu_record_sigsegv,
>       .record_sigbus = alpha_cpu_record_sigbus,
>   #else
> -    .tlb_fill = alpha_cpu_tlb_fill,
> +    .tlb_fill_align = alpha_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = alpha_cpu_exec_interrupt,
>       .cpu_exec_halt = alpha_cpu_has_work,
>       .do_interrupt = alpha_cpu_do_interrupt,
> diff --git a/target/alpha/helper.c b/target/alpha/helper.c
> index 2f1000c99f..26eadfe3ca 100644
> --- a/target/alpha/helper.c
> +++ b/target/alpha/helper.c
> @@ -294,14 +294,21 @@ hwaddr alpha_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
>       return (fail >= 0 ? -1 : phys);
>   }
>   
> -bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr)
> +bool alpha_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
> +                              MMUAccessType access_type, int mmu_idx,
> +                              MemOp memop, int size, bool probe, uintptr_t ra)
>   {
>       CPUAlphaState *env = cpu_env(cs);
>       target_ulong phys;
>       int prot, fail;
>   
> +    if (addr & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        alpha_cpu_do_unaligned_access(cs, addr, access_type, mmu_idx, ra);
> +    }
> +
>       fail = get_physical_address(env, addr, 1 << access_type,
>                                   mmu_idx, &phys, &prot);
>       if (unlikely(fail >= 0)) {
> @@ -314,11 +321,15 @@ bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
>           env->trap_arg2 = (access_type == MMU_DATA_LOAD ? 0ull :
>                             access_type == MMU_DATA_STORE ? 1ull :
>                             /* access_type == MMU_INST_FETCH */ -1ull);
> -        cpu_loop_exit_restore(cs, retaddr);
> +        cpu_loop_exit_restore(cs, ra);
>       }
>   
> -    tlb_set_page(cs, addr & TARGET_PAGE_MASK, phys & TARGET_PAGE_MASK,
> -                 prot, mmu_idx, TARGET_PAGE_SIZE);
> +    memset(out, 0, sizeof(*out));
> +    out->phys_addr = phys;
> +    out->prot = prot;
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
> +    out->lg_page_size = TARGET_PAGE_BITS;
> +
>       return true;
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 35/54] target/avr: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 35/54] target/avr: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/avr/cpu.h    |  7 ++++---
>   target/avr/cpu.c    |  2 +-
>   target/avr/helper.c | 19 ++++++++++++-------
>   3 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/target/avr/cpu.h b/target/avr/cpu.h
> index 4725535102..cdd3bcd418 100644
> --- a/target/avr/cpu.h
> +++ b/target/avr/cpu.h
> @@ -23,6 +23,7 @@
>   
>   #include "cpu-qom.h"
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   
>   #ifdef CONFIG_USER_ONLY
>   #error "AVR 8-bit does not support user mode"
> @@ -238,9 +239,9 @@ static inline void cpu_set_sreg(CPUAVRState *env, uint8_t sreg)
>       env->sregI = (sreg >> 7) & 0x01;
>   }
>   
> -bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr);
> +bool avr_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
> +                            MMUAccessType access_type, int mmu_idx,
> +                            MemOp memop, int size, bool probe, uintptr_t ra);
>   
>   #include "exec/cpu-all.h"
>   
> diff --git a/target/avr/cpu.c b/target/avr/cpu.c
> index 3132842d56..a7fe869396 100644
> --- a/target/avr/cpu.c
> +++ b/target/avr/cpu.c
> @@ -211,7 +211,7 @@ static const TCGCPUOps avr_tcg_ops = {
>       .restore_state_to_opc = avr_restore_state_to_opc,
>       .cpu_exec_interrupt = avr_cpu_exec_interrupt,
>       .cpu_exec_halt = avr_cpu_has_work,
> -    .tlb_fill = avr_cpu_tlb_fill,
> +    .tlb_fill_align = avr_cpu_tlb_fill_align,
>       .do_interrupt = avr_cpu_do_interrupt,
>   };
>   
> diff --git a/target/avr/helper.c b/target/avr/helper.c
> index 345708a1b3..a18f11aa9f 100644
> --- a/target/avr/helper.c
> +++ b/target/avr/helper.c
> @@ -104,11 +104,11 @@ hwaddr avr_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
>       return addr; /* I assume 1:1 address correspondence */
>   }
>   
> -bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr)
> +bool avr_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
> +                            MMUAccessType access_type, int mmu_idx,
> +                            MemOp memop, int size, bool probe, uintptr_t ra)
>   {
> -    int prot, page_size = TARGET_PAGE_SIZE;
> +    int prot, lg_page_size = TARGET_PAGE_BITS;
>       uint32_t paddr;
>   
>       address &= TARGET_PAGE_MASK;
> @@ -141,15 +141,20 @@ bool avr_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>                * to force tlb_fill to be called for the next access.
>                */
>               if (probe) {
> -                page_size = 1;
> +                lg_page_size = 0;
>               } else {
>                   cpu_env(cs)->fullacc = 1;
> -                cpu_loop_exit_restore(cs, retaddr);
> +                cpu_loop_exit_restore(cs, ra);
>               }
>           }
>       }
>   
> -    tlb_set_page(cs, address, paddr, prot, mmu_idx, page_size);
> +    memset(out, 0, sizeof(*out));
> +    out->phys_addr = paddr;
> +    out->prot = prot;
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
> +    out->lg_page_size = lg_page_size;
> +
>       return true;
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 36/54] target/i386: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 36/54] target/i386: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/i386/tcg/helper-tcg.h         |  6 +++---
>   target/i386/tcg/sysemu/excp_helper.c | 28 ++++++++++++++++------------
>   target/i386/tcg/tcg-cpu.c            |  2 +-
>   3 files changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/target/i386/tcg/helper-tcg.h b/target/i386/tcg/helper-tcg.h
> index 696d6ef016..b2164f41e6 100644
> --- a/target/i386/tcg/helper-tcg.h
> +++ b/target/i386/tcg/helper-tcg.h
> @@ -79,9 +79,9 @@ void x86_cpu_record_sigsegv(CPUState *cs, vaddr addr,
>   void x86_cpu_record_sigbus(CPUState *cs, vaddr addr,
>                              MMUAccessType access_type, uintptr_t ra);
>   #else
> -bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr);
> +bool x86_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
> +                            MMUAccessType access_type, int mmu_idx,
> +                            MemOp memop, int size, bool probe, uintptr_t ra);
>   G_NORETURN void x86_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
>                                               MMUAccessType access_type,
>                                               int mmu_idx, uintptr_t retaddr);
> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
> index 168ff8e5f3..d23d28fef5 100644
> --- a/target/i386/tcg/sysemu/excp_helper.c
> +++ b/target/i386/tcg/sysemu/excp_helper.c
> @@ -601,25 +601,29 @@ static bool get_physical_address(CPUX86State *env, vaddr addr,
>       return true;
>   }
>   
> -bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr)
> +bool x86_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *full, vaddr addr,
> +                            MMUAccessType access_type, int mmu_idx,
> +                            MemOp memop, int size, bool probe,
> +                            uintptr_t retaddr)
>   {
>       CPUX86State *env = cpu_env(cs);
>       TranslateResult out;
>       TranslateFault err;
>   
> +    if (addr & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        x86_cpu_do_unaligned_access(cs, addr, access_type, mmu_idx, retaddr);
> +    }
> +
>       if (get_physical_address(env, addr, access_type, mmu_idx, &out, &err,
>                                retaddr)) {
> -        /*
> -         * Even if 4MB pages, we map only one 4KB page in the cache to
> -         * avoid filling it too fast.
> -         */
> -        assert(out.prot & (1 << access_type));
> -        tlb_set_page_with_attrs(cs, addr & TARGET_PAGE_MASK,
> -                                out.paddr & TARGET_PAGE_MASK,
> -                                cpu_get_mem_attrs(env),
> -                                out.prot, mmu_idx, out.page_size);
> +        memset(full, 0, sizeof(*full));
> +        full->phys_addr = out.paddr;
> +        full->prot = out.prot;
> +        full->lg_page_size = ctz32(out.page_size);
> +        full->attrs = cpu_get_mem_attrs(env);
>           return true;
>       }
>   
> diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
> index cca19cd40e..6fce6227c7 100644
> --- a/target/i386/tcg/tcg-cpu.c
> +++ b/target/i386/tcg/tcg-cpu.c
> @@ -117,7 +117,7 @@ static const TCGCPUOps x86_tcg_ops = {
>       .record_sigsegv = x86_cpu_record_sigsegv,
>       .record_sigbus = x86_cpu_record_sigbus,
>   #else
> -    .tlb_fill = x86_cpu_tlb_fill,
> +    .tlb_fill_align = x86_cpu_tlb_fill_align,
>       .do_interrupt = x86_cpu_do_interrupt,
>       .cpu_exec_halt = x86_cpu_exec_halt,
>       .cpu_exec_interrupt = x86_cpu_exec_interrupt,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 37/54] target/loongarch: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 37/54] target/loongarch: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/loongarch/internals.h      |  7 ++++---
>   target/loongarch/cpu.c            |  2 +-
>   target/loongarch/tcg/tlb_helper.c | 17 +++++++++++------
>   3 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h
> index 1a02427627..a9f73f27b2 100644
> --- a/target/loongarch/internals.h
> +++ b/target/loongarch/internals.h
> @@ -60,9 +60,10 @@ int get_physical_address(CPULoongArchState *env, hwaddr *physical,
>   hwaddr loongarch_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
>   
>   #ifdef CONFIG_TCG
> -bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                            MMUAccessType access_type, int mmu_idx,
> -                            bool probe, uintptr_t retaddr);
> +bool loongarch_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                  vaddr addr, MMUAccessType access_type,
> +                                  int mmu_idx, MemOp memop, int size,
> +                                  bool probe, uintptr_t ra);
>   #endif
>   #endif /* !CONFIG_USER_ONLY */
>   
> diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
> index 57cc4f314b..47d69f1788 100644
> --- a/target/loongarch/cpu.c
> +++ b/target/loongarch/cpu.c
> @@ -798,7 +798,7 @@ static const TCGCPUOps loongarch_tcg_ops = {
>       .restore_state_to_opc = loongarch_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = loongarch_cpu_tlb_fill,
> +    .tlb_fill_align = loongarch_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = loongarch_cpu_exec_interrupt,
>       .cpu_exec_halt = loongarch_cpu_has_work,
>       .do_interrupt = loongarch_cpu_do_interrupt,
> diff --git a/target/loongarch/tcg/tlb_helper.c b/target/loongarch/tcg/tlb_helper.c
> index 97f38fc391..94d5df08a4 100644
> --- a/target/loongarch/tcg/tlb_helper.c
> +++ b/target/loongarch/tcg/tlb_helper.c
> @@ -474,9 +474,10 @@ void helper_invtlb_page_asid_or_g(CPULoongArchState *env,
>       tlb_flush(env_cpu(env));
>   }
>   
> -bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                            MMUAccessType access_type, int mmu_idx,
> -                            bool probe, uintptr_t retaddr)
> +bool loongarch_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                  vaddr address, MMUAccessType access_type,
> +                                  int mmu_idx, MemOp memop, int size,
> +                                  bool probe, uintptr_t retaddr)
>   {
>       CPULoongArchState *env = cpu_env(cs);
>       hwaddr physical;
> @@ -488,12 +489,16 @@ bool loongarch_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>                                  access_type, mmu_idx);
>   
>       if (ret == TLBRET_MATCH) {
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                     physical & TARGET_PAGE_MASK, prot,
> -                     mmu_idx, TARGET_PAGE_SIZE);
>           qemu_log_mask(CPU_LOG_MMU,
>                         "%s address=%" VADDR_PRIx " physical " HWADDR_FMT_plx
>                         " prot %d\n", __func__, address, physical, prot);
> +
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = physical;
> +        out->prot = prot;
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
> +        out->lg_page_size = TARGET_PAGE_BITS;
> +
>           return true;
>       } else {
>           qemu_log_mask(CPU_LOG_MMU,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 38/54] target/m68k: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 38/54] target/m68k: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/m68k/cpu.h    |  7 ++++---
>   target/m68k/cpu.c    |  2 +-
>   target/m68k/helper.c | 22 +++++++++++++---------
>   3 files changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
> index b5bbeedb7a..4401426a0b 100644
> --- a/target/m68k/cpu.h
> +++ b/target/m68k/cpu.h
> @@ -22,6 +22,7 @@
>   #define M68K_CPU_H
>   
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   #include "qemu/cpu-float.h"
>   #include "cpu-qom.h"
>   
> @@ -582,10 +583,10 @@ enum {
>   #define MMU_KERNEL_IDX 0
>   #define MMU_USER_IDX 1
>   
> -bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr);
>   #ifndef CONFIG_USER_ONLY
> +bool m68k_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr addr,
> +                             MMUAccessType access_type, int mmu_idx,
> +                             MemOp memop, int size, bool probe, uintptr_t ra);
>   void m68k_cpu_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
>                                    unsigned size, MMUAccessType access_type,
>                                    int mmu_idx, MemTxAttrs attrs,
> diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
> index 5fe335558a..5316cf8922 100644
> --- a/target/m68k/cpu.c
> +++ b/target/m68k/cpu.c
> @@ -550,7 +550,7 @@ static const TCGCPUOps m68k_tcg_ops = {
>       .restore_state_to_opc = m68k_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = m68k_cpu_tlb_fill,
> +    .tlb_fill_align = m68k_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = m68k_cpu_exec_interrupt,
>       .cpu_exec_halt = m68k_cpu_has_work,
>       .do_interrupt = m68k_cpu_do_interrupt,
> diff --git a/target/m68k/helper.c b/target/m68k/helper.c
> index 9bfc6ae97c..1decb6f39c 100644
> --- a/target/m68k/helper.c
> +++ b/target/m68k/helper.c
> @@ -950,9 +950,10 @@ void m68k_set_irq_level(M68kCPU *cpu, int level, uint8_t vector)
>       }
>   }
>   
> -bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType qemu_access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr)
> +bool m68k_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                             vaddr address, MMUAccessType qemu_access_type,
> +                             int mmu_idx, MemOp memop, int size,
> +                             bool probe, uintptr_t retaddr)
>   {
>       CPUM68KState *env = cpu_env(cs);
>       hwaddr physical;
> @@ -961,12 +962,14 @@ bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>       int ret;
>       target_ulong page_size;
>   
> +    memset(out, 0, sizeof(*out));
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
> +
>       if ((env->mmu.tcr & M68K_TCR_ENABLED) == 0) {
>           /* MMU disabled */
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                     address & TARGET_PAGE_MASK,
> -                     PAGE_READ | PAGE_WRITE | PAGE_EXEC,
> -                     mmu_idx, TARGET_PAGE_SIZE);
> +        out->phys_addr = address;
> +        out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> +        out->lg_page_size = TARGET_PAGE_BITS;
>           return true;
>       }
>   
> @@ -985,8 +988,9 @@ bool m68k_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>       ret = get_physical_address(env, &physical, &prot,
>                                  address, access_type, &page_size);
>       if (likely(ret == 0)) {
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                     physical & TARGET_PAGE_MASK, prot, mmu_idx, page_size);
> +        out->phys_addr = physical;
> +        out->prot = prot;
> +        out->lg_page_size = ctz32(page_size);
>           return true;
>       }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest
  2024-11-14 16:01 ` [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> The entire operation of ptest is performed within
> get_physical_address as part of ACCESS_PTEST.
> There is no need to install the page into softmmu.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/m68k/helper.c | 10 +---------
>   1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/target/m68k/helper.c b/target/m68k/helper.c
> index 1decb6f39c..0a54eca9bb 100644
> --- a/target/m68k/helper.c
> +++ b/target/m68k/helper.c
> @@ -1460,7 +1460,6 @@ void HELPER(ptest)(CPUM68KState *env, uint32_t addr, uint32_t is_read)
>       hwaddr physical;
>       int access_type;
>       int prot;
> -    int ret;
>       target_ulong page_size;
>   
>       access_type = ACCESS_PTEST;
> @@ -1476,14 +1475,7 @@ void HELPER(ptest)(CPUM68KState *env, uint32_t addr, uint32_t is_read)
>   
>       env->mmu.mmusr = 0;
>       env->mmu.ssw = 0;
> -    ret = get_physical_address(env, &physical, &prot, addr,
> -                               access_type, &page_size);
> -    if (ret == 0) {
> -        tlb_set_page(env_cpu(env), addr & TARGET_PAGE_MASK,
> -                     physical & TARGET_PAGE_MASK,
> -                     prot, access_type & ACCESS_SUPER ?
> -                     MMU_KERNEL_IDX : MMU_USER_IDX, page_size);
> -    }
> +    get_physical_address(env, &physical, &prot, addr, access_type, &page_size);
>   }
>   
>   void HELPER(pflush)(CPUM68KState *env, uint32_t addr, uint32_t opmode)

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/microblaze/cpu.h    |  7 +++----
>   target/microblaze/cpu.c    |  2 +-
>   target/microblaze/helper.c | 33 ++++++++++++++++++++-------------
>   3 files changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h
> index 3e5a3e5c60..b0eadfd9b1 100644
> --- a/target/microblaze/cpu.h
> +++ b/target/microblaze/cpu.h
> @@ -421,10 +421,9 @@ static inline void cpu_get_tb_cpu_state(CPUMBState *env, vaddr *pc,
>   }
>   
>   #if !defined(CONFIG_USER_ONLY)
> -bool mb_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                     MMUAccessType access_type, int mmu_idx,
> -                     bool probe, uintptr_t retaddr);
> -
> +bool mb_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
> +                            MMUAccessType access_type, int mmu_idx,
> +                           MemOp memop, int size, bool probe, uintptr_t ra);
>   void mb_cpu_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
>                                  unsigned size, MMUAccessType access_type,
>                                  int mmu_idx, MemTxAttrs attrs,
> diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
> index 710eb1146c..212cad2143 100644
> --- a/target/microblaze/cpu.c
> +++ b/target/microblaze/cpu.c
> @@ -425,7 +425,7 @@ static const TCGCPUOps mb_tcg_ops = {
>       .restore_state_to_opc = mb_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = mb_cpu_tlb_fill,
> +    .tlb_fill_align = mb_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = mb_cpu_exec_interrupt,
>       .cpu_exec_halt = mb_cpu_has_work,
>       .do_interrupt = mb_cpu_do_interrupt,
> diff --git a/target/microblaze/helper.c b/target/microblaze/helper.c
> index 5d3259ce31..b6375564b4 100644
> --- a/target/microblaze/helper.c
> +++ b/target/microblaze/helper.c
> @@ -36,37 +36,44 @@ static bool mb_cpu_access_is_secure(MicroBlazeCPU *cpu,
>       }
>   }
>   
> -bool mb_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                     MMUAccessType access_type, int mmu_idx,
> -                     bool probe, uintptr_t retaddr)
> +bool mb_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
> +                           MMUAccessType access_type, int mmu_idx,
> +                           MemOp memop, int size,
> +                           bool probe, uintptr_t retaddr)
>   {
>       MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
>       CPUMBState *env = &cpu->env;
>       MicroBlazeMMULookup lu;
>       unsigned int hit;
> -    int prot;
> -    MemTxAttrs attrs = {};
>   
> -    attrs.secure = mb_cpu_access_is_secure(cpu, access_type);
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        mb_cpu_do_unaligned_access(cs, address, access_type, mmu_idx, retaddr);
> +    }
> +
> +    memset(out, 0, sizeof(*out));
> +    out->attrs.secure = mb_cpu_access_is_secure(cpu, access_type);
> +    out->lg_page_size = TARGET_PAGE_BITS;
>   
>       if (mmu_idx == MMU_NOMMU_IDX) {
>           /* MMU disabled or not available.  */
> -        address &= TARGET_PAGE_MASK;
> -        prot = PAGE_RWX;
> -        tlb_set_page_with_attrs(cs, address, address, attrs, prot, mmu_idx,
> -                                TARGET_PAGE_SIZE);
> +        out->phys_addr = address;
> +        out->prot = PAGE_RWX;
>           return true;
>       }
>   
>       hit = mmu_translate(cpu, &lu, address, access_type, mmu_idx);
>       if (likely(hit)) {
> -        uint32_t vaddr = address & TARGET_PAGE_MASK;
> +        uint32_t vaddr = address;
>           uint32_t paddr = lu.paddr + vaddr - lu.vaddr;
>   
>           qemu_log_mask(CPU_LOG_MMU, "MMU map mmu=%d v=%x p=%x prot=%x\n",
>                         mmu_idx, vaddr, paddr, lu.prot);
> -        tlb_set_page_with_attrs(cs, vaddr, paddr, attrs, lu.prot, mmu_idx,
> -                                TARGET_PAGE_SIZE);
> +
> +        out->phys_addr = paddr;
> +        out->prot = lu.prot;
>           return true;
>       }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 41/54] target/mips: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 41/54] target/mips: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/mips/tcg/tcg-internal.h      |  6 +++---
>   target/mips/cpu.c                   |  2 +-
>   target/mips/tcg/sysemu/tlb_helper.c | 29 ++++++++++++++++++++---------
>   3 files changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/target/mips/tcg/tcg-internal.h b/target/mips/tcg/tcg-internal.h
> index aef032c48d..f4b00354af 100644
> --- a/target/mips/tcg/tcg-internal.h
> +++ b/target/mips/tcg/tcg-internal.h
> @@ -61,9 +61,9 @@ void mips_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
>                                       MemTxResult response, uintptr_t retaddr);
>   void cpu_mips_tlb_flush(CPUMIPSState *env);
>   
> -bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr);
> +bool mips_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
> +                             MMUAccessType access_type, int mmu_idx,
> +                             MemOp memop, int size, bool probe, uintptr_t ra);
>   
>   void mips_semihosting(CPUMIPSState *env);
>   
> diff --git a/target/mips/cpu.c b/target/mips/cpu.c
> index d0a43b6d5c..3a453c9285 100644
> --- a/target/mips/cpu.c
> +++ b/target/mips/cpu.c
> @@ -556,7 +556,7 @@ static const TCGCPUOps mips_tcg_ops = {
>       .restore_state_to_opc = mips_restore_state_to_opc,
>   
>   #if !defined(CONFIG_USER_ONLY)
> -    .tlb_fill = mips_cpu_tlb_fill,
> +    .tlb_fill_align = mips_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = mips_cpu_exec_interrupt,
>       .cpu_exec_halt = mips_cpu_has_work,
>       .do_interrupt = mips_cpu_do_interrupt,
> diff --git a/target/mips/tcg/sysemu/tlb_helper.c b/target/mips/tcg/sysemu/tlb_helper.c
> index e98bb95951..ac76396525 100644
> --- a/target/mips/tcg/sysemu/tlb_helper.c
> +++ b/target/mips/tcg/sysemu/tlb_helper.c
> @@ -904,15 +904,28 @@ refill:
>   }
>   #endif
>   
> -bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr)
> +bool mips_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out, vaddr address,
> +                             MMUAccessType access_type, int mmu_idx,
> +                             MemOp memop, int size,
> +                             bool probe, uintptr_t retaddr)
>   {
>       CPUMIPSState *env = cpu_env(cs);
>       hwaddr physical;
>       int prot;
>       int ret = TLBRET_BADADDR;
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        mips_cpu_do_unaligned_access(cs, address, access_type,
> +                                     mmu_idx, retaddr);
> +    }
> +
> +    memset(out, 0, sizeof(*out));
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
> +    out->lg_page_size = TARGET_PAGE_BITS;
> +
>       /* data access */
>       /* XXX: put correct access by using cpu_restore_state() correctly */
>       ret = get_physical_address(env, &physical, &prot, address,
> @@ -930,9 +943,8 @@ bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>           break;
>       }
>       if (ret == TLBRET_MATCH) {
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                     physical & TARGET_PAGE_MASK, prot,
> -                     mmu_idx, TARGET_PAGE_SIZE);
> +        out->phys_addr = physical;
> +        out->prot = prot;
>           return true;
>       }
>   #if !defined(TARGET_MIPS64)
> @@ -948,9 +960,8 @@ bool mips_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>               ret = get_physical_address(env, &physical, &prot, address,
>                                          access_type, mmu_idx);
>               if (ret == TLBRET_MATCH) {
> -                tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                             physical & TARGET_PAGE_MASK, prot,
> -                             mmu_idx, TARGET_PAGE_SIZE);
> +                out->phys_addr = physical;
> +                out->prot = prot;
>                   return true;
>               }
>           }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 42/54] target/openrisc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 42/54] target/openrisc: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/openrisc/cpu.h |  8 +++++---
>   target/openrisc/cpu.c |  2 +-
>   target/openrisc/mmu.c | 39 +++++++++++++++++++++------------------
>   3 files changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
> index c9fe9ae12d..e177ad8b84 100644
> --- a/target/openrisc/cpu.h
> +++ b/target/openrisc/cpu.h
> @@ -22,6 +22,7 @@
>   
>   #include "cpu-qom.h"
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   #include "fpu/softfloat-types.h"
>   
>   /**
> @@ -306,9 +307,10 @@ int print_insn_or1k(bfd_vma addr, disassemble_info *info);
>   #ifndef CONFIG_USER_ONLY
>   hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
>   
> -bool openrisc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                           MMUAccessType access_type, int mmu_idx,
> -                           bool probe, uintptr_t retaddr);
> +bool openrisc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                 vaddr addr, MMUAccessType access_type,
> +                                 int mmu_idx, MemOp memop, int size,
> +                                 bool probe, uintptr_t ra);
>   
>   extern const VMStateDescription vmstate_openrisc_cpu;
>   
> diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
> index b96561d1f2..6aa04ff7d3 100644
> --- a/target/openrisc/cpu.c
> +++ b/target/openrisc/cpu.c
> @@ -237,7 +237,7 @@ static const TCGCPUOps openrisc_tcg_ops = {
>       .restore_state_to_opc = openrisc_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = openrisc_cpu_tlb_fill,
> +    .tlb_fill_align = openrisc_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = openrisc_cpu_exec_interrupt,
>       .cpu_exec_halt = openrisc_cpu_has_work,
>       .do_interrupt = openrisc_cpu_do_interrupt,
> diff --git a/target/openrisc/mmu.c b/target/openrisc/mmu.c
> index c632d5230b..eafab356a6 100644
> --- a/target/openrisc/mmu.c
> +++ b/target/openrisc/mmu.c
> @@ -104,39 +104,42 @@ static void raise_mmu_exception(OpenRISCCPU *cpu, target_ulong address,
>       cpu->env.lock_addr = -1;
>   }
>   
> -bool openrisc_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
> -                           MMUAccessType access_type, int mmu_idx,
> -                           bool probe, uintptr_t retaddr)
> +bool openrisc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                 vaddr addr, MMUAccessType access_type,
> +                                 int mmu_idx, MemOp memop, int size,
> +                                 bool probe, uintptr_t retaddr)
>   {
>       OpenRISCCPU *cpu = OPENRISC_CPU(cs);
> -    int excp = EXCP_DPF;
>       int prot;
>       hwaddr phys_addr;
>   
> +    /* TODO: alignment faults not currently handled. */
> +
>       if (mmu_idx == MMU_NOMMU_IDX) {
>           /* The mmu is disabled; lookups never fail.  */
>           get_phys_nommu(&phys_addr, &prot, addr);
> -        excp = 0;
>       } else {
>           bool super = mmu_idx == MMU_SUPERVISOR_IDX;
>           int need = (access_type == MMU_INST_FETCH ? PAGE_EXEC
>                       : access_type == MMU_DATA_STORE ? PAGE_WRITE
>                       : PAGE_READ);
> -        excp = get_phys_mmu(cpu, &phys_addr, &prot, addr, need, super);
> +        int excp = get_phys_mmu(cpu, &phys_addr, &prot, addr, need, super);
> +
> +        if (unlikely(excp)) {
> +            if (probe) {
> +                return false;
> +            }
> +            raise_mmu_exception(cpu, addr, excp);
> +            cpu_loop_exit_restore(cs, retaddr);
> +        }
>       }
>   
> -    if (likely(excp == 0)) {
> -        tlb_set_page(cs, addr & TARGET_PAGE_MASK,
> -                     phys_addr & TARGET_PAGE_MASK, prot,
> -                     mmu_idx, TARGET_PAGE_SIZE);
> -        return true;
> -    }
> -    if (probe) {
> -        return false;
> -    }
> -
> -    raise_mmu_exception(cpu, addr, excp);
> -    cpu_loop_exit_restore(cs, retaddr);
> +    memset(out, 0, sizeof(*out));
> +    out->phys_addr = phys_addr;
> +    out->prot = prot;
> +    out->lg_page_size = TARGET_PAGE_BITS;
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
> +    return true;
>   }
>   
>   hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 43/54] target/ppc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 43/54] target/ppc: " Richard Henderson
@ 2024-11-14 18:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/ppc/internal.h   |  7 ++++---
>   target/ppc/cpu_init.c   |  2 +-
>   target/ppc/mmu_helper.c | 21 ++++++++++++++++-----
>   3 files changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index 20fb2ec593..9d132d35a1 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -273,9 +273,10 @@ void ppc_cpu_record_sigsegv(CPUState *cs, vaddr addr,
>                               MMUAccessType access_type,
>                               bool maperr, uintptr_t ra);
>   #else
> -bool ppc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr);
> +bool ppc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                            vaddr addr, MMUAccessType access_type,
> +                            int mmu_idx, MemOp memop, int size,
> +                            bool probe, uintptr_t ra);
>   G_NORETURN void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
>                                               MMUAccessType access_type, int mmu_idx,
>                                               uintptr_t retaddr);
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index efcb80d1c2..387c7ff2da 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -7422,7 +7422,7 @@ static const TCGCPUOps ppc_tcg_ops = {
>   #ifdef CONFIG_USER_ONLY
>     .record_sigsegv = ppc_cpu_record_sigsegv,
>   #else
> -  .tlb_fill = ppc_cpu_tlb_fill,
> +  .tlb_fill_align = ppc_cpu_tlb_fill_align,
>     .cpu_exec_interrupt = ppc_cpu_exec_interrupt,
>     .cpu_exec_halt = ppc_cpu_has_work,
>     .do_interrupt = ppc_cpu_do_interrupt,
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index b167b37e0a..bf98e0efb0 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -1357,18 +1357,29 @@ void helper_check_tlb_flush_global(CPUPPCState *env)
>   }
>   
>   
> -bool ppc_cpu_tlb_fill(CPUState *cs, vaddr eaddr, int size,
> -                      MMUAccessType access_type, int mmu_idx,
> -                      bool probe, uintptr_t retaddr)
> +bool ppc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                            vaddr eaddr, MMUAccessType access_type,
> +                            int mmu_idx, MemOp memop, int size,
> +                            bool probe, uintptr_t retaddr)
>   {
>       PowerPCCPU *cpu = POWERPC_CPU(cs);
>       hwaddr raddr;
>       int page_size, prot;
>   
> +    if (eaddr & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        ppc_cpu_do_unaligned_access(cs, eaddr, access_type, mmu_idx, retaddr);
> +    }
> +
>       if (ppc_xlate(cpu, eaddr, access_type, &raddr,
>                     &page_size, &prot, mmu_idx, !probe)) {
> -        tlb_set_page(cs, eaddr & TARGET_PAGE_MASK, raddr & TARGET_PAGE_MASK,
> -                     prot, mmu_idx, 1UL << page_size);
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = raddr;
> +        out->prot = prot;
> +        out->lg_page_size = page_size;
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       }
>       if (probe) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 44/54] target/riscv: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 44/54] target/riscv: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/riscv/cpu.h         |  8 +++++---
>   target/riscv/cpu_helper.c  | 22 +++++++++++++++++-----
>   target/riscv/tcg/tcg-cpu.c |  2 +-
>   3 files changed, 23 insertions(+), 9 deletions(-)
> 
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 284b112821..f97c4f3410 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -25,6 +25,7 @@
>   #include "hw/qdev-properties.h"
>   #include "exec/cpu-defs.h"
>   #include "exec/gdbstub.h"
> +#include "exec/memop.h"
>   #include "qemu/cpu-float.h"
>   #include "qom/object.h"
>   #include "qemu/int128.h"
> @@ -563,9 +564,10 @@ bool cpu_get_bcfien(CPURISCVState *env);
>   G_NORETURN void  riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
>                                                  MMUAccessType access_type,
>                                                  int mmu_idx, uintptr_t retaddr);
> -bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr);
> +bool riscv_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr addr, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t ra);
>   char *riscv_isa_string(RISCVCPU *cpu);
>   int riscv_cpu_max_xlen(RISCVCPUClass *mcc);
>   bool riscv_cpu_option_set(const char *optname);
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 0a3ead69ea..edb2edfc55 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1429,9 +1429,10 @@ static void pmu_tlb_fill_incr_ctr(RISCVCPU *cpu, MMUAccessType access_type)
>       riscv_pmu_incr_ctr(cpu, pmu_event_type);
>   }
>   
> -bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr)
> +bool riscv_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr address, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr)
>   {
>       RISCVCPU *cpu = RISCV_CPU(cs);
>       CPURISCVState *env = &cpu->env;
> @@ -1452,6 +1453,14 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>       qemu_log_mask(CPU_LOG_MMU, "%s ad %" VADDR_PRIx " rw %d mmu_idx %d\n",
>                     __func__, address, access_type, mmu_idx);
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        riscv_cpu_do_unaligned_access(cs, address, access_type,
> +                                      mmu_idx, retaddr);
> +    }
> +
>       pmu_tlb_fill_incr_ctr(cpu, access_type);
>       if (two_stage_lookup) {
>           /* Two stage lookup */
> @@ -1544,8 +1553,11 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>       }
>   
>       if (ret == TRANSLATE_SUCCESS) {
> -        tlb_set_page(cs, address & ~(tlb_size - 1), pa & ~(tlb_size - 1),
> -                     prot, mmu_idx, tlb_size);
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = pa;
> +        out->prot = prot;
> +        out->lg_page_size = ctz64(tlb_size);
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       } else if (probe) {
>           return false;
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index c62c221696..f3b436bb86 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -138,7 +138,7 @@ static const TCGCPUOps riscv_tcg_ops = {
>       .restore_state_to_opc = riscv_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = riscv_cpu_tlb_fill,
> +    .tlb_fill_align = riscv_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = riscv_cpu_exec_interrupt,
>       .cpu_exec_halt = riscv_cpu_has_work,
>       .do_interrupt = riscv_cpu_do_interrupt,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 45/54] target/rx: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 45/54] target/rx: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  2024-11-14 18:54   ` Pierrick Bouvier
  1 sibling, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/rx/cpu.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/target/rx/cpu.c b/target/rx/cpu.c
> index 65a74ce720..c83a582141 100644
> --- a/target/rx/cpu.c
> +++ b/target/rx/cpu.c
> @@ -161,16 +161,19 @@ static void rx_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
>       info->print_insn = print_insn_rx;
>   }
>   
> -static bool rx_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
> -                            MMUAccessType access_type, int mmu_idx,
> -                            bool probe, uintptr_t retaddr)
> +static bool rx_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                  vaddr addr, MMUAccessType access_type,
> +                                  int mmu_idx, MemOp memop, int size,
> +                                  bool probe, uintptr_t retaddr)
>   {
> -    uint32_t address, physical, prot;
> +    /* TODO: alignment faults not currently handled. */
>   
>       /* Linear mapping */
> -    address = physical = addr & TARGET_PAGE_MASK;
> -    prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> -    tlb_set_page(cs, address, physical, prot, mmu_idx, TARGET_PAGE_SIZE);
> +    memset(out, 0, sizeof(*out));
> +    out->phys_addr = addr;
> +    out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> +    out->lg_page_size = TARGET_PAGE_BITS;
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
>       return true;
>   }
>   
> @@ -195,7 +198,7 @@ static const TCGCPUOps rx_tcg_ops = {
>       .initialize = rx_translate_init,
>       .synchronize_from_tb = rx_cpu_synchronize_from_tb,
>       .restore_state_to_opc = rx_restore_state_to_opc,
> -    .tlb_fill = rx_cpu_tlb_fill,
> +    .tlb_fill_align = rx_cpu_tlb_fill_align,
>   
>   #ifndef CONFIG_USER_ONLY
>       .cpu_exec_interrupt = rx_cpu_exec_interrupt,



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 45/54] target/rx: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 45/54] target/rx: " Richard Henderson
  2024-11-14 18:54   ` Pierrick Bouvier
@ 2024-11-14 18:54   ` Pierrick Bouvier
  1 sibling, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/rx/cpu.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/target/rx/cpu.c b/target/rx/cpu.c
> index 65a74ce720..c83a582141 100644
> --- a/target/rx/cpu.c
> +++ b/target/rx/cpu.c
> @@ -161,16 +161,19 @@ static void rx_cpu_disas_set_info(CPUState *cpu, disassemble_info *info)
>       info->print_insn = print_insn_rx;
>   }
>   
> -static bool rx_cpu_tlb_fill(CPUState *cs, vaddr addr, int size,
> -                            MMUAccessType access_type, int mmu_idx,
> -                            bool probe, uintptr_t retaddr)
> +static bool rx_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                  vaddr addr, MMUAccessType access_type,
> +                                  int mmu_idx, MemOp memop, int size,
> +                                  bool probe, uintptr_t retaddr)
>   {
> -    uint32_t address, physical, prot;
> +    /* TODO: alignment faults not currently handled. */
>   
>       /* Linear mapping */
> -    address = physical = addr & TARGET_PAGE_MASK;
> -    prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> -    tlb_set_page(cs, address, physical, prot, mmu_idx, TARGET_PAGE_SIZE);
> +    memset(out, 0, sizeof(*out));
> +    out->phys_addr = addr;
> +    out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> +    out->lg_page_size = TARGET_PAGE_BITS;
> +    out->attrs = MEMTXATTRS_UNSPECIFIED;
>       return true;
>   }
>   
> @@ -195,7 +198,7 @@ static const TCGCPUOps rx_tcg_ops = {
>       .initialize = rx_translate_init,
>       .synchronize_from_tb = rx_cpu_synchronize_from_tb,
>       .restore_state_to_opc = rx_restore_state_to_opc,
> -    .tlb_fill = rx_cpu_tlb_fill,
> +    .tlb_fill_align = rx_cpu_tlb_fill_align,
>   
>   #ifndef CONFIG_USER_ONLY
>       .cpu_exec_interrupt = rx_cpu_exec_interrupt,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 46/54] target/s390x: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 46/54] target/s390x: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/s390x/s390x-internal.h  |  7 ++++---
>   target/s390x/cpu.c             |  4 ++--
>   target/s390x/tcg/excp_helper.c | 23 ++++++++++++++++++-----
>   3 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h
> index 825252d728..eb6fe24c9a 100644
> --- a/target/s390x/s390x-internal.h
> +++ b/target/s390x/s390x-internal.h
> @@ -278,9 +278,10 @@ void s390_cpu_record_sigsegv(CPUState *cs, vaddr address,
>   void s390_cpu_record_sigbus(CPUState *cs, vaddr address,
>                               MMUAccessType access_type, uintptr_t retaddr);
>   #else
> -bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr);
> +bool s390x_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr addr, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr);
>   G_NORETURN void s390x_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
>                                                 MMUAccessType access_type, int mmu_idx,
>                                                 uintptr_t retaddr);
> diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
> index 514c70f301..4d0eb129e3 100644
> --- a/target/s390x/cpu.c
> +++ b/target/s390x/cpu.c
> @@ -330,7 +330,7 @@ void cpu_get_tb_cpu_state(CPUS390XState *env, vaddr *pc,
>            * Instructions must be at even addresses.
>            * This needs to be checked before address translation.
>            */
> -        env->int_pgm_ilen = 2; /* see s390_cpu_tlb_fill() */
> +        env->int_pgm_ilen = 2; /* see s390x_cpu_tlb_fill_align() */
>           tcg_s390_program_interrupt(env, PGM_SPECIFICATION, 0);
>       }
>   
> @@ -364,7 +364,7 @@ static const TCGCPUOps s390_tcg_ops = {
>       .record_sigsegv = s390_cpu_record_sigsegv,
>       .record_sigbus = s390_cpu_record_sigbus,
>   #else
> -    .tlb_fill = s390_cpu_tlb_fill,
> +    .tlb_fill_align = s390x_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = s390_cpu_exec_interrupt,
>       .cpu_exec_halt = s390_cpu_has_work,
>       .do_interrupt = s390_cpu_do_interrupt,
> diff --git a/target/s390x/tcg/excp_helper.c b/target/s390x/tcg/excp_helper.c
> index 4c0b692c9e..6d61032a4a 100644
> --- a/target/s390x/tcg/excp_helper.c
> +++ b/target/s390x/tcg/excp_helper.c
> @@ -139,9 +139,10 @@ static inline uint64_t cpu_mmu_idx_to_asc(int mmu_idx)
>       }
>   }
>   
> -bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                       MMUAccessType access_type, int mmu_idx,
> -                       bool probe, uintptr_t retaddr)
> +bool s390x_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr address, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr)
>   {
>       CPUS390XState *env = cpu_env(cs);
>       target_ulong vaddr, raddr;
> @@ -151,6 +152,14 @@ bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>       qemu_log_mask(CPU_LOG_MMU, "%s: addr 0x%" VADDR_PRIx " rw %d mmu_idx %d\n",
>                     __func__, address, access_type, mmu_idx);
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        s390x_cpu_do_unaligned_access(cs, address, access_type,
> +                                      mmu_idx, retaddr);
> +    }
> +
>       vaddr = address;
>   
>       if (mmu_idx < MMU_REAL_IDX) {
> @@ -177,8 +186,12 @@ bool s390_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>           qemu_log_mask(CPU_LOG_MMU,
>                         "%s: set tlb %" PRIx64 " -> %" PRIx64 " (%x)\n",
>                         __func__, (uint64_t)vaddr, (uint64_t)raddr, prot);
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK, raddr, prot,
> -                     mmu_idx, TARGET_PAGE_SIZE);
> +
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = raddr;
> +        out->prot = prot;
> +        out->lg_page_size = TARGET_PAGE_BITS;
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       }
>       if (probe) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 47/54] target/sh4: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 47/54] target/sh4: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/sh4/cpu.h    |  8 +++++---
>   target/sh4/cpu.c    |  2 +-
>   target/sh4/helper.c | 24 +++++++++++++++++-------
>   3 files changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
> index d928bcf006..161efdefcf 100644
> --- a/target/sh4/cpu.h
> +++ b/target/sh4/cpu.h
> @@ -22,6 +22,7 @@
>   
>   #include "cpu-qom.h"
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   #include "qemu/cpu-float.h"
>   
>   /* CPU Subtypes */
> @@ -251,9 +252,10 @@ void sh4_translate_init(void);
>   
>   #if !defined(CONFIG_USER_ONLY)
>   hwaddr superh_cpu_get_phys_page_debug(CPUState *cpu, vaddr addr);
> -bool superh_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                         MMUAccessType access_type, int mmu_idx,
> -                         bool probe, uintptr_t retaddr);
> +bool superh_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                               vaddr addr, MMUAccessType access_type,
> +                               int mmu_idx, MemOp memop, int size,
> +                               bool probe, uintptr_t retaddr);
>   void superh_cpu_do_interrupt(CPUState *cpu);
>   bool superh_cpu_exec_interrupt(CPUState *cpu, int int_req);
>   void cpu_sh4_invalidate_tlb(CPUSH4State *s);
> diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
> index 8f07261dcf..8ca8b90e3c 100644
> --- a/target/sh4/cpu.c
> +++ b/target/sh4/cpu.c
> @@ -252,7 +252,7 @@ static const TCGCPUOps superh_tcg_ops = {
>       .restore_state_to_opc = superh_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = superh_cpu_tlb_fill,
> +    .tlb_fill_align = superh_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = superh_cpu_exec_interrupt,
>       .cpu_exec_halt = superh_cpu_has_work,
>       .do_interrupt = superh_cpu_do_interrupt,
> diff --git a/target/sh4/helper.c b/target/sh4/helper.c
> index 9659c69550..543ac1b843 100644
> --- a/target/sh4/helper.c
> +++ b/target/sh4/helper.c
> @@ -792,22 +792,32 @@ bool superh_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
>       return false;
>   }
>   
> -bool superh_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                         MMUAccessType access_type, int mmu_idx,
> -                         bool probe, uintptr_t retaddr)
> +bool superh_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                               vaddr address, MMUAccessType access_type,
> +                               int mmu_idx, MemOp memop, int size,
> +                               bool probe, uintptr_t retaddr)
>   {
>       CPUSH4State *env = cpu_env(cs);
>       int ret;
> -
>       target_ulong physical;
>       int prot;
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        superh_cpu_do_unaligned_access(cs, address, access_type,
> +                                       mmu_idx, retaddr);
> +    }
> +
>       ret = get_physical_address(env, &physical, &prot, address, access_type);
>   
>       if (ret == MMU_OK) {
> -        address &= TARGET_PAGE_MASK;
> -        physical &= TARGET_PAGE_MASK;
> -        tlb_set_page(cs, address, physical, prot, mmu_idx, TARGET_PAGE_SIZE);
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = physical;
> +        out->prot = prot;
> +        out->lg_page_size = TARGET_PAGE_BITS;
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       }
>       if (probe) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 48/54] target/sparc: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 48/54] target/sparc: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/sparc/cpu.h        |  8 ++++---
>   target/sparc/cpu.c        |  2 +-
>   target/sparc/mmu_helper.c | 44 +++++++++++++++++++++++++--------------
>   3 files changed, 34 insertions(+), 20 deletions(-)
> 
> diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
> index f517e5a383..4c8927e9fa 100644
> --- a/target/sparc/cpu.h
> +++ b/target/sparc/cpu.h
> @@ -4,6 +4,7 @@
>   #include "qemu/bswap.h"
>   #include "cpu-qom.h"
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   #include "qemu/cpu-float.h"
>   
>   #if !defined(TARGET_SPARC64)
> @@ -596,9 +597,10 @@ G_NORETURN void cpu_raise_exception_ra(CPUSPARCState *, int, uintptr_t);
>   void cpu_sparc_set_id(CPUSPARCState *env, unsigned int cpu);
>   void sparc_cpu_list(void);
>   /* mmu_helper.c */
> -bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr);
> +bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr addr, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr);
>   target_ulong mmu_probe(CPUSPARCState *env, target_ulong address, int mmulev);
>   void dump_mmu(CPUSPARCState *env);
>   
> diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
> index dd7af86de7..57ae53bd71 100644
> --- a/target/sparc/cpu.c
> +++ b/target/sparc/cpu.c
> @@ -932,7 +932,7 @@ static const TCGCPUOps sparc_tcg_ops = {
>       .restore_state_to_opc = sparc_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = sparc_cpu_tlb_fill,
> +    .tlb_fill_align = sparc_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = sparc_cpu_exec_interrupt,
>       .cpu_exec_halt = sparc_cpu_has_work,
>       .do_interrupt = sparc_cpu_do_interrupt,
> diff --git a/target/sparc/mmu_helper.c b/target/sparc/mmu_helper.c
> index 9ff06026b8..32766a37d6 100644
> --- a/target/sparc/mmu_helper.c
> +++ b/target/sparc/mmu_helper.c
> @@ -203,12 +203,12 @@ static int get_physical_address(CPUSPARCState *env, CPUTLBEntryFull *full,
>   }
>   
>   /* Perform address translation */
> -bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr)
> +bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr address, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr)
>   {
>       CPUSPARCState *env = cpu_env(cs);
> -    CPUTLBEntryFull full = {};
>       target_ulong vaddr;
>       int error_code = 0, access_index;
>   
> @@ -220,16 +220,21 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>        */
>       assert(!probe);
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        sparc_cpu_do_unaligned_access(cs, address, access_type,
> +                                      mmu_idx, retaddr);
> +    }
> +
> +    memset(out, 0, sizeof(*out));
>       address &= TARGET_PAGE_MASK;
> -    error_code = get_physical_address(env, &full, &access_index,
> +    error_code = get_physical_address(env, out, &access_index,
>                                         address, access_type, mmu_idx);
>       vaddr = address;
>       if (likely(error_code == 0)) {
>           qemu_log_mask(CPU_LOG_MMU,
>                         "Translate at %" VADDR_PRIx " -> "
>                         HWADDR_FMT_plx ", vaddr " TARGET_FMT_lx "\n",
> -                      address, full.phys_addr, vaddr);
> -        tlb_set_page_full(cs, mmu_idx, vaddr, &full);
> +                      address, out->phys_addr, vaddr);
>           return true;
>       }
>   
> @@ -244,8 +249,7 @@ bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>              permissions. If no mapping is available, redirect accesses to
>              neverland. Fake/overridden mappings will be flushed when
>              switching to normal mode. */
> -        full.prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
> -        tlb_set_page_full(cs, mmu_idx, vaddr, &full);
> +        out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
>           return true;
>       } else {
>           if (access_type == MMU_INST_FETCH) {
> @@ -754,22 +758,30 @@ static int get_physical_address(CPUSPARCState *env, CPUTLBEntryFull *full,
>   }
>   
>   /* Perform address translation */
> -bool sparc_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                        MMUAccessType access_type, int mmu_idx,
> -                        bool probe, uintptr_t retaddr)
> +bool sparc_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                              vaddr address, MMUAccessType access_type,
> +                              int mmu_idx, MemOp memop, int size,
> +                              bool probe, uintptr_t retaddr)
>   {
>       CPUSPARCState *env = cpu_env(cs);
> -    CPUTLBEntryFull full = {};
>       int error_code = 0, access_index;
>   
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        sparc_cpu_do_unaligned_access(cs, address, access_type,
> +                                      mmu_idx, retaddr);
> +    }
> +
> +    memset(out, 0, sizeof(*out));
>       address &= TARGET_PAGE_MASK;
> -    error_code = get_physical_address(env, &full, &access_index,
> +    error_code = get_physical_address(env, out, &access_index,
>                                         address, access_type, mmu_idx);
>       if (likely(error_code == 0)) {
> -        trace_mmu_helper_mmu_fault(address, full.phys_addr, mmu_idx, env->tl,
> +        trace_mmu_helper_mmu_fault(address, out->phys_addr, mmu_idx, env->tl,
>                                      env->dmmu.mmu_primary_context,
>                                      env->dmmu.mmu_secondary_context);
> -        tlb_set_page_full(cs, mmu_idx, address, &full);
>           return true;
>       }
>       if (probe) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 49/54] target/tricore: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 49/54] target/tricore: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/tricore/cpu.h    |  7 ++++---
>   target/tricore/cpu.c    |  2 +-
>   target/tricore/helper.c | 19 ++++++++++++-------
>   3 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/target/tricore/cpu.h b/target/tricore/cpu.h
> index 220af69fc2..5f141ce8f3 100644
> --- a/target/tricore/cpu.h
> +++ b/target/tricore/cpu.h
> @@ -268,8 +268,9 @@ static inline void cpu_get_tb_cpu_state(CPUTriCoreState *env, vaddr *pc,
>   #define CPU_RESOLVING_TYPE TYPE_TRICORE_CPU
>   
>   /* helpers.c */
> -bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                          MMUAccessType access_type, int mmu_idx,
> -                          bool probe, uintptr_t retaddr);
> +bool tricore_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                vaddr addr, MMUAccessType access_type,
> +                                int mmu_idx, MemOp memop, int size,
> +                                bool probe, uintptr_t retaddr);
>   
>   #endif /* TRICORE_CPU_H */
> diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
> index 1a26171590..29e0b5d129 100644
> --- a/target/tricore/cpu.c
> +++ b/target/tricore/cpu.c
> @@ -173,7 +173,7 @@ static const TCGCPUOps tricore_tcg_ops = {
>       .initialize = tricore_tcg_init,
>       .synchronize_from_tb = tricore_cpu_synchronize_from_tb,
>       .restore_state_to_opc = tricore_restore_state_to_opc,
> -    .tlb_fill = tricore_cpu_tlb_fill,
> +    .tlb_fill_align = tricore_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = tricore_cpu_exec_interrupt,
>       .cpu_exec_halt = tricore_cpu_has_work,
>   };
> diff --git a/target/tricore/helper.c b/target/tricore/helper.c
> index 7014255f77..8c6bf63298 100644
> --- a/target/tricore/helper.c
> +++ b/target/tricore/helper.c
> @@ -64,16 +64,19 @@ static void raise_mmu_exception(CPUTriCoreState *env, target_ulong address,
>   {
>   }
>   
> -bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                          MMUAccessType rw, int mmu_idx,
> -                          bool probe, uintptr_t retaddr)
> +bool tricore_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                                vaddr address, MMUAccessType access_type,
> +                                int mmu_idx, MemOp memop, int size,
> +                                bool probe, uintptr_t retaddr)
>   {
>       CPUTriCoreState *env = cpu_env(cs);
>       hwaddr physical;
>       int prot;
>       int ret = 0;
> +    int rw = access_type & 1;
> +
> +    /* TODO: alignment faults not currently handled. */
>   
> -    rw &= 1;
>       ret = get_physical_address(env, &physical, &prot,
>                                  address, rw, mmu_idx);
>   
> @@ -82,9 +85,11 @@ bool tricore_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>                     __func__, address, ret, physical, prot);
>   
>       if (ret == TLBRET_MATCH) {
> -        tlb_set_page(cs, address & TARGET_PAGE_MASK,
> -                     physical & TARGET_PAGE_MASK, prot | PAGE_EXEC,
> -                     mmu_idx, TARGET_PAGE_SIZE);
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = physical;
> +        out->prot = prot | PAGE_EXEC;
> +        out->lg_page_size = TARGET_PAGE_BITS;
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       } else {
>           assert(ret < 0);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 50/54] target/xtensa: Convert to TCGCPUOps.tlb_fill_align
  2024-11-14 16:01 ` [PATCH v2 50/54] target/xtensa: " Richard Henderson
@ 2024-11-14 18:54   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/xtensa/cpu.h    |  8 +++++---
>   target/xtensa/cpu.c    |  2 +-
>   target/xtensa/helper.c | 28 ++++++++++++++++++++--------
>   3 files changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
> index 77e48eef19..68c3d90d41 100644
> --- a/target/xtensa/cpu.h
> +++ b/target/xtensa/cpu.h
> @@ -31,6 +31,7 @@
>   #include "cpu-qom.h"
>   #include "qemu/cpu-float.h"
>   #include "exec/cpu-defs.h"
> +#include "exec/memop.h"
>   #include "hw/clock.h"
>   #include "xtensa-isa.h"
>   
> @@ -580,9 +581,10 @@ struct XtensaCPUClass {
>   };
>   
>   #ifndef CONFIG_USER_ONLY
> -bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                         MMUAccessType access_type, int mmu_idx,
> -                         bool probe, uintptr_t retaddr);
> +bool xtensa_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                               vaddr addr, MMUAccessType access_type,
> +                               int mmu_idx, MemOp memop, int size,
> +                               bool probe, uintptr_t retaddr);
>   void xtensa_cpu_do_interrupt(CPUState *cpu);
>   bool xtensa_cpu_exec_interrupt(CPUState *cpu, int interrupt_request);
>   void xtensa_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr, vaddr addr,
> diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
> index 6f9039abae..3e4ec97e0e 100644
> --- a/target/xtensa/cpu.c
> +++ b/target/xtensa/cpu.c
> @@ -232,7 +232,7 @@ static const TCGCPUOps xtensa_tcg_ops = {
>       .restore_state_to_opc = xtensa_restore_state_to_opc,
>   
>   #ifndef CONFIG_USER_ONLY
> -    .tlb_fill = xtensa_cpu_tlb_fill,
> +    .tlb_fill_align = xtensa_cpu_tlb_fill_align,
>       .cpu_exec_interrupt = xtensa_cpu_exec_interrupt,
>       .cpu_exec_halt = xtensa_cpu_has_work,
>       .do_interrupt = xtensa_cpu_do_interrupt,
> diff --git a/target/xtensa/helper.c b/target/xtensa/helper.c
> index ca214b948a..69b0e661c8 100644
> --- a/target/xtensa/helper.c
> +++ b/target/xtensa/helper.c
> @@ -261,15 +261,26 @@ void xtensa_cpu_do_unaligned_access(CPUState *cs,
>                                     addr);
>   }
>   
> -bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
> -                         MMUAccessType access_type, int mmu_idx,
> -                         bool probe, uintptr_t retaddr)
> +bool xtensa_cpu_tlb_fill_align(CPUState *cs, CPUTLBEntryFull *out,
> +                               vaddr address, MMUAccessType access_type,
> +                               int mmu_idx, MemOp memop, int size,
> +                               bool probe, uintptr_t retaddr)
>   {
>       CPUXtensaState *env = cpu_env(cs);
>       uint32_t paddr;
>       uint32_t page_size;
>       unsigned access;
> -    int ret = xtensa_get_physical_addr(env, true, address, access_type,
> +    int ret;
> +
> +    if (address & ((1 << memop_alignment_bits(memop)) - 1)) {
> +        if (probe) {
> +            return false;
> +        }
> +        xtensa_cpu_do_unaligned_access(cs, address, access_type,
> +                                       mmu_idx, retaddr);
> +    }
> +
> +    ret = xtensa_get_physical_addr(env, true, address, access_type,
>                                          mmu_idx, &paddr, &page_size, &access);
>   
>       qemu_log_mask(CPU_LOG_MMU, "%s(%08" VADDR_PRIx
> @@ -277,10 +288,11 @@ bool xtensa_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
>                     __func__, address, access_type, mmu_idx, paddr, ret);
>   
>       if (ret == 0) {
> -        tlb_set_page(cs,
> -                     address & TARGET_PAGE_MASK,
> -                     paddr & TARGET_PAGE_MASK,
> -                     access, mmu_idx, page_size);
> +        memset(out, 0, sizeof(*out));
> +        out->phys_addr = paddr;
> +        out->prot = access;
> +        out->lg_page_size = ctz32(page_size);
> +        out->attrs = MEMTXATTRS_UNSPECIFIED;
>           return true;
>       } else if (probe) {
>           return false;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill
  2024-11-14 16:01 ` [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
@ 2024-11-14 18:55   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:55 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Now that all targets have been converted to tlb_fill_align,
> remove the tlb_fill hook.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/tcg-cpu-ops.h | 10 ----------
>   accel/tcg/cputlb.c            | 19 ++++---------------
>   2 files changed, 4 insertions(+), 25 deletions(-)
> 
> diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
> index 663efb9133..70cafcc6cd 100644
> --- a/include/hw/core/tcg-cpu-ops.h
> +++ b/include/hw/core/tcg-cpu-ops.h
> @@ -157,16 +157,6 @@ struct TCGCPUOps {
>       bool (*tlb_fill_align)(CPUState *cpu, CPUTLBEntryFull *out, vaddr addr,
>                              MMUAccessType access_type, int mmu_idx,
>                              MemOp memop, int size, bool probe, uintptr_t ra);
> -    /**
> -     * @tlb_fill: Handle a softmmu tlb miss
> -     *
> -     * If the access is valid, call tlb_set_page and return true;
> -     * if the access is invalid and probe is true, return false;
> -     * otherwise raise an exception and do not return.
> -     */
> -    bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
> -                     MMUAccessType access_type, int mmu_idx,
> -                     bool probe, uintptr_t retaddr);
>       /**
>        * @do_transaction_failed: Callback for handling failed memory transactions
>        * (ie bus faults or external aborts; not MMU faults)
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 7f63dc3fd8..ec597ed6f5 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1222,23 +1222,12 @@ static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
>                              int mmu_idx, MemOp memop, int size,
>                              bool probe, uintptr_t ra)
>   {
> -    const TCGCPUOps *ops = cpu->cc->tcg_ops;
>       CPUTLBEntryFull full;
>   
> -    if (ops->tlb_fill_align) {
> -        if (ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
> -                                memop, size, probe, ra)) {
> -            tlb_set_page_full(cpu, mmu_idx, addr, &full);
> -            return true;
> -        }
> -    } else {
> -        /* Legacy behaviour is alignment before paging. */
> -        if (addr & ((1u << memop_alignment_bits(memop)) - 1)) {
> -            ops->do_unaligned_access(cpu, addr, type, mmu_idx, ra);
> -        }
> -        if (ops->tlb_fill(cpu, addr, size, type, mmu_idx, probe, ra)) {
> -            return true;
> -        }
> +    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
> +                                         memop, size, probe, ra)) {
> +        tlb_set_page_full(cpu, mmu_idx, addr, &full);
> +        return true;
>       }
>       assert(probe);
>       return false;

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page*
  2024-11-14 16:01 ` [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page* Richard Henderson
@ 2024-11-14 18:56   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> The new tlb_fill_align hook returns page data via structure
> rather than by function call, so we can make tlb_set_page_full
> be local to cputlb.c.  There are no users of tlb_set_page
> or tlb_set_page_with_attrs, so those can be eliminated.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/exec-all.h | 57 -----------------------------------------
>   accel/tcg/cputlb.c      | 27 ++-----------------
>   2 files changed, 2 insertions(+), 82 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 69bdb77584..b65fc547bd 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -184,63 +184,6 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *cpu,
>                                                  vaddr len,
>                                                  uint16_t idxmap,
>                                                  unsigned bits);
> -
> -/**
> - * tlb_set_page_full:
> - * @cpu: CPU context
> - * @mmu_idx: mmu index of the tlb to modify
> - * @addr: virtual address of the entry to add
> - * @full: the details of the tlb entry
> - *
> - * Add an entry to @cpu tlb index @mmu_idx.  All of the fields of
> - * @full must be filled, except for xlat_section, and constitute
> - * the complete description of the translated page.
> - *
> - * This is generally called by the target tlb_fill function after
> - * having performed a successful page table walk to find the physical
> - * address and attributes for the translation.
> - *
> - * At most one entry for a given virtual address is permitted. Only a
> - * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
> - * used by tlb_flush_page.
> - */
> -void tlb_set_page_full(CPUState *cpu, int mmu_idx, vaddr addr,
> -                       CPUTLBEntryFull *full);
> -
> -/**
> - * tlb_set_page_with_attrs:
> - * @cpu: CPU to add this TLB entry for
> - * @addr: virtual address of page to add entry for
> - * @paddr: physical address of the page
> - * @attrs: memory transaction attributes
> - * @prot: access permissions (PAGE_READ/PAGE_WRITE/PAGE_EXEC bits)
> - * @mmu_idx: MMU index to insert TLB entry for
> - * @size: size of the page in bytes
> - *
> - * Add an entry to this CPU's TLB (a mapping from virtual address
> - * @addr to physical address @paddr) with the specified memory
> - * transaction attributes. This is generally called by the target CPU
> - * specific code after it has been called through the tlb_fill()
> - * entry point and performed a successful page table walk to find
> - * the physical address and attributes for the virtual address
> - * which provoked the TLB miss.
> - *
> - * At most one entry for a given virtual address is permitted. Only a
> - * single TARGET_PAGE_SIZE region is mapped; the supplied @size is only
> - * used by tlb_flush_page.
> - */
> -void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
> -                             hwaddr paddr, MemTxAttrs attrs,
> -                             int prot, int mmu_idx, vaddr size);
> -/* tlb_set_page:
> - *
> - * This function is equivalent to calling tlb_set_page_with_attrs()
> - * with an @attrs argument of MEMTXATTRS_UNSPECIFIED. It's provided
> - * as a convenience for CPUs which don't use memory transaction attributes.
> - */
> -void tlb_set_page(CPUState *cpu, vaddr addr,
> -                  hwaddr paddr, int prot,
> -                  int mmu_idx, vaddr size);
>   #else
>   static inline void tlb_init(CPUState *cpu)
>   {
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ec597ed6f5..3d731b8f3d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1037,8 +1037,8 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
>    * Called from TCG-generated code, which is under an RCU read-side
>    * critical section.
>    */
> -void tlb_set_page_full(CPUState *cpu, int mmu_idx,
> -                       vaddr addr, CPUTLBEntryFull *full)
> +static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
> +                              vaddr addr, CPUTLBEntryFull *full)
>   {
>       CPUTLB *tlb = &cpu->neg.tlb;
>       CPUTLBDesc *desc = &tlb->d[mmu_idx];
> @@ -1189,29 +1189,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       qemu_spin_unlock(&tlb->c.lock);
>   }
>   
> -void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
> -                             hwaddr paddr, MemTxAttrs attrs, int prot,
> -                             int mmu_idx, uint64_t size)
> -{
> -    CPUTLBEntryFull full = {
> -        .phys_addr = paddr,
> -        .attrs = attrs,
> -        .prot = prot,
> -        .lg_page_size = ctz64(size)
> -    };
> -
> -    assert(is_power_of_2(size));
> -    tlb_set_page_full(cpu, mmu_idx, addr, &full);
> -}
> -
> -void tlb_set_page(CPUState *cpu, vaddr addr,
> -                  hwaddr paddr, int prot,
> -                  int mmu_idx, uint64_t size)
> -{
> -    tlb_set_page_with_attrs(cpu, addr, paddr, MEMTXATTRS_UNSPECIFIED,
> -                            prot, mmu_idx, size);
> -}
> -
>   /*
>    * Note: tlb_fill_align() can trigger a resize of the TLB.
>    * This means that all of the caller's prior references to the TLB table

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers
  2024-11-14 16:01 ` [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers Richard Henderson
@ 2024-11-14 18:57   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> In tlb_lookup, we still call tlb_set_page_full.
> In atomic_mmu_lookup, we're expecting noreturn.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 31 ++++++-------------------------
>   1 file changed, 6 insertions(+), 25 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 3d731b8f3d..20af48c6c5 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1189,27 +1189,6 @@ static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       qemu_spin_unlock(&tlb->c.lock);
>   }
>   
> -/*
> - * Note: tlb_fill_align() can trigger a resize of the TLB.
> - * This means that all of the caller's prior references to the TLB table
> - * (e.g. CPUTLBEntry pointers) must be discarded and looked up again
> - * (e.g. via tlb_entry()).
> - */
> -static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
> -                           int mmu_idx, MemOp memop, int size,
> -                           bool probe, uintptr_t ra)
> -{
> -    CPUTLBEntryFull full;
> -
> -    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
> -                                         memop, size, probe, ra)) {
> -        tlb_set_page_full(cpu, mmu_idx, addr, &full);
> -        return true;
> -    }
> -    assert(probe);
> -    return false;
> -}
> -
>   static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
>                                           MMUAccessType access_type,
>                                           int mmu_idx, uintptr_t retaddr)
> @@ -1281,11 +1260,13 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>       }
>   
>       /* Finally, query the target hook. */
> -    if (!tlb_fill_align(cpu, addr, access_type, i->mmu_idx,
> -                        memop, i->size, probe, i->ra)) {
> +    if (!cpu->cc->tcg_ops->tlb_fill_align(cpu, &o->full, addr, access_type,
> +                                          i->mmu_idx, memop, i->size,
> +                                          probe, i->ra)) {
>           tcg_debug_assert(probe);
>           return false;
>       }
> +    tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
>       o->did_tlb_fill = true;
>   
>       if (access_type == MMU_INST_FETCH) {
> @@ -1794,8 +1775,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>        * We have just verified that the page is writable.
>        */
>       if (unlikely(!(o.full.prot & PAGE_READ))) {
> -        tlb_fill_align(cpu, addr, MMU_DATA_LOAD, i.mmu_idx,
> -                       0, i.size, false, i.ra);
> +        cpu->cc->tcg_ops->tlb_fill_align(cpu, &o.full, addr, MMU_DATA_LOAD,
> +                                         i.mmu_idx, 0, i.size, false, i.ra);
>           /*
>            * Since we don't support reads and writes to different
>            * addresses, and we do have the proper page loaded for

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>



^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full
  2024-11-14 16:01 ` [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full Richard Henderson
@ 2024-11-14 18:59   ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 18:59 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:01, Richard Henderson wrote:
> Avoid a lookup to find the node that we have just inserted.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 20af48c6c5..6d316e8767 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1037,8 +1037,8 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
>    * Called from TCG-generated code, which is under an RCU read-side
>    * critical section.
>    */
> -static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
> -                              vaddr addr, CPUTLBEntryFull *full)
> +static CPUTLBEntryTree *tlb_set_page_full(CPUState *cpu, int mmu_idx,
> +                                          vaddr addr, CPUTLBEntryFull *full)
>   {
>       CPUTLB *tlb = &cpu->neg.tlb;
>       CPUTLBDesc *desc = &tlb->d[mmu_idx];
> @@ -1187,6 +1187,8 @@ static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       copy_tlb_helper_locked(te, &node->copy);
>       desc->n_used_entries++;
>       qemu_spin_unlock(&tlb->c.lock);
> +
> +    return node;
>   }
>   
>   static inline void cpu_unaligned_access(CPUState *cpu, vaddr addr,
> @@ -1266,18 +1268,14 @@ static bool tlb_lookup(CPUState *cpu, TLBLookupOutput *o,
>           tcg_debug_assert(probe);
>           return false;
>       }
> -    tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
> +    node = tlb_set_page_full(cpu, i->mmu_idx, addr, &o->full);
>       o->did_tlb_fill = true;
>   
>       if (access_type == MMU_INST_FETCH) {
> -        node = tlbtree_lookup_addr(desc, addr);
> -        tcg_debug_assert(node);
>           goto found_code;
>       }
>   
> -    entry = tlbfast_entry(fast, addr);
> -    cmp = tlb_read_idx(entry, access_type);
> -    node = entry->tree;
> +    cmp = tlb_read_idx(&node->copy, access_type);
>       /*
>        * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
>        * to force the next access through tlb_fill_align.  We've just

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
  2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (53 preceding siblings ...)
  2024-11-14 16:01 ` [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full Richard Henderson
@ 2024-11-14 19:56 ` Pierrick Bouvier
  2024-11-14 20:58   ` Richard Henderson
  54 siblings, 1 reply; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 19:56 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 08:00, Richard Henderson wrote:
> v1: 20241009150855.804605-1-richard.henderson@linaro.org
> 
> The initial idea was: how much can we do with an intelligent data
> structure for the same cost as a linear search through an array?
> 
> 
> r~
> 
> 
> Richard Henderson (54):
>    util/interval-tree: Introduce interval_tree_free_nodes
>    accel/tcg: Split out tlbfast_flush_locked
>    accel/tcg: Split out tlbfast_{index,entry}
>    accel/tcg: Split out tlbfast_flush_range_locked
>    accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
>    accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx*
>    accel/tcg: Assert bits in range in tlb_flush_range_by_mmuidx*
>    accel/tcg: Flush entire tlb when a masked range wraps
>    accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
>    accel/tcg: Populate IntervalTree in tlb_set_page_full
>    accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
>    accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
>    accel/tcg: Process IntervalTree entries in tlb_reset_dirty
>    accel/tcg: Process IntervalTree entries in tlb_set_dirty
>    accel/tcg: Use tlb_hit_page in victim_tlb_hit
>    accel/tcg: Pass full addr to victim_tlb_hit
>    accel/tcg: Replace victim_tlb_hit with tlbtree_hit
>    accel/tcg: Remove the victim tlb
>    accel/tcg: Remove tlb_n_used_entries_inc
>    include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
>    accel/tcg: Delay plugin adjustment in probe_access_internal
>    accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
>    accel/tcg: Check original prot bits for read in atomic_mmu_lookup
>    accel/tcg: Preserve tlb flags in tlb_set_compare
>    accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu
>    accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full
>    accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal
>    accel/tcg: Introduce tlb_lookup
>    accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput
>    accel/tcg: Merge mmu_lookup1 into mmu_lookup
>    accel/tcg: Always use IntervalTree for code lookups
>    accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
>    accel/tcg: Remove CPUTLBDesc.fulltlb
>    target/alpha: Convert to TCGCPUOps.tlb_fill_align
>    target/avr: Convert to TCGCPUOps.tlb_fill_align
>    target/i386: Convert to TCGCPUOps.tlb_fill_align
>    target/loongarch: Convert to TCGCPUOps.tlb_fill_align
>    target/m68k: Convert to TCGCPUOps.tlb_fill_align
>    target/m68k: Do not call tlb_set_page in helper_ptest
>    target/microblaze: Convert to TCGCPUOps.tlb_fill_align
>    target/mips: Convert to TCGCPUOps.tlb_fill_align
>    target/openrisc: Convert to TCGCPUOps.tlb_fill_align
>    target/ppc: Convert to TCGCPUOps.tlb_fill_align
>    target/riscv: Convert to TCGCPUOps.tlb_fill_align
>    target/rx: Convert to TCGCPUOps.tlb_fill_align
>    target/s390x: Convert to TCGCPUOps.tlb_fill_align
>    target/sh4: Convert to TCGCPUOps.tlb_fill_align
>    target/sparc: Convert to TCGCPUOps.tlb_fill_align
>    target/tricore: Convert to TCGCPUOps.tlb_fill_align
>    target/xtensa: Convert to TCGCPUOps.tlb_fill_align
>    accel/tcg: Drop TCGCPUOps.tlb_fill
>    accel/tcg: Unexport tlb_set_page*
>    accel/tcg: Merge tlb_fill_align into callers
>    accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full
> 
>   include/exec/cpu-all.h               |   3 +
>   include/exec/exec-all.h              |  65 +-
>   include/exec/tlb-common.h            |  68 +-
>   include/hw/core/cpu.h                |  75 +-
>   include/hw/core/tcg-cpu-ops.h        |  10 -
>   include/qemu/interval-tree.h         |  11 +
>   target/alpha/cpu.h                   |   6 +-
>   target/avr/cpu.h                     |   7 +-
>   target/i386/tcg/helper-tcg.h         |   6 +-
>   target/loongarch/internals.h         |   7 +-
>   target/m68k/cpu.h                    |   7 +-
>   target/microblaze/cpu.h              |   7 +-
>   target/mips/tcg/tcg-internal.h       |   6 +-
>   target/openrisc/cpu.h                |   8 +-
>   target/ppc/internal.h                |   7 +-
>   target/riscv/cpu.h                   |   8 +-
>   target/s390x/s390x-internal.h        |   7 +-
>   target/sh4/cpu.h                     |   8 +-
>   target/sparc/cpu.h                   |   8 +-
>   target/tricore/cpu.h                 |   7 +-
>   target/xtensa/cpu.h                  |   8 +-
>   accel/tcg/cputlb.c                   | 994 +++++++++++++--------------
>   target/alpha/cpu.c                   |   2 +-
>   target/alpha/helper.c                |  23 +-
>   target/arm/ptw.c                     |  10 +-
>   target/arm/tcg/helper-a64.c          |   4 +-
>   target/arm/tcg/mte_helper.c          |  15 +-
>   target/arm/tcg/sve_helper.c          |   6 +-
>   target/avr/cpu.c                     |   2 +-
>   target/avr/helper.c                  |  19 +-
>   target/i386/tcg/sysemu/excp_helper.c |  36 +-
>   target/i386/tcg/tcg-cpu.c            |   2 +-
>   target/loongarch/cpu.c               |   2 +-
>   target/loongarch/tcg/tlb_helper.c    |  17 +-
>   target/m68k/cpu.c                    |   2 +-
>   target/m68k/helper.c                 |  32 +-
>   target/microblaze/cpu.c              |   2 +-
>   target/microblaze/helper.c           |  33 +-
>   target/mips/cpu.c                    |   2 +-
>   target/mips/tcg/sysemu/tlb_helper.c  |  29 +-
>   target/openrisc/cpu.c                |   2 +-
>   target/openrisc/mmu.c                |  39 +-
>   target/ppc/cpu_init.c                |   2 +-
>   target/ppc/mmu_helper.c              |  21 +-
>   target/riscv/cpu_helper.c            |  22 +-
>   target/riscv/tcg/tcg-cpu.c           |   2 +-
>   target/rx/cpu.c                      |  19 +-
>   target/s390x/cpu.c                   |   4 +-
>   target/s390x/tcg/excp_helper.c       |  23 +-
>   target/sh4/cpu.c                     |   2 +-
>   target/sh4/helper.c                  |  24 +-
>   target/sparc/cpu.c                   |   2 +-
>   target/sparc/mmu_helper.c            |  44 +-
>   target/tricore/cpu.c                 |   2 +-
>   target/tricore/helper.c              |  19 +-
>   target/xtensa/cpu.c                  |   2 +-
>   target/xtensa/helper.c               |  28 +-
>   util/interval-tree.c                 |  20 +
>   util/selfmap.c                       |  13 +-
>   59 files changed, 938 insertions(+), 923 deletions(-)
> 

I tested this change by booting a debian x86_64 image, it works as expected.

I noticed that this change does not come for free (64s before, 82s after 
- 1.3x). Is that acceptable?

Pierrick


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
  2024-11-14 19:56 ` [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Pierrick Bouvier
@ 2024-11-14 20:58   ` Richard Henderson
  2024-11-14 21:05     ` Pierrick Bouvier
  0 siblings, 1 reply; 114+ messages in thread
From: Richard Henderson @ 2024-11-14 20:58 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 11/14/24 11:56, Pierrick Bouvier wrote:
> I tested this change by booting a debian x86_64 image, it works as expected.
> 
> I noticed that this change does not come for free (64s before, 82s after - 1.3x). Is that 
> acceptable?
Well, no.  But I didn't notice any change during boot tests.  I used hyperfine over 'make 
check-functional'.

I would only expect benefits to be seen during longer lived vm's, since a boot test 
doesn't run applications long enough to see tlb entries accumulate.  I have not attempted 
to create a reproducible test for that so far.


r~


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
  2024-11-14 20:58   ` Richard Henderson
@ 2024-11-14 21:05     ` Pierrick Bouvier
  2024-11-15 11:43       ` Alex Bennée
  0 siblings, 1 reply; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-14 21:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 11/14/24 12:58, Richard Henderson wrote:
> On 11/14/24 11:56, Pierrick Bouvier wrote:
>> I tested this change by booting a debian x86_64 image, it works as expected.
>>
>> I noticed that this change does not come for free (64s before, 82s after - 1.3x). Is that
>> acceptable?
> Well, no.  But I didn't notice any change during boot tests.  I used hyperfine over 'make
> check-functional'.
> 
> I would only expect benefits to be seen during longer lived vm's, since a boot test
> doesn't run applications long enough to see tlb entries accumulate.  I have not attempted
> to create a reproducible test for that so far.
> 
> 

I didn't use check-functional neither.
I used a vanilla debian bookworm install, with a modified /etc/rc.local 
calling poweroff, and ran 3 times with/without change with turbo 
disabled on my cpu.

> r~


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
  2024-11-14 21:05     ` Pierrick Bouvier
@ 2024-11-15 11:43       ` Alex Bennée
  2024-11-15 17:44         ` Pierrick Bouvier
  0 siblings, 1 reply; 114+ messages in thread
From: Alex Bennée @ 2024-11-15 11:43 UTC (permalink / raw)
  To: Pierrick Bouvier; +Cc: Richard Henderson, qemu-devel

Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:

> On 11/14/24 12:58, Richard Henderson wrote:
>> On 11/14/24 11:56, Pierrick Bouvier wrote:
>>> I tested this change by booting a debian x86_64 image, it works as expected.
>>>
>>> I noticed that this change does not come for free (64s before, 82s after - 1.3x). Is that
>>> acceptable?
>> Well, no.  But I didn't notice any change during boot tests.  I used hyperfine over 'make
>> check-functional'.
>> I would only expect benefits to be seen during longer lived vm's,
>> since a boot test
>> doesn't run applications long enough to see tlb entries accumulate.  I have not attempted
>> to create a reproducible test for that so far.
>> 
>
> I didn't use check-functional neither.
> I used a vanilla debian bookworm install, with a modified
> /etc/rc.local calling poweroff, and ran 3 times with/without change
> with turbo disabled on my cpu.

If you want to really stress the VM handling you should use stress-ng to
exercise page faulting and recovery. Wrap it up in a systemd unit for a
reproducible test:

  cat /etc/systemd/system/benchmark-stress-ng.service 
  # A benchmark target
  #
  # This shutsdown once the boot has completed

  [Unit]
  Description=Default
  Requires=basic.target
  After=basic.target
  AllowIsolate=yes

  [Service]
  Type=oneshot
  ExecStart=stress-ng --perf --iomix 4 --vm 2 --timeout 10s
  ExecStartPost=/sbin/poweroff

  [Install]
  WantedBy=multi-user.target

and then call with something like:

  -append "root=/dev/sda2 console=ttyAMA0 systemd.unit=benchmark-stress-ng.service"

>
>> r~

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 114+ messages in thread

* Re: [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree
  2024-11-15 11:43       ` Alex Bennée
@ 2024-11-15 17:44         ` Pierrick Bouvier
  0 siblings, 0 replies; 114+ messages in thread
From: Pierrick Bouvier @ 2024-11-15 17:44 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Richard Henderson, qemu-devel

On 11/15/24 03:43, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> 
>> On 11/14/24 12:58, Richard Henderson wrote:
>>> On 11/14/24 11:56, Pierrick Bouvier wrote:
>>>> I tested this change by booting a debian x86_64 image, it works as expected.
>>>>
>>>> I noticed that this change does not come for free (64s before, 82s after - 1.3x). Is that
>>>> acceptable?
>>> Well, no.  But I didn't notice any change during boot tests.  I used hyperfine over 'make
>>> check-functional'.
>>> I would only expect benefits to be seen during longer lived vm's,
>>> since a boot test
>>> doesn't run applications long enough to see tlb entries accumulate.  I have not attempted
>>> to create a reproducible test for that so far.
>>>
>>
>> I didn't use check-functional neither.
>> I used a vanilla debian bookworm install, with a modified
>> /etc/rc.local calling poweroff, and ran 3 times with/without change
>> with turbo disabled on my cpu.
> 
> If you want to really stress the VM handling you should use stress-ng to
> exercise page faulting and recovery. Wrap it up in a systemd unit for a
> reproducible test:
> 
>    cat /etc/systemd/system/benchmark-stress-ng.service
>    # A benchmark target
>    #
>    # This shutsdown once the boot has completed
> 
>    [Unit]
>    Description=Default
>    Requires=basic.target
>    After=basic.target
>    AllowIsolate=yes
> 
>    [Service]
>    Type=oneshot
>    ExecStart=stress-ng --perf --iomix 4 --vm 2 --timeout 10s
>    ExecStartPost=/sbin/poweroff
> 
>    [Install]
>    WantedBy=multi-user.target
> 
> and then call with something like:
> 
>    -append "root=/dev/sda2 console=ttyAMA0 systemd.unit=benchmark-stress-ng.service"
> 

Thanks for the advice.

>>
>>> r~
> 


^ permalink raw reply	[flat|nested] 114+ messages in thread

end of thread, other threads:[~2024-11-15 17:45 UTC | newest]

Thread overview: 114+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-14 16:00 [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
2024-11-14 16:00 ` [PATCH v2 01/54] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
2024-11-14 17:51   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 02/54] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
2024-11-14 17:52   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 03/54] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
2024-11-14 17:52   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 04/54] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
2024-11-14 16:00 ` [PATCH v2 05/54] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
2024-11-14 17:54   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 06/54] accel/tcg: Assert non-zero length in tlb_flush_range_by_mmuidx* Richard Henderson
2024-11-14 17:56   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 07/54] accel/tcg: Assert bits in range " Richard Henderson
2024-11-14 17:56   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 08/54] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
2024-11-14 17:58   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 09/54] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
2024-11-14 17:59   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 10/54] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
2024-11-14 18:00   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 11/54] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
2024-11-14 18:01   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 12/54] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
2024-11-14 18:01   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 13/54] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
2024-11-14 18:02   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 14/54] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
2024-11-14 18:02   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 15/54] accel/tcg: Use tlb_hit_page in victim_tlb_hit Richard Henderson
2024-11-14 18:03   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 16/54] accel/tcg: Pass full addr to victim_tlb_hit Richard Henderson
2024-11-14 18:04   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 17/54] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
2024-11-14 18:06   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 18/54] accel/tcg: Remove the victim tlb Richard Henderson
2024-11-14 18:07   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 19/54] accel/tcg: Remove tlb_n_used_entries_inc Richard Henderson
2024-11-14 18:07   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 20/54] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
2024-11-14 18:08   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 21/54] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
2024-11-14 18:09   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 22/54] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
2024-11-14 18:09   ` Pierrick Bouvier
2024-11-14 16:00 ` [PATCH v2 23/54] accel/tcg: Check original prot bits for read in atomic_mmu_lookup Richard Henderson
2024-11-14 18:09   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 24/54] accel/tcg: Preserve tlb flags in tlb_set_compare Richard Henderson
2024-11-14 18:11   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 25/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full_mmu Richard Henderson
2024-11-14 18:11   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 26/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_full Richard Henderson
2024-11-14 18:12   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 27/54] accel/tcg: Return CPUTLBEntryFull not pointer in probe_access_internal Richard Henderson
2024-11-14 18:13   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 28/54] accel/tcg: Introduce tlb_lookup Richard Henderson
2024-11-14 18:29   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 29/54] accel/tcg: Partially unify MMULookupPageData and TLBLookupOutput Richard Henderson
2024-11-14 18:29   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 30/54] accel/tcg: Merge mmu_lookup1 into mmu_lookup Richard Henderson
2024-11-14 18:31   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 31/54] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
2024-11-14 18:32   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 32/54] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
2024-11-14 18:39   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 33/54] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
2024-11-14 18:49   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 34/54] target/alpha: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 35/54] target/avr: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 36/54] target/i386: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 37/54] target/loongarch: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 38/54] target/m68k: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 39/54] target/m68k: Do not call tlb_set_page in helper_ptest Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 40/54] target/microblaze: Convert to TCGCPUOps.tlb_fill_align Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 41/54] target/mips: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 42/54] target/openrisc: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 43/54] target/ppc: " Richard Henderson
2024-11-14 18:53   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 44/54] target/riscv: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 45/54] target/rx: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 46/54] target/s390x: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 47/54] target/sh4: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 48/54] target/sparc: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 49/54] target/tricore: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 50/54] target/xtensa: " Richard Henderson
2024-11-14 18:54   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 51/54] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
2024-11-14 18:55   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 52/54] accel/tcg: Unexport tlb_set_page* Richard Henderson
2024-11-14 18:56   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 53/54] accel/tcg: Merge tlb_fill_align into callers Richard Henderson
2024-11-14 18:57   ` Pierrick Bouvier
2024-11-14 16:01 ` [PATCH v2 54/54] accel/tcg: Return CPUTLBEntryTree from tlb_set_page_full Richard Henderson
2024-11-14 18:59   ` Pierrick Bouvier
2024-11-14 19:56 ` [PATCH for-10.0 v2 00/54] accel/tcg: Convert victim tlb to IntervalTree Pierrick Bouvier
2024-11-14 20:58   ` Richard Henderson
2024-11-14 21:05     ` Pierrick Bouvier
2024-11-15 11:43       ` Alex Bennée
2024-11-15 17:44         ` Pierrick Bouvier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).