[RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree
@ 2024-10-09 15:08 Richard Henderson
  2024-10-09 15:08 ` [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
                   ` (23 more replies)
  0 siblings, 24 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Based-on: 20241009000453.315652-1-richard.henderson@linaro.org
("[PATCH v3 00/20] accel/tcg: Introduce tlb_fill_align hook")

The initial idea was: how much can we do with an intelligent data
structure for the same cost as a linear search through an array?

This is an initial installment along these lines.  This is about
as far as I can go without first converting all targets to the
new tlb_fill_align hook.  Indeed, the final two patches will not
compile with all targets enabled, but hint at the direction of
the next steps.

I do not expect large perf changes with this patch set.  I will
be happy if performance comes out even.


r~

Richard Henderson (23):
  util/interval-tree: Introduce interval_tree_free_nodes
  accel/tcg: Split out tlbfast_flush_locked
  accel/tcg: Split out tlbfast_{index,entry}
  accel/tcg: Split out tlbfast_flush_range_locked
  accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx*
  accel/tcg: Flush entire tlb when a masked range wraps
  accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  accel/tcg: Populate IntervalTree in tlb_set_page_full
  accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  accel/tcg: Process IntervalTree entries in tlb_set_dirty
  accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  accel/tcg: Remove the victim tlb
  include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  accel/tcg: Delay plugin adjustment in probe_access_internal
  accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  accel/tcg: Always use IntervalTree for code lookups
  accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  accel/tcg: Remove CPUTLBDesc.fulltlb
  accel/tcg: Drop TCGCPUOps.tlb_fill -- NOTYET
  accel/tcg: Unexport tlb_set_page*

 include/exec/cpu-all.h        |   3 +
 include/exec/exec-all.h       |  57 ----
 include/exec/tlb-common.h     |  68 +++-
 include/hw/core/cpu.h         |  75 +----
 include/hw/core/tcg-cpu-ops.h |  10 -
 include/qemu/interval-tree.h  |  11 +
 accel/tcg/cputlb.c            | 599 +++++++++++++++++++---------------
 util/interval-tree.c          |  20 ++
 util/selfmap.c                |  13 +-
 9 files changed, 431 insertions(+), 425 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 22:51   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Provide a general-purpose release-all-nodes operation, that allows
for the IntervalTreeNode to be embeded within a larger structure.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/qemu/interval-tree.h | 11 +++++++++++
 util/interval-tree.c         | 20 ++++++++++++++++++++
 util/selfmap.c               | 13 +------------
 3 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h
index 25006debe8..d90ea6d17f 100644
--- a/include/qemu/interval-tree.h
+++ b/include/qemu/interval-tree.h
@@ -96,4 +96,15 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
 IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
                                           uint64_t start, uint64_t last);
 
+/**
+ * interval_tree_free_nodes:
+ * @root: root of the tree
+ * @it_offset: offset from outermost type to IntervalTreeNode
+ *
+ * Free, via g_free, all nodes under @root.  IntervalTreeNode may
+ * not be the true type of the nodes allocated; @it_offset gives
+ * the offset from the outermost type to the IntervalTreeNode member.
+ */
+void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset);
+
 #endif /* QEMU_INTERVAL_TREE_H */
diff --git a/util/interval-tree.c b/util/interval-tree.c
index 53465182e6..663d3ec222 100644
--- a/util/interval-tree.c
+++ b/util/interval-tree.c
@@ -639,6 +639,16 @@ static void rb_erase_augmented_cached(RBNode *node, RBRootLeftCached *root,
     rb_erase_augmented(node, &root->rb_root, augment);
 }
 
+static void rb_node_free(RBNode *rb, size_t rb_offset)
+{
+    if (rb->rb_left) {
+        rb_node_free(rb->rb_left, rb_offset);
+    }
+    if (rb->rb_right) {
+        rb_node_free(rb->rb_right, rb_offset);
+    }
+    g_free((void *)rb - rb_offset);
+}
 
 /*
  * Interval trees.
@@ -870,6 +880,16 @@ IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
     }
 }
 
+void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset)
+{
+    if (root && root->rb_root.rb_node) {
+        rb_node_free(root->rb_root.rb_node,
+                     it_offset + offsetof(IntervalTreeNode, rb));
+        root->rb_root.rb_node = NULL;
+        root->rb_leftmost = NULL;
+    }
+}
+
 /* Occasionally useful for calling from within the debugger. */
 #if 0
 static void debug_interval_tree_int(IntervalTreeNode *node,
diff --git a/util/selfmap.c b/util/selfmap.c
index 483cb617e2..d2b86da301 100644
--- a/util/selfmap.c
+++ b/util/selfmap.c
@@ -87,23 +87,12 @@ IntervalTreeRoot *read_self_maps(void)
  * @root: an interval tree
  *
  * Free a tree of MapInfo structures.
- * Since we allocated each MapInfo in one chunk, we need not consider the
- * contents and can simply free each RBNode.
  */
 
-static void free_rbnode(RBNode *n)
-{
-    if (n) {
-        free_rbnode(n->rb_left);
-        free_rbnode(n->rb_right);
-        g_free(n);
-    }
-}
-
 void free_self_maps(IntervalTreeRoot *root)
 {
     if (root) {
-        free_rbnode(root->rb_root.rb_node);
+        interval_tree_free_nodes(root, offsetof(MapInfo, itree));
         g_free(root);
     }
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
  2024-10-09 15:08 ` [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 18:54   ` Philippe Mathieu-Daudé
  2024-10-09 22:53   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
                   ` (21 subsequent siblings)
  23 siblings, 2 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

We will have a need to flush only the "fast" portion
of the tlb, allowing re-fill from the "full" portion.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b76a4eac4e..c1838412e8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -284,13 +284,18 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
     }
 }
 
-static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
+static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
 {
     desc->n_used_entries = 0;
+    memset(fast->table, -1, sizeof_tlb(fast));
+}
+
+static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
+{
+    tlbfast_flush_locked(desc, fast);
     desc->large_page_addr = -1;
     desc->large_page_mask = -1;
     desc->vindex = 0;
-    memset(fast->table, -1, sizeof_tlb(fast));
     memset(desc->vtable, -1, sizeof(desc->vtable));
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry}
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
  2024-10-09 15:08 ` [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
  2024-10-09 15:08 ` [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 22:55   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Often we already have the CPUTLBDescFast structure pointer.
Allows future code simplification.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c1838412e8..e37af24525 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -131,20 +131,28 @@ static inline uint64_t tlb_addr_write(const CPUTLBEntry *entry)
     return tlb_read_idx(entry, MMU_DATA_STORE);
 }
 
+static inline uintptr_t tlbfast_index(CPUTLBDescFast *fast, vaddr addr)
+{
+    return (addr >> TARGET_PAGE_BITS) & (fast->mask >> CPU_TLB_ENTRY_BITS);
+}
+
+static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
+{
+    return fast->table + tlbfast_index(fast, addr);
+}
+
 /* Find the TLB index corresponding to the mmu_idx + address pair.  */
 static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
                                   vaddr addr)
 {
-    uintptr_t size_mask = cpu->neg.tlb.f[mmu_idx].mask >> CPU_TLB_ENTRY_BITS;
-
-    return (addr >> TARGET_PAGE_BITS) & size_mask;
+    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
 }
 
 /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
 static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
                                      vaddr addr)
 {
-    return &cpu->neg.tlb.f[mmu_idx].table[tlb_index(cpu, mmu_idx, addr)];
+    return tlbfast_entry(&cpu->neg.tlb.f[mmu_idx], addr);
 }
 
 static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (2 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:05   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
                   ` (19 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

While this may at present be overly complicated for use
by single page flushes, do so with the expectation that
this will eventually allow simplification of large pages.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 61 +++++++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e37af24525..6773874f2d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -520,10 +520,37 @@ static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
     tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
 }
 
+static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
+                                       vaddr addr, vaddr len, vaddr mask)
+{
+    /*
+     * If @mask is smaller than the tlb size, there may be multiple entries
+     * within the TLB; for now, just flush the entire TLB.
+     * Otherwise all addresses that match under @mask hit the same TLB entry.
+     *
+     * If @len is larger than the tlb size, then it will take longer to
+     * test all of the entries in the TLB than it will to flush it all.
+     */
+    if (mask < fast->mask || len > fast->mask) {
+        tlbfast_flush_locked(desc, fast);
+        return;
+    }
+
+    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
+        vaddr page = addr + i;
+        CPUTLBEntry *entry = tlbfast_entry(fast, page);
+
+        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
+            desc->n_used_entries--;
+        }
+    }
+}
+
 static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
 {
-    vaddr lp_addr = cpu->neg.tlb.d[midx].large_page_addr;
-    vaddr lp_mask = cpu->neg.tlb.d[midx].large_page_mask;
+    CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
+    vaddr lp_addr = desc->large_page_addr;
+    vaddr lp_mask = desc->large_page_mask;
 
     /* Check if we need to flush due to large pages.  */
     if ((page & lp_mask) == lp_addr) {
@@ -532,9 +559,8 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
                   midx, lp_addr, lp_mask);
         tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
     } else {
-        if (tlb_flush_entry_locked(tlb_entry(cpu, midx, page), page)) {
-            tlb_n_used_entries_dec(cpu, midx);
-        }
+        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
+                                   page, TARGET_PAGE_SIZE, -1);
         tlb_flush_vtlb_page_locked(cpu, midx, page);
     }
 }
@@ -689,24 +715,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
     CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
     vaddr mask = MAKE_64BIT_MASK(0, bits);
 
-    /*
-     * If @bits is smaller than the tlb size, there may be multiple entries
-     * within the TLB; otherwise all addresses that match under @mask hit
-     * the same TLB entry.
-     * TODO: Perhaps allow bits to be a few bits less than the size.
-     * For now, just flush the entire TLB.
-     *
-     * If @len is larger than the tlb size, then it will take longer to
-     * test all of the entries in the TLB than it will to flush it all.
-     */
-    if (mask < f->mask || len > f->mask) {
-        tlb_debug("forcing full flush midx %d ("
-                  "%016" VADDR_PRIx "/%016" VADDR_PRIx "+%016" VADDR_PRIx ")\n",
-                  midx, addr, mask, len);
-        tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
-        return;
-    }
-
     /*
      * Check if we need to flush due to large pages.
      * Because large_page_mask contains all 1's from the msb,
@@ -720,13 +728,10 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
         return;
     }
 
+    tlbfast_flush_range_locked(d, f, addr, len, mask);
+
     for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
         vaddr page = addr + i;
-        CPUTLBEntry *entry = tlb_entry(cpu, midx, page);
-
-        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
-            tlb_n_used_entries_dec(cpu, midx);
-        }
         tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
     }
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (3 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:18   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx* Richard Henderson
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

The INVALID bit should only be auto-cleared when we have
just called tlb_fill, not along the victim_tlb_hit path.

In atomic_mmu_lookup, rename tlb_addr to flags, as that
is what we're actually carrying around.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 33 ++++++++++++++++++++++-----------
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 6773874f2d..fd8da8586f 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1657,7 +1657,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
     bool maybe_resized = false;
     CPUTLBEntryFull *full;
-    int flags;
+    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
 
     /* If the TLB entry is for a different page, reload and try again.  */
     if (!tlb_hit(tlb_addr, addr)) {
@@ -1668,8 +1668,14 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
             maybe_resized = true;
             index = tlb_index(cpu, mmu_idx, addr);
             entry = tlb_entry(cpu, mmu_idx, addr);
+            /*
+             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+             * to force the next access through tlb_fill.  We've just
+             * called tlb_fill, so we know that this entry *is* valid.
+             */
+            flags &= ~TLB_INVALID_MASK;
         }
-        tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
+        tlb_addr = tlb_read_idx(entry, access_type);
     }
 
     full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
@@ -1819,10 +1825,10 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     MemOp mop = get_memop(oi);
     uintptr_t index;
     CPUTLBEntry *tlbe;
-    vaddr tlb_addr;
     void *hostaddr;
     CPUTLBEntryFull *full;
     bool did_tlb_fill = false;
+    int flags;
 
     tcg_debug_assert(mmu_idx < NB_MMU_MODES);
 
@@ -1833,8 +1839,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     tlbe = tlb_entry(cpu, mmu_idx, addr);
 
     /* Check TLB entry and enforce page permissions.  */
-    tlb_addr = tlb_addr_write(tlbe);
-    if (!tlb_hit(tlb_addr, addr)) {
+    flags = TLB_FLAGS_MASK;
+    if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
         if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
                             addr & TARGET_PAGE_MASK)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
@@ -1842,8 +1848,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
             did_tlb_fill = true;
             index = tlb_index(cpu, mmu_idx, addr);
             tlbe = tlb_entry(cpu, mmu_idx, addr);
+            /*
+             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+             * to force the next access through tlb_fill.  We've just
+             * called tlb_fill, so we know that this entry *is* valid.
+             */
+            flags &= ~TLB_INVALID_MASK;
         }
-        tlb_addr = tlb_addr_write(tlbe) & ~TLB_INVALID_MASK;
     }
 
     /*
@@ -1879,11 +1890,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         goto stop_the_world;
     }
 
-    /* Collect tlb flags for read. */
-    tlb_addr |= tlbe->addr_read;
+    /* Collect tlb flags for read and write. */
+    flags &= tlbe->addr_read | tlb_addr_write(tlbe);
 
     /* Notice an IO access or a needs-MMU-lookup access */
-    if (unlikely(tlb_addr & (TLB_MMIO | TLB_DISCARD_WRITE))) {
+    if (unlikely(flags & (TLB_MMIO | TLB_DISCARD_WRITE))) {
         /* There's really nothing that can be done to
            support this apart from stop-the-world.  */
         goto stop_the_world;
@@ -1892,11 +1903,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
     full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
 
-    if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
+    if (unlikely(flags & TLB_NOTDIRTY)) {
         notdirty_write(cpu, addr, size, full, retaddr);
     }
 
-    if (unlikely(tlb_addr & TLB_FORCE_SLOW)) {
+    if (unlikely(flags & TLB_FORCE_SLOW)) {
         int wp_flags = 0;
 
         if (full->slow_flags[MMU_DATA_STORE] & TLB_WATCHPOINT) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx*
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (4 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 18:53   ` Philippe Mathieu-Daudé
  2024-10-09 23:20   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
                   ` (17 subsequent siblings)
  23 siblings, 2 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Probably never happens, but next patches will assume non-zero length.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index fd8da8586f..93b42d18ee 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -801,6 +801,9 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
      * If all bits are significant, and len is small,
      * this devolves to tlb_flush_page.
      */
+    if (len == 0) {
+        return;
+    }
     if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
         tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
         return;
@@ -839,6 +842,9 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
      * If all bits are significant, and len is small,
      * this devolves to tlb_flush_page.
      */
+    if (len == 0) {
+        return;
+    }
     if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
         tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
         return;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (5 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx* Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:28   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
                   ` (16 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

We expect masked address spaces to be quite large, e.g. 56 bits
for AArch64 top-byte-ignore mode.  We do not expect addr+len to
wrap around, but it is possible with AArch64 guest flush range
instructions.

Convert this unlikely case to a full tlb flush.  This can simplify
the subroutines actually performing the range flush.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 93b42d18ee..8affa25db3 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -808,8 +808,12 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
         tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
         return;
     }
-    /* If no page bits are significant, this devolves to tlb_flush. */
-    if (bits < TARGET_PAGE_BITS) {
+    /*
+     * If no page bits are significant, this devolves to full flush.
+     * If addr+len wraps in len bits, fall back to full flush.
+     */
+    if (bits < TARGET_PAGE_BITS
+        || (bits < TARGET_LONG_BITS && (addr ^ (addr + len - 1)) >> bits)) {
         tlb_flush_by_mmuidx(cpu, idxmap);
         return;
     }
@@ -849,8 +853,12 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
         tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
         return;
     }
-    /* If no page bits are significant, this devolves to tlb_flush. */
-    if (bits < TARGET_PAGE_BITS) {
+    /*
+     * If no page bits are significant, this devolves to full flush.
+     * If addr+len wraps in len bits, fall back to full flush.
+     */
+    if (bits < TARGET_PAGE_BITS
+        || (bits < TARGET_LONG_BITS && (addr ^ (addr + len - 1)) >> bits)) {
         tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
         return;
     }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (6 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:31   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Add the data structures for tracking softmmu pages via
a balanced interval tree.  So far, only initialize and
destroy the data structure.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  3 +++
 accel/tcg/cputlb.c    | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index d21a24c82f..b567abe3e2 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -34,6 +34,7 @@
 #include "qemu/rcu_queue.h"
 #include "qemu/queue.h"
 #include "qemu/thread.h"
+#include "qemu/interval-tree.h"
 #include "qom/object.h"
 
 typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
@@ -287,6 +288,8 @@ typedef struct CPUTLBDesc {
     CPUTLBEntry vtable[CPU_VTLB_SIZE];
     CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
     CPUTLBEntryFull *fulltlb;
+    /* All active tlb entries for this address space. */
+    IntervalTreeRoot iroot;
 } CPUTLBDesc;
 
 /*
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8affa25db3..435c2dc132 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -89,6 +89,13 @@ QEMU_BUILD_BUG_ON(sizeof(vaddr) > sizeof(run_on_cpu_data));
 QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
 #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
 
+/* Extra data required to manage CPUTLBEntryFull within an interval tree. */
+typedef struct CPUTLBEntryTree {
+    IntervalTreeNode itree;
+    CPUTLBEntry copy;
+    CPUTLBEntryFull full;
+} CPUTLBEntryTree;
+
 static inline size_t tlb_n_entries(CPUTLBDescFast *fast)
 {
     return (fast->mask >> CPU_TLB_ENTRY_BITS) + 1;
@@ -305,6 +312,7 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     desc->large_page_mask = -1;
     desc->vindex = 0;
     memset(desc->vtable, -1, sizeof(desc->vtable));
+    interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
 }
 
 static void tlb_flush_one_mmuidx_locked(CPUState *cpu, int mmu_idx,
@@ -326,6 +334,7 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
     fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_new(CPUTLBEntry, n_entries);
     desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
+    memset(&desc->iroot, 0, sizeof(desc->iroot));
     tlb_mmu_flush_locked(desc, fast);
 }
 
@@ -365,6 +374,8 @@ void tlb_destroy(CPUState *cpu)
 
         g_free(fast->table);
         g_free(desc->fulltlb);
+        interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
+                                 offsetof(CPUTLBEntryTree, itree));
     }
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (7 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:50   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Add or replace an entry in the IntervalTree for each
page installed into softmmu.  We do not yet use the
tree for anything else.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 435c2dc132..d964e1b2e8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -305,6 +305,17 @@ static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     memset(fast->table, -1, sizeof_tlb(fast));
 }
 
+static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
+{
+    IntervalTreeNode *i = interval_tree_iter_first(&desc->iroot, s, l);
+    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
+}
+
+static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
+{
+    return tlbtree_lookup_range(desc, addr, addr);
+}
+
 static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
 {
     tlbfast_flush_locked(desc, fast);
@@ -1086,7 +1097,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     MemoryRegionSection *section;
     unsigned int index, read_flags, write_flags;
     uintptr_t addend;
-    CPUTLBEntry *te, tn;
+    CPUTLBEntry *te;
+    CPUTLBEntryTree *node;
     hwaddr iotlb, xlat, sz, paddr_page;
     vaddr addr_page;
     int asidx, wp_flags, prot;
@@ -1194,6 +1206,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
         tlb_n_used_entries_dec(cpu, mmu_idx);
     }
 
+    /* Replace an old IntervalTree entry, or create a new one. */
+    node = tlbtree_lookup_addr(desc, addr_page);
+    if (!node) {
+        node = g_new(CPUTLBEntryTree, 1);
+        node->itree.start = addr_page;
+        node->itree.last = addr_page + TARGET_PAGE_SIZE - 1;
+        interval_tree_insert(&node->itree, &desc->iroot);
+    }
+
     /* refill the tlb */
     /*
      * When memory region is ram, iotlb contains a TARGET_PAGE_BITS
@@ -1215,15 +1236,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     full->phys_addr = paddr_page;
 
     /* Now calculate the new entry */
-    tn.addend = addend - addr_page;
+    node->copy.addend = addend - addr_page;
 
-    tlb_set_compare(full, &tn, addr_page, read_flags,
+    tlb_set_compare(full, &node->copy, addr_page, read_flags,
                     MMU_INST_FETCH, prot & PAGE_EXEC);
 
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
     }
-    tlb_set_compare(full, &tn, addr_page, read_flags,
+    tlb_set_compare(full, &node->copy, addr_page, read_flags,
                     MMU_DATA_LOAD, prot & PAGE_READ);
 
     if (prot & PAGE_WRITE_INV) {
@@ -1232,10 +1253,11 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     if (wp_flags & BP_MEM_WRITE) {
         write_flags |= TLB_WATCHPOINT;
     }
-    tlb_set_compare(full, &tn, addr_page, write_flags,
+    tlb_set_compare(full, &node->copy, addr_page, write_flags,
                     MMU_DATA_STORE, prot & PAGE_WRITE);
 
-    copy_tlb_helper_locked(te, &tn);
+    node->full = *full;
+    copy_tlb_helper_locked(te, &node->copy);
     tlb_n_used_entries_inc(cpu, mmu_idx);
     qemu_spin_unlock(&tlb->c.lock);
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (8 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:53   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Flush a page from the IntervalTree cache.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d964e1b2e8..772656c7f8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -573,6 +573,7 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
     CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
     vaddr lp_addr = desc->large_page_addr;
     vaddr lp_mask = desc->large_page_mask;
+    CPUTLBEntryTree *node;
 
     /* Check if we need to flush due to large pages.  */
     if ((page & lp_mask) == lp_addr) {
@@ -580,10 +581,17 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
                   VADDR_PRIx "/%016" VADDR_PRIx ")\n",
                   midx, lp_addr, lp_mask);
         tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
-    } else {
-        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
-                                   page, TARGET_PAGE_SIZE, -1);
-        tlb_flush_vtlb_page_locked(cpu, midx, page);
+        return;
+    }
+
+    tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
+                               page, TARGET_PAGE_SIZE, -1);
+    tlb_flush_vtlb_page_locked(cpu, midx, page);
+
+    node = tlbtree_lookup_addr(desc, page);
+    if (node) {
+        interval_tree_remove(&node->itree, &desc->iroot);
+        g_free(node);
     }
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (9 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 23:57   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Flush a masked range of pages from the IntervalTree cache.
When the mask is not used there is a redundant comparison,
but that is better than duplicating code at this point.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 772656c7f8..709ad75616 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -311,6 +311,13 @@ static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
     return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
 }
 
+static CPUTLBEntryTree *tlbtree_lookup_range_next(CPUTLBEntryTree *prev,
+                                                  vaddr s, vaddr l)
+{
+    IntervalTreeNode *i = interval_tree_iter_next(&prev->itree, s, l);
+    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
+}
+
 static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
 {
     return tlbtree_lookup_range(desc, addr, addr);
@@ -744,6 +751,8 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
     CPUTLBDesc *d = &cpu->neg.tlb.d[midx];
     CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
     vaddr mask = MAKE_64BIT_MASK(0, bits);
+    CPUTLBEntryTree *node;
+    vaddr addr_mask, last_mask, last_imask;
 
     /*
      * Check if we need to flush due to large pages.
@@ -764,6 +773,22 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
         vaddr page = addr + i;
         tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
     }
+
+    addr_mask = addr & mask;
+    last_mask = addr_mask + len - 1;
+    last_imask = last_mask | ~mask;
+    node = tlbtree_lookup_range(d, addr_mask, last_imask);
+    while (node) {
+        CPUTLBEntryTree *next =
+            tlbtree_lookup_range_next(node, addr_mask, last_imask);
+        vaddr page_mask = node->itree.start & mask;
+
+        if (page_mask >= addr_mask && page_mask < last_mask) {
+            interval_tree_remove(&node->itree, &d->iroot);
+            g_free(node);
+        }
+        node = next;
+    }
 }
 
 typedef struct {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (10 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:03   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Update the addr_write copy within each interval tree node.
Tidy the iteration within the other two loops as well.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 709ad75616..95f78afee6 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1024,17 +1024,20 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 
     qemu_spin_lock(&cpu->neg.tlb.c.lock);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        unsigned int i;
-        unsigned int n = tlb_n_entries(&cpu->neg.tlb.f[mmu_idx]);
+        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+        CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
 
-        for (i = 0; i < n; i++) {
-            tlb_reset_dirty_range_locked(&cpu->neg.tlb.f[mmu_idx].table[i],
-                                         start1, length);
+        for (size_t i = 0, n = tlb_n_entries(fast); i < n; i++) {
+            tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
         }
 
-        for (i = 0; i < CPU_VTLB_SIZE; i++) {
-            tlb_reset_dirty_range_locked(&cpu->neg.tlb.d[mmu_idx].vtable[i],
-                                         start1, length);
+        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
+            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
+        }
+
+        for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
+             t = tlbtree_lookup_range_next(t, 0, -1)) {
+            tlb_reset_dirty_range_locked(&t->copy, start1, length);
         }
     }
     qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (11 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:04   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Update the addr_write copy within an interval tree node.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 95f78afee6..ec989f1290 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1063,13 +1063,18 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
     addr &= TARGET_PAGE_MASK;
     qemu_spin_lock(&cpu->neg.tlb.c.lock);
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
-    }
+        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+        CPUTLBEntryTree *node;
 
-    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
-        int k;
-        for (k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_set_dirty1_locked(&cpu->neg.tlb.d[mmu_idx].vtable[k], addr);
+        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
+
+        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
+            tlb_set_dirty1_locked(&desc->vtable[k], addr);
+        }
+
+        node = tlbtree_lookup_addr(desc, addr);
+        if (node) {
+            tlb_set_dirty1_locked(&node->copy, addr);
         }
     }
     qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (12 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:10   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 15/23] accel/tcg: Remove the victim tlb Richard Henderson
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Change from a linear search on the victim tlb
to a balanced binary tree search on the interval tree.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 62 +++++++++++++++++++++++-----------------------
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ec989f1290..b10b0a357c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1398,36 +1398,38 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
     }
 }
 
-/* Return true if ADDR is present in the victim tlb, and has been copied
-   back to the main tlb.  */
-static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
-                           MMUAccessType access_type, vaddr page)
+/*
+ * Return true if ADDR is present in the interval tree,
+ * and has been copied back to the main tlb.
+ */
+static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
+                        MMUAccessType access_type, vaddr addr)
 {
-    size_t vidx;
+    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
+    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
+    CPUTLBEntryTree *node;
+    size_t index;
 
     assert_cpu_is_self(cpu);
-    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
-        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
-        uint64_t cmp = tlb_read_idx(vtlb, access_type);
-
-        if (cmp == page) {
-            /* Found entry in victim tlb, swap tlb and iotlb.  */
-            CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
-
-            qemu_spin_lock(&cpu->neg.tlb.c.lock);
-            copy_tlb_helper_locked(&tmptlb, tlb);
-            copy_tlb_helper_locked(tlb, vtlb);
-            copy_tlb_helper_locked(vtlb, &tmptlb);
-            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
-
-            CPUTLBEntryFull *f1 = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
-            CPUTLBEntryFull *f2 = &cpu->neg.tlb.d[mmu_idx].vfulltlb[vidx];
-            CPUTLBEntryFull tmpf;
-            tmpf = *f1; *f1 = *f2; *f2 = tmpf;
-            return true;
-        }
+    node = tlbtree_lookup_addr(desc, addr);
+    if (!node) {
+        /* There is no cached mapping for this page. */
+        return false;
     }
-    return false;
+
+    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
+        /* This access is not permitted. */
+        return false;
+    }
+
+    /* Install the cached entry. */
+    index = tlbfast_index(fast, addr);
+    qemu_spin_lock(&cpu->neg.tlb.c.lock);
+    copy_tlb_helper_locked(&fast->table[index], &node->copy);
+    qemu_spin_unlock(&cpu->neg.tlb.c.lock);
+
+    desc->fulltlb[index] = node->full;
+    return true;
 }
 
 static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
@@ -1469,7 +1471,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     CPUTLBEntryFull *full;
 
     if (!tlb_hit_page(tlb_addr, page_addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, page_addr)) {
+        if (!tlbtree_hit(cpu, mmu_idx, access_type, page_addr)) {
             if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
                                 0, fault_size, nonfault, retaddr)) {
                 /* Non-faulting page table read failed.  */
@@ -1749,8 +1751,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
 
     /* If the TLB entry is for a different page, reload and try again.  */
     if (!tlb_hit(tlb_addr, addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type,
-                            addr & TARGET_PAGE_MASK)) {
+        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
             tlb_fill_align(cpu, addr, access_type, mmu_idx,
                            memop, data->size, false, ra);
             maybe_resized = true;
@@ -1929,8 +1930,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     /* Check TLB entry and enforce page permissions.  */
     flags = TLB_FLAGS_MASK;
     if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
-        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
-                            addr & TARGET_PAGE_MASK)) {
+        if (!tlbtree_hit(cpu, mmu_idx, MMU_DATA_STORE, addr)) {
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
                            mop, size, false, retaddr);
             did_tlb_fill = true;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 15/23] accel/tcg: Remove the victim tlb
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (13 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:12   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

This has been functionally replaced by the IntervalTree.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  8 ------
 accel/tcg/cputlb.c    | 64 -------------------------------------------
 2 files changed, 72 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index b567abe3e2..87b864f5c4 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -198,9 +198,6 @@ struct CPUClass {
  */
 #define NB_MMU_MODES 16
 
-/* Use a fully associative victim tlb of 8 entries. */
-#define CPU_VTLB_SIZE 8
-
 /*
  * The full TLB entry, which is not accessed by generated TCG code,
  * so the layout is not as critical as that of CPUTLBEntry. This is
@@ -282,11 +279,6 @@ typedef struct CPUTLBDesc {
     /* maximum number of entries observed in the window */
     size_t window_max_entries;
     size_t n_used_entries;
-    /* The next index to use in the tlb victim table.  */
-    size_t vindex;
-    /* The tlb victim table, in two parts.  */
-    CPUTLBEntry vtable[CPU_VTLB_SIZE];
-    CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
     CPUTLBEntryFull *fulltlb;
     /* All active tlb entries for this address space. */
     IntervalTreeRoot iroot;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b10b0a357c..561f66c723 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -328,8 +328,6 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
     tlbfast_flush_locked(desc, fast);
     desc->large_page_addr = -1;
     desc->large_page_mask = -1;
-    desc->vindex = 0;
-    memset(desc->vtable, -1, sizeof(desc->vtable));
     interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
 }
 
@@ -501,15 +499,6 @@ static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
     return tlb_hit_page_mask_anyprot(tlb_entry, page, -1);
 }
 
-/**
- * tlb_entry_is_empty - return true if the entry is not in use
- * @te: pointer to CPUTLBEntry
- */
-static inline bool tlb_entry_is_empty(const CPUTLBEntry *te)
-{
-    return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
-}
-
 /* Called with tlb_c.lock held */
 static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
                                         vaddr page,
@@ -527,28 +516,6 @@ static inline bool tlb_flush_entry_locked(CPUTLBEntry *tlb_entry, vaddr page)
     return tlb_flush_entry_mask_locked(tlb_entry, page, -1);
 }
 
-/* Called with tlb_c.lock held */
-static void tlb_flush_vtlb_page_mask_locked(CPUState *cpu, int mmu_idx,
-                                            vaddr page,
-                                            vaddr mask)
-{
-    CPUTLBDesc *d = &cpu->neg.tlb.d[mmu_idx];
-    int k;
-
-    assert_cpu_is_self(cpu);
-    for (k = 0; k < CPU_VTLB_SIZE; k++) {
-        if (tlb_flush_entry_mask_locked(&d->vtable[k], page, mask)) {
-            tlb_n_used_entries_dec(cpu, mmu_idx);
-        }
-    }
-}
-
-static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
-                                              vaddr page)
-{
-    tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
-}
-
 static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
                                        vaddr addr, vaddr len, vaddr mask)
 {
@@ -593,7 +560,6 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
 
     tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
                                page, TARGET_PAGE_SIZE, -1);
-    tlb_flush_vtlb_page_locked(cpu, midx, page);
 
     node = tlbtree_lookup_addr(desc, page);
     if (node) {
@@ -769,11 +735,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
 
     tlbfast_flush_range_locked(d, f, addr, len, mask);
 
-    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
-        vaddr page = addr + i;
-        tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
-    }
-
     addr_mask = addr & mask;
     last_mask = addr_mask + len - 1;
     last_imask = last_mask | ~mask;
@@ -1031,10 +992,6 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
             tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
         }
 
-        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
-            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
-        }
-
         for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
              t = tlbtree_lookup_range_next(t, 0, -1)) {
             tlb_reset_dirty_range_locked(&t->copy, start1, length);
@@ -1068,10 +1025,6 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
 
         tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
 
-        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_set_dirty1_locked(&desc->vtable[k], addr);
-        }
-
         node = tlbtree_lookup_addr(desc, addr);
         if (node) {
             tlb_set_dirty1_locked(&node->copy, addr);
@@ -1230,23 +1183,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     /* Note that the tlb is no longer clean.  */
     tlb->c.dirty |= 1 << mmu_idx;
 
-    /* Make sure there's no cached translation for the new page.  */
-    tlb_flush_vtlb_page_locked(cpu, mmu_idx, addr_page);
-
-    /*
-     * Only evict the old entry to the victim tlb if it's for a
-     * different page; otherwise just overwrite the stale data.
-     */
-    if (!tlb_hit_page_anyprot(te, addr_page) && !tlb_entry_is_empty(te)) {
-        unsigned vidx = desc->vindex++ % CPU_VTLB_SIZE;
-        CPUTLBEntry *tv = &desc->vtable[vidx];
-
-        /* Evict the old entry into the victim tlb.  */
-        copy_tlb_helper_locked(tv, te);
-        desc->vfulltlb[vidx] = desc->fulltlb[index];
-        tlb_n_used_entries_dec(cpu, mmu_idx);
-    }
-
     /* Replace an old IntervalTree entry, or create a new one. */
     node = tlbtree_lookup_addr(desc, addr_page);
     if (!node) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (14 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 15/23] accel/tcg: Remove the victim tlb Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:17   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

CPUTLBEntryFull structures are no longer directly included within
the CPUState structure.  Move the structure definition out of cpu.h
to reduce visibility.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/tlb-common.h | 63 +++++++++++++++++++++++++++++++++++++++
 include/hw/core/cpu.h     | 63 ---------------------------------------
 2 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index dc5a5faa0b..300f9fae67 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -53,4 +53,67 @@ typedef struct CPUTLBDescFast {
     CPUTLBEntry *table;
 } CPUTLBDescFast QEMU_ALIGNED(2 * sizeof(void *));
 
+/*
+ * The full TLB entry, which is not accessed by generated TCG code,
+ * so the layout is not as critical as that of CPUTLBEntry. This is
+ * also why we don't want to combine the two structs.
+ */
+struct CPUTLBEntryFull {
+    /*
+     * @xlat_section contains:
+     *  - in the lower TARGET_PAGE_BITS, a physical section number
+     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
+     *    must be added to the virtual address to obtain:
+     *     + the ram_addr_t of the target RAM (if the physical section
+     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
+     *     + the offset within the target MemoryRegion (otherwise)
+     */
+    hwaddr xlat_section;
+
+    /*
+     * @phys_addr contains the physical address in the address space
+     * given by cpu_asidx_from_attrs(cpu, @attrs).
+     */
+    hwaddr phys_addr;
+
+    /* @attrs contains the memory transaction attributes for the page. */
+    MemTxAttrs attrs;
+
+    /* @prot contains the complete protections for the page. */
+    uint8_t prot;
+
+    /* @lg_page_size contains the log2 of the page size. */
+    uint8_t lg_page_size;
+
+    /* Additional tlb flags requested by tlb_fill. */
+    uint8_t tlb_fill_flags;
+
+    /*
+     * Additional tlb flags for use by the slow path. If non-zero,
+     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
+     */
+    uint8_t slow_flags[MMU_ACCESS_COUNT];
+
+    /*
+     * Allow target-specific additions to this structure.
+     * This may be used to cache items from the guest cpu
+     * page tables for later use by the implementation.
+     */
+    union {
+        /*
+         * Cache the attrs and shareability fields from the page table entry.
+         *
+         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
+         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
+         * For shareability and guarded, as in the SH and GP fields respectively
+         * of the VMSAv8-64 PTEs.
+         */
+        struct {
+            uint8_t pte_attrs;
+            uint8_t shareability;
+            bool guarded;
+        } arm;
+    } extra;
+};
+
 #endif /* EXEC_TLB_COMMON_H */
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 87b864f5c4..6b1c2bfadd 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -198,69 +198,6 @@ struct CPUClass {
  */
 #define NB_MMU_MODES 16
 
-/*
- * The full TLB entry, which is not accessed by generated TCG code,
- * so the layout is not as critical as that of CPUTLBEntry. This is
- * also why we don't want to combine the two structs.
- */
-struct CPUTLBEntryFull {
-    /*
-     * @xlat_section contains:
-     *  - in the lower TARGET_PAGE_BITS, a physical section number
-     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
-     *    must be added to the virtual address to obtain:
-     *     + the ram_addr_t of the target RAM (if the physical section
-     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
-     *     + the offset within the target MemoryRegion (otherwise)
-     */
-    hwaddr xlat_section;
-
-    /*
-     * @phys_addr contains the physical address in the address space
-     * given by cpu_asidx_from_attrs(cpu, @attrs).
-     */
-    hwaddr phys_addr;
-
-    /* @attrs contains the memory transaction attributes for the page. */
-    MemTxAttrs attrs;
-
-    /* @prot contains the complete protections for the page. */
-    uint8_t prot;
-
-    /* @lg_page_size contains the log2 of the page size. */
-    uint8_t lg_page_size;
-
-    /* Additional tlb flags requested by tlb_fill. */
-    uint8_t tlb_fill_flags;
-
-    /*
-     * Additional tlb flags for use by the slow path. If non-zero,
-     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
-     */
-    uint8_t slow_flags[MMU_ACCESS_COUNT];
-
-    /*
-     * Allow target-specific additions to this structure.
-     * This may be used to cache items from the guest cpu
-     * page tables for later use by the implementation.
-     */
-    union {
-        /*
-         * Cache the attrs and shareability fields from the page table entry.
-         *
-         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
-         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
-         * For shareability and guarded, as in the SH and GP fields respectively
-         * of the VMSAv8-64 PTEs.
-         */
-        struct {
-            uint8_t pte_attrs;
-            uint8_t shareability;
-            bool guarded;
-        } arm;
-    } extra;
-};
-
 /*
  * Data elements that are per MMU mode, minus the bits accessed by
  * the TCG fast path.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (15 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:19   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Remove force_mmio and place the expression into the IF
expression, behind the short-circuit logic expressions
that might eliminate its computation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 561f66c723..59ee766d51 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1403,7 +1403,6 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
     vaddr page_addr = addr & TARGET_PAGE_MASK;
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
-    bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
     CPUTLBEntryFull *full;
 
     if (!tlb_hit_page(tlb_addr, page_addr)) {
@@ -1434,9 +1433,14 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
     flags |= full->slow_flags[access_type];
 
-    /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
-    if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
-        || (access_type != MMU_INST_FETCH && force_mmio)) {
+    /*
+     * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
+     * These cannot be treated as RAM.
+     */
+    if ((flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
+        || (access_type != MMU_INST_FETCH
+            && check_mem_cbs
+            && cpu_plugin_mem_cbs_enabled(cpu))) {
         *phost = NULL;
         return TLB_MMIO;
     }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (16 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 18:51   ` Philippe Mathieu-Daudé
  2024-10-10  0:23   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
                   ` (5 subsequent siblings)
  23 siblings, 2 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Ensure a common entry point for all code lookups.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 59ee766d51..61daa89e06 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2954,28 +2954,28 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(cs, true));
-    return do_ld1_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldb_code_mmu(env, addr, oi, 0);
 }
 
 uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(cs, true));
-    return do_ld2_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldw_code_mmu(env, addr, oi, 0);
 }
 
 uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(cs, true));
-    return do_ld4_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldl_code_mmu(env, addr, oi, 0);
 }
 
 uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
 {
     CPUState *cs = env_cpu(env);
     MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(cs, true));
-    return do_ld8_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
+    return cpu_ldq_code_mmu(env, addr, oi, 0);
 }
 
 uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (17 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:35   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Because translation is special, we don't need the speed
of the direct-mapped softmmu tlb.  We cache a lookups in
DisasContextBase within the translator loop anyway.

Drop the addr_code comparator from CPUTLBEntry.
Go directly to the IntervalTree for MMU_INST_FETCH.
Derive exec flags from read flags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h    |   3 +
 include/exec/tlb-common.h |   5 +-
 accel/tcg/cputlb.c        | 138 +++++++++++++++++++++++++++++---------
 3 files changed, 110 insertions(+), 36 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 6f09b86e7f..7f5a10962a 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -326,6 +326,9 @@ static inline int cpu_mmu_index(CPUState *cs, bool ifetch)
     (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO \
     | TLB_FORCE_SLOW | TLB_DISCARD_WRITE)
 
+/* Filter read flags to exec flags. */
+#define TLB_EXEC_FLAGS_MASK  (TLB_MMIO)
+
 /*
  * Flags stored in CPUTLBEntryFull.slow_flags[x].
  * TLB_FORCE_SLOW must be set in CPUTLBEntry.addr_idx[x].
diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index 300f9fae67..feaa471299 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -26,7 +26,6 @@ typedef union CPUTLBEntry {
     struct {
         uint64_t addr_read;
         uint64_t addr_write;
-        uint64_t addr_code;
         /*
          * Addend to virtual address to get host address.  IO accesses
          * use the corresponding iotlb value.
@@ -35,7 +34,7 @@ typedef union CPUTLBEntry {
     };
     /*
      * Padding to get a power of two size, as well as index
-     * access to addr_{read,write,code}.
+     * access to addr_{read,write}.
      */
     uint64_t addr_idx[(1 << CPU_TLB_ENTRY_BITS) / sizeof(uint64_t)];
 } CPUTLBEntry;
@@ -92,7 +91,7 @@ struct CPUTLBEntryFull {
      * Additional tlb flags for use by the slow path. If non-zero,
      * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
      */
-    uint8_t slow_flags[MMU_ACCESS_COUNT];
+    uint8_t slow_flags[2];
 
     /*
      * Allow target-specific additions to this structure.
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 61daa89e06..7c8308355d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -114,8 +114,9 @@ static inline uint64_t tlb_read_idx(const CPUTLBEntry *entry,
                       MMU_DATA_LOAD * sizeof(uint64_t));
     QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
                       MMU_DATA_STORE * sizeof(uint64_t));
-    QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
-                      MMU_INST_FETCH * sizeof(uint64_t));
+
+    tcg_debug_assert(access_type == MMU_DATA_LOAD ||
+                     access_type == MMU_DATA_STORE);
 
 #if TARGET_LONG_BITS == 32
     /* Use qatomic_read, in case of addr_write; only care about low bits. */
@@ -490,8 +491,7 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
     mask &= TARGET_PAGE_MASK | TLB_INVALID_MASK;
 
     return (page == (tlb_entry->addr_read & mask) ||
-            page == (tlb_addr_write(tlb_entry) & mask) ||
-            page == (tlb_entry->addr_code & mask));
+            page == (tlb_addr_write(tlb_entry) & mask));
 }
 
 static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
@@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
                                    vaddr address, int flags,
                                    MMUAccessType access_type, bool enable)
 {
-    if (enable) {
-        address |= flags & TLB_FLAGS_MASK;
-        flags &= TLB_SLOW_FLAGS_MASK;
-        if (flags) {
-            address |= TLB_FORCE_SLOW;
-        }
-    } else {
-        address = -1;
-        flags = 0;
+    if (!enable) {
+	address = TLB_INVALID_MASK;
+    }
+    address |= flags & TLB_FLAGS_MASK;
+    flags &= TLB_SLOW_FLAGS_MASK;
+    if (flags) {
+        address |= TLB_FORCE_SLOW;
     }
     ent->addr_idx[access_type] = address;
     full->slow_flags[access_type] = flags;
@@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     /* Now calculate the new entry */
     node->copy.addend = addend - addr_page;
 
-    tlb_set_compare(full, &node->copy, addr_page, read_flags,
-                    MMU_INST_FETCH, prot & PAGE_EXEC);
-
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
     }
@@ -1392,21 +1387,52 @@ static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
     }
 }
 
-static int probe_access_internal(CPUState *cpu, vaddr addr,
-                                 int fault_size, MMUAccessType access_type,
-                                 int mmu_idx, bool nonfault,
-                                 void **phost, CPUTLBEntryFull **pfull,
-                                 uintptr_t retaddr, bool check_mem_cbs)
+static int probe_access_internal_code(CPUState *cpu, vaddr addr,
+                                      int fault_size, int mmu_idx,
+                                      bool nonfault,
+                                      void **phost, CPUTLBEntryFull **pfull,
+                                      uintptr_t retaddr)
+{
+    CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
+    int flags;
+
+    if (!t || !(t->full.prot & PAGE_EXEC)) {
+        if (!tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
+                            0, fault_size, nonfault, retaddr)) {
+            /* Non-faulting page table read failed.  */
+            *phost = NULL;
+            *pfull = NULL;
+            return TLB_INVALID_MASK;
+        }
+        t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
+    }
+    flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
+    *pfull = &t->full;
+
+    if (flags) {
+        *phost = NULL;
+        return TLB_MMIO;
+    }
+
+    /* Everything else is RAM. */
+    *phost = (void *)((uintptr_t)addr + t->copy.addend);
+    return flags;
+}
+
+static int probe_access_internal_data(CPUState *cpu, vaddr addr,
+                                      int fault_size, MMUAccessType access_type,
+                                      int mmu_idx, bool nonfault,
+                                      void **phost, CPUTLBEntryFull **pfull,
+                                      uintptr_t retaddr, bool check_mem_cbs)
 {
     uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
-    vaddr page_addr = addr & TARGET_PAGE_MASK;
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
     CPUTLBEntryFull *full;
 
-    if (!tlb_hit_page(tlb_addr, page_addr)) {
-        if (!tlbtree_hit(cpu, mmu_idx, access_type, page_addr)) {
+    if (!tlb_hit(tlb_addr, addr)) {
+        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
             if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
                                 0, fault_size, nonfault, retaddr)) {
                 /* Non-faulting page table read failed.  */
@@ -1450,6 +1476,21 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
     return flags;
 }
 
+static int probe_access_internal(CPUState *cpu, vaddr addr,
+                                 int fault_size, MMUAccessType access_type,
+                                 int mmu_idx, bool nonfault,
+                                 void **phost, CPUTLBEntryFull **pfull,
+                                 uintptr_t retaddr, bool check_mem_cbs)
+{
+    if (access_type == MMU_INST_FETCH) {
+        return probe_access_internal_code(cpu, addr, fault_size, mmu_idx,
+                                          nonfault, phost, pfull, retaddr);
+    }
+    return probe_access_internal_data(cpu, addr, fault_size, access_type,
+                                      mmu_idx, nonfault, phost, pfull,
+                                      retaddr, check_mem_cbs);
+}
+
 int probe_access_full(CPUArchState *env, vaddr addr, int size,
                       MMUAccessType access_type, int mmu_idx,
                       bool nonfault, void **phost, CPUTLBEntryFull **pfull,
@@ -1582,9 +1623,9 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
     CPUTLBEntryFull *full;
     void *p;
 
-    (void)probe_access_internal(env_cpu(env), addr, 1, MMU_INST_FETCH,
-                                cpu_mmu_index(env_cpu(env), true), false,
-                                &p, &full, 0, false);
+    (void)probe_access_internal_code(env_cpu(env), addr, 1,
+                                     cpu_mmu_index(env_cpu(env), true),
+                                     false, &p, &full, 0);
     if (p == NULL) {
         return -1;
     }
@@ -1678,8 +1719,31 @@ typedef struct MMULookupLocals {
  * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
  * @mmu_idx may have resized.
  */
-static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
-                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
+static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
+                             MemOp memop, int mmu_idx, uintptr_t ra)
+{
+    vaddr addr = data->addr;
+    CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
+    bool maybe_resized = true;
+
+    if (!t || !(t->full.prot & PAGE_EXEC)) {
+        tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
+                       memop, data->size, false, ra);
+        maybe_resized = true;
+        t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
+    }
+
+    data->full = &t->full;
+    data->flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
+    /* Compute haddr speculatively; depending on flags it might be invalid. */
+    data->haddr = (void *)((uintptr_t)addr + t->copy.addend);
+
+    return maybe_resized;
+}
+
+static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
+                             MemOp memop, int mmu_idx,
+                             MMUAccessType access_type, uintptr_t ra)
 {
     vaddr addr = data->addr;
     uintptr_t index = tlb_index(cpu, mmu_idx, addr);
@@ -1738,6 +1802,15 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
     return maybe_resized;
 }
 
+static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
+                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
+{
+    if (access_type == MMU_INST_FETCH) {
+        return mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
+    }
+    return mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
+}
+
 /**
  * mmu_watch_or_dirty
  * @cpu: generic cpu state
@@ -1885,13 +1958,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         }
     }
 
+    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+
     /*
      * Let the guest notice RMW on a write-only page.
      * We have just verified that the page is writable.
-     * Subpage lookups may have left TLB_INVALID_MASK set,
-     * but addr_read will only be -1 if PAGE_READ was unset.
      */
-    if (unlikely(tlbe->addr_read == -1)) {
+    if (unlikely(!(full->prot & PAGE_READ))) {
         tlb_fill_align(cpu, addr, MMU_DATA_LOAD, mmu_idx,
                        0, size, false, retaddr);
         /*
@@ -1929,7 +2002,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     }
 
     hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
 
     if (unlikely(flags & TLB_NOTDIRTY)) {
         notdirty_write(cpu, addr, size, full, retaddr);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (18 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:37   ` Pierrick Bouvier
  2024-10-09 15:08 ` [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Link from the fast tlb entry to the interval tree node.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/tlb-common.h |  2 ++
 accel/tcg/cputlb.c        | 59 ++++++++++++++-------------------------
 2 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index feaa471299..3b57d61112 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -31,6 +31,8 @@ typedef union CPUTLBEntry {
          * use the corresponding iotlb value.
          */
         uintptr_t addend;
+        /* The defining IntervalTree entry. */
+        struct CPUTLBEntryTree *tree;
     };
     /*
      * Padding to get a power of two size, as well as index
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 7c8308355d..2a8d1b4fb2 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -505,7 +505,10 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
                                         vaddr mask)
 {
     if (tlb_hit_page_mask_anyprot(tlb_entry, page, mask)) {
-        memset(tlb_entry, -1, sizeof(*tlb_entry));
+        tlb_entry->addr_read = -1;
+        tlb_entry->addr_write = -1;
+        tlb_entry->addend = 0;
+        tlb_entry->tree = NULL;
         return true;
     }
     return false;
@@ -1212,6 +1215,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
 
     /* Now calculate the new entry */
     node->copy.addend = addend - addr_page;
+    node->copy.tree = node;
 
     if (wp_flags & BP_MEM_READ) {
         read_flags |= TLB_WATCHPOINT;
@@ -1425,7 +1429,6 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
                                       void **phost, CPUTLBEntryFull **pfull,
                                       uintptr_t retaddr, bool check_mem_cbs)
 {
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
@@ -1442,7 +1445,6 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
             }
 
             /* TLB resize via tlb_fill_align may have moved the entry.  */
-            index = tlb_index(cpu, mmu_idx, addr);
             entry = tlb_entry(cpu, mmu_idx, addr);
 
             /*
@@ -1456,7 +1458,7 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
     }
     flags &= tlb_addr;
 
-    *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    *pfull = full = &entry->tree->full;
     flags |= full->slow_flags[access_type];
 
     /*
@@ -1659,7 +1661,6 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
                        bool is_store, struct qemu_plugin_hwaddr *data)
 {
     CPUTLBEntry *tlbe = tlb_entry(cpu, mmu_idx, addr);
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     MMUAccessType access_type = is_store ? MMU_DATA_STORE : MMU_DATA_LOAD;
     uint64_t tlb_addr = tlb_read_idx(tlbe, access_type);
     CPUTLBEntryFull *full;
@@ -1668,7 +1669,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
         return false;
     }
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    full = &tlbe->tree->full;
     data->phys_addr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
 
     /* We must have an iotlb entry for MMIO */
@@ -1716,20 +1717,17 @@ typedef struct MMULookupLocals {
  *
  * Resolve the translation for the one page at @data.addr, filling in
  * the rest of @data with the results.  If the translation fails,
- * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
- * @mmu_idx may have resized.
+ * tlb_fill_align will longjmp out.
  */
-static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
+static void mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
                              MemOp memop, int mmu_idx, uintptr_t ra)
 {
     vaddr addr = data->addr;
     CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
-    bool maybe_resized = true;
 
     if (!t || !(t->full.prot & PAGE_EXEC)) {
         tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
                        memop, data->size, false, ra);
-        maybe_resized = true;
         t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
     }
 
@@ -1737,19 +1735,16 @@ static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
     data->flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
     /* Compute haddr speculatively; depending on flags it might be invalid. */
     data->haddr = (void *)((uintptr_t)addr + t->copy.addend);
-
-    return maybe_resized;
 }
 
-static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
+static void mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
                              MemOp memop, int mmu_idx,
                              MMUAccessType access_type, uintptr_t ra)
 {
     vaddr addr = data->addr;
-    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
     CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
     uint64_t tlb_addr = tlb_read_idx(entry, access_type);
-    bool maybe_resized = false;
+    bool did_tlb_fill = false;
     CPUTLBEntryFull *full;
     int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
 
@@ -1758,8 +1753,7 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
         if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
             tlb_fill_align(cpu, addr, access_type, mmu_idx,
                            memop, data->size, false, ra);
-            maybe_resized = true;
-            index = tlb_index(cpu, mmu_idx, addr);
+            did_tlb_fill = true;
             entry = tlb_entry(cpu, mmu_idx, addr);
             /*
              * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
@@ -1771,11 +1765,11 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
         tlb_addr = tlb_read_idx(entry, access_type);
     }
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
-    flags = tlb_addr & (TLB_FLAGS_MASK & ~TLB_FORCE_SLOW);
+    full = &entry->tree->full;
+    flags &= tlb_addr;
     flags |= full->slow_flags[access_type];
 
-    if (likely(!maybe_resized)) {
+    if (likely(!did_tlb_fill)) {
         /* Alignment has not been checked by tlb_fill_align. */
         int a_bits = memop_alignment_bits(memop);
 
@@ -1798,17 +1792,15 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
     data->flags = flags;
     /* Compute haddr speculatively; depending on flags it might be invalid. */
     data->haddr = (void *)((uintptr_t)addr + entry->addend);
-
-    return maybe_resized;
 }
 
-static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
+static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
                         int mmu_idx, MMUAccessType access_type, uintptr_t ra)
 {
     if (access_type == MMU_INST_FETCH) {
-        return mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
+        mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
     }
-    return mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
+    mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
 }
 
 /**
@@ -1889,15 +1881,9 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         l->page[1].size = l->page[0].size - size0;
         l->page[0].size = size0;
 
-        /*
-         * Lookup both pages, recognizing exceptions from either.  If the
-         * second lookup potentially resized, refresh first CPUTLBEntryFull.
-         */
+        /* Lookup both pages, recognizing exceptions from either. */
         mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
-        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
-            uintptr_t index = tlb_index(cpu, l->mmu_idx, addr);
-            l->page[0].full = &cpu->neg.tlb.d[l->mmu_idx].fulltlb[index];
-        }
+        mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
 
         flags = l->page[0].flags | l->page[1].flags;
         if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
@@ -1925,7 +1911,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
 {
     uintptr_t mmu_idx = get_mmuidx(oi);
     MemOp mop = get_memop(oi);
-    uintptr_t index;
     CPUTLBEntry *tlbe;
     void *hostaddr;
     CPUTLBEntryFull *full;
@@ -1937,7 +1922,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
     /* Adjust the given return address.  */
     retaddr -= GETPC_ADJ;
 
-    index = tlb_index(cpu, mmu_idx, addr);
     tlbe = tlb_entry(cpu, mmu_idx, addr);
 
     /* Check TLB entry and enforce page permissions.  */
@@ -1947,7 +1931,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
             tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
                            mop, size, false, retaddr);
             did_tlb_fill = true;
-            index = tlb_index(cpu, mmu_idx, addr);
             tlbe = tlb_entry(cpu, mmu_idx, addr);
             /*
              * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
@@ -1958,7 +1941,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
         }
     }
 
-    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
+    full = &tlbe->tree->full;
 
     /*
      * Let the guest notice RMW on a write-only page.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (19 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:38   ` Pierrick Bouvier
  2024-10-09 15:08 ` [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

This array is now write-only, and may be remove.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h |  1 -
 accel/tcg/cputlb.c    | 39 ++++++++-------------------------------
 2 files changed, 8 insertions(+), 32 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 6b1c2bfadd..3022529733 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -216,7 +216,6 @@ typedef struct CPUTLBDesc {
     /* maximum number of entries observed in the window */
     size_t window_max_entries;
     size_t n_used_entries;
-    CPUTLBEntryFull *fulltlb;
     /* All active tlb entries for this address space. */
     IntervalTreeRoot iroot;
 } CPUTLBDesc;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 2a8d1b4fb2..47b9557bb8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -149,13 +149,6 @@ static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
     return fast->table + tlbfast_index(fast, addr);
 }
 
-/* Find the TLB index corresponding to the mmu_idx + address pair.  */
-static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
-                                  vaddr addr)
-{
-    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
-}
-
 /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
 static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
                                      vaddr addr)
@@ -270,22 +263,20 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
     }
 
     g_free(fast->table);
-    g_free(desc->fulltlb);
 
     tlb_window_reset(desc, now, 0);
     /* desc->n_used_entries is cleared by the caller */
     fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_try_new(CPUTLBEntry, new_size);
-    desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
 
     /*
-     * If the allocations fail, try smaller sizes. We just freed some
+     * If the allocation fails, try smaller sizes. We just freed some
      * memory, so going back to half of new_size has a good chance of working.
      * Increased memory pressure elsewhere in the system might cause the
      * allocations to fail though, so we progressively reduce the allocation
      * size, aborting if we cannot even allocate the smallest TLB we support.
      */
-    while (fast->table == NULL || desc->fulltlb == NULL) {
+    while (fast->table == NULL) {
         if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
             error_report("%s: %s", __func__, strerror(errno));
             abort();
@@ -294,9 +285,7 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
         fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
 
         g_free(fast->table);
-        g_free(desc->fulltlb);
         fast->table = g_try_new(CPUTLBEntry, new_size);
-        desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
     }
 }
 
@@ -350,7 +339,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
     desc->n_used_entries = 0;
     fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
     fast->table = g_new(CPUTLBEntry, n_entries);
-    desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
     memset(&desc->iroot, 0, sizeof(desc->iroot));
     tlb_mmu_flush_locked(desc, fast);
 }
@@ -382,15 +370,9 @@ void tlb_init(CPUState *cpu)
 
 void tlb_destroy(CPUState *cpu)
 {
-    int i;
-
     qemu_spin_destroy(&cpu->neg.tlb.c.lock);
-    for (i = 0; i < NB_MMU_MODES; i++) {
-        CPUTLBDesc *desc = &cpu->neg.tlb.d[i];
-        CPUTLBDescFast *fast = &cpu->neg.tlb.f[i];
-
-        g_free(fast->table);
-        g_free(desc->fulltlb);
+    for (int i = 0; i < NB_MMU_MODES; i++) {
+        g_free(cpu->neg.tlb.f[i].table);
         interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
                                  offsetof(CPUTLBEntryTree, itree));
     }
@@ -1090,7 +1072,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     CPUTLB *tlb = &cpu->neg.tlb;
     CPUTLBDesc *desc = &tlb->d[mmu_idx];
     MemoryRegionSection *section;
-    unsigned int index, read_flags, write_flags;
+    unsigned int read_flags, write_flags;
     uintptr_t addend;
     CPUTLBEntry *te;
     CPUTLBEntryTree *node;
@@ -1169,7 +1151,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     wp_flags = cpu_watchpoint_address_matches(cpu, addr_page,
                                               TARGET_PAGE_SIZE);
 
-    index = tlb_index(cpu, mmu_idx, addr_page);
     te = tlb_entry(cpu, mmu_idx, addr_page);
 
     /*
@@ -1208,8 +1189,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
      * subtract here is that of the page base, and not the same as the
      * vaddr we add back in io_prepare()/get_page_addr_code().
      */
-    desc->fulltlb[index] = *full;
-    full = &desc->fulltlb[index];
+    node->full = *full;
+    full = &node->full;
     full->xlat_section = iotlb - addr_page;
     full->phys_addr = paddr_page;
 
@@ -1232,7 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     tlb_set_compare(full, &node->copy, addr_page, write_flags,
                     MMU_DATA_STORE, prot & PAGE_WRITE);
 
-    node->full = *full;
     copy_tlb_helper_locked(te, &node->copy);
     tlb_n_used_entries_inc(cpu, mmu_idx);
     qemu_spin_unlock(&tlb->c.lock);
@@ -1343,7 +1323,6 @@ static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
     CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
     CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
     CPUTLBEntryTree *node;
-    size_t index;
 
     assert_cpu_is_self(cpu);
     node = tlbtree_lookup_addr(desc, addr);
@@ -1358,12 +1337,10 @@ static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
     }
 
     /* Install the cached entry. */
-    index = tlbfast_index(fast, addr);
     qemu_spin_lock(&cpu->neg.tlb.c.lock);
-    copy_tlb_helper_locked(&fast->table[index], &node->copy);
+    copy_tlb_helper_locked(tlbfast_entry(fast, addr), &node->copy);
     qemu_spin_unlock(&cpu->neg.tlb.c.lock);
 
-    desc->fulltlb[index] = node->full;
     return true;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (20 preceding siblings ...)
  2024-10-09 15:08 ` [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-10  0:40   ` Pierrick Bouvier
  2024-10-09 15:08 ` [NOTYET PATCH 23/23] accel/tcg: Unexport tlb_set_page* Richard Henderson
  2024-10-09 16:27 ` [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree BALATON Zoltan
  23 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

Now that all targets have been converted to tlb_fill_align,
remove the tlb_fill hook.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/tcg-cpu-ops.h | 10 ----------
 accel/tcg/cputlb.c            | 19 ++++---------------
 2 files changed, 4 insertions(+), 25 deletions(-)

diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index c932690621..e73c8a03de 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -157,16 +157,6 @@ struct TCGCPUOps {
     bool (*tlb_fill_align)(CPUState *cpu, CPUTLBEntryFull *out, vaddr addr,
                            MMUAccessType access_type, int mmu_idx,
                            MemOp memop, int size, bool probe, uintptr_t ra);
-    /**
-     * @tlb_fill: Handle a softmmu tlb miss
-     *
-     * If the access is valid, call tlb_set_page and return true;
-     * if the access is invalid and probe is true, return false;
-     * otherwise raise an exception and do not return.
-     */
-    bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
-                     MMUAccessType access_type, int mmu_idx,
-                     bool probe, uintptr_t retaddr);
     /**
      * @do_transaction_failed: Callback for handling failed memory transactions
      * (ie bus faults or external aborts; not MMU faults)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 47b9557bb8..55c7bf737b 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1251,23 +1251,12 @@ static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
                            int mmu_idx, MemOp memop, int size,
                            bool probe, uintptr_t ra)
 {
-    const TCGCPUOps *ops = cpu->cc->tcg_ops;
     CPUTLBEntryFull full;
 
-    if (ops->tlb_fill_align) {
-        if (ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
-                                memop, size, probe, ra)) {
-            tlb_set_page_full(cpu, mmu_idx, addr, &full);
-            return true;
-        }
-    } else {
-        /* Legacy behaviour is alignment before paging. */
-        if (addr & ((1u << memop_alignment_bits(memop)) - 1)) {
-            ops->do_unaligned_access(cpu, addr, type, mmu_idx, ra);
-        }
-        if (ops->tlb_fill(cpu, addr, size, type, mmu_idx, probe, ra)) {
-            return true;
-        }
+    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
+                                         memop, size, probe, ra)) {
+        tlb_set_page_full(cpu, mmu_idx, addr, &full);
+        return true;
     }
     assert(probe);
     return false;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [NOTYET PATCH 23/23] accel/tcg: Unexport tlb_set_page*
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (21 preceding siblings ...)
  2024-10-09 15:08 ` [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
@ 2024-10-09 15:08 ` Richard Henderson
  2024-10-09 16:27 ` [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree BALATON Zoltan
  23 siblings, 0 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 15:08 UTC (permalink / raw)
  To: qemu-devel

The new tlb_fill_align hook returns page data via structure
rather than by function call, so we can make tlb_set_page_full
be local to cputlb.c.  There are no users of tlb_set_page
or tlb_set_page_with_attrs, so those can be eliminated.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 57 -----------------------------------------
 accel/tcg/cputlb.c      | 27 ++-----------------
 2 files changed, 2 insertions(+), 82 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 72240ef426..8e2ab26902 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -184,63 +184,6 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *cpu,
                                                vaddr len,
                                                uint16_t idxmap,
                                                unsigned bits);
-
-/**
- * tlb_set_page_full:
- * @cpu: CPU context
- * @mmu_idx: mmu index of the tlb to modify
- * @addr: virtual address of the entry to add
- * @full: the details of the tlb entry
- *
- * Add an entry to @cpu tlb index @mmu_idx.  All of the fields of
- * @full must be filled, except for xlat_section, and constitute
- * the complete description of the translated page.
- *
- * This is generally called by the target tlb_fill function after
- * having performed a successful page table walk to find the physical
- * address and attributes for the translation.
- *
- * At most one entry for a given virtual address is permitted. Only a
- * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
- * used by tlb_flush_page.
- */
-void tlb_set_page_full(CPUState *cpu, int mmu_idx, vaddr addr,
-                       CPUTLBEntryFull *full);
-
-/**
- * tlb_set_page_with_attrs:
- * @cpu: CPU to add this TLB entry for
- * @addr: virtual address of page to add entry for
- * @paddr: physical address of the page
- * @attrs: memory transaction attributes
- * @prot: access permissions (PAGE_READ/PAGE_WRITE/PAGE_EXEC bits)
- * @mmu_idx: MMU index to insert TLB entry for
- * @size: size of the page in bytes
- *
- * Add an entry to this CPU's TLB (a mapping from virtual address
- * @addr to physical address @paddr) with the specified memory
- * transaction attributes. This is generally called by the target CPU
- * specific code after it has been called through the tlb_fill()
- * entry point and performed a successful page table walk to find
- * the physical address and attributes for the virtual address
- * which provoked the TLB miss.
- *
- * At most one entry for a given virtual address is permitted. Only a
- * single TARGET_PAGE_SIZE region is mapped; the supplied @size is only
- * used by tlb_flush_page.
- */
-void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
-                             hwaddr paddr, MemTxAttrs attrs,
-                             int prot, int mmu_idx, vaddr size);
-/* tlb_set_page:
- *
- * This function is equivalent to calling tlb_set_page_with_attrs()
- * with an @attrs argument of MEMTXATTRS_UNSPECIFIED. It's provided
- * as a convenience for CPUs which don't use memory transaction attributes.
- */
-void tlb_set_page(CPUState *cpu, vaddr addr,
-                  hwaddr paddr, int prot,
-                  int mmu_idx, vaddr size);
 #else
 static inline void tlb_init(CPUState *cpu)
 {
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 55c7bf737b..5efd6e536c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1066,8 +1066,8 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
  * Called from TCG-generated code, which is under an RCU read-side
  * critical section.
  */
-void tlb_set_page_full(CPUState *cpu, int mmu_idx,
-                       vaddr addr, CPUTLBEntryFull *full)
+static void tlb_set_page_full(CPUState *cpu, int mmu_idx,
+                              vaddr addr, CPUTLBEntryFull *full)
 {
     CPUTLB *tlb = &cpu->neg.tlb;
     CPUTLBDesc *desc = &tlb->d[mmu_idx];
@@ -1218,29 +1218,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
     qemu_spin_unlock(&tlb->c.lock);
 }
 
-void tlb_set_page_with_attrs(CPUState *cpu, vaddr addr,
-                             hwaddr paddr, MemTxAttrs attrs, int prot,
-                             int mmu_idx, uint64_t size)
-{
-    CPUTLBEntryFull full = {
-        .phys_addr = paddr,
-        .attrs = attrs,
-        .prot = prot,
-        .lg_page_size = ctz64(size)
-    };
-
-    assert(is_power_of_2(size));
-    tlb_set_page_full(cpu, mmu_idx, addr, &full);
-}
-
-void tlb_set_page(CPUState *cpu, vaddr addr,
-                  hwaddr paddr, int prot,
-                  int mmu_idx, uint64_t size)
-{
-    tlb_set_page_with_attrs(cpu, addr, paddr, MEMTXATTRS_UNSPECIFIED,
-                            prot, mmu_idx, size);
-}
-
 /*
  * Note: tlb_fill_align() can trigger a resize of the TLB.
  * This means that all of the caller's prior references to the TLB table
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree
  2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
                   ` (22 preceding siblings ...)
  2024-10-09 15:08 ` [NOTYET PATCH 23/23] accel/tcg: Unexport tlb_set_page* Richard Henderson
@ 2024-10-09 16:27 ` BALATON Zoltan
  2024-10-09 17:10   ` Richard Henderson
  23 siblings, 1 reply; 60+ messages in thread
From: BALATON Zoltan @ 2024-10-09 16:27 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Wed, 9 Oct 2024, Richard Henderson wrote:
> Based-on: 20241009000453.315652-1-richard.henderson@linaro.org
> ("[PATCH v3 00/20] accel/tcg: Introduce tlb_fill_align hook")
>
> The initial idea was: how much can we do with an intelligent data
> structure for the same cost as a linear search through an array?
>
> This is an initial installment along these lines.  This is about
> as far as I can go without first converting all targets to the
> new tlb_fill_align hook.  Indeed, the final two patches will not
> compile with all targets enabled, but hint at the direction of
> the next steps.
>
> I do not expect large perf changes with this patch set.  I will
> be happy if performance comes out even.

Then what's the point? Diffstat below does not show it's less code and if 
it's also not more efficient then what's the reason to change it? I'm not 
opposed to any change just don't see an explanation of what motivates this 
series.

Regards,
BALATON Zoltan

>
> r~
>
> Richard Henderson (23):
>  util/interval-tree: Introduce interval_tree_free_nodes
>  accel/tcg: Split out tlbfast_flush_locked
>  accel/tcg: Split out tlbfast_{index,entry}
>  accel/tcg: Split out tlbfast_flush_range_locked
>  accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
>  accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx*
>  accel/tcg: Flush entire tlb when a masked range wraps
>  accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
>  accel/tcg: Populate IntervalTree in tlb_set_page_full
>  accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
>  accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
>  accel/tcg: Process IntervalTree entries in tlb_reset_dirty
>  accel/tcg: Process IntervalTree entries in tlb_set_dirty
>  accel/tcg: Replace victim_tlb_hit with tlbtree_hit
>  accel/tcg: Remove the victim tlb
>  include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
>  accel/tcg: Delay plugin adjustment in probe_access_internal
>  accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
>  accel/tcg: Always use IntervalTree for code lookups
>  accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
>  accel/tcg: Remove CPUTLBDesc.fulltlb
>  accel/tcg: Drop TCGCPUOps.tlb_fill -- NOTYET
>  accel/tcg: Unexport tlb_set_page*
>
> include/exec/cpu-all.h        |   3 +
> include/exec/exec-all.h       |  57 ----
> include/exec/tlb-common.h     |  68 +++-
> include/hw/core/cpu.h         |  75 +----
> include/hw/core/tcg-cpu-ops.h |  10 -
> include/qemu/interval-tree.h  |  11 +
> accel/tcg/cputlb.c            | 599 +++++++++++++++++++---------------
> util/interval-tree.c          |  20 ++
> util/selfmap.c                |  13 +-
> 9 files changed, 431 insertions(+), 425 deletions(-)
>
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree
  2024-10-09 16:27 ` [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree BALATON Zoltan
@ 2024-10-09 17:10   ` Richard Henderson
  2024-10-10  0:50     ` Pierrick Bouvier
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-09 17:10 UTC (permalink / raw)
  To: BALATON Zoltan; +Cc: qemu-devel

On 10/9/24 09:27, BALATON Zoltan wrote:
> On Wed, 9 Oct 2024, Richard Henderson wrote:
>> Based-on: 20241009000453.315652-1-richard.henderson@linaro.org
>> ("[PATCH v3 00/20] accel/tcg: Introduce tlb_fill_align hook")
>>
>> The initial idea was: how much can we do with an intelligent data
>> structure for the same cost as a linear search through an array?
>>
>> This is an initial installment along these lines.  This is about
>> as far as I can go without first converting all targets to the
>> new tlb_fill_align hook.  Indeed, the final two patches will not
>> compile with all targets enabled, but hint at the direction of
>> the next steps.
>>
>> I do not expect large perf changes with this patch set.  I will
>> be happy if performance comes out even.
> 
> Then what's the point?

Eventually fixing the page size > TARGET_PAGE_SIZE performance issues.

E.g. with a 16k or 64k aarch64 guest kernel, we still have TARGET_PAGE_SIZE at 4k, so all 
guest pages are "large", and so run into our current behaviour of flushing the entire tlb 
too often.

Even without that, I expect further cleanups to improve performance, we're just not there yet.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-10-09 15:08 ` [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
@ 2024-10-09 18:51   ` Philippe Mathieu-Daudé
  2024-10-10  0:23   ` Pierrick Bouvier
  1 sibling, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-10-09 18:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 9/10/24 12:08, Richard Henderson wrote:
> Ensure a common entry point for all code lookups.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx*
  2024-10-09 15:08 ` [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx* Richard Henderson
@ 2024-10-09 18:53   ` Philippe Mathieu-Daudé
  2024-10-09 23:20   ` Pierrick Bouvier
  1 sibling, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-10-09 18:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 9/10/24 12:08, Richard Henderson wrote:
> Probably never happens, but next patches will assume non-zero length.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index fd8da8586f..93b42d18ee 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -801,6 +801,9 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> +    if (len == 0) {
> +        return;
> +    }

Maybe clearer to move the check before the comment, otherwise:

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

>       if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
>           return;
> @@ -839,6 +842,9 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> +    if (len == 0) {
> +        return;
> +    }
>       if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
>           return;



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked
  2024-10-09 15:08 ` [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
@ 2024-10-09 18:54   ` Philippe Mathieu-Daudé
  2024-10-09 22:53   ` Pierrick Bouvier
  1 sibling, 0 replies; 60+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-10-09 18:54 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 9/10/24 12:08, Richard Henderson wrote:
> We will have a need to flush only the "fast" portion
> of the tlb, allowing re-fill from the "full" portion.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes
  2024-10-09 15:08 ` [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
@ 2024-10-09 22:51   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 22:51 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Provide a general-purpose release-all-nodes operation, that allows
> for the IntervalTreeNode to be embeded within a larger structure.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/qemu/interval-tree.h | 11 +++++++++++
>   util/interval-tree.c         | 20 ++++++++++++++++++++
>   util/selfmap.c               | 13 +------------
>   3 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h
> index 25006debe8..d90ea6d17f 100644
> --- a/include/qemu/interval-tree.h
> +++ b/include/qemu/interval-tree.h
> @@ -96,4 +96,15 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
>   IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
>                                             uint64_t start, uint64_t last);
>   
> +/**
> + * interval_tree_free_nodes:
> + * @root: root of the tree
> + * @it_offset: offset from outermost type to IntervalTreeNode
> + *
> + * Free, via g_free, all nodes under @root.  IntervalTreeNode may
> + * not be the true type of the nodes allocated; @it_offset gives
> + * the offset from the outermost type to the IntervalTreeNode member.
> + */
> +void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset);
> +
>   #endif /* QEMU_INTERVAL_TREE_H */
> diff --git a/util/interval-tree.c b/util/interval-tree.c
> index 53465182e6..663d3ec222 100644
> --- a/util/interval-tree.c
> +++ b/util/interval-tree.c
> @@ -639,6 +639,16 @@ static void rb_erase_augmented_cached(RBNode *node, RBRootLeftCached *root,
>       rb_erase_augmented(node, &root->rb_root, augment);
>   }
>   
> +static void rb_node_free(RBNode *rb, size_t rb_offset)
> +{
> +    if (rb->rb_left) {
> +        rb_node_free(rb->rb_left, rb_offset);
> +    }
> +    if (rb->rb_right) {
> +        rb_node_free(rb->rb_right, rb_offset);
> +    }
> +    g_free((void *)rb - rb_offset);
> +}
>   
>   /*
>    * Interval trees.
> @@ -870,6 +880,16 @@ IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
>       }
>   }
>   
> +void interval_tree_free_nodes(IntervalTreeRoot *root, size_t it_offset)
> +{
> +    if (root && root->rb_root.rb_node) {
> +        rb_node_free(root->rb_root.rb_node,
> +                     it_offset + offsetof(IntervalTreeNode, rb));
> +        root->rb_root.rb_node = NULL;
> +        root->rb_leftmost = NULL;
> +    }
> +}
> +
>   /* Occasionally useful for calling from within the debugger. */
>   #if 0
>   static void debug_interval_tree_int(IntervalTreeNode *node,
> diff --git a/util/selfmap.c b/util/selfmap.c
> index 483cb617e2..d2b86da301 100644
> --- a/util/selfmap.c
> +++ b/util/selfmap.c
> @@ -87,23 +87,12 @@ IntervalTreeRoot *read_self_maps(void)
>    * @root: an interval tree
>    *
>    * Free a tree of MapInfo structures.
> - * Since we allocated each MapInfo in one chunk, we need not consider the
> - * contents and can simply free each RBNode.
>    */
>   
> -static void free_rbnode(RBNode *n)
> -{
> -    if (n) {
> -        free_rbnode(n->rb_left);
> -        free_rbnode(n->rb_right);
> -        g_free(n);
> -    }
> -}
> -
>   void free_self_maps(IntervalTreeRoot *root)
>   {
>       if (root) {
> -        free_rbnode(root->rb_root.rb_node);
> +        interval_tree_free_nodes(root, offsetof(MapInfo, itree));
>           g_free(root);
>       }
>   }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked
  2024-10-09 15:08 ` [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
  2024-10-09 18:54   ` Philippe Mathieu-Daudé
@ 2024-10-09 22:53   ` Pierrick Bouvier
  1 sibling, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 22:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> We will have a need to flush only the "fast" portion
> of the tlb, allowing re-fill from the "full" portion.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index b76a4eac4e..c1838412e8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -284,13 +284,18 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>       }
>   }
>   
> -static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
> +static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>   {
>       desc->n_used_entries = 0;
> +    memset(fast->table, -1, sizeof_tlb(fast));
> +}
> +
> +static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
> +{
> +    tlbfast_flush_locked(desc, fast);
>       desc->large_page_addr = -1;
>       desc->large_page_mask = -1;
>       desc->vindex = 0;
> -    memset(fast->table, -1, sizeof_tlb(fast));
>       memset(desc->vtable, -1, sizeof(desc->vtable));
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry}
  2024-10-09 15:08 ` [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
@ 2024-10-09 22:55   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 22:55 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Often we already have the CPUTLBDescFast structure pointer.
> Allows future code simplification.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index c1838412e8..e37af24525 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -131,20 +131,28 @@ static inline uint64_t tlb_addr_write(const CPUTLBEntry *entry)
>       return tlb_read_idx(entry, MMU_DATA_STORE);
>   }
>   
> +static inline uintptr_t tlbfast_index(CPUTLBDescFast *fast, vaddr addr)
> +{
> +    return (addr >> TARGET_PAGE_BITS) & (fast->mask >> CPU_TLB_ENTRY_BITS);
> +}
> +
> +static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
> +{
> +    return fast->table + tlbfast_index(fast, addr);
> +}
> +
>   /* Find the TLB index corresponding to the mmu_idx + address pair.  */
>   static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
>                                     vaddr addr)
>   {
> -    uintptr_t size_mask = cpu->neg.tlb.f[mmu_idx].mask >> CPU_TLB_ENTRY_BITS;
> -
> -    return (addr >> TARGET_PAGE_BITS) & size_mask;
> +    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
>   }
>   
>   /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
>   static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
>                                        vaddr addr)
>   {
> -    return &cpu->neg.tlb.f[mmu_idx].table[tlb_index(cpu, mmu_idx, addr)];
> +    return tlbfast_entry(&cpu->neg.tlb.f[mmu_idx], addr);
>   }
>   
>   static void tlb_window_reset(CPUTLBDesc *desc, int64_t ns,

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked
  2024-10-09 15:08 ` [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
@ 2024-10-09 23:05   ` Pierrick Bouvier
  2024-10-10  1:20     ` Richard Henderson
  0 siblings, 1 reply; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:05 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> While this may at present be overly complicated for use
> by single page flushes, do so with the expectation that
> this will eventually allow simplification of large pages.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 61 +++++++++++++++++++++++++---------------------
>   1 file changed, 33 insertions(+), 28 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index e37af24525..6773874f2d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -520,10 +520,37 @@ static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
>       tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
>   }
>   
> +static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
> +                                       vaddr addr, vaddr len, vaddr mask)
> +{
> +    /*
> +     * If @mask is smaller than the tlb size, there may be multiple entries
> +     * within the TLB; for now, just flush the entire TLB.
> +     * Otherwise all addresses that match under @mask hit the same TLB entry.
> +     *
> +     * If @len is larger than the tlb size, then it will take longer to
> +     * test all of the entries in the TLB than it will to flush it all.
> +     */
> +    if (mask < fast->mask || len > fast->mask) {
> +        tlbfast_flush_locked(desc, fast);
> +        return;
> +    }
> +
> +    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
> +        vaddr page = addr + i;
> +        CPUTLBEntry *entry = tlbfast_entry(fast, page);
> +
> +        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
> +            desc->n_used_entries--;
> +        }
> +    }
> +}
> +
>   static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>   {
> -    vaddr lp_addr = cpu->neg.tlb.d[midx].large_page_addr;
> -    vaddr lp_mask = cpu->neg.tlb.d[midx].large_page_mask;
> +    CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
> +    vaddr lp_addr = desc->large_page_addr;
> +    vaddr lp_mask = desc->large_page_mask;
>   
>       /* Check if we need to flush due to large pages.  */
>       if ((page & lp_mask) == lp_addr) {
> @@ -532,9 +559,8 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>                     midx, lp_addr, lp_mask);
>           tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
>       } else {
> -        if (tlb_flush_entry_locked(tlb_entry(cpu, midx, page), page)) {
> -            tlb_n_used_entries_dec(cpu, midx);
> -        }
> +        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
> +                                   page, TARGET_PAGE_SIZE, -1);
>           tlb_flush_vtlb_page_locked(cpu, midx, page);
>       }
>   }
> @@ -689,24 +715,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>       CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
>       vaddr mask = MAKE_64BIT_MASK(0, bits);
>   
> -    /*
> -     * If @bits is smaller than the tlb size, there may be multiple entries
> -     * within the TLB; otherwise all addresses that match under @mask hit
> -     * the same TLB entry.
> -     * TODO: Perhaps allow bits to be a few bits less than the size.
> -     * For now, just flush the entire TLB.
> -     *
> -     * If @len is larger than the tlb size, then it will take longer to
> -     * test all of the entries in the TLB than it will to flush it all.
> -     */
> -    if (mask < f->mask || len > f->mask) {
> -        tlb_debug("forcing full flush midx %d ("
> -                  "%016" VADDR_PRIx "/%016" VADDR_PRIx "+%016" VADDR_PRIx ")\n",
> -                  midx, addr, mask, len);
> -        tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
> -        return;
> -    }
> -
>       /*
>        * Check if we need to flush due to large pages.
>        * Because large_page_mask contains all 1's from the msb,
> @@ -720,13 +728,10 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>           return;
>       }
>   
> +    tlbfast_flush_range_locked(d, f, addr, len, mask);
> +
>       for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
>           vaddr page = addr + i;
> -        CPUTLBEntry *entry = tlb_entry(cpu, midx, page);
> -
> -        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
> -            tlb_n_used_entries_dec(cpu, midx);
> -        }
>           tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
>       }
>   }

Why don't we have the same kind of change for 
tlb_flush_vtlb_page_mask_locked?

We know have two loops (for entry mask, and for page mask).


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup
  2024-10-09 15:08 ` [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
@ 2024-10-09 23:18   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> The INVALID bit should only be auto-cleared when we have
> just called tlb_fill, not along the victim_tlb_hit path.
> 
> In atomic_mmu_lookup, rename tlb_addr to flags, as that
> is what we're actually carrying around.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 33 ++++++++++++++++++++++-----------
>   1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 6773874f2d..fd8da8586f 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1657,7 +1657,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
>       bool maybe_resized = false;
>       CPUTLBEntryFull *full;
> -    int flags;
> +    int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
>   
>       /* If the TLB entry is for a different page, reload and try again.  */
>       if (!tlb_hit(tlb_addr, addr)) {
> @@ -1668,8 +1668,14 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>               maybe_resized = true;
>               index = tlb_index(cpu, mmu_idx, addr);
>               entry = tlb_entry(cpu, mmu_idx, addr);
> +            /*
> +             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> +             * to force the next access through tlb_fill.  We've just
> +             * called tlb_fill, so we know that this entry *is* valid.
> +             */
> +            flags &= ~TLB_INVALID_MASK;
>           }
> -        tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK;
> +        tlb_addr = tlb_read_idx(entry, access_type);
>       }
>   
>       full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> @@ -1819,10 +1825,10 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       MemOp mop = get_memop(oi);
>       uintptr_t index;
>       CPUTLBEntry *tlbe;
> -    vaddr tlb_addr;
>       void *hostaddr;
>       CPUTLBEntryFull *full;
>       bool did_tlb_fill = false;
> +    int flags;
>   
>       tcg_debug_assert(mmu_idx < NB_MMU_MODES);
>   
> @@ -1833,8 +1839,8 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       tlbe = tlb_entry(cpu, mmu_idx, addr);
>   
>       /* Check TLB entry and enforce page permissions.  */
> -    tlb_addr = tlb_addr_write(tlbe);
> -    if (!tlb_hit(tlb_addr, addr)) {
> +    flags = TLB_FLAGS_MASK;
> +    if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
>           if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
>                               addr & TARGET_PAGE_MASK)) {
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
> @@ -1842,8 +1848,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>               did_tlb_fill = true;
>               index = tlb_index(cpu, mmu_idx, addr);
>               tlbe = tlb_entry(cpu, mmu_idx, addr);
> +            /*
> +             * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> +             * to force the next access through tlb_fill.  We've just
> +             * called tlb_fill, so we know that this entry *is* valid.
> +             */
> +            flags &= ~TLB_INVALID_MASK;
>           }
> -        tlb_addr = tlb_addr_write(tlbe) & ~TLB_INVALID_MASK;
>       }
>   
>       /*
> @@ -1879,11 +1890,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           goto stop_the_world;
>       }
>   
> -    /* Collect tlb flags for read. */
> -    tlb_addr |= tlbe->addr_read;
> +    /* Collect tlb flags for read and write. */
> +    flags &= tlbe->addr_read | tlb_addr_write(tlbe);
>   
>       /* Notice an IO access or a needs-MMU-lookup access */
> -    if (unlikely(tlb_addr & (TLB_MMIO | TLB_DISCARD_WRITE))) {
> +    if (unlikely(flags & (TLB_MMIO | TLB_DISCARD_WRITE))) {
>           /* There's really nothing that can be done to
>              support this apart from stop-the-world.  */
>           goto stop_the_world;
> @@ -1892,11 +1903,11 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
>       full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>   
> -    if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
> +    if (unlikely(flags & TLB_NOTDIRTY)) {
>           notdirty_write(cpu, addr, size, full, retaddr);
>       }
>   
> -    if (unlikely(tlb_addr & TLB_FORCE_SLOW)) {
> +    if (unlikely(flags & TLB_FORCE_SLOW)) {
>           int wp_flags = 0;
>   
>           if (full->slow_flags[MMU_DATA_STORE] & TLB_WATCHPOINT) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx*
  2024-10-09 15:08 ` [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx* Richard Henderson
  2024-10-09 18:53   ` Philippe Mathieu-Daudé
@ 2024-10-09 23:20   ` Pierrick Bouvier
  1 sibling, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Probably never happens, but next patches will assume non-zero length.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index fd8da8586f..93b42d18ee 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -801,6 +801,9 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> +    if (len == 0) {
> +        return;
> +    }
>       if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
>           return;
> @@ -839,6 +842,9 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>        * If all bits are significant, and len is small,
>        * this devolves to tlb_flush_page.
>        */
> +    if (len == 0) {
> +        return;
> +    }
>       if (bits >= TARGET_LONG_BITS && len <= TARGET_PAGE_SIZE) {
>           tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
>           return;

If this "probably never happens", wouldn't that be better to assert that 
len is strictly positive?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps
  2024-10-09 15:08 ` [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
@ 2024-10-09 23:28   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:28 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> We expect masked address spaces to be quite large, e.g. 56 bits
> for AArch64 top-byte-ignore mode.  We do not expect addr+len to
> wrap around, but it is possible with AArch64 guest flush range
> instructions.
> 
> Convert this unlikely case to a full tlb flush.  This can simplify
> the subroutines actually performing the range flush.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 93b42d18ee..8affa25db3 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -808,8 +808,12 @@ void tlb_flush_range_by_mmuidx(CPUState *cpu, vaddr addr,
>           tlb_flush_page_by_mmuidx(cpu, addr, idxmap);
>           return;
>       }
> -    /* If no page bits are significant, this devolves to tlb_flush. */
> -    if (bits < TARGET_PAGE_BITS) {
> +    /*
> +     * If no page bits are significant, this devolves to full flush.
> +     * If addr+len wraps in len bits, fall back to full flush.
> +     */
> +    if (bits < TARGET_PAGE_BITS
> +        || (bits < TARGET_LONG_BITS && (addr ^ (addr + len - 1)) >> bits)) {
>           tlb_flush_by_mmuidx(cpu, idxmap);
>           return;
>       }
> @@ -849,8 +853,12 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
>           tlb_flush_page_by_mmuidx_all_cpus_synced(src_cpu, addr, idxmap);
>           return;
>       }
> -    /* If no page bits are significant, this devolves to tlb_flush. */
> -    if (bits < TARGET_PAGE_BITS) {
> +    /*
> +     * If no page bits are significant, this devolves to full flush.
> +     * If addr+len wraps in len bits, fall back to full flush.
> +     */
> +    if (bits < TARGET_PAGE_BITS
> +        || (bits < TARGET_LONG_BITS && (addr ^ (addr + len - 1)) >> bits)) {
>           tlb_flush_by_mmuidx_all_cpus_synced(src_cpu, idxmap);
>           return;
>       }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc
  2024-10-09 15:08 ` [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
@ 2024-10-09 23:31   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:31 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Add the data structures for tracking softmmu pages via
> a balanced interval tree.  So far, only initialize and
> destroy the data structure.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  3 +++
>   accel/tcg/cputlb.c    | 11 +++++++++++
>   2 files changed, 14 insertions(+)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index d21a24c82f..b567abe3e2 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -34,6 +34,7 @@
>   #include "qemu/rcu_queue.h"
>   #include "qemu/queue.h"
>   #include "qemu/thread.h"
> +#include "qemu/interval-tree.h"
>   #include "qom/object.h"
>   
>   typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
> @@ -287,6 +288,8 @@ typedef struct CPUTLBDesc {
>       CPUTLBEntry vtable[CPU_VTLB_SIZE];
>       CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
>       CPUTLBEntryFull *fulltlb;
> +    /* All active tlb entries for this address space. */
> +    IntervalTreeRoot iroot;
>   } CPUTLBDesc;
>   
>   /*
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 8affa25db3..435c2dc132 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -89,6 +89,13 @@ QEMU_BUILD_BUG_ON(sizeof(vaddr) > sizeof(run_on_cpu_data));
>   QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
>   #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
>   
> +/* Extra data required to manage CPUTLBEntryFull within an interval tree. */
> +typedef struct CPUTLBEntryTree {
> +    IntervalTreeNode itree;
> +    CPUTLBEntry copy;
> +    CPUTLBEntryFull full;
> +} CPUTLBEntryTree;
> +
>   static inline size_t tlb_n_entries(CPUTLBDescFast *fast)
>   {
>       return (fast->mask >> CPU_TLB_ENTRY_BITS) + 1;
> @@ -305,6 +312,7 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       desc->large_page_mask = -1;
>       desc->vindex = 0;
>       memset(desc->vtable, -1, sizeof(desc->vtable));
> +    interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
>   }
>   
>   static void tlb_flush_one_mmuidx_locked(CPUState *cpu, int mmu_idx,
> @@ -326,6 +334,7 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
>       fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_new(CPUTLBEntry, n_entries);
>       desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
> +    memset(&desc->iroot, 0, sizeof(desc->iroot));
>       tlb_mmu_flush_locked(desc, fast);
>   }
>   
> @@ -365,6 +374,8 @@ void tlb_destroy(CPUState *cpu)
>   
>           g_free(fast->table);
>           g_free(desc->fulltlb);
> +        interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
> +                                 offsetof(CPUTLBEntryTree, itree));
>       }
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full
  2024-10-09 15:08 ` [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
@ 2024-10-09 23:50   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:50 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Add or replace an entry in the IntervalTree for each
> page installed into softmmu.  We do not yet use the
> tree for anything else.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 34 ++++++++++++++++++++++++++++------
>   1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 435c2dc132..d964e1b2e8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -305,6 +305,17 @@ static void tlbfast_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       memset(fast->table, -1, sizeof_tlb(fast));
>   }
>   
> +static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
> +{
> +    IntervalTreeNode *i = interval_tree_iter_first(&desc->iroot, s, l);
> +    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
> +}
> +
> +static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
> +{
> +    return tlbtree_lookup_range(desc, addr, addr);
> +}
> +
>   static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>   {
>       tlbfast_flush_locked(desc, fast);
> @@ -1086,7 +1097,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       MemoryRegionSection *section;
>       unsigned int index, read_flags, write_flags;
>       uintptr_t addend;
> -    CPUTLBEntry *te, tn;
> +    CPUTLBEntry *te;
> +    CPUTLBEntryTree *node;
>       hwaddr iotlb, xlat, sz, paddr_page;
>       vaddr addr_page;
>       int asidx, wp_flags, prot;
> @@ -1194,6 +1206,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>           tlb_n_used_entries_dec(cpu, mmu_idx);
>       }
>   
> +    /* Replace an old IntervalTree entry, or create a new one. */
> +    node = tlbtree_lookup_addr(desc, addr_page);
> +    if (!node) {
> +        node = g_new(CPUTLBEntryTree, 1);
> +        node->itree.start = addr_page;
> +        node->itree.last = addr_page + TARGET_PAGE_SIZE - 1;
> +        interval_tree_insert(&node->itree, &desc->iroot);
> +    }
> +
>       /* refill the tlb */
>       /*
>        * When memory region is ram, iotlb contains a TARGET_PAGE_BITS
> @@ -1215,15 +1236,15 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       full->phys_addr = paddr_page;
>   
>       /* Now calculate the new entry */
> -    tn.addend = addend - addr_page;
> +    node->copy.addend = addend - addr_page;
>   
> -    tlb_set_compare(full, &tn, addr_page, read_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>                       MMU_INST_FETCH, prot & PAGE_EXEC);
>   
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
>       }
> -    tlb_set_compare(full, &tn, addr_page, read_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>                       MMU_DATA_LOAD, prot & PAGE_READ);
>   
>       if (prot & PAGE_WRITE_INV) {
> @@ -1232,10 +1253,11 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       if (wp_flags & BP_MEM_WRITE) {
>           write_flags |= TLB_WATCHPOINT;
>       }
> -    tlb_set_compare(full, &tn, addr_page, write_flags,
> +    tlb_set_compare(full, &node->copy, addr_page, write_flags,
>                       MMU_DATA_STORE, prot & PAGE_WRITE);
>   
> -    copy_tlb_helper_locked(te, &tn);
> +    node->full = *full;
> +    copy_tlb_helper_locked(te, &node->copy);
>       tlb_n_used_entries_inc(cpu, mmu_idx);
>       qemu_spin_unlock(&tlb->c.lock);
>   }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked
  2024-10-09 15:08 ` [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
@ 2024-10-09 23:53   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Flush a page from the IntervalTree cache.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index d964e1b2e8..772656c7f8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -573,6 +573,7 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>       CPUTLBDesc *desc = &cpu->neg.tlb.d[midx];
>       vaddr lp_addr = desc->large_page_addr;
>       vaddr lp_mask = desc->large_page_mask;
> +    CPUTLBEntryTree *node;
>   
>       /* Check if we need to flush due to large pages.  */
>       if ((page & lp_mask) == lp_addr) {
> @@ -580,10 +581,17 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>                     VADDR_PRIx "/%016" VADDR_PRIx ")\n",
>                     midx, lp_addr, lp_mask);
>           tlb_flush_one_mmuidx_locked(cpu, midx, get_clock_realtime());
> -    } else {
> -        tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
> -                                   page, TARGET_PAGE_SIZE, -1);
> -        tlb_flush_vtlb_page_locked(cpu, midx, page);
> +        return;
> +    }
> +
> +    tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
> +                               page, TARGET_PAGE_SIZE, -1);
> +    tlb_flush_vtlb_page_locked(cpu, midx, page);
> +
> +    node = tlbtree_lookup_addr(desc, page);
> +    if (node) {
> +        interval_tree_remove(&node->itree, &desc->iroot);
> +        g_free(node);
>       }
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked
  2024-10-09 15:08 ` [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
@ 2024-10-09 23:57   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-09 23:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Flush a masked range of pages from the IntervalTree cache.
> When the mask is not used there is a redundant comparison,
> but that is better than duplicating code at this point.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 772656c7f8..709ad75616 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -311,6 +311,13 @@ static CPUTLBEntryTree *tlbtree_lookup_range(CPUTLBDesc *desc, vaddr s, vaddr l)
>       return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
>   }
>   
> +static CPUTLBEntryTree *tlbtree_lookup_range_next(CPUTLBEntryTree *prev,
> +                                                  vaddr s, vaddr l)
> +{
> +    IntervalTreeNode *i = interval_tree_iter_next(&prev->itree, s, l);
> +    return i ? container_of(i, CPUTLBEntryTree, itree) : NULL;
> +}
> +
>   static CPUTLBEntryTree *tlbtree_lookup_addr(CPUTLBDesc *desc, vaddr addr)
>   {
>       return tlbtree_lookup_range(desc, addr, addr);
> @@ -744,6 +751,8 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>       CPUTLBDesc *d = &cpu->neg.tlb.d[midx];
>       CPUTLBDescFast *f = &cpu->neg.tlb.f[midx];
>       vaddr mask = MAKE_64BIT_MASK(0, bits);
> +    CPUTLBEntryTree *node;
> +    vaddr addr_mask, last_mask, last_imask;
>   
>       /*
>        * Check if we need to flush due to large pages.
> @@ -764,6 +773,22 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>           vaddr page = addr + i;
>           tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
>       }
> +
> +    addr_mask = addr & mask;
> +    last_mask = addr_mask + len - 1;
> +    last_imask = last_mask | ~mask;
> +    node = tlbtree_lookup_range(d, addr_mask, last_imask);
> +    while (node) {
> +        CPUTLBEntryTree *next =
> +            tlbtree_lookup_range_next(node, addr_mask, last_imask);
> +        vaddr page_mask = node->itree.start & mask;
> +
> +        if (page_mask >= addr_mask && page_mask < last_mask) {
> +            interval_tree_remove(&node->itree, &d->iroot);
> +            g_free(node); > +        }
> +        node = next;
> +    }
>   }
>   
>   typedef struct {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty
  2024-10-09 15:08 ` [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
@ 2024-10-10  0:03   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:03 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Update the addr_write copy within each interval tree node.
> Tidy the iteration within the other two loops as well.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 709ad75616..95f78afee6 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1024,17 +1024,20 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
>   
>       qemu_spin_lock(&cpu->neg.tlb.c.lock);
>       for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        unsigned int i;
> -        unsigned int n = tlb_n_entries(&cpu->neg.tlb.f[mmu_idx]);
> +        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +        CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
>   
> -        for (i = 0; i < n; i++) {
> -            tlb_reset_dirty_range_locked(&cpu->neg.tlb.f[mmu_idx].table[i],
> -                                         start1, length);
> +        for (size_t i = 0, n = tlb_n_entries(fast); i < n; i++) {
> +            tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
>           }
>   
> -        for (i = 0; i < CPU_VTLB_SIZE; i++) {
> -            tlb_reset_dirty_range_locked(&cpu->neg.tlb.d[mmu_idx].vtable[i],
> -                                         start1, length);
> +        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
> +            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
> +        }
> +
> +        for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
> +             t = tlbtree_lookup_range_next(t, 0, -1)) {

How about introducing interval_tree_foreach function, that runs on every 
node? Running with [0,0xffffffff...] does the job, but it's not really 
obvious when reading the code.

> +            tlb_reset_dirty_range_locked(&t->copy, start1, length);
>           }
>       }
>       qemu_spin_unlock(&cpu->neg.tlb.c.lock);


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty
  2024-10-09 15:08 ` [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
@ 2024-10-10  0:04   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Update the addr_write copy within an interval tree node.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 95f78afee6..ec989f1290 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1063,13 +1063,18 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
>       addr &= TARGET_PAGE_MASK;
>       qemu_spin_lock(&cpu->neg.tlb.c.lock);
>       for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
> -    }
> +        CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +        CPUTLBEntryTree *node;
>   
> -    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
> -        int k;
> -        for (k = 0; k < CPU_VTLB_SIZE; k++) {
> -            tlb_set_dirty1_locked(&cpu->neg.tlb.d[mmu_idx].vtable[k], addr);
> +        tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
> +
> +        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
> +            tlb_set_dirty1_locked(&desc->vtable[k], addr);
> +        }
> +
> +        node = tlbtree_lookup_addr(desc, addr);
> +        if (node) {
> +            tlb_set_dirty1_locked(&node->copy, addr);
>           }
>       }
>       qemu_spin_unlock(&cpu->neg.tlb.c.lock);

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-10-09 15:08 ` [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
@ 2024-10-10  0:10   ` Pierrick Bouvier
  2024-10-10 19:29     ` Richard Henderson
  0 siblings, 1 reply; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:10 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Change from a linear search on the victim tlb
> to a balanced binary tree search on the interval tree.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 62 +++++++++++++++++++++++-----------------------
>   1 file changed, 31 insertions(+), 31 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index ec989f1290..b10b0a357c 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1398,36 +1398,38 @@ static void io_failed(CPUState *cpu, CPUTLBEntryFull *full, vaddr addr,
>       }
>   }
>   
> -/* Return true if ADDR is present in the victim tlb, and has been copied
> -   back to the main tlb.  */
> -static bool victim_tlb_hit(CPUState *cpu, size_t mmu_idx, size_t index,
> -                           MMUAccessType access_type, vaddr page)
> +/*
> + * Return true if ADDR is present in the interval tree,
> + * and has been copied back to the main tlb.
> + */
> +static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
> +                        MMUAccessType access_type, vaddr addr)
>   {
> -    size_t vidx;
> +    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
> +    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
> +    CPUTLBEntryTree *node;
> +    size_t index;
>   
>       assert_cpu_is_self(cpu);
> -    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
> -        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
> -        uint64_t cmp = tlb_read_idx(vtlb, access_type);
> -
> -        if (cmp == page) {
> -            /* Found entry in victim tlb, swap tlb and iotlb.  */
> -            CPUTLBEntry tmptlb, *tlb = &cpu->neg.tlb.f[mmu_idx].table[index];
> -
> -            qemu_spin_lock(&cpu->neg.tlb.c.lock);
> -            copy_tlb_helper_locked(&tmptlb, tlb);
> -            copy_tlb_helper_locked(tlb, vtlb);
> -            copy_tlb_helper_locked(vtlb, &tmptlb);
> -            qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> -
> -            CPUTLBEntryFull *f1 = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> -            CPUTLBEntryFull *f2 = &cpu->neg.tlb.d[mmu_idx].vfulltlb[vidx];
> -            CPUTLBEntryFull tmpf;
> -            tmpf = *f1; *f1 = *f2; *f2 = tmpf;
> -            return true;
> -        }
> +    node = tlbtree_lookup_addr(desc, addr);
> +    if (!node) {
> +        /* There is no cached mapping for this page. */
> +        return false;
>       }
> -    return false;
> +
> +    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
> +        /* This access is not permitted. */
> +        return false;
> +    }

This is not something we were checking before. If this is an addition, 
maybe it would be better to split this out of this commit. Or maybe I 
missed a step in previous commits :)

> +
> +    /* Install the cached entry. */
> +    index = tlbfast_index(fast, addr);
> +    qemu_spin_lock(&cpu->neg.tlb.c.lock);
> +    copy_tlb_helper_locked(&fast->table[index], &node->copy);
> +    qemu_spin_unlock(&cpu->neg.tlb.c.lock);
> +
> +    desc->fulltlb[index] = node->full;
> +    return true;
>   }
>   
>   static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
> @@ -1469,7 +1471,7 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       CPUTLBEntryFull *full;
>   
>       if (!tlb_hit_page(tlb_addr, page_addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type, page_addr)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, access_type, page_addr)) {
>               if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                                   0, fault_size, nonfault, retaddr)) {
>                   /* Non-faulting page table read failed.  */
> @@ -1749,8 +1751,7 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>   
>       /* If the TLB entry is for a different page, reload and try again.  */
>       if (!tlb_hit(tlb_addr, addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, access_type,
> -                            addr & TARGET_PAGE_MASK)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
>               tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                              memop, data->size, false, ra);
>               maybe_resized = true;
> @@ -1929,8 +1930,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       /* Check TLB entry and enforce page permissions.  */
>       flags = TLB_FLAGS_MASK;
>       if (!tlb_hit(tlb_addr_write(tlbe), addr)) {
> -        if (!victim_tlb_hit(cpu, mmu_idx, index, MMU_DATA_STORE,
> -                            addr & TARGET_PAGE_MASK)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, MMU_DATA_STORE, addr)) {
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
>                              mop, size, false, retaddr);
>               did_tlb_fill = true;

Else, hurrah!
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 15/23] accel/tcg: Remove the victim tlb
  2024-10-09 15:08 ` [PATCH 15/23] accel/tcg: Remove the victim tlb Richard Henderson
@ 2024-10-10  0:12   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:12 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> This has been functionally replaced by the IntervalTree.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  8 ------
>   accel/tcg/cputlb.c    | 64 -------------------------------------------
>   2 files changed, 72 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index b567abe3e2..87b864f5c4 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -198,9 +198,6 @@ struct CPUClass {
>    */
>   #define NB_MMU_MODES 16
>   
> -/* Use a fully associative victim tlb of 8 entries. */
> -#define CPU_VTLB_SIZE 8
> -
>   /*
>    * The full TLB entry, which is not accessed by generated TCG code,
>    * so the layout is not as critical as that of CPUTLBEntry. This is
> @@ -282,11 +279,6 @@ typedef struct CPUTLBDesc {
>       /* maximum number of entries observed in the window */
>       size_t window_max_entries;
>       size_t n_used_entries;
> -    /* The next index to use in the tlb victim table.  */
> -    size_t vindex;
> -    /* The tlb victim table, in two parts.  */
> -    CPUTLBEntry vtable[CPU_VTLB_SIZE];
> -    CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
>       CPUTLBEntryFull *fulltlb;
>       /* All active tlb entries for this address space. */
>       IntervalTreeRoot iroot;
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index b10b0a357c..561f66c723 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -328,8 +328,6 @@ static void tlb_mmu_flush_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast)
>       tlbfast_flush_locked(desc, fast);
>       desc->large_page_addr = -1;
>       desc->large_page_mask = -1;
> -    desc->vindex = 0;
> -    memset(desc->vtable, -1, sizeof(desc->vtable));
>       interval_tree_free_nodes(&desc->iroot, offsetof(CPUTLBEntryTree, itree));
>   }
>   
> @@ -501,15 +499,6 @@ static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
>       return tlb_hit_page_mask_anyprot(tlb_entry, page, -1);
>   }
>   
> -/**
> - * tlb_entry_is_empty - return true if the entry is not in use
> - * @te: pointer to CPUTLBEntry
> - */
> -static inline bool tlb_entry_is_empty(const CPUTLBEntry *te)
> -{
> -    return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
> -}
> -
>   /* Called with tlb_c.lock held */
>   static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
>                                           vaddr page,
> @@ -527,28 +516,6 @@ static inline bool tlb_flush_entry_locked(CPUTLBEntry *tlb_entry, vaddr page)
>       return tlb_flush_entry_mask_locked(tlb_entry, page, -1);
>   }
>   
> -/* Called with tlb_c.lock held */
> -static void tlb_flush_vtlb_page_mask_locked(CPUState *cpu, int mmu_idx,
> -                                            vaddr page,
> -                                            vaddr mask)
> -{
> -    CPUTLBDesc *d = &cpu->neg.tlb.d[mmu_idx];
> -    int k;
> -
> -    assert_cpu_is_self(cpu);
> -    for (k = 0; k < CPU_VTLB_SIZE; k++) {
> -        if (tlb_flush_entry_mask_locked(&d->vtable[k], page, mask)) {
> -            tlb_n_used_entries_dec(cpu, mmu_idx);
> -        }
> -    }
> -}
> -
> -static inline void tlb_flush_vtlb_page_locked(CPUState *cpu, int mmu_idx,
> -                                              vaddr page)
> -{
> -    tlb_flush_vtlb_page_mask_locked(cpu, mmu_idx, page, -1);
> -}
> -
>   static void tlbfast_flush_range_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>                                          vaddr addr, vaddr len, vaddr mask)
>   {
> @@ -593,7 +560,6 @@ static void tlb_flush_page_locked(CPUState *cpu, int midx, vaddr page)
>   
>       tlbfast_flush_range_locked(desc, &cpu->neg.tlb.f[midx],
>                                  page, TARGET_PAGE_SIZE, -1);
> -    tlb_flush_vtlb_page_locked(cpu, midx, page);
>   
>       node = tlbtree_lookup_addr(desc, page);
>       if (node) {
> @@ -769,11 +735,6 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>   
>       tlbfast_flush_range_locked(d, f, addr, len, mask);
>   
> -    for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
> -        vaddr page = addr + i;
> -        tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
> -    }
> -
>       addr_mask = addr & mask;
>       last_mask = addr_mask + len - 1;
>       last_imask = last_mask | ~mask;
> @@ -1031,10 +992,6 @@ void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
>               tlb_reset_dirty_range_locked(&fast->table[i], start1, length);
>           }
>   
> -        for (size_t i = 0; i < CPU_VTLB_SIZE; i++) {
> -            tlb_reset_dirty_range_locked(&desc->vtable[i], start1, length);
> -        }
> -
>           for (CPUTLBEntryTree *t = tlbtree_lookup_range(desc, 0, -1); t;
>                t = tlbtree_lookup_range_next(t, 0, -1)) {
>               tlb_reset_dirty_range_locked(&t->copy, start1, length);
> @@ -1068,10 +1025,6 @@ static void tlb_set_dirty(CPUState *cpu, vaddr addr)
>   
>           tlb_set_dirty1_locked(tlb_entry(cpu, mmu_idx, addr), addr);
>   
> -        for (int k = 0; k < CPU_VTLB_SIZE; k++) {
> -            tlb_set_dirty1_locked(&desc->vtable[k], addr);
> -        }
> -
>           node = tlbtree_lookup_addr(desc, addr);
>           if (node) {
>               tlb_set_dirty1_locked(&node->copy, addr);
> @@ -1230,23 +1183,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       /* Note that the tlb is no longer clean.  */
>       tlb->c.dirty |= 1 << mmu_idx;
>   
> -    /* Make sure there's no cached translation for the new page.  */
> -    tlb_flush_vtlb_page_locked(cpu, mmu_idx, addr_page);
> -
> -    /*
> -     * Only evict the old entry to the victim tlb if it's for a
> -     * different page; otherwise just overwrite the stale data.
> -     */
> -    if (!tlb_hit_page_anyprot(te, addr_page) && !tlb_entry_is_empty(te)) {
> -        unsigned vidx = desc->vindex++ % CPU_VTLB_SIZE;
> -        CPUTLBEntry *tv = &desc->vtable[vidx];
> -
> -        /* Evict the old entry into the victim tlb.  */
> -        copy_tlb_helper_locked(tv, te);
> -        desc->vfulltlb[vidx] = desc->fulltlb[index];
> -        tlb_n_used_entries_dec(cpu, mmu_idx);
> -    }
> -
>       /* Replace an old IntervalTree entry, or create a new one. */
>       node = tlbtree_lookup_addr(desc, addr_page);
>       if (!node) {

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h
  2024-10-09 15:08 ` [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
@ 2024-10-10  0:17   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:17 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> CPUTLBEntryFull structures are no longer directly included within
> the CPUState structure.  Move the structure definition out of cpu.h
> to reduce visibility.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/tlb-common.h | 63 +++++++++++++++++++++++++++++++++++++++
>   include/hw/core/cpu.h     | 63 ---------------------------------------
>   2 files changed, 63 insertions(+), 63 deletions(-)
> 
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index dc5a5faa0b..300f9fae67 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -53,4 +53,67 @@ typedef struct CPUTLBDescFast {
>       CPUTLBEntry *table;
>   } CPUTLBDescFast QEMU_ALIGNED(2 * sizeof(void *));
>   
> +/*
> + * The full TLB entry, which is not accessed by generated TCG code,
> + * so the layout is not as critical as that of CPUTLBEntry. This is
> + * also why we don't want to combine the two structs.
> + */
> +struct CPUTLBEntryFull {
> +    /*
> +     * @xlat_section contains:
> +     *  - in the lower TARGET_PAGE_BITS, a physical section number
> +     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
> +     *    must be added to the virtual address to obtain:
> +     *     + the ram_addr_t of the target RAM (if the physical section
> +     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
> +     *     + the offset within the target MemoryRegion (otherwise)
> +     */
> +    hwaddr xlat_section;
> +
> +    /*
> +     * @phys_addr contains the physical address in the address space
> +     * given by cpu_asidx_from_attrs(cpu, @attrs).
> +     */
> +    hwaddr phys_addr;
> +
> +    /* @attrs contains the memory transaction attributes for the page. */
> +    MemTxAttrs attrs;
> +
> +    /* @prot contains the complete protections for the page. */
> +    uint8_t prot;
> +
> +    /* @lg_page_size contains the log2 of the page size. */
> +    uint8_t lg_page_size;
> +
> +    /* Additional tlb flags requested by tlb_fill. */
> +    uint8_t tlb_fill_flags;
> +
> +    /*
> +     * Additional tlb flags for use by the slow path. If non-zero,
> +     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
> +     */
> +    uint8_t slow_flags[MMU_ACCESS_COUNT];
> +
> +    /*
> +     * Allow target-specific additions to this structure.
> +     * This may be used to cache items from the guest cpu
> +     * page tables for later use by the implementation.
> +     */
> +    union {
> +        /*
> +         * Cache the attrs and shareability fields from the page table entry.
> +         *
> +         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
> +         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
> +         * For shareability and guarded, as in the SH and GP fields respectively
> +         * of the VMSAv8-64 PTEs.
> +         */
> +        struct {
> +            uint8_t pte_attrs;
> +            uint8_t shareability;
> +            bool guarded;
> +        } arm;
> +    } extra;
> +};
> +
>   #endif /* EXEC_TLB_COMMON_H */
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 87b864f5c4..6b1c2bfadd 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -198,69 +198,6 @@ struct CPUClass {
>    */
>   #define NB_MMU_MODES 16
>   
> -/*
> - * The full TLB entry, which is not accessed by generated TCG code,
> - * so the layout is not as critical as that of CPUTLBEntry. This is
> - * also why we don't want to combine the two structs.
> - */
> -struct CPUTLBEntryFull {
> -    /*
> -     * @xlat_section contains:
> -     *  - in the lower TARGET_PAGE_BITS, a physical section number
> -     *  - with the lower TARGET_PAGE_BITS masked off, an offset which
> -     *    must be added to the virtual address to obtain:
> -     *     + the ram_addr_t of the target RAM (if the physical section
> -     *       number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
> -     *     + the offset within the target MemoryRegion (otherwise)
> -     */
> -    hwaddr xlat_section;
> -
> -    /*
> -     * @phys_addr contains the physical address in the address space
> -     * given by cpu_asidx_from_attrs(cpu, @attrs).
> -     */
> -    hwaddr phys_addr;
> -
> -    /* @attrs contains the memory transaction attributes for the page. */
> -    MemTxAttrs attrs;
> -
> -    /* @prot contains the complete protections for the page. */
> -    uint8_t prot;
> -
> -    /* @lg_page_size contains the log2 of the page size. */
> -    uint8_t lg_page_size;
> -
> -    /* Additional tlb flags requested by tlb_fill. */
> -    uint8_t tlb_fill_flags;
> -
> -    /*
> -     * Additional tlb flags for use by the slow path. If non-zero,
> -     * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
> -     */
> -    uint8_t slow_flags[MMU_ACCESS_COUNT];
> -
> -    /*
> -     * Allow target-specific additions to this structure.
> -     * This may be used to cache items from the guest cpu
> -     * page tables for later use by the implementation.
> -     */
> -    union {
> -        /*
> -         * Cache the attrs and shareability fields from the page table entry.
> -         *
> -         * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
> -         * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
> -         * For shareability and guarded, as in the SH and GP fields respectively
> -         * of the VMSAv8-64 PTEs.
> -         */
> -        struct {
> -            uint8_t pte_attrs;
> -            uint8_t shareability;
> -            bool guarded;
> -        } arm;
> -    } extra;
> -};
> -
>   /*
>    * Data elements that are per MMU mode, minus the bits accessed by
>    * the TCG fast path.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal
  2024-10-09 15:08 ` [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
@ 2024-10-10  0:19   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:19 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Remove force_mmio and place the expression into the IF
> expression, behind the short-circuit logic expressions
> that might eliminate its computation.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 561f66c723..59ee766d51 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1403,7 +1403,6 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
>       vaddr page_addr = addr & TARGET_PAGE_MASK;
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
> -    bool force_mmio = check_mem_cbs && cpu_plugin_mem_cbs_enabled(cpu);
>       CPUTLBEntryFull *full;
>   
>       if (!tlb_hit_page(tlb_addr, page_addr)) {
> @@ -1434,9 +1433,14 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>       flags |= full->slow_flags[access_type];
>   
> -    /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
> -    if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
> -        || (access_type != MMU_INST_FETCH && force_mmio)) {
> +    /*
> +     * Fold all "mmio-like" bits, and required plugin callbacks, to TLB_MMIO.
> +     * These cannot be treated as RAM.
> +     */
> +    if ((flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY | TLB_CHECK_ALIGNED))
> +        || (access_type != MMU_INST_FETCH
> +            && check_mem_cbs
> +            && cpu_plugin_mem_cbs_enabled(cpu))) {
>           *phost = NULL;
>           return TLB_MMIO;
>       }

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-10-09 15:08 ` [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
  2024-10-09 18:51   ` Philippe Mathieu-Daudé
@ 2024-10-10  0:23   ` Pierrick Bouvier
  2024-10-10 19:31     ` Richard Henderson
  1 sibling, 1 reply; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:23 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Ensure a common entry point for all code lookups.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/cputlb.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 59ee766d51..61daa89e06 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -2954,28 +2954,28 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(cs, true));
> -    return do_ld1_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldb_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(cs, true));
> -    return do_ld2_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldw_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(cs, true));
> -    return do_ld4_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldl_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
>   {
>       CPUState *cs = env_cpu(env);
>       MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(cs, true));
> -    return do_ld8_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
> +    return cpu_ldq_code_mmu(env, addr, oi, 0);
>   }
>   
>   uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,

This will still call the same functions behind _code_mmu variants, but I 
guess it's more coherent overall.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups
  2024-10-09 15:08 ` [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
@ 2024-10-10  0:35   ` Pierrick Bouvier
  2024-10-11 14:47     ` Richard Henderson
  0 siblings, 1 reply; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:35 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Because translation is special, we don't need the speed
> of the direct-mapped softmmu tlb.  We cache a lookups in
> DisasContextBase within the translator loop anyway.
> 
> Drop the addr_code comparator from CPUTLBEntry.
> Go directly to the IntervalTree for MMU_INST_FETCH.
> Derive exec flags from read flags.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/cpu-all.h    |   3 +
>   include/exec/tlb-common.h |   5 +-
>   accel/tcg/cputlb.c        | 138 +++++++++++++++++++++++++++++---------
>   3 files changed, 110 insertions(+), 36 deletions(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 6f09b86e7f..7f5a10962a 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -326,6 +326,9 @@ static inline int cpu_mmu_index(CPUState *cs, bool ifetch)
>       (TLB_INVALID_MASK | TLB_NOTDIRTY | TLB_MMIO \
>       | TLB_FORCE_SLOW | TLB_DISCARD_WRITE)
>   
> +/* Filter read flags to exec flags. */
> +#define TLB_EXEC_FLAGS_MASK  (TLB_MMIO)
> +
>   /*
>    * Flags stored in CPUTLBEntryFull.slow_flags[x].
>    * TLB_FORCE_SLOW must be set in CPUTLBEntry.addr_idx[x].
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index 300f9fae67..feaa471299 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -26,7 +26,6 @@ typedef union CPUTLBEntry {
>       struct {
>           uint64_t addr_read;
>           uint64_t addr_write;
> -        uint64_t addr_code;
>           /*
>            * Addend to virtual address to get host address.  IO accesses
>            * use the corresponding iotlb value.
> @@ -35,7 +34,7 @@ typedef union CPUTLBEntry {
>       };
>       /*
>        * Padding to get a power of two size, as well as index
> -     * access to addr_{read,write,code}.
> +     * access to addr_{read,write}.
>        */
>       uint64_t addr_idx[(1 << CPU_TLB_ENTRY_BITS) / sizeof(uint64_t)];
>   } CPUTLBEntry;
> @@ -92,7 +91,7 @@ struct CPUTLBEntryFull {
>        * Additional tlb flags for use by the slow path. If non-zero,
>        * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
>        */
> -    uint8_t slow_flags[MMU_ACCESS_COUNT];
> +    uint8_t slow_flags[2];
>   
>       /*
>        * Allow target-specific additions to this structure.
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 61daa89e06..7c8308355d 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -114,8 +114,9 @@ static inline uint64_t tlb_read_idx(const CPUTLBEntry *entry,
>                         MMU_DATA_LOAD * sizeof(uint64_t));
>       QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) !=
>                         MMU_DATA_STORE * sizeof(uint64_t));
> -    QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) !=
> -                      MMU_INST_FETCH * sizeof(uint64_t));
> +
> +    tcg_debug_assert(access_type == MMU_DATA_LOAD ||
> +                     access_type == MMU_DATA_STORE);
>   
>   #if TARGET_LONG_BITS == 32
>       /* Use qatomic_read, in case of addr_write; only care about low bits. */
> @@ -490,8 +491,7 @@ static bool tlb_hit_page_mask_anyprot(CPUTLBEntry *tlb_entry,
>       mask &= TARGET_PAGE_MASK | TLB_INVALID_MASK;
>   
>       return (page == (tlb_entry->addr_read & mask) ||
> -            page == (tlb_addr_write(tlb_entry) & mask) ||
> -            page == (tlb_entry->addr_code & mask));
> +            page == (tlb_addr_write(tlb_entry) & mask));
>   }
>   
>   static inline bool tlb_hit_page_anyprot(CPUTLBEntry *tlb_entry, vaddr page)
> @@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, CPUTLBEntry *ent,
>                                      vaddr address, int flags,
>                                      MMUAccessType access_type, bool enable)
>   {
> -    if (enable) {
> -        address |= flags & TLB_FLAGS_MASK;
> -        flags &= TLB_SLOW_FLAGS_MASK;
> -        if (flags) {
> -            address |= TLB_FORCE_SLOW;
> -        }
> -    } else {
> -        address = -1;
> -        flags = 0;
> +    if (!enable) {
> +	address = TLB_INVALID_MASK;
> +    }
> +    address |= flags & TLB_FLAGS_MASK;
> +    flags &= TLB_SLOW_FLAGS_MASK;
> +    if (flags) {
> +        address |= TLB_FORCE_SLOW;
>       }

I'm not sure to follow this change correctly.
After, the final address and flags value depend on flags in parameter, 
while before, it used to depend on flags & enable parameter.

>       ent->addr_idx[access_type] = address;
>       full->slow_flags[access_type] = flags;
> @@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       /* Now calculate the new entry */
>       node->copy.addend = addend - addr_page;
>   
> -    tlb_set_compare(full, &node->copy, addr_page, read_flags,
> -                    MMU_INST_FETCH, prot & PAGE_EXEC);
> -
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
>       }
> @@ -1392,21 +1387,52 @@ static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
>       }
>   }
>   
> -static int probe_access_internal(CPUState *cpu, vaddr addr,
> -                                 int fault_size, MMUAccessType access_type,
> -                                 int mmu_idx, bool nonfault,
> -                                 void **phost, CPUTLBEntryFull **pfull,
> -                                 uintptr_t retaddr, bool check_mem_cbs)
> +static int probe_access_internal_code(CPUState *cpu, vaddr addr,
> +                                      int fault_size, int mmu_idx,
> +                                      bool nonfault,
> +                                      void **phost, CPUTLBEntryFull **pfull,
> +                                      uintptr_t retaddr)
> +{
> +    CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
> +    int flags;
> +
> +    if (!t || !(t->full.prot & PAGE_EXEC)) {
> +        if (!tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
> +                            0, fault_size, nonfault, retaddr)) {
> +            /* Non-faulting page table read failed.  */
> +            *phost = NULL;
> +            *pfull = NULL;
> +            return TLB_INVALID_MASK;
> +        }
> +        t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
> +    }
> +    flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
> +    *pfull = &t->full;
> +
> +    if (flags) {
> +        *phost = NULL;
> +        return TLB_MMIO;
> +    }
> +
> +    /* Everything else is RAM. */
> +    *phost = (void *)((uintptr_t)addr + t->copy.addend);
> +    return flags;
> +}
> +
> +static int probe_access_internal_data(CPUState *cpu, vaddr addr,
> +                                      int fault_size, MMUAccessType access_type,
> +                                      int mmu_idx, bool nonfault,
> +                                      void **phost, CPUTLBEntryFull **pfull,
> +                                      uintptr_t retaddr, bool check_mem_cbs)
>   {
>       uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
> -    vaddr page_addr = addr & TARGET_PAGE_MASK;
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
>       CPUTLBEntryFull *full;
>   
> -    if (!tlb_hit_page(tlb_addr, page_addr)) {
> -        if (!tlbtree_hit(cpu, mmu_idx, access_type, page_addr)) {
> +    if (!tlb_hit(tlb_addr, addr)) {
> +        if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
>               if (!tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                                   0, fault_size, nonfault, retaddr)) {
>                   /* Non-faulting page table read failed.  */
> @@ -1450,6 +1476,21 @@ static int probe_access_internal(CPUState *cpu, vaddr addr,
>       return flags;
>   }
>   
> +static int probe_access_internal(CPUState *cpu, vaddr addr,
> +                                 int fault_size, MMUAccessType access_type,
> +                                 int mmu_idx, bool nonfault,
> +                                 void **phost, CPUTLBEntryFull **pfull,
> +                                 uintptr_t retaddr, bool check_mem_cbs)
> +{
> +    if (access_type == MMU_INST_FETCH) {
> +        return probe_access_internal_code(cpu, addr, fault_size, mmu_idx,
> +                                          nonfault, phost, pfull, retaddr);
> +    }
> +    return probe_access_internal_data(cpu, addr, fault_size, access_type,
> +                                      mmu_idx, nonfault, phost, pfull,
> +                                      retaddr, check_mem_cbs);
> +}
> +
>   int probe_access_full(CPUArchState *env, vaddr addr, int size,
>                         MMUAccessType access_type, int mmu_idx,
>                         bool nonfault, void **phost, CPUTLBEntryFull **pfull,
> @@ -1582,9 +1623,9 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, vaddr addr,
>       CPUTLBEntryFull *full;
>       void *p;
>   
> -    (void)probe_access_internal(env_cpu(env), addr, 1, MMU_INST_FETCH,
> -                                cpu_mmu_index(env_cpu(env), true), false,
> -                                &p, &full, 0, false);
> +    (void)probe_access_internal_code(env_cpu(env), addr, 1,
> +                                     cpu_mmu_index(env_cpu(env), true),
> +                                     false, &p, &full, 0);
>       if (p == NULL) {
>           return -1;
>       }
> @@ -1678,8 +1719,31 @@ typedef struct MMULookupLocals {
>    * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
>    * @mmu_idx may have resized.
>    */
> -static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
> -                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
> +static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
> +                             MemOp memop, int mmu_idx, uintptr_t ra)
> +{
> +    vaddr addr = data->addr;
> +    CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
> +    bool maybe_resized = true;
> +
> +    if (!t || !(t->full.prot & PAGE_EXEC)) {
> +        tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
> +                       memop, data->size, false, ra);
> +        maybe_resized = true;
> +        t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
> +    }
> +
> +    data->full = &t->full;
> +    data->flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
> +    /* Compute haddr speculatively; depending on flags it might be invalid. */
> +    data->haddr = (void *)((uintptr_t)addr + t->copy.addend);
> +
> +    return maybe_resized;
> +}
> +
> +static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
> +                             MemOp memop, int mmu_idx,
> +                             MMUAccessType access_type, uintptr_t ra)
>   {
>       vaddr addr = data->addr;
>       uintptr_t index = tlb_index(cpu, mmu_idx, addr);
> @@ -1738,6 +1802,15 @@ static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>       return maybe_resized;
>   }
>   
> +static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
> +                        int mmu_idx, MMUAccessType access_type, uintptr_t ra)
> +{
> +    if (access_type == MMU_INST_FETCH) {
> +        return mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
> +    }
> +    return mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
> +}
> +
>   /**
>    * mmu_watch_or_dirty
>    * @cpu: generic cpu state
> @@ -1885,13 +1958,13 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           }
>       }
>   
> +    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +
>       /*
>        * Let the guest notice RMW on a write-only page.
>        * We have just verified that the page is writable.
> -     * Subpage lookups may have left TLB_INVALID_MASK set,
> -     * but addr_read will only be -1 if PAGE_READ was unset.
>        */
> -    if (unlikely(tlbe->addr_read == -1)) {
> +    if (unlikely(!(full->prot & PAGE_READ))) {
>           tlb_fill_align(cpu, addr, MMU_DATA_LOAD, mmu_idx,
>                          0, size, false, retaddr);
>           /*
> @@ -1929,7 +2002,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       }
>   
>       hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
>   
>       if (unlikely(flags & TLB_NOTDIRTY)) {
>           notdirty_write(cpu, addr, size, full, retaddr);

Sounds good to have a fast path for code fetch. Did you measure the 
benefit, or just implemented this thinking it's worth?


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree
  2024-10-09 15:08 ` [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
@ 2024-10-10  0:37   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:37 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Link from the fast tlb entry to the interval tree node.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/tlb-common.h |  2 ++
>   accel/tcg/cputlb.c        | 59 ++++++++++++++-------------------------
>   2 files changed, 23 insertions(+), 38 deletions(-)
> 
> diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
> index feaa471299..3b57d61112 100644
> --- a/include/exec/tlb-common.h
> +++ b/include/exec/tlb-common.h
> @@ -31,6 +31,8 @@ typedef union CPUTLBEntry {
>            * use the corresponding iotlb value.
>            */
>           uintptr_t addend;
> +        /* The defining IntervalTree entry. */
> +        struct CPUTLBEntryTree *tree;
>       };
>       /*
>        * Padding to get a power of two size, as well as index
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 7c8308355d..2a8d1b4fb2 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -505,7 +505,10 @@ static bool tlb_flush_entry_mask_locked(CPUTLBEntry *tlb_entry,
>                                           vaddr mask)
>   {
>       if (tlb_hit_page_mask_anyprot(tlb_entry, page, mask)) {
> -        memset(tlb_entry, -1, sizeof(*tlb_entry));
> +        tlb_entry->addr_read = -1;
> +        tlb_entry->addr_write = -1;
> +        tlb_entry->addend = 0;
> +        tlb_entry->tree = NULL;
>           return true;
>       }
>       return false;
> @@ -1212,6 +1215,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>   
>       /* Now calculate the new entry */
>       node->copy.addend = addend - addr_page;
> +    node->copy.tree = node;
>   
>       if (wp_flags & BP_MEM_READ) {
>           read_flags |= TLB_WATCHPOINT;
> @@ -1425,7 +1429,6 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
>                                         void **phost, CPUTLBEntryFull **pfull,
>                                         uintptr_t retaddr, bool check_mem_cbs)
>   {
> -    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
> @@ -1442,7 +1445,6 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
>               }
>   
>               /* TLB resize via tlb_fill_align may have moved the entry.  */
> -            index = tlb_index(cpu, mmu_idx, addr);
>               entry = tlb_entry(cpu, mmu_idx, addr);
>   
>               /*
> @@ -1456,7 +1458,7 @@ static int probe_access_internal_data(CPUState *cpu, vaddr addr,
>       }
>       flags &= tlb_addr;
>   
> -    *pfull = full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +    *pfull = full = &entry->tree->full;
>       flags |= full->slow_flags[access_type];
>   
>       /*
> @@ -1659,7 +1661,6 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
>                          bool is_store, struct qemu_plugin_hwaddr *data)
>   {
>       CPUTLBEntry *tlbe = tlb_entry(cpu, mmu_idx, addr);
> -    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       MMUAccessType access_type = is_store ? MMU_DATA_STORE : MMU_DATA_LOAD;
>       uint64_t tlb_addr = tlb_read_idx(tlbe, access_type);
>       CPUTLBEntryFull *full;
> @@ -1668,7 +1669,7 @@ bool tlb_plugin_lookup(CPUState *cpu, vaddr addr, int mmu_idx,
>           return false;
>       }
>   
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +    full = &tlbe->tree->full;
>       data->phys_addr = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
>   
>       /* We must have an iotlb entry for MMIO */
> @@ -1716,20 +1717,17 @@ typedef struct MMULookupLocals {
>    *
>    * Resolve the translation for the one page at @data.addr, filling in
>    * the rest of @data with the results.  If the translation fails,
> - * tlb_fill_align will longjmp out.  Return true if the softmmu tlb for
> - * @mmu_idx may have resized.
> + * tlb_fill_align will longjmp out.
>    */
> -static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
> +static void mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
>                                MemOp memop, int mmu_idx, uintptr_t ra)
>   {
>       vaddr addr = data->addr;
>       CPUTLBEntryTree *t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
> -    bool maybe_resized = true;
>   
>       if (!t || !(t->full.prot & PAGE_EXEC)) {
>           tlb_fill_align(cpu, addr, MMU_INST_FETCH, mmu_idx,
>                          memop, data->size, false, ra);
> -        maybe_resized = true;
>           t = tlbtree_lookup_addr(&cpu->neg.tlb.d[mmu_idx], addr);
>       }
>   
> @@ -1737,19 +1735,16 @@ static bool mmu_lookup1_code(CPUState *cpu, MMULookupPageData *data,
>       data->flags = t->copy.addr_read & TLB_EXEC_FLAGS_MASK;
>       /* Compute haddr speculatively; depending on flags it might be invalid. */
>       data->haddr = (void *)((uintptr_t)addr + t->copy.addend);
> -
> -    return maybe_resized;
>   }
>   
> -static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
> +static void mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
>                                MemOp memop, int mmu_idx,
>                                MMUAccessType access_type, uintptr_t ra)
>   {
>       vaddr addr = data->addr;
> -    uintptr_t index = tlb_index(cpu, mmu_idx, addr);
>       CPUTLBEntry *entry = tlb_entry(cpu, mmu_idx, addr);
>       uint64_t tlb_addr = tlb_read_idx(entry, access_type);
> -    bool maybe_resized = false;
> +    bool did_tlb_fill = false;
>       CPUTLBEntryFull *full;
>       int flags = TLB_FLAGS_MASK & ~TLB_FORCE_SLOW;
>   
> @@ -1758,8 +1753,7 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
>           if (!tlbtree_hit(cpu, mmu_idx, access_type, addr)) {
>               tlb_fill_align(cpu, addr, access_type, mmu_idx,
>                              memop, data->size, false, ra);
> -            maybe_resized = true;
> -            index = tlb_index(cpu, mmu_idx, addr);
> +            did_tlb_fill = true;
>               entry = tlb_entry(cpu, mmu_idx, addr);
>               /*
>                * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> @@ -1771,11 +1765,11 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
>           tlb_addr = tlb_read_idx(entry, access_type);
>       }
>   
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> -    flags = tlb_addr & (TLB_FLAGS_MASK & ~TLB_FORCE_SLOW);
> +    full = &entry->tree->full;
> +    flags &= tlb_addr;
>       flags |= full->slow_flags[access_type];
>   
> -    if (likely(!maybe_resized)) {
> +    if (likely(!did_tlb_fill)) {
>           /* Alignment has not been checked by tlb_fill_align. */
>           int a_bits = memop_alignment_bits(memop);
>   
> @@ -1798,17 +1792,15 @@ static bool mmu_lookup1_data(CPUState *cpu, MMULookupPageData *data,
>       data->flags = flags;
>       /* Compute haddr speculatively; depending on flags it might be invalid. */
>       data->haddr = (void *)((uintptr_t)addr + entry->addend);
> -
> -    return maybe_resized;
>   }
>   
> -static bool mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
> +static void mmu_lookup1(CPUState *cpu, MMULookupPageData *data, MemOp memop,
>                           int mmu_idx, MMUAccessType access_type, uintptr_t ra)
>   {
>       if (access_type == MMU_INST_FETCH) {
> -        return mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
> +        mmu_lookup1_code(cpu, data, memop, mmu_idx, ra);
>       }
> -    return mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
> +    mmu_lookup1_data(cpu, data, memop, mmu_idx, access_type, ra);
>   }
>   
>   /**
> @@ -1889,15 +1881,9 @@ static bool mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           l->page[1].size = l->page[0].size - size0;
>           l->page[0].size = size0;
>   
> -        /*
> -         * Lookup both pages, recognizing exceptions from either.  If the
> -         * second lookup potentially resized, refresh first CPUTLBEntryFull.
> -         */
> +        /* Lookup both pages, recognizing exceptions from either. */
>           mmu_lookup1(cpu, &l->page[0], l->memop, l->mmu_idx, type, ra);
> -        if (mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra)) {
> -            uintptr_t index = tlb_index(cpu, l->mmu_idx, addr);
> -            l->page[0].full = &cpu->neg.tlb.d[l->mmu_idx].fulltlb[index];
> -        }
> +        mmu_lookup1(cpu, &l->page[1], 0, l->mmu_idx, type, ra);
>   
>           flags = l->page[0].flags | l->page[1].flags;
>           if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) {
> @@ -1925,7 +1911,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>   {
>       uintptr_t mmu_idx = get_mmuidx(oi);
>       MemOp mop = get_memop(oi);
> -    uintptr_t index;
>       CPUTLBEntry *tlbe;
>       void *hostaddr;
>       CPUTLBEntryFull *full;
> @@ -1937,7 +1922,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>       /* Adjust the given return address.  */
>       retaddr -= GETPC_ADJ;
>   
> -    index = tlb_index(cpu, mmu_idx, addr);
>       tlbe = tlb_entry(cpu, mmu_idx, addr);
>   
>       /* Check TLB entry and enforce page permissions.  */
> @@ -1947,7 +1931,6 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>               tlb_fill_align(cpu, addr, MMU_DATA_STORE, mmu_idx,
>                              mop, size, false, retaddr);
>               did_tlb_fill = true;
> -            index = tlb_index(cpu, mmu_idx, addr);
>               tlbe = tlb_entry(cpu, mmu_idx, addr);
>               /*
>                * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
> @@ -1958,7 +1941,7 @@ static void *atomic_mmu_lookup(CPUState *cpu, vaddr addr, MemOpIdx oi,
>           }
>       }
>   
> -    full = &cpu->neg.tlb.d[mmu_idx].fulltlb[index];
> +    full = &tlbe->tree->full;
>   
>       /*
>        * Let the guest notice RMW on a write-only page.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb
  2024-10-09 15:08 ` [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
@ 2024-10-10  0:38   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> This array is now write-only, and may be remove.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/cpu.h |  1 -
>   accel/tcg/cputlb.c    | 39 ++++++++-------------------------------
>   2 files changed, 8 insertions(+), 32 deletions(-)
> 
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 6b1c2bfadd..3022529733 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -216,7 +216,6 @@ typedef struct CPUTLBDesc {
>       /* maximum number of entries observed in the window */
>       size_t window_max_entries;
>       size_t n_used_entries;
> -    CPUTLBEntryFull *fulltlb;
>       /* All active tlb entries for this address space. */
>       IntervalTreeRoot iroot;
>   } CPUTLBDesc;
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 2a8d1b4fb2..47b9557bb8 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -149,13 +149,6 @@ static inline CPUTLBEntry *tlbfast_entry(CPUTLBDescFast *fast, vaddr addr)
>       return fast->table + tlbfast_index(fast, addr);
>   }
>   
> -/* Find the TLB index corresponding to the mmu_idx + address pair.  */
> -static inline uintptr_t tlb_index(CPUState *cpu, uintptr_t mmu_idx,
> -                                  vaddr addr)
> -{
> -    return tlbfast_index(&cpu->neg.tlb.f[mmu_idx], addr);
> -}
> -
>   /* Find the TLB entry corresponding to the mmu_idx + address pair.  */
>   static inline CPUTLBEntry *tlb_entry(CPUState *cpu, uintptr_t mmu_idx,
>                                        vaddr addr)
> @@ -270,22 +263,20 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>       }
>   
>       g_free(fast->table);
> -    g_free(desc->fulltlb);
>   
>       tlb_window_reset(desc, now, 0);
>       /* desc->n_used_entries is cleared by the caller */
>       fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_try_new(CPUTLBEntry, new_size);
> -    desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
>   
>       /*
> -     * If the allocations fail, try smaller sizes. We just freed some
> +     * If the allocation fails, try smaller sizes. We just freed some
>        * memory, so going back to half of new_size has a good chance of working.
>        * Increased memory pressure elsewhere in the system might cause the
>        * allocations to fail though, so we progressively reduce the allocation
>        * size, aborting if we cannot even allocate the smallest TLB we support.
>        */
> -    while (fast->table == NULL || desc->fulltlb == NULL) {
> +    while (fast->table == NULL) {
>           if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
>               error_report("%s: %s", __func__, strerror(errno));
>               abort();
> @@ -294,9 +285,7 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, CPUTLBDescFast *fast,
>           fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
>   
>           g_free(fast->table);
> -        g_free(desc->fulltlb);
>           fast->table = g_try_new(CPUTLBEntry, new_size);
> -        desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
>       }
>   }
>   
> @@ -350,7 +339,6 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast *fast, int64_t now)
>       desc->n_used_entries = 0;
>       fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
>       fast->table = g_new(CPUTLBEntry, n_entries);
> -    desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
>       memset(&desc->iroot, 0, sizeof(desc->iroot));
>       tlb_mmu_flush_locked(desc, fast);
>   }
> @@ -382,15 +370,9 @@ void tlb_init(CPUState *cpu)
>   
>   void tlb_destroy(CPUState *cpu)
>   {
> -    int i;
> -
>       qemu_spin_destroy(&cpu->neg.tlb.c.lock);
> -    for (i = 0; i < NB_MMU_MODES; i++) {
> -        CPUTLBDesc *desc = &cpu->neg.tlb.d[i];
> -        CPUTLBDescFast *fast = &cpu->neg.tlb.f[i];
> -
> -        g_free(fast->table);
> -        g_free(desc->fulltlb);
> +    for (int i = 0; i < NB_MMU_MODES; i++) {
> +        g_free(cpu->neg.tlb.f[i].table);
>           interval_tree_free_nodes(&cpu->neg.tlb.d[i].iroot,
>                                    offsetof(CPUTLBEntryTree, itree));
>       }
> @@ -1090,7 +1072,7 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       CPUTLB *tlb = &cpu->neg.tlb;
>       CPUTLBDesc *desc = &tlb->d[mmu_idx];
>       MemoryRegionSection *section;
> -    unsigned int index, read_flags, write_flags;
> +    unsigned int read_flags, write_flags;
>       uintptr_t addend;
>       CPUTLBEntry *te;
>       CPUTLBEntryTree *node;
> @@ -1169,7 +1151,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       wp_flags = cpu_watchpoint_address_matches(cpu, addr_page,
>                                                 TARGET_PAGE_SIZE);
>   
> -    index = tlb_index(cpu, mmu_idx, addr_page);
>       te = tlb_entry(cpu, mmu_idx, addr_page);
>   
>       /*
> @@ -1208,8 +1189,8 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>        * subtract here is that of the page base, and not the same as the
>        * vaddr we add back in io_prepare()/get_page_addr_code().
>        */
> -    desc->fulltlb[index] = *full;
> -    full = &desc->fulltlb[index];
> +    node->full = *full;
> +    full = &node->full;
>       full->xlat_section = iotlb - addr_page;
>       full->phys_addr = paddr_page;
>   
> @@ -1232,7 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>       tlb_set_compare(full, &node->copy, addr_page, write_flags,
>                       MMU_DATA_STORE, prot & PAGE_WRITE);
>   
> -    node->full = *full;
>       copy_tlb_helper_locked(te, &node->copy);
>       tlb_n_used_entries_inc(cpu, mmu_idx);
>       qemu_spin_unlock(&tlb->c.lock);
> @@ -1343,7 +1323,6 @@ static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
>       CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
>       CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
>       CPUTLBEntryTree *node;
> -    size_t index;
>   
>       assert_cpu_is_self(cpu);
>       node = tlbtree_lookup_addr(desc, addr);
> @@ -1358,12 +1337,10 @@ static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
>       }
>   
>       /* Install the cached entry. */
> -    index = tlbfast_index(fast, addr);
>       qemu_spin_lock(&cpu->neg.tlb.c.lock);
> -    copy_tlb_helper_locked(&fast->table[index], &node->copy);
> +    copy_tlb_helper_locked(tlbfast_entry(fast, addr), &node->copy);
>       qemu_spin_unlock(&cpu->neg.tlb.c.lock);
>   
> -    desc->fulltlb[index] = node->full;
>       return true;
>   }
>   

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill
  2024-10-09 15:08 ` [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
@ 2024-10-10  0:40   ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:40 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 08:08, Richard Henderson wrote:
> Now that all targets have been converted to tlb_fill_align,
> remove the tlb_fill hook.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/hw/core/tcg-cpu-ops.h | 10 ----------
>   accel/tcg/cputlb.c            | 19 ++++---------------
>   2 files changed, 4 insertions(+), 25 deletions(-)
> 
> diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
> index c932690621..e73c8a03de 100644
> --- a/include/hw/core/tcg-cpu-ops.h
> +++ b/include/hw/core/tcg-cpu-ops.h
> @@ -157,16 +157,6 @@ struct TCGCPUOps {
>       bool (*tlb_fill_align)(CPUState *cpu, CPUTLBEntryFull *out, vaddr addr,
>                              MMUAccessType access_type, int mmu_idx,
>                              MemOp memop, int size, bool probe, uintptr_t ra);
> -    /**
> -     * @tlb_fill: Handle a softmmu tlb miss
> -     *
> -     * If the access is valid, call tlb_set_page and return true;
> -     * if the access is invalid and probe is true, return false;
> -     * otherwise raise an exception and do not return.
> -     */
> -    bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
> -                     MMUAccessType access_type, int mmu_idx,
> -                     bool probe, uintptr_t retaddr);
>       /**
>        * @do_transaction_failed: Callback for handling failed memory transactions
>        * (ie bus faults or external aborts; not MMU faults)
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 47b9557bb8..55c7bf737b 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -1251,23 +1251,12 @@ static bool tlb_fill_align(CPUState *cpu, vaddr addr, MMUAccessType type,
>                              int mmu_idx, MemOp memop, int size,
>                              bool probe, uintptr_t ra)
>   {
> -    const TCGCPUOps *ops = cpu->cc->tcg_ops;
>       CPUTLBEntryFull full;
>   
> -    if (ops->tlb_fill_align) {
> -        if (ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
> -                                memop, size, probe, ra)) {
> -            tlb_set_page_full(cpu, mmu_idx, addr, &full);
> -            return true;
> -        }
> -    } else {
> -        /* Legacy behaviour is alignment before paging. */
> -        if (addr & ((1u << memop_alignment_bits(memop)) - 1)) {
> -            ops->do_unaligned_access(cpu, addr, type, mmu_idx, ra);
> -        }
> -        if (ops->tlb_fill(cpu, addr, size, type, mmu_idx, probe, ra)) {
> -            return true;
> -        }
> +    if (cpu->cc->tcg_ops->tlb_fill_align(cpu, &full, addr, type, mmu_idx,
> +                                         memop, size, probe, ra)) {
> +        tlb_set_page_full(cpu, mmu_idx, addr, &full);
> +        return true;
>       }
>       assert(probe);
>       return false;

Looks good!
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree
  2024-10-09 17:10   ` Richard Henderson
@ 2024-10-10  0:50     ` Pierrick Bouvier
  2024-10-15  0:07       ` Richard Henderson
  0 siblings, 1 reply; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-10  0:50 UTC (permalink / raw)
  To: Richard Henderson, BALATON Zoltan; +Cc: qemu-devel

On 10/9/24 10:10, Richard Henderson wrote:
> On 10/9/24 09:27, BALATON Zoltan wrote:
>> On Wed, 9 Oct 2024, Richard Henderson wrote:
>>> Based-on: 20241009000453.315652-1-richard.henderson@linaro.org
>>> ("[PATCH v3 00/20] accel/tcg: Introduce tlb_fill_align hook")
>>>
>>> The initial idea was: how much can we do with an intelligent data
>>> structure for the same cost as a linear search through an array?
>>>
>>> This is an initial installment along these lines.  This is about
>>> as far as I can go without first converting all targets to the
>>> new tlb_fill_align hook.  Indeed, the final two patches will not
>>> compile with all targets enabled, but hint at the direction of
>>> the next steps.
>>>
>>> I do not expect large perf changes with this patch set.  I will
>>> be happy if performance comes out even.
>>
>> Then what's the point?
> 
> Eventually fixing the page size > TARGET_PAGE_SIZE performance issues.
> 
> E.g. with a 16k or 64k aarch64 guest kernel, we still have TARGET_PAGE_SIZE at 4k, so all
> guest pages are "large", and so run into our current behaviour of flushing the entire tlb
> too often.
> 
> Even without that, I expect further cleanups to improve performance, we're just not there yet.
> 
> 
> r~
> 

Does merging pages over a given range be something we could benefit from 
too? In this case, entries in our tlbtree would have varying size, 
allowing us to cover more space with a single entry.

It would allow us to have a more shallow tlbtree (given that it's 
rebalanced when modified) and speed up walking operations.

I'm not sure if it can help performance wise though.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked
  2024-10-09 23:05   ` Pierrick Bouvier
@ 2024-10-10  1:20     ` Richard Henderson
  2024-10-11 17:09       ` Pierrick Bouvier
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-10  1:20 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 10/9/24 16:05, Pierrick Bouvier wrote:
>> @@ -720,13 +728,10 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>>           return;
>>       }
>> +    tlbfast_flush_range_locked(d, f, addr, len, mask);
>> +
>>       for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
>>           vaddr page = addr + i;
>> -        CPUTLBEntry *entry = tlb_entry(cpu, midx, page);
>> -
>> -        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
>> -            tlb_n_used_entries_dec(cpu, midx);
>> -        }
>>           tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
>>       }
>>   }
> 
> Why don't we have the same kind of change for tlb_flush_vtlb_page_mask_locked?
> 
> We know have two loops (for entry mask, and for page mask).

It goes away in patch 15.

r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-10-10  0:10   ` Pierrick Bouvier
@ 2024-10-10 19:29     ` Richard Henderson
  2024-10-11 17:11       ` Pierrick Bouvier
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-10 19:29 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 10/9/24 17:10, Pierrick Bouvier wrote:
>> +static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
>> +                        MMUAccessType access_type, vaddr addr)
>>   {
>> -    size_t vidx;
>> +    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
>> +    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
>> +    CPUTLBEntryTree *node;
>> +    size_t index;
>>       assert_cpu_is_self(cpu);
>> -    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
>> -        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
>> -        uint64_t cmp = tlb_read_idx(vtlb, access_type);
>> -
>> -        if (cmp == page) {
...
>> +    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
>> +        /* This access is not permitted. */
>> +        return false;
>> +    }
> 
> This is not something we were checking before. If this is an addition, maybe it would be 
> better to split this out of this commit. Or maybe I missed a step in previous commits :)

It's there, with the comparison and page mask, but I agree it's not obvious.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code
  2024-10-10  0:23   ` Pierrick Bouvier
@ 2024-10-10 19:31     ` Richard Henderson
  0 siblings, 0 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-10 19:31 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 10/9/24 17:23, Pierrick Bouvier wrote:
> On 10/9/24 08:08, Richard Henderson wrote:
>> Ensure a common entry point for all code lookups.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   accel/tcg/cputlb.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
>> index 59ee766d51..61daa89e06 100644
>> --- a/accel/tcg/cputlb.c
>> +++ b/accel/tcg/cputlb.c
>> @@ -2954,28 +2954,28 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr)
>>   {
>>       CPUState *cs = env_cpu(env);
>>       MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(cs, true));
>> -    return do_ld1_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
>> +    return cpu_ldb_code_mmu(env, addr, oi, 0);
>>   }
>>   uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr)
>>   {
>>       CPUState *cs = env_cpu(env);
>>       MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(cs, true));
>> -    return do_ld2_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
>> +    return cpu_ldw_code_mmu(env, addr, oi, 0);
>>   }
>>   uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr)
>>   {
>>       CPUState *cs = env_cpu(env);
>>       MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(cs, true));
>> -    return do_ld4_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
>> +    return cpu_ldl_code_mmu(env, addr, oi, 0);
>>   }
>>   uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
>>   {
>>       CPUState *cs = env_cpu(env);
>>       MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(cs, true));
>> -    return do_ld8_mmu(cs, addr, oi, 0, MMU_INST_FETCH);
>> +    return cpu_ldq_code_mmu(env, addr, oi, 0);
>>   }
>>   uint8_t cpu_ldb_code_mmu(CPUArchState *env, abi_ptr addr,
> 
> This will still call the same functions behind _code_mmu variants, but I guess it's more 
> coherent overall.
> 
> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

Previously, I was modifying cpu_ld*_code_mmu, so this mattered.
But I dropped that, so I should drop this as well.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups
  2024-10-10  0:35   ` Pierrick Bouvier
@ 2024-10-11 14:47     ` Richard Henderson
  2024-10-11 17:55       ` Pierrick Bouvier
  0 siblings, 1 reply; 60+ messages in thread
From: Richard Henderson @ 2024-10-11 14:47 UTC (permalink / raw)
  To: Pierrick Bouvier, qemu-devel

On 10/9/24 17:35, Pierrick Bouvier wrote:
>> @@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full, 
>> CPUTLBEntry *ent,
>>                                      vaddr address, int flags,
>>                                      MMUAccessType access_type, bool enable)
>>   {
>> -    if (enable) {
>> -        address |= flags & TLB_FLAGS_MASK;
>> -        flags &= TLB_SLOW_FLAGS_MASK;
>> -        if (flags) {
>> -            address |= TLB_FORCE_SLOW;
>> -        }
>> -    } else {
>> -        address = -1;
>> -        flags = 0;
>> +    if (!enable) {
>> +    address = TLB_INVALID_MASK;
>> +    }
>> +    address |= flags & TLB_FLAGS_MASK;
>> +    flags &= TLB_SLOW_FLAGS_MASK;
>> +    if (flags) {
>> +        address |= TLB_FORCE_SLOW;
>>       }
> 
> I'm not sure to follow this change correctly.
> After, the final address and flags value depend on flags in parameter, while before, it 
> used to depend on flags & enable parameter.

This deserves to be split out; I even thought about it Monday night but then forgot when I 
restarted on Tuesday morning.

Before, address is -1 for disabled, mostly because that mirrors what you get with memset 
to initialize the tlb.  All of the flags are discarded.  But the only thing that's 
important is that TLB_INVALID_MASK is set.

After, TLB_INVALID_MASK is still set, but the flags are retained.  This is because we want 
a source of those flags to use for MMU_INST_FETCH.  With this patch set we no longer store 
flags for execute and instead grab them from the flags for read.  From tlb_set_page_full...

>> @@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>>       /* Now calculate the new entry */
>>       node->copy.addend = addend - addr_page;
>> -    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>> -                    MMU_INST_FETCH, prot & PAGE_EXEC);
>> -
>>       if (wp_flags & BP_MEM_READ) {
>>           read_flags |= TLB_WATCHPOINT;
>>       }

... we can see that the only difference between the two is the watchpoint bit. 
Importantly, TLB_MMIO is common to all three comparators.

> Sounds good to have a fast path for code fetch. Did you measure the benefit, or just 
> implemented this thinking it's worth?

This is not about a fast path for code fetch, but always using the *slow* path.  The 
object is to repurpose one word from CPUTLBEntry, removed here and added back in the next 
patch, to link CPUTLBEntry to CPUTLBEntryTree without changing sizeof(CPUTLBEntry).

r~

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked
  2024-10-10  1:20     ` Richard Henderson
@ 2024-10-11 17:09       ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-11 17:09 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/9/24 18:20, Richard Henderson wrote:
> On 10/9/24 16:05, Pierrick Bouvier wrote:
>>> @@ -720,13 +728,10 @@ static void tlb_flush_range_locked(CPUState *cpu, int midx,
>>>            return;
>>>        }
>>> +    tlbfast_flush_range_locked(d, f, addr, len, mask);
>>> +
>>>        for (vaddr i = 0; i < len; i += TARGET_PAGE_SIZE) {
>>>            vaddr page = addr + i;
>>> -        CPUTLBEntry *entry = tlb_entry(cpu, midx, page);
>>> -
>>> -        if (tlb_flush_entry_mask_locked(entry, page, mask)) {
>>> -            tlb_n_used_entries_dec(cpu, midx);
>>> -        }
>>>            tlb_flush_vtlb_page_mask_locked(cpu, midx, page, mask);
>>>        }
>>>    }
>>
>> Why don't we have the same kind of change for tlb_flush_vtlb_page_mask_locked?
>>
>> We know have two loops (for entry mask, and for page mask).
> 
> It goes away in patch 15.
> 
> r~

Right, looks good.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit
  2024-10-10 19:29     ` Richard Henderson
@ 2024-10-11 17:11       ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-11 17:11 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel



On 10/10/24 12:29, Richard Henderson wrote:
> On 10/9/24 17:10, Pierrick Bouvier wrote:
>>> +static bool tlbtree_hit(CPUState *cpu, int mmu_idx,
>>> +                        MMUAccessType access_type, vaddr addr)
>>>    {
>>> -    size_t vidx;
>>> +    CPUTLBDesc *desc = &cpu->neg.tlb.d[mmu_idx];
>>> +    CPUTLBDescFast *fast = &cpu->neg.tlb.f[mmu_idx];
>>> +    CPUTLBEntryTree *node;
>>> +    size_t index;
>>>        assert_cpu_is_self(cpu);
>>> -    for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) {
>>> -        CPUTLBEntry *vtlb = &cpu->neg.tlb.d[mmu_idx].vtable[vidx];
>>> -        uint64_t cmp = tlb_read_idx(vtlb, access_type);
>>> -
>>> -        if (cmp == page) {
> ...
>>> +    if (!tlb_hit(tlb_read_idx(&node->copy, access_type), addr)) {
>>> +        /* This access is not permitted. */
>>> +        return false;
>>> +    }
>>
>> This is not something we were checking before. If this is an addition, maybe it would be
>> better to split this out of this commit. Or maybe I missed a step in previous commits :)
> 
> It's there, with the comparison and page mask, but I agree it's not obvious.
> 
> 
> r~

Subtle indeed, you don't expect & TARGET_PAGE_MASK to imply that 
protection is checked, even if that makes sense once said.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups
  2024-10-11 14:47     ` Richard Henderson
@ 2024-10-11 17:55       ` Pierrick Bouvier
  0 siblings, 0 replies; 60+ messages in thread
From: Pierrick Bouvier @ 2024-10-11 17:55 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 10/11/24 07:47, Richard Henderson wrote:
> On 10/9/24 17:35, Pierrick Bouvier wrote:
>>> @@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full,
>>> CPUTLBEntry *ent,
>>>                                       vaddr address, int flags,
>>>                                       MMUAccessType access_type, bool enable)
>>>    {
>>> -    if (enable) {
>>> -        address |= flags & TLB_FLAGS_MASK;
>>> -        flags &= TLB_SLOW_FLAGS_MASK;
>>> -        if (flags) {
>>> -            address |= TLB_FORCE_SLOW;
>>> -        }
>>> -    } else {
>>> -        address = -1;
>>> -        flags = 0;
>>> +    if (!enable) {
>>> +    address = TLB_INVALID_MASK;
>>> +    }
>>> +    address |= flags & TLB_FLAGS_MASK;
>>> +    flags &= TLB_SLOW_FLAGS_MASK;
>>> +    if (flags) {
>>> +        address |= TLB_FORCE_SLOW;
>>>        }
>>
>> I'm not sure to follow this change correctly.
>> After, the final address and flags value depend on flags in parameter, while before, it
>> used to depend on flags & enable parameter.
> 
> This deserves to be split out; I even thought about it Monday night but then forgot when I
> restarted on Tuesday morning.
>

Agree.

> Before, address is -1 for disabled, mostly because that mirrors what you get with memset
> to initialize the tlb.  All of the flags are discarded.  But the only thing that's
> important is that TLB_INVALID_MASK is set.
>  > After, TLB_INVALID_MASK is still set, but the flags are retained. 
This is because we want
> a source of those flags to use for MMU_INST_FETCH.  With this patch set we no longer store
> flags for execute and instead grab them from the flags for read.  From tlb_set_page_full...
>

That's subtle. When setting to TLB_INVALID_MASK, it seems to imply we 
won't check any flag on this entry. Maybe just a comment added here 
would clarify this.

> 
>>> @@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
>>>        /* Now calculate the new entry */
>>>        node->copy.addend = addend - addr_page;
>>> -    tlb_set_compare(full, &node->copy, addr_page, read_flags,
>>> -                    MMU_INST_FETCH, prot & PAGE_EXEC);
>>> -
>>>        if (wp_flags & BP_MEM_READ) {
>>>            read_flags |= TLB_WATCHPOINT;
>>>        }
> 
> ... we can see that the only difference between the two is the watchpoint bit.
> Importantly, TLB_MMIO is common to all three comparators.
>
>> Sounds good to have a fast path for code fetch. Did you measure the benefit, or just
>> implemented this thinking it's worth?
> 
> This is not about a fast path for code fetch, but always using the *slow* path.  The
> object is to repurpose one word from CPUTLBEntry, removed here and added back in the next
> patch, to link CPUTLBEntry to CPUTLBEntryTree without changing sizeof(CPUTLBEntry).
>

I missed the fact that MMU_INST_FETCH was used only when fetching code 
at translation time.

> 
> r~

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree
  2024-10-10  0:50     ` Pierrick Bouvier
@ 2024-10-15  0:07       ` Richard Henderson
  0 siblings, 0 replies; 60+ messages in thread
From: Richard Henderson @ 2024-10-15  0:07 UTC (permalink / raw)
  To: Pierrick Bouvier, BALATON Zoltan; +Cc: qemu-devel

On 10/9/24 17:50, Pierrick Bouvier wrote:
>> Eventually fixing the page size > TARGET_PAGE_SIZE performance issues.
>>
>> E.g. with a 16k or 64k aarch64 guest kernel, we still have TARGET_PAGE_SIZE at 4k, so all
>> guest pages are "large", and so run into our current behaviour of flushing the entire tlb
>> too often.
>>
>> Even without that, I expect further cleanups to improve performance, we're just not 
>> there yet.
>>
>>
>> r~
>>
> 
> Does merging pages over a given range be something we could benefit from too? In this 
> case, entries in our tlbtree would have varying size, allowing us to cover more space with 
> a single entry.

I don't know.  I kinda doubt it, because of tlb flushing mechanics.
But we shall see once everything up until that point is done.


r~


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2024-10-15  0:09 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09 15:08 [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree Richard Henderson
2024-10-09 15:08 ` [PATCH 01/23] util/interval-tree: Introduce interval_tree_free_nodes Richard Henderson
2024-10-09 22:51   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 02/23] accel/tcg: Split out tlbfast_flush_locked Richard Henderson
2024-10-09 18:54   ` Philippe Mathieu-Daudé
2024-10-09 22:53   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 03/23] accel/tcg: Split out tlbfast_{index,entry} Richard Henderson
2024-10-09 22:55   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 04/23] accel/tcg: Split out tlbfast_flush_range_locked Richard Henderson
2024-10-09 23:05   ` Pierrick Bouvier
2024-10-10  1:20     ` Richard Henderson
2024-10-11 17:09       ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 05/23] accel/tcg: Fix flags usage in mmu_lookup1, atomic_mmu_lookup Richard Henderson
2024-10-09 23:18   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 06/23] accel/tcg: Early exit for zero length in tlb_flush_range_by_mmuidx* Richard Henderson
2024-10-09 18:53   ` Philippe Mathieu-Daudé
2024-10-09 23:20   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 07/23] accel/tcg: Flush entire tlb when a masked range wraps Richard Henderson
2024-10-09 23:28   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 08/23] accel/tcg: Add IntervalTreeRoot to CPUTLBDesc Richard Henderson
2024-10-09 23:31   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 09/23] accel/tcg: Populate IntervalTree in tlb_set_page_full Richard Henderson
2024-10-09 23:50   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 10/23] accel/tcg: Remove IntervalTree entry in tlb_flush_page_locked Richard Henderson
2024-10-09 23:53   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 11/23] accel/tcg: Remove IntervalTree entries in tlb_flush_range_locked Richard Henderson
2024-10-09 23:57   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 12/23] accel/tcg: Process IntervalTree entries in tlb_reset_dirty Richard Henderson
2024-10-10  0:03   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 13/23] accel/tcg: Process IntervalTree entries in tlb_set_dirty Richard Henderson
2024-10-10  0:04   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 14/23] accel/tcg: Replace victim_tlb_hit with tlbtree_hit Richard Henderson
2024-10-10  0:10   ` Pierrick Bouvier
2024-10-10 19:29     ` Richard Henderson
2024-10-11 17:11       ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 15/23] accel/tcg: Remove the victim tlb Richard Henderson
2024-10-10  0:12   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 16/23] include/exec/tlb-common: Move CPUTLBEntryFull from hw/core/cpu.h Richard Henderson
2024-10-10  0:17   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 17/23] accel/tcg: Delay plugin adjustment in probe_access_internal Richard Henderson
2024-10-10  0:19   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 18/23] accel/tcg: Call cpu_ld*_code_mmu from cpu_ld*_code Richard Henderson
2024-10-09 18:51   ` Philippe Mathieu-Daudé
2024-10-10  0:23   ` Pierrick Bouvier
2024-10-10 19:31     ` Richard Henderson
2024-10-09 15:08 ` [PATCH 19/23] accel/tcg: Always use IntervalTree for code lookups Richard Henderson
2024-10-10  0:35   ` Pierrick Bouvier
2024-10-11 14:47     ` Richard Henderson
2024-10-11 17:55       ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 20/23] accel/tcg: Link CPUTLBEntry to CPUTLBEntryTree Richard Henderson
2024-10-10  0:37   ` Pierrick Bouvier
2024-10-09 15:08 ` [PATCH 21/23] accel/tcg: Remove CPUTLBDesc.fulltlb Richard Henderson
2024-10-10  0:38   ` Pierrick Bouvier
2024-10-09 15:08 ` [NOTYET PATCH 22/23] accel/tcg: Drop TCGCPUOps.tlb_fill Richard Henderson
2024-10-10  0:40   ` Pierrick Bouvier
2024-10-09 15:08 ` [NOTYET PATCH 23/23] accel/tcg: Unexport tlb_set_page* Richard Henderson
2024-10-09 16:27 ` [RFC PATCH 00/23] accel/tcg: Convert victim tlb to IntervalTree BALATON Zoltan
2024-10-09 17:10   ` Richard Henderson
2024-10-10  0:50     ` Pierrick Bouvier
2024-10-15  0:07       ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).