* [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState
@ 2023-06-30 12:25 Anton Johansson via
2023-06-30 12:25 ` [PATCH 1/9] target/arm: Replace TARGET_PAGE_ENTRY_EXTRA Anton Johansson via
` (9 more replies)
0 siblings, 10 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
CPUNegativeOffsetState is a struct placed immediately before
CPUArchState in the ArchCPU struct. Its purpose is to ensure that
certain fields (CPUTLBDescFast, IcountDecr) lay within a small negative
offset of CPUArchState in memory. This is desired for better
code-generation on arm[32|64] and riscv hosts which has addressing
modes with 12- and 11 bits of displacement respectively.
This patchset removes CPUNegativeOffsetState, moves its fields to
CPUState, and statically asserts that the offset to the fields above
is expressable in 11 bits of displacement ( >= -(1 << 11) ). In order
to achieve this the TARGET_PAGE_ENTRY_EXTRA macro in CPUTLBEntryFull
had to be replaced with a union to make CPUTLB target independent.
The motivation for this patchset is twofold:
1. Parts of the codebase that previously depended on CPUArchState
to access either CPUTLB or IcountDecr now only depend on the
target-agnostic CPUState. This is a step towards building
accel/ once for system- and once for user-mode.
2. Targets no longer have to define a CPUNegativeOffsetState
member of ArchCPU, and QEMU will fail to compile if CPUTLB
and IcountDecr drift too far from CPUArchState.
Patches will follow that convert accel/tcg/cputlb.c and
accel/tcg/user-exec.c away from CPUArchState.
Anton Johansson (9):
target/arm: Replace TARGET_PAGE_ENTRY_EXTRA
include: Move MMUAccessType to tlb-common.h
include/exec: Move CPUTLB and friends to tlb-common.h
include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET
accel: Move CPUTLB to CPUState and assert offset
Move IcountDecr to CPUState and assert offset
include/exec: Remove [cpu|env]_neg() functions
target: Remove CPUNegativeOffsetState field from ArchCPU
include/exec: Remove CPUNegativeOffsetState
include/exec/cpu-all.h | 28 +-----
include/exec/cpu-defs.h | 141 ----------------------------
include/exec/exec-all.h | 2 +-
include/exec/tlb-common.h | 153 +++++++++++++++++++++++++++++++
include/hw/core/cpu.h | 27 ++++--
target/alpha/cpu.h | 1 -
target/arm/cpu-param.h | 12 ---
target/arm/cpu.h | 1 -
target/avr/cpu.h | 1 -
target/cris/cpu.h | 1 -
target/hexagon/cpu.h | 1 -
target/hppa/cpu.h | 1 -
target/i386/cpu.h | 1 -
target/loongarch/cpu.h | 1 -
target/m68k/cpu.h | 1 -
target/microblaze/cpu.h | 1 -
target/mips/cpu.h | 3 +-
target/nios2/cpu.h | 1 -
target/openrisc/cpu.h | 1 -
target/ppc/cpu.h | 1 -
target/riscv/cpu.h | 1 -
target/rx/cpu.h | 1 -
target/s390x/cpu.h | 1 -
target/sh4/cpu.h | 1 -
target/sparc/cpu.h | 1 -
target/tricore/cpu.h | 1 -
target/xtensa/cpu.h | 3 +-
accel/tcg/cpu-exec.c | 14 +--
accel/tcg/tcg-accel-ops-icount.c | 6 +-
accel/tcg/tcg-accel-ops.c | 2 +-
accel/tcg/translate-all.c | 19 +++-
accel/tcg/translator.c | 15 ++-
softmmu/icount.c | 2 +-
target/arm/ptw.c | 4 +-
target/arm/tcg/mte_helper.c | 2 +-
target/arm/tcg/sve_helper.c | 2 +-
target/arm/tcg/tlb_helper.c | 4 +-
target/arm/tcg/translate-a64.c | 2 +-
38 files changed, 222 insertions(+), 238 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/9] target/arm: Replace TARGET_PAGE_ENTRY_EXTRA
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 2/9] include: Move MMUAccessType to tlb-common.h Anton Johansson via
` (8 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
From: Anton Johansson <97antjoh@gmail.com>
TARGET_PAGE_ENTRY_EXTRA is a macro that allows guests to specify additional
fields for caching with the full TLB entry. This macro is replaced with
a union in CPUTLBEntryFull, thus making CPUTLB target-agnostic at the
cost of slightly inflated CPUTLBEntryFull for non-arm guests.
(arm is the only guest actually making use of this feature.)
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-defs.h | 18 +++++++++++++++---
target/arm/cpu-param.h | 12 ------------
target/arm/ptw.c | 4 ++--
target/arm/tcg/mte_helper.c | 2 +-
target/arm/tcg/sve_helper.c | 2 +-
target/arm/tcg/tlb_helper.c | 4 ++--
target/arm/tcg/translate-a64.c | 2 +-
7 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index fb4c8d480f..0a600a312b 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -135,9 +135,21 @@ typedef struct CPUTLBEntryFull {
* This may be used to cache items from the guest cpu
* page tables for later use by the implementation.
*/
-#ifdef TARGET_PAGE_ENTRY_EXTRA
- TARGET_PAGE_ENTRY_EXTRA
-#endif
+ union {
+ /*
+ * Cache the attrs and shareability fields from the page table entry.
+ *
+ * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
+ * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
+ * For shareability and guarded, as in the SH and GP fields respectively
+ * of the VMSAv8-64 PTEs.
+ */
+ struct {
+ uint8_t pte_attrs;
+ uint8_t shareability;
+ bool guarded;
+ } arm;
+ } extra;
} CPUTLBEntryFull;
#endif /* CONFIG_SOFTMMU */
diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index b3b35f7aa1..f9b462a98f 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -31,18 +31,6 @@
# define TARGET_PAGE_BITS_VARY
# define TARGET_PAGE_BITS_MIN 10
-/*
- * Cache the attrs and shareability fields from the page table entry.
- *
- * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
- * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
- * For shareability and guarded, as in the SH and GP fields respectively
- * of the VMSAv8-64 PTEs.
- */
-# define TARGET_PAGE_ENTRY_EXTRA \
- uint8_t pte_attrs; \
- uint8_t shareability; \
- bool guarded;
#endif
#endif
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 6015121b99..22e02579cb 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -499,7 +499,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
}
ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
ptw->out_rw = full->prot & PAGE_WRITE;
- pte_attrs = full->pte_attrs;
+ pte_attrs = full->extra.arm.pte_attrs;
ptw->out_secure = full->attrs.secure;
ptw->out_space = full->attrs.space;
#else
@@ -1933,7 +1933,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, S1Translate *ptw,
/* When in aarch64 mode, and BTI is enabled, remember GP in the TLB. */
if (aarch64 && cpu_isar_feature(aa64_bti, cpu)) {
- result->f.guarded = extract64(attrs, 50, 1); /* GP */
+ result->f.extra.arm.guarded = extract64(attrs, 50, 1); /* GP */
}
}
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index 9c64def081..be30c2bb51 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -124,7 +124,7 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
assert(!(flags & TLB_INVALID_MASK));
/* If the virtual page MemAttr != Tagged, access unchecked. */
- if (full->pte_attrs != 0xf0) {
+ if (full->extra.arm.pte_attrs != 0xf0) {
return NULL;
}
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index 0097522470..996c04d3d9 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5373,7 +5373,7 @@ bool sve_probe_page(SVEHostPage *info, bool nofault, CPUARMState *env,
info->tagged = (flags & PAGE_ANON) && (flags & PAGE_MTE);
#else
info->attrs = full->attrs;
- info->tagged = full->pte_attrs == 0xf0;
+ info->tagged = full->extra.arm.pte_attrs == 0xf0;
#endif
/* Ensure that info->host[] is relative to addr, not addr + mem_off. */
diff --git a/target/arm/tcg/tlb_helper.c b/target/arm/tcg/tlb_helper.c
index b22b2a4c6e..59bff8b452 100644
--- a/target/arm/tcg/tlb_helper.c
+++ b/target/arm/tcg/tlb_helper.c
@@ -334,8 +334,8 @@ bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int size,
address &= TARGET_PAGE_MASK;
}
- res.f.pte_attrs = res.cacheattrs.attrs;
- res.f.shareability = res.cacheattrs.shareability;
+ res.f.extra.arm.pte_attrs = res.cacheattrs.attrs;
+ res.f.extra.arm.shareability = res.cacheattrs.shareability;
tlb_set_page_full(cs, mmu_idx, address, &res.f);
return true;
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 3baab6aa60..0c5e275ac8 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -13763,7 +13763,7 @@ static bool is_guarded_page(CPUARMState *env, DisasContext *s)
false, &host, &full, 0);
assert(!(flags & TLB_INVALID_MASK));
- return full->guarded;
+ return full->extra.arm.guarded;
#endif
}
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/9] include: Move MMUAccessType to tlb-common.h
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
2023-06-30 12:25 ` [PATCH 1/9] target/arm: Replace TARGET_PAGE_ENTRY_EXTRA Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 3/9] include/exec: Move CPUTLB and friends " Anton Johansson via
` (7 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
This commit sets up and avoids circular inclusion in following commits.
CPUTLB and friends will be moved from cpu-defs.h to tlb-common.h and
depends on MMU_ACCESS_COUNT defined in hw/core/cpu.h, and hw/core/cpu.h
will in turn need to include tlb-common.h once the CPUTLB field has been
moved to CPUState.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/tlb-common.h | 7 ++++++-
include/hw/core/cpu.h | 8 +-------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index dc5a5faa0b..d1203354b4 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -19,7 +19,12 @@
#ifndef EXEC_TLB_COMMON_H
#define EXEC_TLB_COMMON_H 1
-#define CPU_TLB_ENTRY_BITS 5
+typedef enum MMUAccessType {
+ MMU_DATA_LOAD = 0,
+ MMU_DATA_STORE = 1,
+ MMU_INST_FETCH = 2
+#define MMU_ACCESS_COUNT 3
+} MMUAccessType;
/* Minimalized TLB entry for use by TCG fast path. */
typedef union CPUTLBEntry {
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index b08f8b7079..c226d7263c 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -25,6 +25,7 @@
#include "exec/cpu-common.h"
#include "exec/hwaddr.h"
#include "exec/memattrs.h"
+#include "exec/tlb-common.h"
#include "qapi/qapi-types-run-state.h"
#include "qemu/bitmap.h"
#include "qemu/rcu_queue.h"
@@ -80,13 +81,6 @@ DECLARE_CLASS_CHECKERS(CPUClass, CPU,
typedef struct ArchCPU CpuInstanceType; \
OBJECT_DECLARE_TYPE(ArchCPU, CpuClassType, CPU_MODULE_OBJ_NAME);
-typedef enum MMUAccessType {
- MMU_DATA_LOAD = 0,
- MMU_DATA_STORE = 1,
- MMU_INST_FETCH = 2
-#define MMU_ACCESS_COUNT 3
-} MMUAccessType;
-
typedef struct CPUWatchpoint CPUWatchpoint;
/* see tcg-cpu-ops.h */
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/9] include/exec: Move CPUTLB and friends to tlb-common.h
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
2023-06-30 12:25 ` [PATCH 1/9] target/arm: Replace TARGET_PAGE_ENTRY_EXTRA Anton Johansson via
2023-06-30 12:25 ` [PATCH 2/9] include: Move MMUAccessType to tlb-common.h Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET Anton Johansson via
` (6 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
CPUTLB is now target-agnostic and can be moved to a more suitable
header.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-defs.h | 145 +------------------------------------
include/exec/tlb-common.h | 148 ++++++++++++++++++++++++++++++++++++++
2 files changed, 149 insertions(+), 144 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 0a600a312b..dff6c37f6b 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -25,10 +25,7 @@
#include "qemu/host-utils.h"
#include "qemu/thread.h"
-#ifndef CONFIG_USER_ONLY
-#include "exec/hwaddr.h"
-#endif
-#include "exec/memattrs.h"
+#include "exec/tlb-common.h"
#include "hw/core/cpu.h"
#include "cpu-param.h"
@@ -54,17 +51,7 @@
#include "exec/target_long.h"
-/*
- * Fix the number of mmu modes to 16, which is also the maximum
- * supported by the softmmu tlb api.
- */
-#define NB_MMU_MODES 16
-
#if defined(CONFIG_SOFTMMU) && defined(CONFIG_TCG)
-#include "exec/tlb-common.h"
-
-/* use a fully associative victim tlb of 8 entries */
-#define CPU_VTLB_SIZE 8
#define CPU_TLB_DYN_MIN_BITS 6
#define CPU_TLB_DYN_DEFAULT_BITS 8
@@ -91,136 +78,6 @@
#endif /* CONFIG_SOFTMMU && CONFIG_TCG */
-#if defined(CONFIG_SOFTMMU)
-/*
- * The full TLB entry, which is not accessed by generated TCG code,
- * so the layout is not as critical as that of CPUTLBEntry. This is
- * also why we don't want to combine the two structs.
- */
-typedef struct CPUTLBEntryFull {
- /*
- * @xlat_section contains:
- * - in the lower TARGET_PAGE_BITS, a physical section number
- * - with the lower TARGET_PAGE_BITS masked off, an offset which
- * must be added to the virtual address to obtain:
- * + the ram_addr_t of the target RAM (if the physical section
- * number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
- * + the offset within the target MemoryRegion (otherwise)
- */
- hwaddr xlat_section;
-
- /*
- * @phys_addr contains the physical address in the address space
- * given by cpu_asidx_from_attrs(cpu, @attrs).
- */
- hwaddr phys_addr;
-
- /* @attrs contains the memory transaction attributes for the page. */
- MemTxAttrs attrs;
-
- /* @prot contains the complete protections for the page. */
- uint8_t prot;
-
- /* @lg_page_size contains the log2 of the page size. */
- uint8_t lg_page_size;
-
- /*
- * Additional tlb flags for use by the slow path. If non-zero,
- * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
- */
- uint8_t slow_flags[MMU_ACCESS_COUNT];
-
- /*
- * Allow target-specific additions to this structure.
- * This may be used to cache items from the guest cpu
- * page tables for later use by the implementation.
- */
- union {
- /*
- * Cache the attrs and shareability fields from the page table entry.
- *
- * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
- * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
- * For shareability and guarded, as in the SH and GP fields respectively
- * of the VMSAv8-64 PTEs.
- */
- struct {
- uint8_t pte_attrs;
- uint8_t shareability;
- bool guarded;
- } arm;
- } extra;
-} CPUTLBEntryFull;
-#endif /* CONFIG_SOFTMMU */
-
-#if defined(CONFIG_SOFTMMU) && defined(CONFIG_TCG)
-/*
- * Data elements that are per MMU mode, minus the bits accessed by
- * the TCG fast path.
- */
-typedef struct CPUTLBDesc {
- /*
- * Describe a region covering all of the large pages allocated
- * into the tlb. When any page within this region is flushed,
- * we must flush the entire tlb. The region is matched if
- * (addr & large_page_mask) == large_page_addr.
- */
- vaddr large_page_addr;
- vaddr large_page_mask;
- /* host time (in ns) at the beginning of the time window */
- int64_t window_begin_ns;
- /* maximum number of entries observed in the window */
- size_t window_max_entries;
- size_t n_used_entries;
- /* The next index to use in the tlb victim table. */
- size_t vindex;
- /* The tlb victim table, in two parts. */
- CPUTLBEntry vtable[CPU_VTLB_SIZE];
- CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
- CPUTLBEntryFull *fulltlb;
-} CPUTLBDesc;
-
-/*
- * Data elements that are shared between all MMU modes.
- */
-typedef struct CPUTLBCommon {
- /* Serialize updates to f.table and d.vtable, and others as noted. */
- QemuSpin lock;
- /*
- * Within dirty, for each bit N, modifications have been made to
- * mmu_idx N since the last time that mmu_idx was flushed.
- * Protected by tlb_c.lock.
- */
- uint16_t dirty;
- /*
- * Statistics. These are not lock protected, but are read and
- * written atomically. This allows the monitor to print a snapshot
- * of the stats without interfering with the cpu.
- */
- size_t full_flush_count;
- size_t part_flush_count;
- size_t elide_flush_count;
-} CPUTLBCommon;
-
-/*
- * The entire softmmu tlb, for all MMU modes.
- * The meaning of each of the MMU modes is defined in the target code.
- * Since this is placed within CPUNegativeOffsetState, the smallest
- * negative offsets are at the end of the struct.
- */
-
-typedef struct CPUTLB {
- CPUTLBCommon c;
- CPUTLBDesc d[NB_MMU_MODES];
- CPUTLBDescFast f[NB_MMU_MODES];
-} CPUTLB;
-
-#else
-
-typedef struct CPUTLB { } CPUTLB;
-
-#endif /* CONFIG_SOFTMMU && CONFIG_TCG */
-
/*
* This structure must be placed in ArchCPU immediately
* before CPUArchState, as a field named "neg".
diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index d1203354b4..838a1f0d2a 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -19,6 +19,17 @@
#ifndef EXEC_TLB_COMMON_H
#define EXEC_TLB_COMMON_H 1
+#ifndef CONFIG_USER_ONLY
+#include "exec/hwaddr.h"
+#endif
+#include "exec/memattrs.h"
+
+/*
+ * Fix the number of mmu modes to 16, which is also the maximum
+ * supported by the softmmu tlb api.
+ */
+#define NB_MMU_MODES 16
+
typedef enum MMUAccessType {
MMU_DATA_LOAD = 0,
MMU_DATA_STORE = 1,
@@ -26,6 +37,13 @@ typedef enum MMUAccessType {
#define MMU_ACCESS_COUNT 3
} MMUAccessType;
+#if defined(CONFIG_SOFTMMU) && defined(CONFIG_TCG)
+/* use a fully associative victim tlb of 8 entries */
+#define CPU_VTLB_SIZE 8
+#endif
+
+#define CPU_TLB_ENTRY_BITS 5
+
/* Minimalized TLB entry for use by TCG fast path. */
typedef union CPUTLBEntry {
struct {
@@ -58,4 +76,134 @@ typedef struct CPUTLBDescFast {
CPUTLBEntry *table;
} CPUTLBDescFast QEMU_ALIGNED(2 * sizeof(void *));
+#if defined(CONFIG_SOFTMMU)
+/*
+ * The full TLB entry, which is not accessed by generated TCG code,
+ * so the layout is not as critical as that of CPUTLBEntry. This is
+ * also why we don't want to combine the two structs.
+ */
+typedef struct CPUTLBEntryFull {
+ /*
+ * @xlat_section contains:
+ * - in the lower TARGET_PAGE_BITS, a physical section number
+ * - with the lower TARGET_PAGE_BITS masked off, an offset which
+ * must be added to the virtual address to obtain:
+ * + the ram_addr_t of the target RAM (if the physical section
+ * number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
+ * + the offset within the target MemoryRegion (otherwise)
+ */
+ hwaddr xlat_section;
+
+ /*
+ * @phys_addr contains the physical address in the address space
+ * given by cpu_asidx_from_attrs(cpu, @attrs).
+ */
+ hwaddr phys_addr;
+
+ /* @attrs contains the memory transaction attributes for the page. */
+ MemTxAttrs attrs;
+
+ /* @prot contains the complete protections for the page. */
+ uint8_t prot;
+
+ /* @lg_page_size contains the log2 of the page size. */
+ uint8_t lg_page_size;
+
+ /*
+ * Additional tlb flags for use by the slow path. If non-zero,
+ * the corresponding CPUTLBEntry comparator must have TLB_FORCE_SLOW.
+ */
+ uint8_t slow_flags[MMU_ACCESS_COUNT];
+
+ /*
+ * Allow target-specific additions to this structure.
+ * This may be used to cache items from the guest cpu
+ * page tables for later use by the implementation.
+ */
+ union {
+ /*
+ * Cache the attrs and shareability fields from the page table entry.
+ *
+ * For ARMMMUIdx_Stage2*, pte_attrs is the S2 descriptor bits [5:2].
+ * Otherwise, pte_attrs is the same as the MAIR_EL1 8-bit format.
+ * For shareability and guarded, as in the SH and GP fields respectively
+ * of the VMSAv8-64 PTEs.
+ */
+ struct {
+ uint8_t pte_attrs;
+ uint8_t shareability;
+ bool guarded;
+ } arm;
+ } extra;
+} CPUTLBEntryFull;
+#endif /* CONFIG_SOFTMMU */
+
+#if defined(CONFIG_SOFTMMU) && defined(CONFIG_TCG)
+/*
+ * Data elements that are per MMU mode, minus the bits accessed by
+ * the TCG fast path.
+ */
+typedef struct CPUTLBDesc {
+ /*
+ * Describe a region covering all of the large pages allocated
+ * into the tlb. When any page within this region is flushed,
+ * we must flush the entire tlb. The region is matched if
+ * (addr & large_page_mask) == large_page_addr.
+ */
+ vaddr large_page_addr;
+ vaddr large_page_mask;
+ /* host time (in ns) at the beginning of the time window */
+ int64_t window_begin_ns;
+ /* maximum number of entries observed in the window */
+ size_t window_max_entries;
+ size_t n_used_entries;
+ /* The next index to use in the tlb victim table. */
+ size_t vindex;
+ /* The tlb victim table, in two parts. */
+ CPUTLBEntry vtable[CPU_VTLB_SIZE];
+ CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
+ CPUTLBEntryFull *fulltlb;
+} CPUTLBDesc;
+
+/*
+ * Data elements that are shared between all MMU modes.
+ */
+typedef struct CPUTLBCommon {
+ /* Serialize updates to f.table and d.vtable, and others as noted. */
+ QemuSpin lock;
+ /*
+ * Within dirty, for each bit N, modifications have been made to
+ * mmu_idx N since the last time that mmu_idx was flushed.
+ * Protected by tlb_c.lock.
+ */
+ uint16_t dirty;
+ /*
+ * Statistics. These are not lock protected, but are read and
+ * written atomically. This allows the monitor to print a snapshot
+ * of the stats without interfering with the cpu.
+ */
+ size_t full_flush_count;
+ size_t part_flush_count;
+ size_t elide_flush_count;
+} CPUTLBCommon;
+
+/*
+ * The entire softmmu tlb, for all MMU modes.
+ * The meaning of each of the MMU modes is defined in the target code.
+ * Since this is placed within CPUNegativeOffsetState, the smallest
+ * negative offsets are at the end of the struct.
+ */
+
+typedef struct CPUTLB {
+ CPUTLBCommon c;
+ CPUTLBDesc d[NB_MMU_MODES];
+ CPUTLBDescFast f[NB_MMU_MODES];
+} CPUTLB;
+
+#else
+
+typedef struct CPUTLB { } CPUTLB;
+
+#endif /* CONFIG_SOFTMMU && CONFIG_TCG */
+
#endif /* EXEC_TLB_COMMON_H */
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (2 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 3/9] include/exec: Move CPUTLB and friends " Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 14:19 ` Richard Henderson
2023-06-30 12:25 ` [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset Anton Johansson via
` (5 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
For reasons related to code-generation quality, the offset of
CPUTLBDescFast and IcountDecr from CPUArchState needs to fit within
11 bits of displacement (arm[32|64] and riscv addressing modes).
This commit introduces a new constant to store the maximum allowed
negative offset, so it can be statically asserted to hold later on.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/hw/core/cpu.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index c226d7263c..0377f74d48 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -259,6 +259,17 @@ struct qemu_work_item;
#define CPU_UNSET_NUMA_NODE_ID -1
+/*
+ * For reasons related to code-generation quality the fast path
+ * CPUTLBDescFast array and IcountDecr fields need to be located within a
+ * small negative offset of CPUArchState. This requirement comes from
+ * host-specific addressing modes of arm[32|64] and riscv which uses 12-
+ * and 11 bits of displacement respectively.
+ */
+#define CPU_MIN_DISPLACEMENT_BITS 11
+#define CPU_MAX_NEGATIVE_ENV_OFFSET \
+ (-(1 << CPU_MIN_DISPLACEMENT_BITS))
+
/**
* CPUState:
* @cpu_index: CPU index (informative).
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (3 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 14:16 ` Richard Henderson
2023-06-30 12:25 ` [PATCH 6/9] Move IcountDecr " Anton Johansson via
` (4 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
As CPUTLB is now target-agnostic it can be moved from
CPUNegativeOffsetState to CPUState, and the negative offset from
CPUArchState can instead be statically asserted to be greater than
CPU_MAX_NEGATIVE_ENV_OFFSET.
This also opens up the door for reducing the dependency of common code
on CPUArchState.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-all.h | 2 +-
include/exec/cpu-defs.h | 2 --
include/exec/tlb-common.h | 4 ++--
include/hw/core/cpu.h | 7 +++++++
accel/tcg/translate-all.c | 13 +++++++++++--
5 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 8018ce783e..706daa49ec 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -453,7 +453,7 @@ static inline CPUNegativeOffsetState *cpu_neg(CPUState *cpu)
*/
static inline CPUTLB *env_tlb(CPUArchState *env)
{
- return &env_neg(env)->tlb;
+ return &env_cpu(env)->tlb;
}
#endif /* CPU_ALL_H */
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index dff6c37f6b..add0f3c541 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -25,7 +25,6 @@
#include "qemu/host-utils.h"
#include "qemu/thread.h"
-#include "exec/tlb-common.h"
#include "hw/core/cpu.h"
#include "cpu-param.h"
@@ -83,7 +82,6 @@
* before CPUArchState, as a field named "neg".
*/
typedef struct CPUNegativeOffsetState {
- CPUTLB tlb;
IcountDecr icount_decr;
} CPUNegativeOffsetState;
diff --git a/include/exec/tlb-common.h b/include/exec/tlb-common.h
index 838a1f0d2a..450c0156bf 100644
--- a/include/exec/tlb-common.h
+++ b/include/exec/tlb-common.h
@@ -190,8 +190,8 @@ typedef struct CPUTLBCommon {
/*
* The entire softmmu tlb, for all MMU modes.
* The meaning of each of the MMU modes is defined in the target code.
- * Since this is placed within CPUNegativeOffsetState, the smallest
- * negative offsets are at the end of the struct.
+ * Since this is placed within CPUState, the smallest negative offsets
+ * are at the end of the struct.
*/
typedef struct CPUTLB {
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 0377f74d48..adf6158899 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -448,6 +448,13 @@ struct CPUState {
/* track IOMMUs whose translations we've cached in the TCG TLB */
GArray *iommu_notifiers;
+
+ /*
+ * The following fields needs to be within CPU_MAX_NEGATIVE_ENV_OFFSET of
+ * CPUArchState. As CPUArchState is assumed to follow CPUState in the
+ * ArchCPU struct these are placed last. This is checked statically.
+ */
+ CPUTLB tlb;
};
typedef QTAILQ_HEAD(CPUTailQ, CPUState) CPUTailQ;
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index d3d4fbc1a4..5582aaf653 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -339,8 +339,17 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
tcg_ctx->page_bits = TARGET_PAGE_BITS;
tcg_ctx->page_mask = TARGET_PAGE_MASK;
tcg_ctx->tlb_dyn_max_bits = CPU_TLB_DYN_MAX_BITS;
- tcg_ctx->tlb_fast_offset =
- (int)offsetof(ArchCPU, neg.tlb.f) - (int)offsetof(ArchCPU, env);
+
+#define TLB_FAST_OFFSET \
+ ((int)offsetof(ArchCPU, parent_obj.tlb.f) - (int)offsetof(ArchCPU, env))
+
+ QEMU_BUILD_BUG_ON(TLB_FAST_OFFSET < CPU_MAX_NEGATIVE_ENV_OFFSET ||
+ TLB_FAST_OFFSET > 0);
+
+ tcg_ctx->tlb_fast_offset = TLB_FAST_OFFSET;
+
+#undef TLB_FAST_OFFSET
+
#endif
tcg_ctx->insn_start_words = TARGET_INSN_START_WORDS;
#ifdef TCG_GUEST_DEFAULT_MO
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 6/9] Move IcountDecr to CPUState and assert offset
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (4 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 7/9] include/exec: Remove [cpu|env]_neg() functions Anton Johansson via
` (3 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
Instead of using CPUNegativeOffsetState to ensure that negative offset
between IcountDecr and CPUArchState is small enough, it can be
statically asserted to be greater than CPU_MAX_NEGATIVE_ENV_OFFSET.
Additionally, moving IcountDecr to CPUState allows for reducing the
dependency of common code on CPUArchState (in the future).
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-all.h | 2 +-
include/exec/cpu-defs.h | 1 -
include/exec/exec-all.h | 2 +-
include/hw/core/cpu.h | 1 +
accel/tcg/cpu-exec.c | 14 +++++++-------
accel/tcg/tcg-accel-ops-icount.c | 6 +++---
accel/tcg/tcg-accel-ops.c | 2 +-
accel/tcg/translate-all.c | 6 +++---
accel/tcg/translator.c | 15 +++++++++++----
softmmu/icount.c | 2 +-
10 files changed, 29 insertions(+), 22 deletions(-)
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 706daa49ec..a31df11770 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -396,7 +396,7 @@ void tcg_exec_unrealizefn(CPUState *cpu);
static inline void cpu_set_cpustate_pointers(ArchCPU *cpu)
{
cpu->parent_obj.env_ptr = &cpu->env;
- cpu->parent_obj.icount_decr_ptr = &cpu->neg.icount_decr;
+ cpu->parent_obj.icount_decr_ptr = &cpu->parent_obj.icount_decr;
}
/**
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index add0f3c541..429d9525f7 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -82,7 +82,6 @@
* before CPUArchState, as a field named "neg".
*/
typedef struct CPUNegativeOffsetState {
- IcountDecr icount_decr;
} CPUNegativeOffsetState;
#endif
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 200c27eadf..3bc4e34909 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -71,7 +71,7 @@ G_NORETURN void cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc);
*/
static inline bool cpu_loop_exit_requested(CPUState *cpu)
{
- return (int32_t)qatomic_read(&cpu_neg(cpu)->icount_decr.u32) < 0;
+ return (int32_t)qatomic_read(&cpu->icount_decr.u32) < 0;
}
#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_TCG)
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index adf6158899..c2e78b5fa9 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -455,6 +455,7 @@ struct CPUState {
* ArchCPU struct these are placed last. This is checked statically.
*/
CPUTLB tlb;
+ IcountDecr icount_decr;
};
typedef QTAILQ_HEAD(CPUTailQ, CPUState) CPUTailQ;
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index ba1890a373..0a49b6fd7e 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -73,7 +73,7 @@ static void align_clocks(SyncClocks *sc, CPUState *cpu)
return;
}
- cpu_icount = cpu->icount_extra + cpu_neg(cpu)->icount_decr.u16.low;
+ cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
sc->diff_clk += icount_to_ns(sc->last_cpu_icount - cpu_icount);
sc->last_cpu_icount = cpu_icount;
@@ -124,7 +124,7 @@ static void init_delay_params(SyncClocks *sc, CPUState *cpu)
sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - sc->realtime_clock;
sc->last_cpu_icount
- = cpu->icount_extra + cpu_neg(cpu)->icount_decr.u16.low;
+ = cpu->icount_extra + cpu->icount_decr.u16.low;
if (sc->diff_clk < max_delay) {
max_delay = sc->diff_clk;
}
@@ -689,7 +689,7 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
if (cpu->exception_index < 0) {
#ifndef CONFIG_USER_ONLY
if (replay_has_exception()
- && cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra == 0) {
+ && cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
/* Execute just one insn to trigger exception pending in the log */
cpu->cflags_next_tb = (curr_cflags(cpu) & ~CF_USE_ICOUNT)
| CF_NOIRQ | 1;
@@ -779,7 +779,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
* Ensure zeroing happens before reading cpu->exit_request or
* cpu->interrupt_request (see also smp_wmb in cpu_exit())
*/
- qatomic_set_mb(&cpu_neg(cpu)->icount_decr.u16.high, 0);
+ qatomic_set_mb(&cpu->icount_decr.u16.high, 0);
if (unlikely(qatomic_read(&cpu->interrupt_request))) {
int interrupt_request;
@@ -870,7 +870,7 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
if (unlikely(qatomic_read(&cpu->exit_request))
|| (icount_enabled()
&& (cpu->cflags_next_tb == -1 || cpu->cflags_next_tb & CF_USE_ICOUNT)
- && cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra == 0)) {
+ && cpu->icount_decr.u16.low + cpu->icount_extra == 0)) {
qatomic_set(&cpu->exit_request, 0);
if (cpu->exception_index == -1) {
cpu->exception_index = EXCP_INTERRUPT;
@@ -895,7 +895,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
}
*last_tb = NULL;
- insns_left = qatomic_read(&cpu_neg(cpu)->icount_decr.u32);
+ insns_left = qatomic_read(&cpu->icount_decr.u32);
if (insns_left < 0) {
/* Something asked us to stop executing chained TBs; just
* continue round the main loop. Whatever requested the exit
@@ -914,7 +914,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
icount_update(cpu);
/* Refill decrementer and continue execution. */
insns_left = MIN(0xffff, cpu->icount_budget);
- cpu_neg(cpu)->icount_decr.u16.low = insns_left;
+ cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
/*
diff --git a/accel/tcg/tcg-accel-ops-icount.c b/accel/tcg/tcg-accel-ops-icount.c
index 3d2cfbbc97..1bb31fed2c 100644
--- a/accel/tcg/tcg-accel-ops-icount.c
+++ b/accel/tcg/tcg-accel-ops-icount.c
@@ -111,14 +111,14 @@ void icount_prepare_for_run(CPUState *cpu, int64_t cpu_budget)
* each vCPU execution. However u16.high can be raised
* asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
*/
- g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0);
+ g_assert(cpu->icount_decr.u16.low == 0);
g_assert(cpu->icount_extra == 0);
replay_mutex_lock();
cpu->icount_budget = MIN(icount_get_limit(), cpu_budget);
insns_left = MIN(0xffff, cpu->icount_budget);
- cpu_neg(cpu)->icount_decr.u16.low = insns_left;
+ cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
if (cpu->icount_budget == 0) {
@@ -138,7 +138,7 @@ void icount_process_data(CPUState *cpu)
icount_update(cpu);
/* Reset the counters */
- cpu_neg(cpu)->icount_decr.u16.low = 0;
+ cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
cpu->icount_budget = 0;
diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c
index 3973591508..1b957eb37b 100644
--- a/accel/tcg/tcg-accel-ops.c
+++ b/accel/tcg/tcg-accel-ops.c
@@ -91,7 +91,7 @@ void tcg_handle_interrupt(CPUState *cpu, int mask)
if (!qemu_cpu_is_self(cpu)) {
qemu_cpu_kick(cpu);
} else {
- qatomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1);
+ qatomic_set(&cpu->icount_decr.u16.high, -1);
}
}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 5582aaf653..6069f6b8ba 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -214,7 +214,7 @@ void cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
* Reset the cycle counter to the start of the block and
* shift if to the number of actually executed instructions.
*/
- cpu_neg(cpu)->icount_decr.u16.low += insns_left;
+ cpu->icount_decr.u16.low += insns_left;
}
cpu->cc->tcg_ops->restore_state_to_opc(cpu, tb, data);
@@ -598,7 +598,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
cc = CPU_GET_CLASS(cpu);
if (cc->tcg_ops->io_recompile_replay_branch &&
cc->tcg_ops->io_recompile_replay_branch(cpu, tb)) {
- cpu_neg(cpu)->icount_decr.u16.low++;
+ cpu->icount_decr.u16.low++;
n = 2;
}
@@ -754,7 +754,7 @@ void cpu_interrupt(CPUState *cpu, int mask)
{
g_assert(qemu_mutex_iothread_locked());
cpu->interrupt_request |= mask;
- qatomic_set(&cpu_neg(cpu)->icount_decr.u16.high, -1);
+ qatomic_set(&cpu->icount_decr.u16.high, -1);
}
#endif /* CONFIG_USER_ONLY */
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 0fd9efceba..7eba58137d 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -52,9 +52,16 @@ static TCGOp *gen_tb_start(uint32_t cflags)
TCGv_i32 count = tcg_temp_new_i32();
TCGOp *icount_start_insn = NULL;
- tcg_gen_ld_i32(count, cpu_env,
- offsetof(ArchCPU, neg.icount_decr.u32) -
- offsetof(ArchCPU, env));
+#define ICOUNT_DECR_OFFSET \
+ ((int)offsetof(ArchCPU, parent_obj.icount_decr.u32) - \
+ (int)offsetof(ArchCPU, env))
+
+ QEMU_BUILD_BUG_ON(ICOUNT_DECR_OFFSET < CPU_MAX_NEGATIVE_ENV_OFFSET ||
+ ICOUNT_DECR_OFFSET > 0);
+
+ tcg_gen_ld_i32(count, cpu_env, ICOUNT_DECR_OFFSET);
+
+#undef ICOUNT_DECR_OFFSET
if (cflags & CF_USE_ICOUNT) {
/*
@@ -82,7 +89,7 @@ static TCGOp *gen_tb_start(uint32_t cflags)
if (cflags & CF_USE_ICOUNT) {
tcg_gen_st16_i32(count, cpu_env,
- offsetof(ArchCPU, neg.icount_decr.u16.low) -
+ offsetof(ArchCPU, parent_obj.icount_decr.u16.low) -
offsetof(ArchCPU, env));
/*
* cpu->can_do_io is cleared automatically here at the beginning of
diff --git a/softmmu/icount.c b/softmmu/icount.c
index a5cef9c60a..344af5fc3c 100644
--- a/softmmu/icount.c
+++ b/softmmu/icount.c
@@ -75,7 +75,7 @@ static void icount_enable_adaptive(void)
static int64_t icount_get_executed(CPUState *cpu)
{
return (cpu->icount_budget -
- (cpu_neg(cpu)->icount_decr.u16.low + cpu->icount_extra));
+ (cpu->icount_decr.u16.low + cpu->icount_extra));
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 7/9] include/exec: Remove [cpu|env]_neg() functions
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (5 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 6/9] Move IcountDecr " Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 8/9] target: Remove CPUNegativeOffsetState field from ArchCPU Anton Johansson via
` (2 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
These functions are no longer used to access CPUTLB or IcountDecr and
can be safely removed.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-all.h | 24 ------------------------
1 file changed, 24 deletions(-)
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index a31df11770..fee0e7df73 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -421,30 +421,6 @@ static inline CPUState *env_cpu(CPUArchState *env)
return &env_archcpu(env)->parent_obj;
}
-/**
- * env_neg(env)
- * @env: The architecture environment
- *
- * Return the CPUNegativeOffsetState associated with the environment.
- */
-static inline CPUNegativeOffsetState *env_neg(CPUArchState *env)
-{
- ArchCPU *arch_cpu = container_of(env, ArchCPU, env);
- return &arch_cpu->neg;
-}
-
-/**
- * cpu_neg(cpu)
- * @cpu: The generic CPUState
- *
- * Return the CPUNegativeOffsetState associated with the cpu.
- */
-static inline CPUNegativeOffsetState *cpu_neg(CPUState *cpu)
-{
- ArchCPU *arch_cpu = container_of(cpu, ArchCPU, parent_obj);
- return &arch_cpu->neg;
-}
-
/**
* env_tlb(env)
* @env: The architecture environment
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 8/9] target: Remove CPUNegativeOffsetState field from ArchCPU
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (6 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 7/9] include/exec: Remove [cpu|env]_neg() functions Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 9/9] include/exec: Remove CPUNegativeOffsetState Anton Johansson via
2023-07-01 9:21 ` [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Paolo Bonzini
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
All fields in CPUNegativeOffsetState have been moved to CPUState.
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
target/alpha/cpu.h | 1 -
target/arm/cpu.h | 1 -
target/avr/cpu.h | 1 -
target/cris/cpu.h | 1 -
target/hexagon/cpu.h | 1 -
target/hppa/cpu.h | 1 -
target/i386/cpu.h | 1 -
target/loongarch/cpu.h | 1 -
target/m68k/cpu.h | 1 -
target/microblaze/cpu.h | 1 -
target/mips/cpu.h | 3 +--
target/nios2/cpu.h | 1 -
target/openrisc/cpu.h | 1 -
target/ppc/cpu.h | 1 -
target/riscv/cpu.h | 1 -
target/rx/cpu.h | 1 -
target/s390x/cpu.h | 1 -
target/sh4/cpu.h | 1 -
target/sparc/cpu.h | 1 -
target/tricore/cpu.h | 1 -
target/xtensa/cpu.h | 3 +--
21 files changed, 2 insertions(+), 23 deletions(-)
diff --git a/target/alpha/cpu.h b/target/alpha/cpu.h
index fcd20bfd3a..2e4b7cfc53 100644
--- a/target/alpha/cpu.h
+++ b/target/alpha/cpu.h
@@ -263,7 +263,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUAlphaState env;
/* This alarm doesn't exist in real hardware; we wish it did. */
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4d6c0f95d5..7bced90c34 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -856,7 +856,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUARMState env;
/* Coprocessor information */
diff --git a/target/avr/cpu.h b/target/avr/cpu.h
index 7225174668..4ce22d8e4f 100644
--- a/target/avr/cpu.h
+++ b/target/avr/cpu.h
@@ -148,7 +148,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUAVRState env;
};
diff --git a/target/cris/cpu.h b/target/cris/cpu.h
index 8e37c6e50d..676b8e93ca 100644
--- a/target/cris/cpu.h
+++ b/target/cris/cpu.h
@@ -178,7 +178,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUCRISState env;
};
diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index daef5c3f00..b2de6e1d5d 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -141,7 +141,6 @@ struct ArchCPU {
/*< private >*/
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUHexagonState env;
bool lldb_compat;
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index 7373177b55..b36de8f431 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -219,7 +219,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUHPPAState env;
QEMUTimer *alarm_timer;
};
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 2c9b0d2ebc..3bc215dc5d 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1889,7 +1889,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUX86State env;
VMChangeStateEntry *vmsentry;
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index ed04027af1..87529c93bf 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -367,7 +367,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPULoongArchState env;
QEMUTimer timer;
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index cf70282717..20afb0c94d 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -168,7 +168,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUM68KState env;
};
diff --git a/target/microblaze/cpu.h b/target/microblaze/cpu.h
index a7b040abd4..baf7d979f6 100644
--- a/target/microblaze/cpu.h
+++ b/target/microblaze/cpu.h
@@ -352,7 +352,6 @@ struct ArchCPU {
bool ns_axi_dc;
bool ns_axi_ic;
- CPUNegativeOffsetState neg;
CPUMBState env;
MicroBlazeCPUConfig cfg;
};
diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index a3bc646976..d118170a62 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -1177,9 +1177,8 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- Clock *clock;
- CPUNegativeOffsetState neg;
CPUMIPSState env;
+ Clock *clock;
};
diff --git a/target/nios2/cpu.h b/target/nios2/cpu.h
index 477a3161fd..70b6377a4f 100644
--- a/target/nios2/cpu.h
+++ b/target/nios2/cpu.h
@@ -218,7 +218,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUNios2State env;
bool diverr_present;
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 92c38f54c2..e823858e40 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -305,7 +305,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUOpenRISCState env;
};
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index af12c93ebc..c480a1b37f 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1311,7 +1311,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUPPCState env;
int vcpu_id;
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 7adb8706ac..4b9e0f818e 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -383,7 +383,6 @@ struct ArchCPU {
/* < private > */
CPUState parent_obj;
/* < public > */
- CPUNegativeOffsetState neg;
CPURISCVState env;
char *dyn_csr_xml;
diff --git a/target/rx/cpu.h b/target/rx/cpu.h
index 7f03ffcfed..f66754eb8a 100644
--- a/target/rx/cpu.h
+++ b/target/rx/cpu.h
@@ -111,7 +111,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPURXState env;
};
diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index eb5b65b7d3..7592ad53a8 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -168,7 +168,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUS390XState env;
S390CPUModel *model;
/* needed for live migration */
diff --git a/target/sh4/cpu.h b/target/sh4/cpu.h
index 1399d3840f..f75a235973 100644
--- a/target/sh4/cpu.h
+++ b/target/sh4/cpu.h
@@ -208,7 +208,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUSH4State env;
};
diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index 95d2d0da71..9385228a96 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -561,7 +561,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUSPARCState env;
};
diff --git a/target/tricore/cpu.h b/target/tricore/cpu.h
index a50b91cc36..7a48077485 100644
--- a/target/tricore/cpu.h
+++ b/target/tricore/cpu.h
@@ -192,7 +192,6 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- CPUNegativeOffsetState neg;
CPUTriCoreState env;
};
diff --git a/target/xtensa/cpu.h b/target/xtensa/cpu.h
index 87fe992ba6..c6bbef1e5d 100644
--- a/target/xtensa/cpu.h
+++ b/target/xtensa/cpu.h
@@ -560,9 +560,8 @@ struct ArchCPU {
CPUState parent_obj;
/*< public >*/
- Clock *clock;
- CPUNegativeOffsetState neg;
CPUXtensaState env;
+ Clock *clock;
};
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 9/9] include/exec: Remove CPUNegativeOffsetState
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (7 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 8/9] target: Remove CPUNegativeOffsetState field from ArchCPU Anton Johansson via
@ 2023-06-30 12:25 ` Anton Johansson via
2023-07-01 9:21 ` [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Paolo Bonzini
9 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-06-30 12:25 UTC (permalink / raw)
To: qemu-devel
Cc: ale, richard.henderson, pbonzini, eduardo, philmd,
marcel.apfelbaum, peter.maydell, wangyanan55
Signed-off-by: Anton Johansson <anjo@rev.ng>
---
include/exec/cpu-defs.h | 7 -------
1 file changed, 7 deletions(-)
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 429d9525f7..e5125abc07 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -77,11 +77,4 @@
#endif /* CONFIG_SOFTMMU && CONFIG_TCG */
-/*
- * This structure must be placed in ArchCPU immediately
- * before CPUArchState, as a field named "neg".
- */
-typedef struct CPUNegativeOffsetState {
-} CPUNegativeOffsetState;
-
#endif
--
2.41.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset
2023-06-30 12:25 ` [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset Anton Johansson via
@ 2023-06-30 14:16 ` Richard Henderson
2023-07-04 8:45 ` Anton Johansson via
0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2023-06-30 14:16 UTC (permalink / raw)
To: Anton Johansson, qemu-devel
Cc: ale, pbonzini, eduardo, philmd, marcel.apfelbaum, peter.maydell,
wangyanan55
On 6/30/23 14:25, Anton Johansson wrote:
> @@ -448,6 +448,13 @@ struct CPUState {
>
> /* track IOMMUs whose translations we've cached in the TCG TLB */
> GArray *iommu_notifiers;
> +
> + /*
> + * The following fields needs to be within CPU_MAX_NEGATIVE_ENV_OFFSET of
> + * CPUArchState. As CPUArchState is assumed to follow CPUState in the
> + * ArchCPU struct these are placed last. This is checked statically.
> + */
> + CPUTLB tlb;
> };
This is what we had before CPUNegativeOffsetState, comment and all, and over the course of
time the comment was ignored and the CPUTLB crept toward the middle of the structure.
I really don't see how this merge helps. There's nothing target-specific about
CPUTLBDescFast or CPUNegativeOffsetState, and keeping them separate enforces their importance.
r~
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET
2023-06-30 12:25 ` [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET Anton Johansson via
@ 2023-06-30 14:19 ` Richard Henderson
2023-07-04 8:49 ` Anton Johansson via
0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2023-06-30 14:19 UTC (permalink / raw)
To: Anton Johansson, qemu-devel
Cc: ale, pbonzini, eduardo, philmd, marcel.apfelbaum, peter.maydell,
wangyanan55
On 6/30/23 14:25, Anton Johansson wrote:
> For reasons related to code-generation quality, the offset of
> CPUTLBDescFast and IcountDecr from CPUArchState needs to fit within
> 11 bits of displacement (arm[32|64] and riscv addressing modes).
>
> This commit introduces a new constant to store the maximum allowed
> negative offset, so it can be statically asserted to hold later on.
>
> Signed-off-by: Anton Johansson <anjo@rev.ng>
> ---
> include/hw/core/cpu.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index c226d7263c..0377f74d48 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -259,6 +259,17 @@ struct qemu_work_item;
>
> #define CPU_UNSET_NUMA_NODE_ID -1
>
> +/*
> + * For reasons related to code-generation quality the fast path
> + * CPUTLBDescFast array and IcountDecr fields need to be located within a
> + * small negative offset of CPUArchState. This requirement comes from
> + * host-specific addressing modes of arm[32|64] and riscv which uses 12-
> + * and 11 bits of displacement respectively.
> + */
> +#define CPU_MIN_DISPLACEMENT_BITS 11
> +#define CPU_MAX_NEGATIVE_ENV_OFFSET \
> + (-(1 << CPU_MIN_DISPLACEMENT_BITS))
You'd want 6 bits, for AArch64 LDP (7-bit signed immediate).
r~
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
` (8 preceding siblings ...)
2023-06-30 12:25 ` [PATCH 9/9] include/exec: Remove CPUNegativeOffsetState Anton Johansson via
@ 2023-07-01 9:21 ` Paolo Bonzini
2023-07-04 8:54 ` Anton Johansson via
9 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2023-07-01 9:21 UTC (permalink / raw)
To: Anton Johansson, qemu-devel
Cc: ale, richard.henderson, eduardo, philmd, marcel.apfelbaum,
peter.maydell, wangyanan55
On 6/30/23 14:25, Anton Johansson via wrote:
> CPUNegativeOffsetState is a struct placed immediately before
> CPUArchState in the ArchCPU struct. Its purpose is to ensure that
> certain fields (CPUTLBDescFast, IcountDecr) lay within a small negative
> offset of CPUArchState in memory. This is desired for better
> code-generation on arm[32|64] and riscv hosts which has addressing
> modes with 12- and 11 bits of displacement respectively.
The purpose is also to ensure that general purpose registers stay close
to the beginning of the CPUArchState and thus can also be accessed with
a small displacement.
Can you check if this becomes worse for any architecture? On some
64-bit targets, 8 bytes * 32 registers is 512 bytes and it's a
substantial part of the 11 bits that are available.
Paolo
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset
2023-06-30 14:16 ` Richard Henderson
@ 2023-07-04 8:45 ` Anton Johansson via
0 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-07-04 8:45 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
Cc: ale, pbonzini, eduardo, philmd, marcel.apfelbaum, peter.maydell,
wangyanan55
On 6/30/23 16:16, Richard Henderson wrote:
> On 6/30/23 14:25, Anton Johansson wrote:
>> @@ -448,6 +448,13 @@ struct CPUState {
>> /* track IOMMUs whose translations we've cached in the TCG
>> TLB */
>> GArray *iommu_notifiers;
>> +
>> + /*
>> + * The following fields needs to be within
>> CPU_MAX_NEGATIVE_ENV_OFFSET of
>> + * CPUArchState. As CPUArchState is assumed to follow CPUState
>> in the
>> + * ArchCPU struct these are placed last. This is checked
>> statically.
>> + */
>> + CPUTLB tlb;
>> };
>
> This is what we had before CPUNegativeOffsetState, comment and all,
> and over the course of time the comment was ignored and the CPUTLB
> crept toward the middle of the structure.
I recall you talking about this earlier. Is there any reason the
drifting of CPUTLB/IcountDecr from env wouldn't be caught by the static
asserts we've added?
>
> I really don't see how this merge helps. There's nothing
> target-specific about CPUTLBDescFast or CPUNegativeOffsetState, and
> keeping them separate enforces their importance.
>
There isn't anything target specific about CPUTLBDescFast but its offset
in ArchCPU does depend on the target. This is due to the
TARGET_PAGE_ENTRY_EXTRA macro in CPUTLBEntryFull which is replaced
with a union in the first patch of this patchset.
On second thought, I think you're right and keeping CPUTLB and IcountDecr
together to emphasize their importance makes sense. There would still be
an implicit assumption on the target to keep CPUArchState and CPUState
close together, at least the intent is signaled better by
CPUNegativeOffsetState.
Even so, statically asserting the offset to CPUTLB and IcountDecr would be a
good idea.
The main concern of this patchset is being able to access the tlb and
icount_decr fields in a target-independent way only having access to
CPUState. Merging CPUNegativeOffsetState simply made the accessing part
simpler, but let's have a cpu_tlb(CPUState *) function instead.
I'll pull out the patch for making CPUTLB target independent and include
it in the patchset replacing CPUArchState with CPUState in cputlb.c and
user-exec.c. The static assert patches will be resubmitted separately.
Thanks,
--
Anton Johansson,
rev.ng Labs Srl.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET
2023-06-30 14:19 ` Richard Henderson
@ 2023-07-04 8:49 ` Anton Johansson via
0 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-07-04 8:49 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
Cc: ale, pbonzini, eduardo, philmd, marcel.apfelbaum, peter.maydell,
wangyanan55
On 6/30/23 16:19, Richard Henderson wrote:
> On 6/30/23 14:25, Anton Johansson wrote:
>> For reasons related to code-generation quality, the offset of
>> CPUTLBDescFast and IcountDecr from CPUArchState needs to fit within
>> 11 bits of displacement (arm[32|64] and riscv addressing modes).
>>
>> This commit introduces a new constant to store the maximum allowed
>> negative offset, so it can be statically asserted to hold later on.
>>
>> Signed-off-by: Anton Johansson <anjo@rev.ng>
>> ---
>> include/hw/core/cpu.h | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
>> index c226d7263c..0377f74d48 100644
>> --- a/include/hw/core/cpu.h
>> +++ b/include/hw/core/cpu.h
>> @@ -259,6 +259,17 @@ struct qemu_work_item;
>> #define CPU_UNSET_NUMA_NODE_ID -1
>> +/*
>> + * For reasons related to code-generation quality the fast path
>> + * CPUTLBDescFast array and IcountDecr fields need to be located
>> within a
>> + * small negative offset of CPUArchState. This requirement comes from
>> + * host-specific addressing modes of arm[32|64] and riscv which uses
>> 12-
>> + * and 11 bits of displacement respectively.
>> + */
>> +#define CPU_MIN_DISPLACEMENT_BITS 11
>> +#define CPU_MAX_NEGATIVE_ENV_OFFSET \
>> + (-(1 << CPU_MIN_DISPLACEMENT_BITS))
>
> You'd want 6 bits, for AArch64 LDP (7-bit signed immediate).
Ah right, but it's scaled so it would be -512 for AArch64 and -256 for
arm32,
that would also agree with the smallest MIN_TLB_MASK_TABLE_OFS in
the backends.
--
Anton Johansson,
rev.ng Labs Srl.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState
2023-07-01 9:21 ` [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Paolo Bonzini
@ 2023-07-04 8:54 ` Anton Johansson via
0 siblings, 0 replies; 16+ messages in thread
From: Anton Johansson via @ 2023-07-04 8:54 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: ale, richard.henderson, eduardo, philmd, marcel.apfelbaum,
peter.maydell, wangyanan55
On 7/1/23 11:21, Paolo Bonzini wrote:
> On 6/30/23 14:25, Anton Johansson via wrote:
>> CPUNegativeOffsetState is a struct placed immediately before
>> CPUArchState in the ArchCPU struct. Its purpose is to ensure that
>> certain fields (CPUTLBDescFast, IcountDecr) lay within a small negative
>> offset of CPUArchState in memory. This is desired for better
>> code-generation on arm[32|64] and riscv hosts which has addressing
>> modes with 12- and 11 bits of displacement respectively.
>
> The purpose is also to ensure that general purpose registers stay
> close to the beginning of the CPUArchState and thus can also be
> accessed with a small displacement.
>
> Can you check if this becomes worse for any architecture? On some
> 64-bit targets, 8 bytes * 32 registers is 512 bytes and it's a
> substantial part of the 11 bits that are available.
>
> Paolo
>
I'm dropping the CPUNegativeOffsetState changes, but the fields would
still have had
a negative displacement from the beginning of env, so the positive
offset of the gprs from env
wouldn't be affected.
--
Anton Johansson,
rev.ng Labs Srl.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-07-04 8:55 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-30 12:25 [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Anton Johansson via
2023-06-30 12:25 ` [PATCH 1/9] target/arm: Replace TARGET_PAGE_ENTRY_EXTRA Anton Johansson via
2023-06-30 12:25 ` [PATCH 2/9] include: Move MMUAccessType to tlb-common.h Anton Johansson via
2023-06-30 12:25 ` [PATCH 3/9] include/exec: Move CPUTLB and friends " Anton Johansson via
2023-06-30 12:25 ` [PATCH 4/9] include/hw: introduce CPU_MAX_NEGATIVE_ENV_OFFSET Anton Johansson via
2023-06-30 14:19 ` Richard Henderson
2023-07-04 8:49 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 5/9] accel: Move CPUTLB to CPUState and assert offset Anton Johansson via
2023-06-30 14:16 ` Richard Henderson
2023-07-04 8:45 ` Anton Johansson via
2023-06-30 12:25 ` [PATCH 6/9] Move IcountDecr " Anton Johansson via
2023-06-30 12:25 ` [PATCH 7/9] include/exec: Remove [cpu|env]_neg() functions Anton Johansson via
2023-06-30 12:25 ` [PATCH 8/9] target: Remove CPUNegativeOffsetState field from ArchCPU Anton Johansson via
2023-06-30 12:25 ` [PATCH 9/9] include/exec: Remove CPUNegativeOffsetState Anton Johansson via
2023-07-01 9:21 ` [PATCH 0/9] Collapse CPUNegativeOffsetState into CPUState Paolo Bonzini
2023-07-04 8:54 ` Anton Johansson via
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).