qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes
@ 2022-08-12 18:07 Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 01/21] linux-user/arm: Mark the commpage executable Richard Henderson
                   ` (21 more replies)
  0 siblings, 22 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

This is part of a larger body of work, but in the process of
reorganizing I was reminded that PROT_EXEC wasn't being enforced
properly for user-only.  As this has come up in the context of
some of Ilya's patches, I thought I'd go ahead and post this part.


r~


Ilya Leoshkevich (1):
  accel/tcg: Introduce is_same_page()

Richard Henderson (20):
  linux-user/arm: Mark the commpage executable
  linux-user/hppa: Allocate page zero as a commpage
  linux-user/x86_64: Allocate vsyscall page as a commpage
  linux-user: Honor PT_GNU_STACK
  tests/tcg/i386: Move smc_code2 to an executable section
  accel/tcg: Remove PageDesc code_bitmap
  accel/tcg: Use bool for page_find_alloc
  accel/tcg: Merge tb_htable_lookup into caller
  accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c
  accel/tcg: Properly implement get_page_addr_code for user-only
  accel/tcg: Use probe_access_internal for softmmu
    get_page_addr_code_hostp
  accel/tcg: Add nofault parameter to get_page_addr_code_hostp
  accel/tcg: Unlock mmap_lock after longjmp
  accel/tcg: Hoist get_page_addr_code out of tb_lookup
  accel/tcg: Hoist get_page_addr_code out of tb_gen_code
  accel/tcg: Raise PROT_EXEC exception early
  accel/tcg: Remove translator_ldsw
  accel/tcg: Add pc and host_pc params to gen_intermediate_code
  accel/tcg: Add fast path for translator_ld*
  accel/tcg: Use DisasContextBase in plugin_gen_tb_start

 accel/tcg/internal.h          |   7 +-
 include/elf.h                 |   1 +
 include/exec/cpu-common.h     |   1 +
 include/exec/exec-all.h       |  87 +++++-----------
 include/exec/plugin-gen.h     |   7 +-
 include/exec/translator.h     |  85 ++++++++++++----
 linux-user/arm/target_cpu.h   |   4 +-
 linux-user/qemu.h             |   1 +
 accel/tcg/cpu-exec.c          | 184 ++++++++++++++++++----------------
 accel/tcg/cputlb.c            |  93 +++++------------
 accel/tcg/plugin-gen.c        |  23 +++--
 accel/tcg/translate-all.c     | 120 ++++------------------
 accel/tcg/translator.c        | 122 +++++++++++++++++-----
 accel/tcg/user-exec.c         |  15 +++
 linux-user/elfload.c          |  80 ++++++++++++++-
 softmmu/physmem.c             |  12 +++
 target/alpha/translate.c      |   5 +-
 target/arm/translate.c        |   5 +-
 target/avr/translate.c        |   5 +-
 target/cris/translate.c       |   5 +-
 target/hexagon/translate.c    |   6 +-
 target/hppa/translate.c       |   5 +-
 target/i386/tcg/translate.c   |   7 +-
 target/loongarch/translate.c  |   6 +-
 target/m68k/translate.c       |   5 +-
 target/microblaze/translate.c |   5 +-
 target/mips/tcg/translate.c   |   5 +-
 target/nios2/translate.c      |   5 +-
 target/openrisc/translate.c   |   6 +-
 target/ppc/translate.c        |   5 +-
 target/riscv/translate.c      |   5 +-
 target/rx/translate.c         |   5 +-
 target/s390x/tcg/translate.c  |   5 +-
 target/sh4/translate.c        |   5 +-
 target/sparc/translate.c      |   5 +-
 target/tricore/translate.c    |   6 +-
 target/xtensa/translate.c     |   6 +-
 tests/tcg/i386/test-i386.c    |   2 +-
 38 files changed, 532 insertions(+), 424 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 01/21] linux-user/arm: Mark the commpage executable
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 02/21] linux-user/hppa: Allocate page zero as a commpage Richard Henderson
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We're about to start validating PAGE_EXEC, which means
that we've got to mark the commpage executable.  We had
been placing the commpage outside of reserved_va, which
was incorrect and lead to an abort.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/arm/target_cpu.h | 4 ++--
 linux-user/elfload.c        | 6 +++++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/linux-user/arm/target_cpu.h b/linux-user/arm/target_cpu.h
index 709d19bc9e..89ba274cfc 100644
--- a/linux-user/arm/target_cpu.h
+++ b/linux-user/arm/target_cpu.h
@@ -34,9 +34,9 @@ static inline unsigned long arm_max_reserved_va(CPUState *cs)
     } else {
         /*
          * We need to be able to map the commpage.
-         * See validate_guest_space in linux-user/elfload.c.
+         * See init_guest_commpage in linux-user/elfload.c.
          */
-        return 0xffff0000ul;
+        return 0xfffffffful;
     }
 }
 #define MAX_RESERVED_VA  arm_max_reserved_va
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index ce902dbd56..3e3dc02499 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -398,7 +398,8 @@ enum {
 
 static bool init_guest_commpage(void)
 {
-    void *want = g2h_untagged(HI_COMMPAGE & -qemu_host_page_size);
+    abi_ptr commpage = HI_COMMPAGE & -qemu_host_page_size;
+    void *want = g2h_untagged(commpage);
     void *addr = mmap(want, qemu_host_page_size, PROT_READ | PROT_WRITE,
                       MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
 
@@ -417,6 +418,9 @@ static bool init_guest_commpage(void)
         perror("Protecting guest commpage");
         exit(EXIT_FAILURE);
     }
+
+    page_set_flags(commpage, commpage + qemu_host_page_size,
+                   PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 02/21] linux-user/hppa: Allocate page zero as a commpage
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 01/21] linux-user/arm: Mark the commpage executable Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 03/21] linux-user/x86_64: Allocate vsyscall page " Richard Henderson
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We're about to start validating PAGE_EXEC, which means that we've
got to mark page zero executable.  We had been special casing this
entirely within translate.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 3e3dc02499..29d910c4cc 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1646,6 +1646,34 @@ static inline void init_thread(struct target_pt_regs *regs,
     regs->gr[31] = infop->entry;
 }
 
+#define LO_COMMPAGE  0
+
+static bool init_guest_commpage(void)
+{
+    void *want = g2h_untagged(LO_COMMPAGE);
+    void *addr = mmap(want, qemu_host_page_size, PROT_NONE,
+                      MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
+
+    if (addr == MAP_FAILED) {
+        perror("Allocating guest commpage");
+        exit(EXIT_FAILURE);
+    }
+    if (addr != want) {
+        return false;
+    }
+
+    /*
+     * On Linux, page zero is normally marked execute only + gateway.
+     * Normal read or write is supposed to fail (thus PROT_NONE above),
+     * but specific offsets have kernel code mapped to raise permissions
+     * and implement syscalls.  Here, simply mark the page executable.
+     * Special case the entry points during translation (see do_page_zero).
+     */
+    page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+                   PAGE_EXEC | PAGE_VALID);
+    return true;
+}
+
 #endif /* TARGET_HPPA */
 
 #ifdef TARGET_XTENSA
@@ -2326,12 +2354,12 @@ static abi_ulong create_elf_tables(abi_ulong p, int argc, int envc,
 }
 
 #if defined(HI_COMMPAGE)
-#define LO_COMMPAGE 0
+#define LO_COMMPAGE -1
 #elif defined(LO_COMMPAGE)
 #define HI_COMMPAGE 0
 #else
 #define HI_COMMPAGE 0
-#define LO_COMMPAGE 0
+#define LO_COMMPAGE -1
 #define init_guest_commpage() true
 #endif
 
@@ -2555,7 +2583,7 @@ static void pgb_static(const char *image_name, abi_ulong orig_loaddr,
         } else {
             offset = -(HI_COMMPAGE & -align);
         }
-    } else if (LO_COMMPAGE != 0) {
+    } else if (LO_COMMPAGE != -1) {
         loaddr = MIN(loaddr, LO_COMMPAGE & -align);
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 03/21] linux-user/x86_64: Allocate vsyscall page as a commpage
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 01/21] linux-user/arm: Mark the commpage executable Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 02/21] linux-user/hppa: Allocate page zero as a commpage Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 04/21] linux-user: Honor PT_GNU_STACK Richard Henderson
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We're about to start validating PAGE_EXEC, which means that we've
got to the vsyscall page executable.  We had been special casing
this entirely within translate.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 29d910c4cc..e315155dad 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -195,6 +195,27 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, const CPUX86State *en
     (*regs)[26] = tswapreg(env->segs[R_GS].selector & 0xffff);
 }
 
+#define HI_COMMPAGE  TARGET_VSYSCALL_PAGE
+
+static bool init_guest_commpage(void)
+{
+    /*
+     * The vsyscall page is at a high negative address aka kernel space,
+     * which means that we cannot actually allocate it with target_mmap.
+     * We still should be able to use page_set_flags, unless the user
+     * has specified -R reserved_va, which would trigger an assert().
+     */
+    if (reserved_va != 0 &&
+        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE >= reserved_va) {
+        error_report("Cannot allocate vsyscall page");
+        exit(EXIT_FAILURE);
+    }
+    page_set_flags(TARGET_VSYSCALL_PAGE,
+                   TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE,
+                   PAGE_EXEC | PAGE_VALID);
+    return true;
+}
+
 #else
 
 #define ELF_START_MMAP 0x80000000
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 04/21] linux-user: Honor PT_GNU_STACK
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (2 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 03/21] linux-user/x86_64: Allocate vsyscall page " Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 05/21] tests/tcg/i386: Move smc_code2 to an executable section Richard Henderson
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Map the stack executable if required by default or on demand.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/elf.h        |  1 +
 linux-user/qemu.h    |  1 +
 linux-user/elfload.c | 19 ++++++++++++++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/elf.h b/include/elf.h
index 3a4bcb646a..3d6b9062c0 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -31,6 +31,7 @@ typedef int64_t  Elf64_Sxword;
 #define PT_LOPROC  0x70000000
 #define PT_HIPROC  0x7fffffff
 
+#define PT_GNU_STACK      (PT_LOOS + 0x474e551)
 #define PT_GNU_PROPERTY   (PT_LOOS + 0x474e553)
 
 #define PT_MIPS_REGINFO   0x70000000
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 7d90de1b15..e2e93fbd1d 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -48,6 +48,7 @@ struct image_info {
         uint32_t        elf_flags;
         int             personality;
         abi_ulong       alignment;
+        bool            exec_stack;
 
         /* Generic semihosting knows about these pointers. */
         abi_ulong       arg_strings;   /* strings for argv */
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index e315155dad..b1169ca6df 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -232,6 +232,7 @@ static bool init_guest_commpage(void)
 #define ELF_ARCH        EM_386
 
 #define ELF_PLATFORM get_elf_platform()
+#define EXSTACK_DEFAULT true
 
 static const char *get_elf_platform(void)
 {
@@ -308,6 +309,7 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, const CPUX86State *en
 
 #define ELF_ARCH        EM_ARM
 #define ELF_CLASS       ELFCLASS32
+#define EXSTACK_DEFAULT true
 
 static inline void init_thread(struct target_pt_regs *regs,
                                struct image_info *infop)
@@ -776,6 +778,7 @@ static inline void init_thread(struct target_pt_regs *regs,
 #else
 
 #define ELF_CLASS       ELFCLASS32
+#define EXSTACK_DEFAULT true
 
 #endif
 
@@ -973,6 +976,7 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, const CPUPPCState *en
 
 #define ELF_CLASS   ELFCLASS64
 #define ELF_ARCH    EM_LOONGARCH
+#define EXSTACK_DEFAULT true
 
 #define elf_check_arch(x) ((x) == EM_LOONGARCH)
 
@@ -1068,6 +1072,7 @@ static uint32_t get_elf_hwcap(void)
 #define ELF_CLASS   ELFCLASS32
 #endif
 #define ELF_ARCH    EM_MIPS
+#define EXSTACK_DEFAULT true
 
 #ifdef TARGET_ABI_MIPSN32
 #define elf_check_abi(x) ((x) & EF_MIPS_ABI2)
@@ -1806,6 +1811,10 @@ static inline void init_thread(struct target_pt_regs *regs,
 #define bswaptls(ptr) bswap32s(ptr)
 #endif
 
+#ifndef EXSTACK_DEFAULT
+#define EXSTACK_DEFAULT false
+#endif
+
 #include "elf.h"
 
 /* We must delay the following stanzas until after "elf.h". */
@@ -2081,6 +2090,7 @@ static abi_ulong setup_arg_pages(struct linux_binprm *bprm,
                                  struct image_info *info)
 {
     abi_ulong size, error, guard;
+    int prot;
 
     size = guest_stack_size;
     if (size < STACK_LOWER_LIMIT) {
@@ -2091,7 +2101,11 @@ static abi_ulong setup_arg_pages(struct linux_binprm *bprm,
         guard = qemu_real_host_page_size();
     }
 
-    error = target_mmap(0, size + guard, PROT_READ | PROT_WRITE,
+    prot = PROT_READ | PROT_WRITE;
+    if (info->exec_stack) {
+        prot |= PROT_EXEC;
+    }
+    error = target_mmap(0, size + guard, prot,
                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
     if (error == -1) {
         perror("mmap stack");
@@ -2919,6 +2933,7 @@ static void load_elf_image(const char *image_name, int image_fd,
      */
     loaddr = -1, hiaddr = 0;
     info->alignment = 0;
+    info->exec_stack = EXSTACK_DEFAULT;
     for (i = 0; i < ehdr->e_phnum; ++i) {
         struct elf_phdr *eppnt = phdr + i;
         if (eppnt->p_type == PT_LOAD) {
@@ -2961,6 +2976,8 @@ static void load_elf_image(const char *image_name, int image_fd,
             if (!parse_elf_properties(image_fd, info, eppnt, bprm_buf, &err)) {
                 goto exit_errmsg;
             }
+        } else if (eppnt->p_type == PT_GNU_STACK) {
+            info->exec_stack = eppnt->p_flags & PF_X;
         }
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 05/21] tests/tcg/i386: Move smc_code2 to an executable section
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (3 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 04/21] linux-user: Honor PT_GNU_STACK Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 06/21] accel/tcg: Remove PageDesc code_bitmap Richard Henderson
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We're about to start validating PAGE_EXEC, which means
that we've got to put this code into a section that is
both writable and executable.

Note that this test did not run on hardware beforehand either.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tests/tcg/i386/test-i386.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/tcg/i386/test-i386.c b/tests/tcg/i386/test-i386.c
index ac8d5a3c1f..e6b308a2c0 100644
--- a/tests/tcg/i386/test-i386.c
+++ b/tests/tcg/i386/test-i386.c
@@ -1998,7 +1998,7 @@ uint8_t code[] = {
     0xc3, /* ret */
 };
 
-asm(".section \".data\"\n"
+asm(".section \".data_x\",\"awx\"\n"
     "smc_code2:\n"
     "movl 4(%esp), %eax\n"
     "movl %eax, smc_patch_addr2 + 1\n"
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 06/21] accel/tcg: Remove PageDesc code_bitmap
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (4 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 05/21] tests/tcg/i386: Move smc_code2 to an executable section Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 07/21] accel/tcg: Use bool for page_find_alloc Richard Henderson
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

This bitmap is created and discarded immediately.
We gain nothing by its existence.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/translate-all.c | 78 ++-------------------------------------
 1 file changed, 4 insertions(+), 74 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index ef62a199c7..cf99b2b876 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -101,21 +101,14 @@
 #define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
 
-#define SMC_BITMAP_USE_THRESHOLD 10
-
 typedef struct PageDesc {
     /* list of TBs intersecting this ram page */
     uintptr_t first_tb;
-#ifdef CONFIG_SOFTMMU
-    /* in order to optimize self modifying code, we count the number
-       of lookups we do to a given page to use a bitmap */
-    unsigned long *code_bitmap;
-    unsigned int code_write_count;
-#else
+#ifdef CONFIG_USER_ONLY
     unsigned long flags;
     void *target_data;
 #endif
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_SOFTMMU
     QemuSpin lock;
 #endif
 } PageDesc;
@@ -906,17 +899,6 @@ void tb_htable_init(void)
     qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
 }
 
-/* call with @p->lock held */
-static inline void invalidate_page_bitmap(PageDesc *p)
-{
-    assert_page_locked(p);
-#ifdef CONFIG_SOFTMMU
-    g_free(p->code_bitmap);
-    p->code_bitmap = NULL;
-    p->code_write_count = 0;
-#endif
-}
-
 /* Set to NULL all the 'first_tb' fields in all PageDescs. */
 static void page_flush_tb_1(int level, void **lp)
 {
@@ -931,7 +913,6 @@ static void page_flush_tb_1(int level, void **lp)
         for (i = 0; i < V_L2_SIZE; ++i) {
             page_lock(&pd[i]);
             pd[i].first_tb = (uintptr_t)NULL;
-            invalidate_page_bitmap(pd + i);
             page_unlock(&pd[i]);
         }
     } else {
@@ -1196,11 +1177,9 @@ static void do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
     if (rm_from_page_list) {
         p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
         tb_page_remove(p, tb);
-        invalidate_page_bitmap(p);
         if (tb->page_addr[1] != -1) {
             p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
             tb_page_remove(p, tb);
-            invalidate_page_bitmap(p);
         }
     }
 
@@ -1245,35 +1224,6 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
     }
 }
 
-#ifdef CONFIG_SOFTMMU
-/* call with @p->lock held */
-static void build_page_bitmap(PageDesc *p)
-{
-    int n, tb_start, tb_end;
-    TranslationBlock *tb;
-
-    assert_page_locked(p);
-    p->code_bitmap = bitmap_new(TARGET_PAGE_SIZE);
-
-    PAGE_FOR_EACH_TB(p, tb, n) {
-        /* NOTE: this is subtle as a TB may span two physical pages */
-        if (n == 0) {
-            /* NOTE: tb_end may be after the end of the page, but
-               it is not a problem */
-            tb_start = tb->pc & ~TARGET_PAGE_MASK;
-            tb_end = tb_start + tb->size;
-            if (tb_end > TARGET_PAGE_SIZE) {
-                tb_end = TARGET_PAGE_SIZE;
-             }
-        } else {
-            tb_start = 0;
-            tb_end = ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
-        }
-        bitmap_set(p->code_bitmap, tb_start, tb_end - tb_start);
-    }
-}
-#endif
-
 /* add the tb in the target page and protect it if necessary
  *
  * Called with mmap_lock held for user-mode emulation.
@@ -1294,7 +1244,6 @@ static inline void tb_page_add(PageDesc *p, TranslationBlock *tb,
     page_already_protected = p->first_tb != (uintptr_t)NULL;
 #endif
     p->first_tb = (uintptr_t)tb | n;
-    invalidate_page_bitmap(p);
 
 #if defined(CONFIG_USER_ONLY)
     /* translator_loop() must have made all TB pages non-writable */
@@ -1356,10 +1305,8 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
     /* remove TB from the page(s) if we couldn't insert it */
     if (unlikely(existing_tb)) {
         tb_page_remove(p, tb);
-        invalidate_page_bitmap(p);
         if (p2) {
             tb_page_remove(p2, tb);
-            invalidate_page_bitmap(p2);
         }
         tb = existing_tb;
     }
@@ -1736,7 +1683,6 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
 #if !defined(CONFIG_USER_ONLY)
     /* if no code remaining, no need to continue to use slow writes */
     if (!p->first_tb) {
-        invalidate_page_bitmap(p);
         tlb_unprotect_code(start);
     }
 #endif
@@ -1832,24 +1778,8 @@ void tb_invalidate_phys_page_fast(struct page_collection *pages,
     }
 
     assert_page_locked(p);
-    if (!p->code_bitmap &&
-        ++p->code_write_count >= SMC_BITMAP_USE_THRESHOLD) {
-        build_page_bitmap(p);
-    }
-    if (p->code_bitmap) {
-        unsigned int nr;
-        unsigned long b;
-
-        nr = start & ~TARGET_PAGE_MASK;
-        b = p->code_bitmap[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG - 1));
-        if (b & ((1 << len) - 1)) {
-            goto do_invalidate;
-        }
-    } else {
-    do_invalidate:
-        tb_invalidate_phys_page_range__locked(pages, p, start, start + len,
-                                              retaddr);
-    }
+    tb_invalidate_phys_page_range__locked(pages, p, start, start + len,
+                                          retaddr);
 }
 #else
 /* Called with mmap_lock held. If pc is not 0 then it indicates the
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 07/21] accel/tcg: Use bool for page_find_alloc
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (5 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 06/21] accel/tcg: Remove PageDesc code_bitmap Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 08/21] accel/tcg: Merge tb_htable_lookup into caller Richard Henderson
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Bool is more appropriate type for the alloc parameter.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/translate-all.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index cf99b2b876..65a23f47d6 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -464,7 +464,7 @@ void page_init(void)
 #endif
 }
 
-static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
+static PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc)
 {
     PageDesc *pd;
     void **lp;
@@ -532,11 +532,11 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
 
 static inline PageDesc *page_find(tb_page_addr_t index)
 {
-    return page_find_alloc(index, 0);
+    return page_find_alloc(index, false);
 }
 
 static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
-                           PageDesc **ret_p2, tb_page_addr_t phys2, int alloc);
+                           PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc);
 
 /* In user-mode page locks aren't used; mmap_lock is enough */
 #ifdef CONFIG_USER_ONLY
@@ -650,7 +650,7 @@ static inline void page_unlock(PageDesc *pd)
 /* lock the page(s) of a TB in the correct acquisition order */
 static inline void page_lock_tb(const TranslationBlock *tb)
 {
-    page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], 0);
+    page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], false);
 }
 
 static inline void page_unlock_tb(const TranslationBlock *tb)
@@ -839,7 +839,7 @@ void page_collection_unlock(struct page_collection *set)
 #endif /* !CONFIG_USER_ONLY */
 
 static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
-                           PageDesc **ret_p2, tb_page_addr_t phys2, int alloc)
+                           PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc)
 {
     PageDesc *p1, *p2;
     tb_page_addr_t page1;
@@ -1289,7 +1289,7 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
      * Note that inserting into the hash table first isn't an option, since
      * we can only insert TBs that are fully initialized.
      */
-    page_lock_pair(&p, phys_pc, &p2, phys_page2, 1);
+    page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
     tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK);
     if (p2) {
         tb_page_add(p2, tb, 1, phys_page2);
@@ -2224,7 +2224,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
     for (addr = start, len = end - start;
          len != 0;
          len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-        PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
+        PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, true);
 
         /* If the write protection bit is set, then we invalidate
            the code inside.  */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 08/21] accel/tcg: Merge tb_htable_lookup into caller
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (6 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 07/21] accel/tcg: Use bool for page_find_alloc Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 09/21] accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c Richard Henderson
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

This function is used only once, so merge it into
its only caller, tb_lookup.  This requires moving
the support routine, tb_lookup_cmp, and its private
data structure, tb_desc, up in the file.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h |   3 -
 accel/tcg/cpu-exec.c    | 134 +++++++++++++++++++---------------------
 2 files changed, 64 insertions(+), 73 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 311e5fb422..e7e30d55b8 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -552,9 +552,6 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs);
 #endif
 void tb_flush(CPUState *cpu);
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
-TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,
-                                   target_ulong cs_base, uint32_t flags,
-                                   uint32_t cflags);
 void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
 
 /* GETPC is the true target of the return instruction that we'll execute.  */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index a565a3f8ec..f6c0c0aff6 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -170,19 +170,60 @@ uint32_t curr_cflags(CPUState *cpu)
     return cflags;
 }
 
-/* Might cause an exception, so have a longjmp destination ready */
-static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
-                                          target_ulong cs_base,
-                                          uint32_t flags, uint32_t cflags)
+struct tb_desc {
+    target_ulong pc;
+    target_ulong cs_base;
+    CPUArchState *env;
+    tb_page_addr_t phys_page1;
+    uint32_t flags;
+    uint32_t cflags;
+    uint32_t trace_vcpu_dstate;
+};
+
+static bool tb_lookup_cmp(const void *p, const void *d)
 {
+    const TranslationBlock *tb = p;
+    const struct tb_desc *desc = d;
+
+    if (tb->pc == desc->pc &&
+        tb->page_addr[0] == desc->phys_page1 &&
+        tb->cs_base == desc->cs_base &&
+        tb->flags == desc->flags &&
+        tb->trace_vcpu_dstate == desc->trace_vcpu_dstate &&
+        tb_cflags(tb) == desc->cflags) {
+        /* check next page if needed */
+        if (tb->page_addr[1] == -1) {
+            return true;
+        } else {
+            tb_page_addr_t phys_page2;
+            target_ulong virt_page2;
+
+            virt_page2 = (desc->pc & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
+            phys_page2 = get_page_addr_code(desc->env, virt_page2);
+            if (tb->page_addr[1] == phys_page2) {
+                return true;
+            }
+        }
+    }
+    return false;
+}
+
+/* Might cause an exception, so have a longjmp destination ready */
+static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
+                                   target_ulong cs_base,
+                                   uint32_t flags, uint32_t cflags)
+{
+    CPUArchState *env = cpu->env_ptr;
     TranslationBlock *tb;
-    uint32_t hash;
+    tb_page_addr_t phys_pc;
+    struct tb_desc desc;
+    uint32_t jmp_hash, tb_hash;
 
     /* we should never be trying to look up an INVALID tb */
     tcg_debug_assert(!(cflags & CF_INVALID));
 
-    hash = tb_jmp_cache_hash_func(pc);
-    tb = qatomic_rcu_read(&cpu->tb_jmp_cache[hash]);
+    jmp_hash = tb_jmp_cache_hash_func(pc);
+    tb = qatomic_rcu_read(&cpu->tb_jmp_cache[jmp_hash]);
 
     if (likely(tb &&
                tb->pc == pc &&
@@ -192,11 +233,25 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
                tb_cflags(tb) == cflags)) {
         return tb;
     }
-    tb = tb_htable_lookup(cpu, pc, cs_base, flags, cflags);
+
+    desc.env = env;
+    desc.cs_base = cs_base;
+    desc.flags = flags;
+    desc.cflags = cflags;
+    desc.trace_vcpu_dstate = *cpu->trace_dstate;
+    desc.pc = pc;
+    phys_pc = get_page_addr_code(desc.env, pc);
+    if (phys_pc == -1) {
+        return NULL;
+    }
+    desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
+    tb_hash = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
+    tb = qht_lookup_custom(&tb_ctx.htable, &desc, tb_hash, tb_lookup_cmp);
     if (tb == NULL) {
         return NULL;
     }
-    qatomic_set(&cpu->tb_jmp_cache[hash], tb);
+
+    qatomic_set(&cpu->tb_jmp_cache[jmp_hash], tb);
     return tb;
 }
 
@@ -487,67 +542,6 @@ void cpu_exec_step_atomic(CPUState *cpu)
     end_exclusive();
 }
 
-struct tb_desc {
-    target_ulong pc;
-    target_ulong cs_base;
-    CPUArchState *env;
-    tb_page_addr_t phys_page1;
-    uint32_t flags;
-    uint32_t cflags;
-    uint32_t trace_vcpu_dstate;
-};
-
-static bool tb_lookup_cmp(const void *p, const void *d)
-{
-    const TranslationBlock *tb = p;
-    const struct tb_desc *desc = d;
-
-    if (tb->pc == desc->pc &&
-        tb->page_addr[0] == desc->phys_page1 &&
-        tb->cs_base == desc->cs_base &&
-        tb->flags == desc->flags &&
-        tb->trace_vcpu_dstate == desc->trace_vcpu_dstate &&
-        tb_cflags(tb) == desc->cflags) {
-        /* check next page if needed */
-        if (tb->page_addr[1] == -1) {
-            return true;
-        } else {
-            tb_page_addr_t phys_page2;
-            target_ulong virt_page2;
-
-            virt_page2 = (desc->pc & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-            phys_page2 = get_page_addr_code(desc->env, virt_page2);
-            if (tb->page_addr[1] == phys_page2) {
-                return true;
-            }
-        }
-    }
-    return false;
-}
-
-TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,
-                                   target_ulong cs_base, uint32_t flags,
-                                   uint32_t cflags)
-{
-    tb_page_addr_t phys_pc;
-    struct tb_desc desc;
-    uint32_t h;
-
-    desc.env = cpu->env_ptr;
-    desc.cs_base = cs_base;
-    desc.flags = flags;
-    desc.cflags = cflags;
-    desc.trace_vcpu_dstate = *cpu->trace_dstate;
-    desc.pc = pc;
-    phys_pc = get_page_addr_code(desc.env, pc);
-    if (phys_pc == -1) {
-        return NULL;
-    }
-    desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
-    h = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
-    return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
-}
-
 void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr)
 {
     if (TCG_TARGET_HAS_direct_jump) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 09/21] accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (7 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 08/21] accel/tcg: Merge tb_htable_lookup into caller Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 10/21] accel/tcg: Properly implement get_page_addr_code for user-only Richard Henderson
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

The base qemu_ram_addr_from_host function is already in
softmmu/physmem.c; move the nofail version to be adjacent.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-common.h |  1 +
 accel/tcg/cputlb.c        | 12 ------------
 softmmu/physmem.c         | 12 ++++++++++++
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 2281be4e10..d909429427 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -72,6 +72,7 @@ typedef uintptr_t ram_addr_t;
 void qemu_ram_remap(ram_addr_t addr, ram_addr_t length);
 /* This should not be used by devices.  */
 ram_addr_t qemu_ram_addr_from_host(void *ptr);
+ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr);
 RAMBlock *qemu_ram_block_by_name(const char *name);
 RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
                                    ram_addr_t *offset);
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index a46f3a654d..5db56bcd1e 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1283,18 +1283,6 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
                             prot, mmu_idx, size);
 }
 
-static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
-{
-    ram_addr_t ram_addr;
-
-    ram_addr = qemu_ram_addr_from_host(ptr);
-    if (ram_addr == RAM_ADDR_INVALID) {
-        error_report("Bad ram pointer %p", ptr);
-        abort();
-    }
-    return ram_addr;
-}
-
 /*
  * Note: tlb_fill() can trigger a resize of the TLB. This means that all of the
  * caller's prior references to the TLB table (e.g. CPUTLBEntry pointers) must
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index dc3c3e5f2e..d4c30e99ea 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2460,6 +2460,18 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
     return block->offset + offset;
 }
 
+ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
+{
+    ram_addr_t ram_addr;
+
+    ram_addr = qemu_ram_addr_from_host(ptr);
+    if (ram_addr == RAM_ADDR_INVALID) {
+        error_report("Bad ram pointer %p", ptr);
+        abort();
+    }
+    return ram_addr;
+}
+
 static MemTxResult flatview_read(FlatView *fv, hwaddr addr,
                                  MemTxAttrs attrs, void *buf, hwaddr len);
 static MemTxResult flatview_write(FlatView *fv, hwaddr addr, MemTxAttrs attrs,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 10/21] accel/tcg: Properly implement get_page_addr_code for user-only
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (8 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 09/21] accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 11/21] accel/tcg: Use probe_access_internal for softmmu get_page_addr_code_hostp Richard Henderson
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

The current implementation is a no-op, simply returning addr.
This is incorrect, because we ought to be checking the page
permissions for execution.

Make get_page_addr_code inline for both implementations.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 85 ++++++++++++++---------------------------
 accel/tcg/cputlb.c      |  5 ---
 accel/tcg/user-exec.c   | 15 ++++++++
 3 files changed, 43 insertions(+), 62 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index e7e30d55b8..9f35e3b7a9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -595,43 +595,44 @@ struct MemoryRegionSection *iotlb_to_section(CPUState *cpu,
                                              hwaddr index, MemTxAttrs attrs);
 #endif
 
-#if defined(CONFIG_USER_ONLY)
-void mmap_lock(void);
-void mmap_unlock(void);
-bool have_mmap_lock(void);
-
 /**
- * get_page_addr_code() - user-mode version
+ * get_page_addr_code_hostp()
  * @env: CPUArchState
  * @addr: guest virtual address of guest code
  *
- * Returns @addr.
+ * See get_page_addr_code() (full-system version) for documentation on the
+ * return value.
+ *
+ * Sets *@hostp (when @hostp is non-NULL) as follows.
+ * If the return value is -1, sets *@hostp to NULL. Otherwise, sets *@hostp
+ * to the host address where @addr's content is kept.
+ *
+ * Note: this function can trigger an exception.
+ */
+tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
+                                        void **hostp);
+
+/**
+ * get_page_addr_code()
+ * @env: CPUArchState
+ * @addr: guest virtual address of guest code
+ *
+ * If we cannot translate and execute from the entire RAM page, or if
+ * the region is not backed by RAM, returns -1. Otherwise, returns the
+ * ram_addr_t corresponding to the guest code at @addr.
+ *
+ * Note: this function can trigger an exception.
  */
 static inline tb_page_addr_t get_page_addr_code(CPUArchState *env,
                                                 target_ulong addr)
 {
-    return addr;
+    return get_page_addr_code_hostp(env, addr, NULL);
 }
 
-/**
- * get_page_addr_code_hostp() - user-mode version
- * @env: CPUArchState
- * @addr: guest virtual address of guest code
- *
- * Returns @addr.
- *
- * If @hostp is non-NULL, sets *@hostp to the host address where @addr's content
- * is kept.
- */
-static inline tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env,
-                                                      target_ulong addr,
-                                                      void **hostp)
-{
-    if (hostp) {
-        *hostp = g2h_untagged(addr);
-    }
-    return addr;
-}
+#if defined(CONFIG_USER_ONLY)
+void mmap_lock(void);
+void mmap_unlock(void);
+bool have_mmap_lock(void);
 
 /**
  * adjust_signal_pc:
@@ -688,36 +689,6 @@ G_NORETURN void cpu_loop_exit_sigbus(CPUState *cpu, target_ulong addr,
 static inline void mmap_lock(void) {}
 static inline void mmap_unlock(void) {}
 
-/**
- * get_page_addr_code() - full-system version
- * @env: CPUArchState
- * @addr: guest virtual address of guest code
- *
- * If we cannot translate and execute from the entire RAM page, or if
- * the region is not backed by RAM, returns -1. Otherwise, returns the
- * ram_addr_t corresponding to the guest code at @addr.
- *
- * Note: this function can trigger an exception.
- */
-tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr);
-
-/**
- * get_page_addr_code_hostp() - full-system version
- * @env: CPUArchState
- * @addr: guest virtual address of guest code
- *
- * See get_page_addr_code() (full-system version) for documentation on the
- * return value.
- *
- * Sets *@hostp (when @hostp is non-NULL) as follows.
- * If the return value is -1, sets *@hostp to NULL. Otherwise, sets *@hostp
- * to the host address where @addr's content is kept.
- *
- * Note: this function can trigger an exception.
- */
-tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
-                                        void **hostp);
-
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length);
 void tlb_set_dirty(CPUState *cpu, target_ulong vaddr);
 
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 5db56bcd1e..80a3eb4f1c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1532,11 +1532,6 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
     return qemu_ram_addr_from_host_nofail(p);
 }
 
-tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
-{
-    return get_page_addr_code_hostp(env, addr, NULL);
-}
-
 static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
                            CPUIOTLBEntry *iotlbentry, uintptr_t retaddr)
 {
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 20ada5472b..a20234fb02 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -199,6 +199,21 @@ void *probe_access(CPUArchState *env, target_ulong addr, int size,
     return size ? g2h(env_cpu(env), addr) : NULL;
 }
 
+tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
+                                        void **hostp)
+{
+    int flags;
+
+    flags = probe_access_internal(env, addr, 1, MMU_INST_FETCH, true, 0);
+    if (unlikely(flags)) {
+        return -1;
+    }
+    if (hostp) {
+        *hostp = g2h_untagged(addr);
+    }
+    return addr;
+}
+
 /* The softmmu versions of these helpers are in cputlb.c.  */
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 11/21] accel/tcg: Use probe_access_internal for softmmu get_page_addr_code_hostp
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (9 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 10/21] accel/tcg: Properly implement get_page_addr_code for user-only Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 12/21] accel/tcg: Add nofault parameter to get_page_addr_code_hostp Richard Henderson
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Simplify the implementation of get_page_addr_code_hostp
by reusing the existing probe_access infrastructure.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 76 ++++++++++++++++------------------------------
 1 file changed, 26 insertions(+), 50 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 80a3eb4f1c..2dc2affa12 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1482,56 +1482,6 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index,
   victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
                  (ADDR) & TARGET_PAGE_MASK)
 
-/*
- * Return a ram_addr_t for the virtual address for execution.
- *
- * Return -1 if we can't translate and execute from an entire page
- * of RAM.  This will force us to execute by loading and translating
- * one insn at a time, without caching.
- *
- * NOTE: This function will trigger an exception if the page is
- * not executable.
- */
-tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
-                                        void **hostp)
-{
-    uintptr_t mmu_idx = cpu_mmu_index(env, true);
-    uintptr_t index = tlb_index(env, mmu_idx, addr);
-    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
-    void *p;
-
-    if (unlikely(!tlb_hit(entry->addr_code, addr))) {
-        if (!VICTIM_TLB_HIT(addr_code, addr)) {
-            tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
-            index = tlb_index(env, mmu_idx, addr);
-            entry = tlb_entry(env, mmu_idx, addr);
-
-            if (unlikely(entry->addr_code & TLB_INVALID_MASK)) {
-                /*
-                 * The MMU protection covers a smaller range than a target
-                 * page, so we must redo the MMU check for every insn.
-                 */
-                return -1;
-            }
-        }
-        assert(tlb_hit(entry->addr_code, addr));
-    }
-
-    if (unlikely(entry->addr_code & TLB_MMIO)) {
-        /* The region is not backed by RAM.  */
-        if (hostp) {
-            *hostp = NULL;
-        }
-        return -1;
-    }
-
-    p = (void *)((uintptr_t)addr + entry->addend);
-    if (hostp) {
-        *hostp = p;
-    }
-    return qemu_ram_addr_from_host_nofail(p);
-}
-
 static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
                            CPUIOTLBEntry *iotlbentry, uintptr_t retaddr)
 {
@@ -1687,6 +1637,32 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
     return flags ? NULL : host;
 }
 
+/*
+ * Return a ram_addr_t for the virtual address for execution.
+ *
+ * Return -1 if we can't translate and execute from an entire page
+ * of RAM.  This will force us to execute by loading and translating
+ * one insn at a time, without caching.
+ *
+ * NOTE: This function will trigger an exception if the page is
+ * not executable.
+ */
+tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
+                                        void **hostp)
+{
+    void *p;
+
+    (void)probe_access_internal(env, addr, 1, MMU_INST_FETCH,
+                                cpu_mmu_index(env, true), true, &p, 0);
+    if (p == NULL) {
+        return -1;
+    }
+    if (hostp) {
+        *hostp = p;
+    }
+    return qemu_ram_addr_from_host_nofail(p);
+}
+
 #ifdef CONFIG_PLUGIN
 /*
  * Perform a TLB lookup and populate the qemu_plugin_hwaddr structure.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 12/21] accel/tcg: Add nofault parameter to get_page_addr_code_hostp
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (10 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 11/21] accel/tcg: Use probe_access_internal for softmmu get_page_addr_code_hostp Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 13/21] accel/tcg: Unlock mmap_lock after longjmp Richard Henderson
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 10 +++++-----
 accel/tcg/cputlb.c      |  8 ++++----
 accel/tcg/plugin-gen.c  |  4 ++--
 accel/tcg/user-exec.c   |  4 ++--
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 9f35e3b7a9..7a6dc44d86 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -599,6 +599,8 @@ struct MemoryRegionSection *iotlb_to_section(CPUState *cpu,
  * get_page_addr_code_hostp()
  * @env: CPUArchState
  * @addr: guest virtual address of guest code
+ * @nofault: do not raise an exception
+ * @hostp: output for host pointer
  *
  * See get_page_addr_code() (full-system version) for documentation on the
  * return value.
@@ -607,10 +609,10 @@ struct MemoryRegionSection *iotlb_to_section(CPUState *cpu,
  * If the return value is -1, sets *@hostp to NULL. Otherwise, sets *@hostp
  * to the host address where @addr's content is kept.
  *
- * Note: this function can trigger an exception.
+ * Note: Unless @nofault, this function can trigger an exception.
  */
 tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
-                                        void **hostp);
+                                        bool nofault, void **hostp);
 
 /**
  * get_page_addr_code()
@@ -620,13 +622,11 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
  * If we cannot translate and execute from the entire RAM page, or if
  * the region is not backed by RAM, returns -1. Otherwise, returns the
  * ram_addr_t corresponding to the guest code at @addr.
- *
- * Note: this function can trigger an exception.
  */
 static inline tb_page_addr_t get_page_addr_code(CPUArchState *env,
                                                 target_ulong addr)
 {
-    return get_page_addr_code_hostp(env, addr, NULL);
+    return get_page_addr_code_hostp(env, addr, true, NULL);
 }
 
 #if defined(CONFIG_USER_ONLY)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 2dc2affa12..ae7b40dd51 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1644,16 +1644,16 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
  * of RAM.  This will force us to execute by loading and translating
  * one insn at a time, without caching.
  *
- * NOTE: This function will trigger an exception if the page is
- * not executable.
+ * NOTE: Unless @nofault, this function will trigger an exception
+ * if the page is not executable.
  */
 tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
-                                        void **hostp)
+                                        bool nofault, void **hostp)
 {
     void *p;
 
     (void)probe_access_internal(env, addr, 1, MMU_INST_FETCH,
-                                cpu_mmu_index(env, true), true, &p, 0);
+                                cpu_mmu_index(env, true), nofault, &p, 0);
     if (p == NULL) {
         return -1;
     }
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 3d0b101e34..8377c15383 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -872,7 +872,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool mem_onl
 
         ptb->vaddr = tb->pc;
         ptb->vaddr2 = -1;
-        get_page_addr_code_hostp(cpu->env_ptr, tb->pc, &ptb->haddr1);
+        get_page_addr_code_hostp(cpu->env_ptr, tb->pc, true, &ptb->haddr1);
         ptb->haddr2 = NULL;
         ptb->mem_only = mem_only;
 
@@ -902,7 +902,7 @@ void plugin_gen_insn_start(CPUState *cpu, const DisasContextBase *db)
         unlikely((db->pc_next & TARGET_PAGE_MASK) !=
                  (db->pc_first & TARGET_PAGE_MASK))) {
         get_page_addr_code_hostp(cpu->env_ptr, db->pc_next,
-                                 &ptb->haddr2);
+                                 true, &ptb->haddr2);
         ptb->vaddr2 = db->pc_next;
     }
     if (likely(ptb->vaddr2 == -1)) {
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index a20234fb02..1b3403a064 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -200,11 +200,11 @@ void *probe_access(CPUArchState *env, target_ulong addr, int size,
 }
 
 tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr,
-                                        void **hostp)
+                                        bool nofault, void **hostp)
 {
     int flags;
 
-    flags = probe_access_internal(env, addr, 1, MMU_INST_FETCH, true, 0);
+    flags = probe_access_internal(env, addr, 1, MMU_INST_FETCH, nofault, 0);
     if (unlikely(flags)) {
         return -1;
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 13/21] accel/tcg: Unlock mmap_lock after longjmp
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (11 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 12/21] accel/tcg: Add nofault parameter to get_page_addr_code_hostp Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-12 18:07 ` [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup Richard Henderson
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

The mmap_lock is held around tb_gen_code.  While the comment
is correct that the lock is dropped when tb_gen_code runs out
of memory, the lock is *not* dropped when an exception is
raised reading code for translation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index f6c0c0aff6..a9b7053274 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -517,13 +517,11 @@ void cpu_exec_step_atomic(CPUState *cpu)
         cpu_tb_exec(cpu, tb, &tb_exit);
         cpu_exec_exit(cpu);
     } else {
-        /*
-         * The mmap_lock is dropped by tb_gen_code if it runs out of
-         * memory.
-         */
 #ifndef CONFIG_SOFTMMU
         clear_helper_retaddr();
-        tcg_debug_assert(!have_mmap_lock());
+        if (have_mmap_lock()) {
+            mmap_unlock();
+        }
 #endif
         if (qemu_mutex_iothread_locked()) {
             qemu_mutex_unlock_iothread();
@@ -930,7 +928,9 @@ int cpu_exec(CPUState *cpu)
 
 #ifndef CONFIG_SOFTMMU
         clear_helper_retaddr();
-        tcg_debug_assert(!have_mmap_lock());
+        if (have_mmap_lock()) {
+            mmap_unlock();
+        }
 #endif
         if (qemu_mutex_iothread_locked()) {
             qemu_mutex_unlock_iothread();
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (12 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 13/21] accel/tcg: Unlock mmap_lock after longjmp Richard Henderson
@ 2022-08-12 18:07 ` Richard Henderson
  2022-08-16 23:43   ` Ilya Leoshkevich
  2022-08-12 18:08 ` [PATCH for-7.2 15/21] accel/tcg: Hoist get_page_addr_code out of tb_gen_code Richard Henderson
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We will want to re-use the result of get_page_addr_code
beyond the scope of tb_lookup.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 34 ++++++++++++++++++++++++----------
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index a9b7053274..889355b341 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -209,13 +209,12 @@ static bool tb_lookup_cmp(const void *p, const void *d)
 }
 
 /* Might cause an exception, so have a longjmp destination ready */
-static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
-                                   target_ulong cs_base,
+static TranslationBlock *tb_lookup(CPUState *cpu, tb_page_addr_t phys_pc,
+                                   target_ulong pc, target_ulong cs_base,
                                    uint32_t flags, uint32_t cflags)
 {
     CPUArchState *env = cpu->env_ptr;
     TranslationBlock *tb;
-    tb_page_addr_t phys_pc;
     struct tb_desc desc;
     uint32_t jmp_hash, tb_hash;
 
@@ -240,11 +239,8 @@ static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
     desc.cflags = cflags;
     desc.trace_vcpu_dstate = *cpu->trace_dstate;
     desc.pc = pc;
-    phys_pc = get_page_addr_code(desc.env, pc);
-    if (phys_pc == -1) {
-        return NULL;
-    }
     desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
+
     tb_hash = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
     tb = qht_lookup_custom(&tb_ctx.htable, &desc, tb_hash, tb_lookup_cmp);
     if (tb == NULL) {
@@ -371,6 +367,7 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
     TranslationBlock *tb;
     target_ulong cs_base, pc;
     uint32_t flags, cflags;
+    tb_page_addr_t phys_pc;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
 
@@ -379,7 +376,12 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
         cpu_loop_exit(cpu);
     }
 
-    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
+    phys_pc = get_page_addr_code(env, pc);
+    if (phys_pc == -1) {
+        return tcg_code_gen_epilogue;
+    }
+
+    tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
     if (tb == NULL) {
         return tcg_code_gen_epilogue;
     }
@@ -482,6 +484,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
     TranslationBlock *tb;
     target_ulong cs_base, pc;
     uint32_t flags, cflags;
+    tb_page_addr_t phys_pc;
     int tb_exit;
 
     if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -504,7 +507,12 @@ void cpu_exec_step_atomic(CPUState *cpu)
          * Any breakpoint for this insn will have been recognized earlier.
          */
 
-        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
+        phys_pc = get_page_addr_code(env, pc);
+        if (phys_pc == -1) {
+            tb = NULL;
+        } else {
+            tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
+        }
         if (tb == NULL) {
             mmap_lock();
             tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
@@ -949,6 +957,7 @@ int cpu_exec(CPUState *cpu)
             TranslationBlock *tb;
             target_ulong cs_base, pc;
             uint32_t flags, cflags;
+            tb_page_addr_t phys_pc;
 
             cpu_get_tb_cpu_state(cpu->env_ptr, &pc, &cs_base, &flags);
 
@@ -970,7 +979,12 @@ int cpu_exec(CPUState *cpu)
                 break;
             }
 
-            tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
+            phys_pc = get_page_addr_code(cpu->env_ptr, pc);
+            if (phys_pc == -1) {
+                tb = NULL;
+            } else {
+                tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
+            }
             if (tb == NULL) {
                 mmap_lock();
                 tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 15/21] accel/tcg: Hoist get_page_addr_code out of tb_gen_code
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (13 preceding siblings ...)
  2022-08-12 18:07 ` [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 16/21] accel/tcg: Raise PROT_EXEC exception early Richard Henderson
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Reuse the result that we just used with tb_lookup.
Pass in host_pc while touching these lines, to be used shortly.
We must widen the scope of the mmap_lock, so that the page table
lookup that is finally used is covered by the lock.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/internal.h      |  7 ++++---
 accel/tcg/cpu-exec.c      | 20 ++++++++++++--------
 accel/tcg/translate-all.c |  5 ++---
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index 3092bfa964..920d35e8bb 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -11,9 +11,10 @@
 
 #include "exec/exec-all.h"
 
-TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc,
-                              target_ulong cs_base, uint32_t flags,
-                              int cflags);
+TranslationBlock *tb_gen_code(CPUState *cpu,
+                              tb_page_addr_t phys_pc, void *host_pc,
+                              target_ulong pc, target_ulong cs_base,
+                              uint32_t flags, int cflags);
 G_NORETURN void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
 void page_init(void);
 void tb_htable_init(void);
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 889355b341..5278d1837b 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -485,6 +485,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
     target_ulong cs_base, pc;
     uint32_t flags, cflags;
     tb_page_addr_t phys_pc;
+    void *host_pc;
     int tb_exit;
 
     if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -507,17 +508,17 @@ void cpu_exec_step_atomic(CPUState *cpu)
          * Any breakpoint for this insn will have been recognized earlier.
          */
 
-        phys_pc = get_page_addr_code(env, pc);
+        mmap_lock();
+        phys_pc = get_page_addr_code_hostp(env, pc, true, &host_pc);
         if (phys_pc == -1) {
             tb = NULL;
         } else {
             tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
         }
         if (tb == NULL) {
-            mmap_lock();
-            tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
-            mmap_unlock();
+            tb = tb_gen_code(cpu, phys_pc, host_pc, pc, cs_base, flags, cflags);
         }
+        mmap_unlock();
 
         cpu_exec_enter(cpu);
         /* execute the generated code */
@@ -958,6 +959,7 @@ int cpu_exec(CPUState *cpu)
             target_ulong cs_base, pc;
             uint32_t flags, cflags;
             tb_page_addr_t phys_pc;
+            void *host_pc;
 
             cpu_get_tb_cpu_state(cpu->env_ptr, &pc, &cs_base, &flags);
 
@@ -979,22 +981,24 @@ int cpu_exec(CPUState *cpu)
                 break;
             }
 
-            phys_pc = get_page_addr_code(cpu->env_ptr, pc);
+            mmap_lock();
+            phys_pc = get_page_addr_code_hostp(cpu->env_ptr, pc,
+                                               true, &host_pc);
             if (phys_pc == -1) {
                 tb = NULL;
             } else {
                 tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
             }
             if (tb == NULL) {
-                mmap_lock();
-                tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
-                mmap_unlock();
+                tb = tb_gen_code(cpu, phys_pc, host_pc, pc,
+                                 cs_base, flags, cflags);
                 /*
                  * We add the TB in the virtual pc hash table
                  * for the fast lookup
                  */
                 qatomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)], tb);
             }
+            mmap_unlock();
 
 #ifndef CONFIG_USER_ONLY
             /*
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 65a23f47d6..86e7644c1b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1326,12 +1326,13 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
 
 /* Called with mmap_lock held for user mode emulation.  */
 TranslationBlock *tb_gen_code(CPUState *cpu,
+                              tb_page_addr_t phys_pc, void *host_pc,
                               target_ulong pc, target_ulong cs_base,
                               uint32_t flags, int cflags)
 {
     CPUArchState *env = cpu->env_ptr;
     TranslationBlock *tb, *existing_tb;
-    tb_page_addr_t phys_pc, phys_page2;
+    tb_page_addr_t phys_page2;
     target_ulong virt_page2;
     tcg_insn_unit *gen_code_buf;
     int gen_code_size, search_size, max_insns;
@@ -1343,8 +1344,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     assert_memory_lock();
     qemu_thread_jit_write();
 
-    phys_pc = get_page_addr_code(env, pc);
-
     if (phys_pc == -1) {
         /* Generate a one-shot TB with 1 insn in it */
         cflags = (cflags & ~CF_COUNT_MASK) | CF_LAST_IO | 1;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 16/21] accel/tcg: Raise PROT_EXEC exception early
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (14 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 15/21] accel/tcg: Hoist get_page_addr_code out of tb_gen_code Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 17/21] accel/tcg: Introduce is_same_page() Richard Henderson
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

We currently ignore PROT_EXEC on the initial lookup, and
defer raising the exception until cpu_ld*_code().
It makes more sense to raise the exception early.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 5278d1837b..6a3ca8224f 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -376,7 +376,7 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
         cpu_loop_exit(cpu);
     }
 
-    phys_pc = get_page_addr_code(env, pc);
+    phys_pc = get_page_addr_code_hostp(env, pc, false, NULL);
     if (phys_pc == -1) {
         return tcg_code_gen_epilogue;
     }
@@ -509,7 +509,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
          */
 
         mmap_lock();
-        phys_pc = get_page_addr_code_hostp(env, pc, true, &host_pc);
+        phys_pc = get_page_addr_code_hostp(env, pc, false, &host_pc);
         if (phys_pc == -1) {
             tb = NULL;
         } else {
@@ -983,7 +983,7 @@ int cpu_exec(CPUState *cpu)
 
             mmap_lock();
             phys_pc = get_page_addr_code_hostp(cpu->env_ptr, pc,
-                                               true, &host_pc);
+                                               false, &host_pc);
             if (phys_pc == -1) {
                 tb = NULL;
             } else {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 17/21] accel/tcg: Introduce is_same_page()
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (15 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 16/21] accel/tcg: Raise PROT_EXEC exception early Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 18/21] accel/tcg: Remove translator_ldsw Richard Henderson
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

From: Ilya Leoshkevich <iii@linux.ibm.com>

Introduce a function that checks whether a given address is on the same
page as where disassembly started. Having it improves readability of
the following patches.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20220811095534.241224-3-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[rth: Make the DisasContextBase parameter const.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/translator.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index 7db6845535..0d0bf3a31e 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -187,4 +187,14 @@ FOR_EACH_TRANSLATOR_LD(GEN_TRANSLATOR_LD)
 
 #undef GEN_TRANSLATOR_LD
 
+/*
+ * Return whether addr is on the same page as where disassembly started.
+ * Translators can use this to enforce the rule that only single-insn
+ * translation blocks are allowed to cross page boundaries.
+ */
+static inline bool is_same_page(const DisasContextBase *db, target_ulong addr)
+{
+    return ((addr ^ db->pc_first) & TARGET_PAGE_MASK) == 0;
+}
+
 #endif /* EXEC__TRANSLATOR_H */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 18/21] accel/tcg: Remove translator_ldsw
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (16 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 17/21] accel/tcg: Introduce is_same_page() Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 19/21] accel/tcg: Add pc and host_pc params to gen_intermediate_code Richard Henderson
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

The only user can easily use translator_lduw and
adjust the type to signed during the return.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/translator.h   | 1 -
 target/i386/tcg/translate.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index 0d0bf3a31e..45b9268ca4 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -178,7 +178,6 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest);
 
 #define FOR_EACH_TRANSLATOR_LD(F)                                       \
     F(translator_ldub, uint8_t, cpu_ldub_code, /* no swap */)           \
-    F(translator_ldsw, int16_t, cpu_ldsw_code, bswap16)                 \
     F(translator_lduw, uint16_t, cpu_lduw_code, bswap16)                \
     F(translator_ldl, uint32_t, cpu_ldl_code, bswap32)                  \
     F(translator_ldq, uint64_t, cpu_ldq_code, bswap64)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b7972f0ff5..a23417d058 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2033,7 +2033,7 @@ static inline uint8_t x86_ldub_code(CPUX86State *env, DisasContext *s)
 
 static inline int16_t x86_ldsw_code(CPUX86State *env, DisasContext *s)
 {
-    return translator_ldsw(env, &s->base, advance_pc(env, s, 2));
+    return translator_lduw(env, &s->base, advance_pc(env, s, 2));
 }
 
 static inline uint16_t x86_lduw_code(CPUX86State *env, DisasContext *s)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 19/21] accel/tcg: Add pc and host_pc params to gen_intermediate_code
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (17 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 18/21] accel/tcg: Remove translator_ldsw Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 20/21] accel/tcg: Add fast path for translator_ld* Richard Henderson
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Pass these along to translator_loop -- pc may be used instead
of tb->pc, and host_pc is currently unused.  Adjust all targets
at one time.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h       |  1 -
 include/exec/translator.h     | 24 ++++++++++++++++++++----
 accel/tcg/translate-all.c     |  3 ++-
 accel/tcg/translator.c        |  9 +++++----
 target/alpha/translate.c      |  5 +++--
 target/arm/translate.c        |  5 +++--
 target/avr/translate.c        |  5 +++--
 target/cris/translate.c       |  5 +++--
 target/hexagon/translate.c    |  6 ++++--
 target/hppa/translate.c       |  5 +++--
 target/i386/tcg/translate.c   |  5 +++--
 target/loongarch/translate.c  |  6 ++++--
 target/m68k/translate.c       |  5 +++--
 target/microblaze/translate.c |  5 +++--
 target/mips/tcg/translate.c   |  5 +++--
 target/nios2/translate.c      |  5 +++--
 target/openrisc/translate.c   |  6 ++++--
 target/ppc/translate.c        |  5 +++--
 target/riscv/translate.c      |  5 +++--
 target/rx/translate.c         |  5 +++--
 target/s390x/tcg/translate.c  |  5 +++--
 target/sh4/translate.c        |  5 +++--
 target/sparc/translate.c      |  5 +++--
 target/tricore/translate.c    |  6 ++++--
 target/xtensa/translate.c     |  6 ++++--
 25 files changed, 95 insertions(+), 52 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 7a6dc44d86..4ad166966b 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -39,7 +39,6 @@ typedef ram_addr_t tb_page_addr_t;
 #define TB_PAGE_ADDR_FMT RAM_ADDR_FMT
 #endif
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns);
 void restore_state_to_opc(CPUArchState *env, TranslationBlock *tb,
                           target_ulong *data);
 
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 45b9268ca4..69db0f5c21 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -26,6 +26,19 @@
 #include "exec/translate-all.h"
 #include "tcg/tcg.h"
 
+/**
+ * gen_intermediate_code
+ * @cpu: cpu context
+ * @tb: translation block
+ * @max_insns: max number of instructions to translate
+ * @pc: guest virtual program counter address
+ * @host_pc: host physical program counter address
+ *
+ * This function must be provided by the target, which should create
+ * the target-specific DisasContext, and then invoke translator_loop.
+ */
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc);
 
 /**
  * DisasJumpType:
@@ -123,11 +136,13 @@ typedef struct TranslatorOps {
 
 /**
  * translator_loop:
- * @ops: Target-specific operations.
- * @db: Disassembly context.
  * @cpu: Target vCPU.
  * @tb: Translation block.
  * @max_insns: Maximum number of insns to translate.
+ * @pc: guest virtual program counter address
+ * @host_pc: host physical program counter address
+ * @ops: Target-specific operations.
+ * @db: Disassembly context.
  *
  * Generic translator loop.
  *
@@ -141,8 +156,9 @@ typedef struct TranslatorOps {
  * - When single-stepping is enabled (system-wide or on the current vCPU).
  * - When too many instructions have been translated.
  */
-void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
-                     CPUState *cpu, TranslationBlock *tb, int max_insns);
+void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                     target_ulong pc, void *host_pc,
+                     const TranslatorOps *ops, DisasContextBase *db);
 
 void translator_loop_temp_check(DisasContextBase *db);
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 86e7644c1b..d52097ab2d 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -46,6 +46,7 @@
 
 #include "exec/cputlb.h"
 #include "exec/translate-all.h"
+#include "exec/translator.h"
 #include "qemu/bitmap.h"
 #include "qemu/qemu-print.h"
 #include "qemu/timer.h"
@@ -1390,7 +1391,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     tcg_func_start(tcg_ctx);
 
     tcg_ctx->cpu = env_cpu(env);
-    gen_intermediate_code(cpu, tb, max_insns);
+    gen_intermediate_code(cpu, tb, max_insns, pc, host_pc);
     assert(tb->size != 0);
     tcg_ctx->cpu = NULL;
     max_insns = tb->icount;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index fe7af9b943..3eef30d93a 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -51,16 +51,17 @@ static inline void translator_page_protect(DisasContextBase *dcbase,
 #endif
 }
 
-void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
-                     CPUState *cpu, TranslationBlock *tb, int max_insns)
+void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                     target_ulong pc, void *host_pc,
+                     const TranslatorOps *ops, DisasContextBase *db)
 {
     uint32_t cflags = tb_cflags(tb);
     bool plugin_enabled;
 
     /* Initialize DisasContext */
     db->tb = tb;
-    db->pc_first = tb->pc;
-    db->pc_next = db->pc_first;
+    db->pc_first = pc;
+    db->pc_next = pc;
     db->is_jmp = DISAS_NEXT;
     db->num_insns = 0;
     db->max_insns = max_insns;
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 9af1627079..6766350f56 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -3043,10 +3043,11 @@ static const TranslatorOps alpha_tr_ops = {
     .disas_log          = alpha_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
-    translator_loop(&alpha_tr_ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc, &alpha_tr_ops, &dc.base);
 }
 
 void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb,
diff --git a/target/arm/translate.c b/target/arm/translate.c
index ad617b9948..9474e4b44b 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9892,7 +9892,8 @@ static const TranslatorOps thumb_translator_ops = {
 };
 
 /* generate intermediate code for basic block 'tb'.  */
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc = { };
     const TranslatorOps *ops = &arm_translator_ops;
@@ -9907,7 +9908,7 @@ void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
     }
 #endif
 
-    translator_loop(ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc, ops, &dc.base);
 }
 
 void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb,
diff --git a/target/avr/translate.c b/target/avr/translate.c
index dc9c3d6bcc..1da34da103 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -3031,10 +3031,11 @@ static const TranslatorOps avr_tr_ops = {
     .disas_log          = avr_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc = { };
-    translator_loop(&avr_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &avr_tr_ops, &dc.base);
 }
 
 void restore_state_to_opc(CPUAVRState *env, TranslationBlock *tb,
diff --git a/target/cris/translate.c b/target/cris/translate.c
index ac101344a3..73385b0b3c 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -3286,10 +3286,11 @@ static const TranslatorOps cris_tr_ops = {
     .disas_log          = cris_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
-    translator_loop(&cris_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &cris_tr_ops, &dc.base);
 }
 
 void cris_cpu_dump_state(CPUState *cs, FILE *f, int flags)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index d4fc92f7e9..0e8a0772f7 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -850,11 +850,13 @@ static const TranslatorOps hexagon_tr_ops = {
     .disas_log          = hexagon_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&hexagon_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc,
+                    &hexagon_tr_ops, &ctx.base);
 }
 
 #define NAME_LEN               64
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index b8dbfee5e9..8b861957e0 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -4340,10 +4340,11 @@ static const TranslatorOps hppa_tr_ops = {
     .disas_log          = hppa_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
-    translator_loop(&hppa_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &hppa_tr_ops, &ctx.base);
 }
 
 void restore_state_to_opc(CPUHPPAState *env, TranslationBlock *tb,
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index a23417d058..4836c889e0 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -8708,11 +8708,12 @@ static const TranslatorOps i386_tr_ops = {
 };
 
 /* generate intermediate code for basic block 'tb'.  */
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
 
-    translator_loop(&i386_tr_ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc, &i386_tr_ops, &dc.base);
 }
 
 void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb,
diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c
index 51ba291430..95b37ea180 100644
--- a/target/loongarch/translate.c
+++ b/target/loongarch/translate.c
@@ -241,11 +241,13 @@ static const TranslatorOps loongarch_tr_ops = {
     .disas_log          = loongarch_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&loongarch_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc,
+                    &loongarch_tr_ops, &ctx.base);
 }
 
 void loongarch_translate_init(void)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 8f3c298ad0..5098f7e570 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -6361,10 +6361,11 @@ static const TranslatorOps m68k_tr_ops = {
     .disas_log          = m68k_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
-    translator_loop(&m68k_tr_ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc, &m68k_tr_ops, &dc.base);
 }
 
 static double floatx80_to_double(CPUM68KState *env, uint16_t high, uint64_t low)
diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index bf01384d33..c5546f93aa 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -1849,10 +1849,11 @@ static const TranslatorOps mb_tr_ops = {
     .disas_log          = mb_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
-    translator_loop(&mb_tr_ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc, &mb_tr_ops, &dc.base);
 }
 
 void mb_cpu_dump_state(CPUState *cs, FILE *f, int flags)
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index de1511baaf..0d936e2648 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -16155,11 +16155,12 @@ static const TranslatorOps mips_tr_ops = {
     .disas_log          = mips_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&mips_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &mips_tr_ops, &ctx.base);
 }
 
 void mips_tcg_init(void)
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 3a037a68cc..c588e8e885 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -1038,10 +1038,11 @@ static const TranslatorOps nios2_tr_ops = {
     .disas_log          = nios2_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
-    translator_loop(&nios2_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &nios2_tr_ops, &dc.base);
 }
 
 void nios2_cpu_dump_state(CPUState *cs, FILE *f, int flags)
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 7b8ad43d5f..8154f9d744 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1705,11 +1705,13 @@ static const TranslatorOps openrisc_tr_ops = {
     .disas_log          = openrisc_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&openrisc_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc,
+                    &openrisc_tr_ops, &ctx.base);
 }
 
 void openrisc_cpu_dump_state(CPUState *cs, FILE *f, int flags)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 388337f81b..000b1e518d 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -7719,11 +7719,12 @@ static const TranslatorOps ppc_tr_ops = {
     .disas_log          = ppc_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&ppc_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &ppc_tr_ops, &ctx.base);
 }
 
 void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb,
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 63b04e8a94..38666ddc91 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -1196,11 +1196,12 @@ static const TranslatorOps riscv_tr_ops = {
     .disas_log          = riscv_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&riscv_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &riscv_tr_ops, &ctx.base);
 }
 
 void riscv_translate_init(void)
diff --git a/target/rx/translate.c b/target/rx/translate.c
index 62aee66937..ea5653bc95 100644
--- a/target/rx/translate.c
+++ b/target/rx/translate.c
@@ -2363,11 +2363,12 @@ static const TranslatorOps rx_tr_ops = {
     .disas_log          = rx_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
 
-    translator_loop(&rx_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &rx_tr_ops, &dc.base);
 }
 
 void restore_state_to_opc(CPURXState *env, TranslationBlock *tb,
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index e2ee005671..d4c0b9b3a2 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -6676,11 +6676,12 @@ static const TranslatorOps s390x_tr_ops = {
     .disas_log          = s390x_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc;
 
-    translator_loop(&s390x_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &s390x_tr_ops, &dc.base);
 }
 
 void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb,
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index f1b190e7cf..01056571c3 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -2368,11 +2368,12 @@ static const TranslatorOps sh4_tr_ops = {
     .disas_log          = sh4_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
 
-    translator_loop(&sh4_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &sh4_tr_ops, &ctx.base);
 }
 
 void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb,
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 2e28222d31..2cbbe2396a 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5917,11 +5917,12 @@ static const TranslatorOps sparc_tr_ops = {
     .disas_log          = sparc_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc = {};
 
-    translator_loop(&sparc_tr_ops, &dc.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc, &sparc_tr_ops, &dc.base);
 }
 
 void sparc_tcg_init(void)
diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index d170500fa5..a0558ead71 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -8878,10 +8878,12 @@ static const TranslatorOps tricore_tr_ops = {
 };
 
 
-void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cs, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext ctx;
-    translator_loop(&tricore_tr_ops, &ctx.base, cs, tb, max_insns);
+    translator_loop(cs, tb, max_insns, pc, host_pc,
+                    &tricore_tr_ops, &ctx.base);
 }
 
 void
diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 70e11eeb45..8b864ef925 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1279,10 +1279,12 @@ static const TranslatorOps xtensa_translator_ops = {
     .disas_log          = xtensa_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns)
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+                           target_ulong pc, void *host_pc)
 {
     DisasContext dc = {};
-    translator_loop(&xtensa_translator_ops, &dc.base, cpu, tb, max_insns);
+    translator_loop(cpu, tb, max_insns, pc, host_pc,
+                    &xtensa_translator_ops, &dc.base);
 }
 
 void xtensa_cpu_dump_state(CPUState *cs, FILE *f, int flags)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 20/21] accel/tcg: Add fast path for translator_ld*
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (18 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 19/21] accel/tcg: Add pc and host_pc params to gen_intermediate_code Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-12 18:08 ` [PATCH for-7.2 21/21] accel/tcg: Use DisasContextBase in plugin_gen_tb_start Richard Henderson
  2022-08-16 23:12 ` [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Ilya Leoshkevich
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Cache the translation from guest to host address, so we may
use direct loads when we hit on the primary translation page.

Look up the second translation page only once, during translation.
This obviates another lookup of the second page within tb_gen_code
after translation.

Fixes a bug in that plugin_insn_append should be passed the bytes
in the original memory order, not bswapped by pieces.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/translator.h |  52 ++++++++++++------
 accel/tcg/translate-all.c |  22 +++-----
 accel/tcg/translator.c    | 111 +++++++++++++++++++++++++++++++-------
 3 files changed, 135 insertions(+), 50 deletions(-)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index 69db0f5c21..177a001698 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -81,13 +81,14 @@ typedef enum DisasJumpType {
  * Architecture-agnostic disassembly context.
  */
 typedef struct DisasContextBase {
-    const TranslationBlock *tb;
+    TranslationBlock *tb;
     target_ulong pc_first;
     target_ulong pc_next;
     DisasJumpType is_jmp;
     int num_insns;
     int max_insns;
     bool singlestep_enabled;
+    void *host_addr[2];
 #ifdef CONFIG_USER_ONLY
     /*
      * Guest address of the last byte of the last protected page.
@@ -183,24 +184,43 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest);
  * the relevant information at translation time.
  */
 
-#define GEN_TRANSLATOR_LD(fullname, type, load_fn, swap_fn)             \
-    type fullname ## _swap(CPUArchState *env, DisasContextBase *dcbase, \
-                           abi_ptr pc, bool do_swap);                   \
-    static inline type fullname(CPUArchState *env,                      \
-                                DisasContextBase *dcbase, abi_ptr pc)   \
-    {                                                                   \
-        return fullname ## _swap(env, dcbase, pc, false);               \
+uint8_t translator_ldub(CPUArchState *env, DisasContextBase *db, abi_ptr pc);
+uint16_t translator_lduw(CPUArchState *env, DisasContextBase *db, abi_ptr pc);
+uint32_t translator_ldl(CPUArchState *env, DisasContextBase *db, abi_ptr pc);
+uint64_t translator_ldq(CPUArchState *env, DisasContextBase *db, abi_ptr pc);
+
+static inline uint16_t
+translator_lduw_swap(CPUArchState *env, DisasContextBase *db,
+                     abi_ptr pc, bool do_swap)
+{
+    uint16_t ret = translator_lduw(env, db, pc);
+    if (do_swap) {
+        ret = bswap16(ret);
     }
+    return ret;
+}
 
-#define FOR_EACH_TRANSLATOR_LD(F)                                       \
-    F(translator_ldub, uint8_t, cpu_ldub_code, /* no swap */)           \
-    F(translator_lduw, uint16_t, cpu_lduw_code, bswap16)                \
-    F(translator_ldl, uint32_t, cpu_ldl_code, bswap32)                  \
-    F(translator_ldq, uint64_t, cpu_ldq_code, bswap64)
+static inline uint32_t
+translator_ldl_swap(CPUArchState *env, DisasContextBase *db,
+                    abi_ptr pc, bool do_swap)
+{
+    uint32_t ret = translator_ldl(env, db, pc);
+    if (do_swap) {
+        ret = bswap32(ret);
+    }
+    return ret;
+}
 
-FOR_EACH_TRANSLATOR_LD(GEN_TRANSLATOR_LD)
-
-#undef GEN_TRANSLATOR_LD
+static inline uint64_t
+translator_ldq_swap(CPUArchState *env, DisasContextBase *db,
+                    abi_ptr pc, bool do_swap)
+{
+    uint64_t ret = translator_ldq_swap(env, db, pc, false);
+    if (do_swap) {
+        ret = bswap64(ret);
+    }
+    return ret;
+}
 
 /*
  * Return whether addr is on the same page as where disassembly started.
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index d52097ab2d..299b068f9c 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1333,8 +1333,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 {
     CPUArchState *env = cpu->env_ptr;
     TranslationBlock *tb, *existing_tb;
-    tb_page_addr_t phys_page2;
-    target_ulong virt_page2;
     tcg_insn_unit *gen_code_buf;
     int gen_code_size, search_size, max_insns;
 #ifdef CONFIG_PROFILER
@@ -1374,6 +1372,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     tb->flags = flags;
     tb->cflags = cflags;
     tb->trace_vcpu_dstate = *cpu->trace_dstate;
+    tb->page_addr[0] = phys_pc;
+    tb->page_addr[1] = -1;
     tcg_ctx->tb_cflags = cflags;
  tb_overflow:
 
@@ -1567,13 +1567,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
 
     /*
-     * If the TB is not associated with a physical RAM page then
-     * it must be a temporary one-insn TB, and we have nothing to do
-     * except fill in the page_addr[] fields. Return early before
-     * attempting to link to other TBs or add to the lookup table.
+     * If the TB is not associated with a physical RAM page then it must be
+     * a temporary one-insn TB, and we have nothing left to do. Return early
+     * before attempting to link to other TBs or add to the lookup table.
      */
-    if (phys_pc == -1) {
-        tb->page_addr[0] = tb->page_addr[1] = -1;
+    if (tb->page_addr[0] == -1) {
         return tb;
     }
 
@@ -1584,17 +1582,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
      */
     tcg_tb_insert(tb);
 
-    /* check next page if needed */
-    virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
-    phys_page2 = -1;
-    if ((pc & TARGET_PAGE_MASK) != virt_page2) {
-        phys_page2 = get_page_addr_code(env, virt_page2);
-    }
     /*
      * No explicit memory barrier is required -- tb_link_page() makes the
      * TB visible in a consistent state.
      */
-    existing_tb = tb_link_page(tb, phys_pc, phys_page2);
+    existing_tb = tb_link_page(tb, tb->page_addr[0], tb->page_addr[1]);
     /* if the TB already exists, discard what we just translated */
     if (unlikely(existing_tb != tb)) {
         uintptr_t orig_aligned = (uintptr_t)gen_code_buf;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 3eef30d93a..a693c17259 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -66,6 +66,8 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
     db->num_insns = 0;
     db->max_insns = max_insns;
     db->singlestep_enabled = cflags & CF_SINGLE_STEP;
+    db->host_addr[0] = host_pc;
+    db->host_addr[1] = NULL;
     translator_page_protect(db, db->pc_next);
 
     ops->init_disas_context(db, cpu);
@@ -151,31 +153,102 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
 #endif
 }
 
-static inline void translator_maybe_page_protect(DisasContextBase *dcbase,
-                                                 target_ulong pc, size_t len)
+static void *translator_access(CPUArchState *env, DisasContextBase *db,
+                               target_ulong pc, size_t len)
 {
+    void *host;
+    target_ulong base;
+    TranslationBlock *tb;
+
 #ifdef CONFIG_USER_ONLY
     target_ulong end = pc + len - 1;
-
-    if (end > dcbase->page_protect_end) {
-        translator_page_protect(dcbase, end);
+    if (end > db->page_protect_end) {
+        translator_page_protect(db, end);
     }
 #endif
-}
 
-#define GEN_TRANSLATOR_LD(fullname, type, load_fn, swap_fn)             \
-    type fullname ## _swap(CPUArchState *env, DisasContextBase *dcbase, \
-                           abi_ptr pc, bool do_swap)                    \
-    {                                                                   \
-        translator_maybe_page_protect(dcbase, pc, sizeof(type));        \
-        type ret = load_fn(env, pc);                                    \
-        if (do_swap) {                                                  \
-            ret = swap_fn(ret);                                         \
-        }                                                               \
-        plugin_insn_append(pc, &ret, sizeof(ret));                      \
-        return ret;                                                     \
+    tb = db->tb;
+    if (unlikely(tb->page_addr[0] == -1)) {
+        /* Use slow path if first page is MMIO. */
+        return NULL;
+    } else if (likely(is_same_page(db, pc + len - 1))) {
+        host = db->host_addr[0];
+        base = db->pc_first;
+    } else if (is_same_page(db, pc)) {
+        /* Use slow path when crossing pages. */
+        return NULL;
+    } else {
+        host = db->host_addr[1];
+        base = TARGET_PAGE_ALIGN(db->pc_first);
+        if (host == NULL) {
+            tb->page_addr[1] =
+                get_page_addr_code_hostp(env, base, false,
+                                         &db->host_addr[1]);
+            /* We cannot handle MMIO as second page. */
+            assert(tb->page_addr[1] != -1);
+            host = db->host_addr[1];
+        }
     }
 
-FOR_EACH_TRANSLATOR_LD(GEN_TRANSLATOR_LD)
+    tcg_debug_assert(pc >= base);
+    return host + (pc - base);
+}
 
-#undef GEN_TRANSLATOR_LD
+uint8_t translator_ldub(CPUArchState *env, DisasContextBase *db, abi_ptr pc)
+{
+    uint8_t ret;
+    void *p = translator_access(env, db, pc, sizeof(ret));
+
+    if (p) {
+        plugin_insn_append(pc, p, sizeof(ret));
+        return ldub_p(p);
+    }
+    ret = cpu_ldub_code(env, pc);
+    plugin_insn_append(pc, &ret, sizeof(ret));
+    return ret;
+}
+
+uint16_t translator_lduw(CPUArchState *env, DisasContextBase *db, abi_ptr pc)
+{
+    uint16_t ret, plug;
+    void *p = translator_access(env, db, pc, sizeof(ret));
+
+    if (p) {
+        plugin_insn_append(pc, p, sizeof(ret));
+        return lduw_p(p);
+    }
+    ret = cpu_lduw_code(env, pc);
+    plug = tswap16(ret);
+    plugin_insn_append(pc, &plug, sizeof(ret));
+    return ret;
+}
+
+uint32_t translator_ldl(CPUArchState *env, DisasContextBase *db, abi_ptr pc)
+{
+    uint32_t ret, plug;
+    void *p = translator_access(env, db, pc, sizeof(ret));
+
+    if (p) {
+        plugin_insn_append(pc, p, sizeof(ret));
+        return ldl_p(p);
+    }
+    ret = cpu_ldl_code(env, pc);
+    plug = tswap32(ret);
+    plugin_insn_append(pc, &plug, sizeof(ret));
+    return ret;
+}
+
+uint64_t translator_ldq(CPUArchState *env, DisasContextBase *db, abi_ptr pc)
+{
+    uint64_t ret, plug;
+    void *p = translator_access(env, db, pc, sizeof(ret));
+
+    if (p) {
+        plugin_insn_append(pc, p, sizeof(ret));
+        return ldq_p(p);
+    }
+    ret = cpu_ldq_code(env, pc);
+    plug = tswap64(ret);
+    plugin_insn_append(pc, &plug, sizeof(ret));
+    return ret;
+}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH for-7.2 21/21] accel/tcg: Use DisasContextBase in plugin_gen_tb_start
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (19 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 20/21] accel/tcg: Add fast path for translator_ld* Richard Henderson
@ 2022-08-12 18:08 ` Richard Henderson
  2022-08-16 23:12 ` [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Ilya Leoshkevich
  21 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-12 18:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: laurent, iii, alex.bennee

Use the pc coming from db->pc_first rather than the TB.

Use the cached host_addr rather than re-computing for the
first page.  We still need a separate lookup for the second
page because it won't be computed for DisasContextBase until
the translator actually performs a read from the page.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/plugin-gen.h |  7 ++++---
 accel/tcg/plugin-gen.c    | 23 ++++++++++++-----------
 accel/tcg/translator.c    |  2 +-
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
index f92f169739..5004728c61 100644
--- a/include/exec/plugin-gen.h
+++ b/include/exec/plugin-gen.h
@@ -19,7 +19,8 @@ struct DisasContextBase;
 
 #ifdef CONFIG_PLUGIN
 
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool supress);
+bool plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db,
+                         bool supress);
 void plugin_gen_tb_end(CPUState *cpu);
 void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db);
 void plugin_gen_insn_end(void);
@@ -48,8 +49,8 @@ static inline void plugin_insn_append(abi_ptr pc, const void *from, size_t size)
 
 #else /* !CONFIG_PLUGIN */
 
-static inline
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool supress)
+static inline bool
+plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db, bool sup)
 {
     return false;
 }
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 8377c15383..0f080386af 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -852,7 +852,8 @@ static void plugin_gen_inject(const struct qemu_plugin_tb *plugin_tb)
     pr_ops();
 }
 
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool mem_only)
+bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
+                         bool mem_only)
 {
     bool ret = false;
 
@@ -870,9 +871,9 @@ bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool mem_onl
 
         ret = true;
 
-        ptb->vaddr = tb->pc;
+        ptb->vaddr = db->pc_first;
         ptb->vaddr2 = -1;
-        get_page_addr_code_hostp(cpu->env_ptr, tb->pc, true, &ptb->haddr1);
+        ptb->haddr1 = db->host_addr[0];
         ptb->haddr2 = NULL;
         ptb->mem_only = mem_only;
 
@@ -898,16 +899,16 @@ void plugin_gen_insn_start(CPUState *cpu, const DisasContextBase *db)
      * Note that we skip this when haddr1 == NULL, e.g. when we're
      * fetching instructions from a region not backed by RAM.
      */
-    if (likely(ptb->haddr1 != NULL && ptb->vaddr2 == -1) &&
-        unlikely((db->pc_next & TARGET_PAGE_MASK) !=
-                 (db->pc_first & TARGET_PAGE_MASK))) {
-        get_page_addr_code_hostp(cpu->env_ptr, db->pc_next,
-                                 true, &ptb->haddr2);
-        ptb->vaddr2 = db->pc_next;
-    }
-    if (likely(ptb->vaddr2 == -1)) {
+    if (ptb->haddr1 == NULL) {
+        pinsn->haddr = NULL;
+    } else if (is_same_page(db, db->pc_next)) {
         pinsn->haddr = ptb->haddr1 + pinsn->vaddr - ptb->vaddr;
     } else {
+        if (ptb->vaddr2 == -1) {
+            ptb->vaddr2 = TARGET_PAGE_ALIGN(db->pc_first);
+            get_page_addr_code_hostp(cpu->env_ptr, ptb->vaddr2,
+                                     true, &ptb->haddr2);
+        }
         pinsn->haddr = ptb->haddr2 + pinsn->vaddr - ptb->vaddr2;
     }
 }
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a693c17259..3e6fab482e 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -81,7 +81,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
     ops->tb_start(db, cpu);
     tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
-    plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
+    plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
 
     while (true) {
         db->num_insns++;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes
  2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
                   ` (20 preceding siblings ...)
  2022-08-12 18:08 ` [PATCH for-7.2 21/21] accel/tcg: Use DisasContextBase in plugin_gen_tb_start Richard Henderson
@ 2022-08-16 23:12 ` Ilya Leoshkevich
  21 siblings, 0 replies; 32+ messages in thread
From: Ilya Leoshkevich @ 2022-08-16 23:12 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, alex.bennee

On Fri, 2022-08-12 at 11:07 -0700, Richard Henderson wrote:
> This is part of a larger body of work, but in the process of
> reorganizing I was reminded that PROT_EXEC wasn't being enforced
> properly for user-only.  As this has come up in the context of
> some of Ilya's patches, I thought I'd go ahead and post this part.
> 
> 
> r~
> 
> 
> Ilya Leoshkevich (1):
>   accel/tcg: Introduce is_same_page()
> 
> Richard Henderson (20):
>   linux-user/arm: Mark the commpage executable
>   linux-user/hppa: Allocate page zero as a commpage
>   linux-user/x86_64: Allocate vsyscall page as a commpage
>   linux-user: Honor PT_GNU_STACK
>   tests/tcg/i386: Move smc_code2 to an executable section
>   accel/tcg: Remove PageDesc code_bitmap
>   accel/tcg: Use bool for page_find_alloc
>   accel/tcg: Merge tb_htable_lookup into caller
>   accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c
>   accel/tcg: Properly implement get_page_addr_code for user-only
>   accel/tcg: Use probe_access_internal for softmmu
>     get_page_addr_code_hostp
>   accel/tcg: Add nofault parameter to get_page_addr_code_hostp
>   accel/tcg: Unlock mmap_lock after longjmp
>   accel/tcg: Hoist get_page_addr_code out of tb_lookup
>   accel/tcg: Hoist get_page_addr_code out of tb_gen_code
>   accel/tcg: Raise PROT_EXEC exception early
>   accel/tcg: Remove translator_ldsw
>   accel/tcg: Add pc and host_pc params to gen_intermediate_code
>   accel/tcg: Add fast path for translator_ld*
>   accel/tcg: Use DisasContextBase in plugin_gen_tb_start
> 
>  accel/tcg/internal.h          |   7 +-
>  include/elf.h                 |   1 +
>  include/exec/cpu-common.h     |   1 +
>  include/exec/exec-all.h       |  87 +++++-----------
>  include/exec/plugin-gen.h     |   7 +-
>  include/exec/translator.h     |  85 ++++++++++++----
>  linux-user/arm/target_cpu.h   |   4 +-
>  linux-user/qemu.h             |   1 +
>  accel/tcg/cpu-exec.c          | 184 ++++++++++++++++++--------------
> --
>  accel/tcg/cputlb.c            |  93 +++++------------
>  accel/tcg/plugin-gen.c        |  23 +++--
>  accel/tcg/translate-all.c     | 120 ++++------------------
>  accel/tcg/translator.c        | 122 +++++++++++++++++-----
>  accel/tcg/user-exec.c         |  15 +++
>  linux-user/elfload.c          |  80 ++++++++++++++-
>  softmmu/physmem.c             |  12 +++
>  target/alpha/translate.c      |   5 +-
>  target/arm/translate.c        |   5 +-
>  target/avr/translate.c        |   5 +-
>  target/cris/translate.c       |   5 +-
>  target/hexagon/translate.c    |   6 +-
>  target/hppa/translate.c       |   5 +-
>  target/i386/tcg/translate.c   |   7 +-
>  target/loongarch/translate.c  |   6 +-
>  target/m68k/translate.c       |   5 +-
>  target/microblaze/translate.c |   5 +-
>  target/mips/tcg/translate.c   |   5 +-
>  target/nios2/translate.c      |   5 +-
>  target/openrisc/translate.c   |   6 +-
>  target/ppc/translate.c        |   5 +-
>  target/riscv/translate.c      |   5 +-
>  target/rx/translate.c         |   5 +-
>  target/s390x/tcg/translate.c  |   5 +-
>  target/sh4/translate.c        |   5 +-
>  target/sparc/translate.c      |   5 +-
>  target/tricore/translate.c    |   6 +-
>  target/xtensa/translate.c     |   6 +-
>  tests/tcg/i386/test-i386.c    |   2 +-
>  38 files changed, 532 insertions(+), 424 deletions(-)
> 

Hi,

I need the following fixup to make my noexec tests pass with v1:

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 6a3ca8224f..cc6a43a3bc 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -386,6 +386,10 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState
*env)
         return tcg_code_gen_epilogue;
     }
 
+    if (tb->page_addr[1] != -1) {
+        get_page_addr_code_hostp(env, tb->page_addr[1], false, NULL);
+    }
+
     log_cpu_exec(pc, cpu, tb);
 
     return tb->tc.ptr;
@@ -997,6 +1001,9 @@ int cpu_exec(CPUState *cpu)
                  * for the fast lookup
                  */
                 qatomic_set(&cpu-
>tb_jmp_cache[tb_jmp_cache_hash_func(pc)], tb);
+            } else if (tb->page_addr[1] != -1) {
+                get_page_addr_code_hostp(cpu->env_ptr, tb-
>page_addr[1], false,
+                                         NULL);
             }
             mmap_unlock();

With v2, the exception after mprotect(PROT_NONE) is not happening
again. I have not figured out what the problem is yet.

Also, wasmtime tests trigger this assertion:

static void pgb_dynamic(const char *image_name, long align)
{
    /*
     * The executable is dynamic and does not require a fixed address.
     * All we need is a commpage that satisfies align.
     * If we do not need a commpage, leave guest_base == 0.
     */
    if (HI_COMMPAGE) {
        uintptr_t addr, commpage;

        /* 64-bit hosts should have used reserved_va. */
        assert(sizeof(uintptr_t) == 4);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Likewise, I also have not figured out why this is happening.

Best regards,
Ilya


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-12 18:07 ` [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup Richard Henderson
@ 2022-08-16 23:43   ` Ilya Leoshkevich
  2022-08-17  1:42     ` Richard Henderson
  0 siblings, 1 reply; 32+ messages in thread
From: Ilya Leoshkevich @ 2022-08-16 23:43 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, alex.bennee

On Fri, 2022-08-12 at 11:07 -0700, Richard Henderson wrote:
> We will want to re-use the result of get_page_addr_code
> beyond the scope of tb_lookup.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 34 ++++++++++++++++++++++++----------
>  1 file changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index a9b7053274..889355b341 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -209,13 +209,12 @@ static bool tb_lookup_cmp(const void *p, const
> void *d)
>  }
>  
>  /* Might cause an exception, so have a longjmp destination ready */
> -static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
> -                                   target_ulong cs_base,
> +static TranslationBlock *tb_lookup(CPUState *cpu, tb_page_addr_t
> phys_pc,
> +                                   target_ulong pc, target_ulong
> cs_base,
>                                     uint32_t flags, uint32_t cflags)
>  {
>      CPUArchState *env = cpu->env_ptr;
>      TranslationBlock *tb;
> -    tb_page_addr_t phys_pc;
>      struct tb_desc desc;
>      uint32_t jmp_hash, tb_hash;
>  
> @@ -240,11 +239,8 @@ static TranslationBlock *tb_lookup(CPUState
> *cpu, target_ulong pc,
>      desc.cflags = cflags;
>      desc.trace_vcpu_dstate = *cpu->trace_dstate;
>      desc.pc = pc;
> -    phys_pc = get_page_addr_code(desc.env, pc);
> -    if (phys_pc == -1) {
> -        return NULL;
> -    }
>      desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
> +
>      tb_hash = tb_hash_func(phys_pc, pc, flags, cflags, *cpu-
> >trace_dstate);
>      tb = qht_lookup_custom(&tb_ctx.htable, &desc, tb_hash,
> tb_lookup_cmp);
>      if (tb == NULL) {
> @@ -371,6 +367,7 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState
> *env)
>      TranslationBlock *tb;
>      target_ulong cs_base, pc;
>      uint32_t flags, cflags;
> +    tb_page_addr_t phys_pc;
>  
>      cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>  
> @@ -379,7 +376,12 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState
> *env)
>          cpu_loop_exit(cpu);
>      }
>  
> -    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> +    phys_pc = get_page_addr_code(env, pc);
> +    if (phys_pc == -1) {
> +        return tcg_code_gen_epilogue;
> +    }
> +
> +    tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
>      if (tb == NULL) {
>          return tcg_code_gen_epilogue;
>      }
> @@ -482,6 +484,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>      TranslationBlock *tb;
>      target_ulong cs_base, pc;
>      uint32_t flags, cflags;
> +    tb_page_addr_t phys_pc;
>      int tb_exit;
>  
>      if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> @@ -504,7 +507,12 @@ void cpu_exec_step_atomic(CPUState *cpu)
>           * Any breakpoint for this insn will have been recognized
> earlier.
>           */
>  
> -        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> +        phys_pc = get_page_addr_code(env, pc);
> +        if (phys_pc == -1) {
> +            tb = NULL;
> +        } else {
> +            tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
> cflags);
> +        }
>          if (tb == NULL) {
>              mmap_lock();
>              tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> @@ -949,6 +957,7 @@ int cpu_exec(CPUState *cpu)
>              TranslationBlock *tb;
>              target_ulong cs_base, pc;
>              uint32_t flags, cflags;
> +            tb_page_addr_t phys_pc;
>  
>              cpu_get_tb_cpu_state(cpu->env_ptr, &pc, &cs_base,
> &flags);
>  
> @@ -970,7 +979,12 @@ int cpu_exec(CPUState *cpu)
>                  break;
>              }
>  
> -            tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> +            phys_pc = get_page_addr_code(cpu->env_ptr, pc);
> +            if (phys_pc == -1) {
> +                tb = NULL;
> +            } else {
> +                tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
> cflags);
> +            }
>              if (tb == NULL) {
>                  mmap_lock();
>                  tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);

This patch did not make it into v2, but having get_page_addr_code()
before tb_lookup() in helper_lookup_tb_ptr() helped raise the exception
when trying to execute a no-longer-executable TB.

Was it dropped for performance reasons?


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-16 23:43   ` Ilya Leoshkevich
@ 2022-08-17  1:42     ` Richard Henderson
  2022-08-17 11:08       ` Ilya Leoshkevich
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Henderson @ 2022-08-17  1:42 UTC (permalink / raw)
  To: Ilya Leoshkevich, qemu-devel; +Cc: laurent, alex.bennee

On 8/16/22 18:43, Ilya Leoshkevich wrote:
> On Fri, 2022-08-12 at 11:07 -0700, Richard Henderson wrote:
>> We will want to re-use the result of get_page_addr_code
>> beyond the scope of tb_lookup.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   accel/tcg/cpu-exec.c | 34 ++++++++++++++++++++++++----------
>>   1 file changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index a9b7053274..889355b341 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -209,13 +209,12 @@ static bool tb_lookup_cmp(const void *p, const
>> void *d)
>>   }
>>   
>>   /* Might cause an exception, so have a longjmp destination ready */
>> -static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
>> -                                   target_ulong cs_base,
>> +static TranslationBlock *tb_lookup(CPUState *cpu, tb_page_addr_t
>> phys_pc,
>> +                                   target_ulong pc, target_ulong
>> cs_base,
>>                                      uint32_t flags, uint32_t cflags)
>>   {
>>       CPUArchState *env = cpu->env_ptr;
>>       TranslationBlock *tb;
>> -    tb_page_addr_t phys_pc;
>>       struct tb_desc desc;
>>       uint32_t jmp_hash, tb_hash;
>>   
>> @@ -240,11 +239,8 @@ static TranslationBlock *tb_lookup(CPUState
>> *cpu, target_ulong pc,
>>       desc.cflags = cflags;
>>       desc.trace_vcpu_dstate = *cpu->trace_dstate;
>>       desc.pc = pc;
>> -    phys_pc = get_page_addr_code(desc.env, pc);
>> -    if (phys_pc == -1) {
>> -        return NULL;
>> -    }
>>       desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
>> +
>>       tb_hash = tb_hash_func(phys_pc, pc, flags, cflags, *cpu-
>>> trace_dstate);
>>       tb = qht_lookup_custom(&tb_ctx.htable, &desc, tb_hash,
>> tb_lookup_cmp);
>>       if (tb == NULL) {
>> @@ -371,6 +367,7 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState
>> *env)
>>       TranslationBlock *tb;
>>       target_ulong cs_base, pc;
>>       uint32_t flags, cflags;
>> +    tb_page_addr_t phys_pc;
>>   
>>       cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>   
>> @@ -379,7 +376,12 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState
>> *env)
>>           cpu_loop_exit(cpu);
>>       }
>>   
>> -    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>> +    phys_pc = get_page_addr_code(env, pc);
>> +    if (phys_pc == -1) {
>> +        return tcg_code_gen_epilogue;
>> +    }
>> +
>> +    tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
>>       if (tb == NULL) {
>>           return tcg_code_gen_epilogue;
>>       }
>> @@ -482,6 +484,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>>       TranslationBlock *tb;
>>       target_ulong cs_base, pc;
>>       uint32_t flags, cflags;
>> +    tb_page_addr_t phys_pc;
>>       int tb_exit;
>>   
>>       if (sigsetjmp(cpu->jmp_env, 0) == 0) {
>> @@ -504,7 +507,12 @@ void cpu_exec_step_atomic(CPUState *cpu)
>>            * Any breakpoint for this insn will have been recognized
>> earlier.
>>            */
>>   
>> -        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>> +        phys_pc = get_page_addr_code(env, pc);
>> +        if (phys_pc == -1) {
>> +            tb = NULL;
>> +        } else {
>> +            tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
>> cflags);
>> +        }
>>           if (tb == NULL) {
>>               mmap_lock();
>>               tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
>> @@ -949,6 +957,7 @@ int cpu_exec(CPUState *cpu)
>>               TranslationBlock *tb;
>>               target_ulong cs_base, pc;
>>               uint32_t flags, cflags;
>> +            tb_page_addr_t phys_pc;
>>   
>>               cpu_get_tb_cpu_state(cpu->env_ptr, &pc, &cs_base,
>> &flags);
>>   
>> @@ -970,7 +979,12 @@ int cpu_exec(CPUState *cpu)
>>                   break;
>>               }
>>   
>> -            tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>> +            phys_pc = get_page_addr_code(cpu->env_ptr, pc);
>> +            if (phys_pc == -1) {
>> +                tb = NULL;
>> +            } else {
>> +                tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
>> cflags);
>> +            }
>>               if (tb == NULL) {
>>                   mmap_lock();
>>                   tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> 
> This patch did not make it into v2, but having get_page_addr_code()
> before tb_lookup() in helper_lookup_tb_ptr() helped raise the exception
> when trying to execute a no-longer-executable TB.
> 
> Was it dropped for performance reasons?

Ah, yes.  I dropped it because I ran into some regression, and started minimizing the 
tree.  Because of the extra lock that needed to be held (next patch, also dropped), I 
couldn't prove this actually helped.

I think the bit that's causing your user-only failure at the moment is the jump cache. 
This patch hoisted the page table check before the jmp_cache.  For system, cputlb.c takes 
care of flushing the jump cache with page table changes; we still don't have anything in 
user-only that takes care of that.


r~



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17  1:42     ` Richard Henderson
@ 2022-08-17 11:08       ` Ilya Leoshkevich
  2022-08-17 13:15         ` Richard Henderson
  2022-08-17 13:42         ` Richard Henderson
  0 siblings, 2 replies; 32+ messages in thread
From: Ilya Leoshkevich @ 2022-08-17 11:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, alex.bennee

On Tue, 2022-08-16 at 20:42 -0500, Richard Henderson wrote:
> On 8/16/22 18:43, Ilya Leoshkevich wrote:
> > On Fri, 2022-08-12 at 11:07 -0700, Richard Henderson wrote:
> > > We will want to re-use the result of get_page_addr_code
> > > beyond the scope of tb_lookup.
> > > 
> > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > > ---
> > >   accel/tcg/cpu-exec.c | 34 ++++++++++++++++++++++++----------
> > >   1 file changed, 24 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > > index a9b7053274..889355b341 100644
> > > --- a/accel/tcg/cpu-exec.c
> > > +++ b/accel/tcg/cpu-exec.c
> > > @@ -209,13 +209,12 @@ static bool tb_lookup_cmp(const void *p,
> > > const
> > > void *d)
> > >   }
> > >   
> > >   /* Might cause an exception, so have a longjmp destination
> > > ready */
> > > -static TranslationBlock *tb_lookup(CPUState *cpu, target_ulong
> > > pc,
> > > -                                   target_ulong cs_base,
> > > +static TranslationBlock *tb_lookup(CPUState *cpu, tb_page_addr_t
> > > phys_pc,
> > > +                                   target_ulong pc, target_ulong
> > > cs_base,
> > >                                      uint32_t flags, uint32_t
> > > cflags)
> > >   {
> > >       CPUArchState *env = cpu->env_ptr;
> > >       TranslationBlock *tb;
> > > -    tb_page_addr_t phys_pc;
> > >       struct tb_desc desc;
> > >       uint32_t jmp_hash, tb_hash;
> > >   
> > > @@ -240,11 +239,8 @@ static TranslationBlock *tb_lookup(CPUState
> > > *cpu, target_ulong pc,
> > >       desc.cflags = cflags;
> > >       desc.trace_vcpu_dstate = *cpu->trace_dstate;
> > >       desc.pc = pc;
> > > -    phys_pc = get_page_addr_code(desc.env, pc);
> > > -    if (phys_pc == -1) {
> > > -        return NULL;
> > > -    }
> > >       desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
> > > +
> > >       tb_hash = tb_hash_func(phys_pc, pc, flags, cflags, *cpu-
> > > > trace_dstate);
> > >       tb = qht_lookup_custom(&tb_ctx.htable, &desc, tb_hash,
> > > tb_lookup_cmp);
> > >       if (tb == NULL) {
> > > @@ -371,6 +367,7 @@ const void
> > > *HELPER(lookup_tb_ptr)(CPUArchState
> > > *env)
> > >       TranslationBlock *tb;
> > >       target_ulong cs_base, pc;
> > >       uint32_t flags, cflags;
> > > +    tb_page_addr_t phys_pc;
> > >   
> > >       cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> > >   
> > > @@ -379,7 +376,12 @@ const void
> > > *HELPER(lookup_tb_ptr)(CPUArchState
> > > *env)
> > >           cpu_loop_exit(cpu);
> > >       }
> > >   
> > > -    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> > > +    phys_pc = get_page_addr_code(env, pc);
> > > +    if (phys_pc == -1) {
> > > +        return tcg_code_gen_epilogue;
> > > +    }
> > > +
> > > +    tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags, cflags);
> > >       if (tb == NULL) {
> > >           return tcg_code_gen_epilogue;
> > >       }
> > > @@ -482,6 +484,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
> > >       TranslationBlock *tb;
> > >       target_ulong cs_base, pc;
> > >       uint32_t flags, cflags;
> > > +    tb_page_addr_t phys_pc;
> > >       int tb_exit;
> > >   
> > >       if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> > > @@ -504,7 +507,12 @@ void cpu_exec_step_atomic(CPUState *cpu)
> > >            * Any breakpoint for this insn will have been
> > > recognized
> > > earlier.
> > >            */
> > >   
> > > -        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> > > +        phys_pc = get_page_addr_code(env, pc);
> > > +        if (phys_pc == -1) {
> > > +            tb = NULL;
> > > +        } else {
> > > +            tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
> > > cflags);
> > > +        }
> > >           if (tb == NULL) {
> > >               mmap_lock();
> > >               tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> > > @@ -949,6 +957,7 @@ int cpu_exec(CPUState *cpu)
> > >               TranslationBlock *tb;
> > >               target_ulong cs_base, pc;
> > >               uint32_t flags, cflags;
> > > +            tb_page_addr_t phys_pc;
> > >   
> > >               cpu_get_tb_cpu_state(cpu->env_ptr, &pc, &cs_base,
> > > &flags);
> > >   
> > > @@ -970,7 +979,12 @@ int cpu_exec(CPUState *cpu)
> > >                   break;
> > >               }
> > >   
> > > -            tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> > > +            phys_pc = get_page_addr_code(cpu->env_ptr, pc);
> > > +            if (phys_pc == -1) {
> > > +                tb = NULL;
> > > +            } else {
> > > +                tb = tb_lookup(cpu, phys_pc, pc, cs_base, flags,
> > > cflags);
> > > +            }
> > >               if (tb == NULL) {
> > >                   mmap_lock();
> > >                   tb = tb_gen_code(cpu, pc, cs_base, flags,
> > > cflags);
> > 
> > This patch did not make it into v2, but having get_page_addr_code()
> > before tb_lookup() in helper_lookup_tb_ptr() helped raise the
> > exception
> > when trying to execute a no-longer-executable TB.
> > 
> > Was it dropped for performance reasons?
> 
> Ah, yes.  I dropped it because I ran into some regression, and
> started minimizing the 
> tree.  Because of the extra lock that needed to be held (next patch,
> also dropped), I 
> couldn't prove this actually helped.
> 
> I think the bit that's causing your user-only failure at the moment
> is the jump cache. 
> This patch hoisted the page table check before the jmp_cache.  For
> system, cputlb.c takes 
> care of flushing the jump cache with page table changes; we still
> don't have anything in 
> user-only that takes care of that.
> 
> 
> r~
> 

Would something like this be okay?

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 27435b97dbd..9421c84d991 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1152,6 +1152,27 @@ static inline void
tb_jmp_unlink(TranslationBlock *dest)
     qemu_spin_unlock(&dest->jmp_lock);
 }
 
+static void cpu_tb_jmp_cache_remove(TranslationBlock *tb)
+{
+    CPUState *cpu;
+    uint32_t h;
+
+    /* remove the TB from the hash list */
+    if (TARGET_TB_PCREL) {
+        /* Any TB may be at any virtual address */
+        CPU_FOREACH(cpu) {
+            cpu_tb_jmp_cache_clear(cpu);
+        }
+    } else {
+        h = tb_jmp_cache_hash_func(tb_pc(tb));
+        CPU_FOREACH(cpu) {
+            if (qatomic_read(&cpu->tb_jmp_cache[h].tb) == tb) {
+                qatomic_set(&cpu->tb_jmp_cache[h].tb, NULL);
+            }
+        }
+    }
+}
+
 /*
  * In user-mode, call with mmap_lock held.
  * In !user-mode, if @rm_from_page_list is set, call with the TB's
pages'
@@ -1159,7 +1180,6 @@ static inline void tb_jmp_unlink(TranslationBlock
*dest)
  */
 static void do_tb_phys_invalidate(TranslationBlock *tb, bool
rm_from_page_list)
 {
-    CPUState *cpu;
     PageDesc *p;
     uint32_t h;
     tb_page_addr_t phys_pc;
@@ -1190,20 +1210,7 @@ static void
do_tb_phys_invalidate(TranslationBlock *tb, bool rm_from_page_list)
         }
     }
 
-    /* remove the TB from the hash list */
-    if (TARGET_TB_PCREL) {
-        /* Any TB may be at any virtual address */
-        CPU_FOREACH(cpu) {
-            cpu_tb_jmp_cache_clear(cpu);
-        }
-    } else {
-        h = tb_jmp_cache_hash_func(tb_pc(tb));
-        CPU_FOREACH(cpu) {
-            if (qatomic_read(&cpu->tb_jmp_cache[h].tb) == tb) {
-                qatomic_set(&cpu->tb_jmp_cache[h].tb, NULL);
-            }
-        }
-    }
+    cpu_tb_jmp_cache_remove(tb);
 
     /* suppress this TB from the two jump lists */
     tb_remove_from_jmp_list(tb, 0);
@@ -2243,6 +2250,13 @@ void page_set_flags(target_ulong start,
target_ulong end, int flags)
             (flags & PAGE_WRITE) &&
             p->first_tb) {
             tb_invalidate_phys_page(addr, 0);
+        } else {
+            TranslationBlock *tb;
+            int n;
+
+            PAGE_FOR_EACH_TB(p, tb, n) {
+                cpu_tb_jmp_cache_remove(tb);
+            }
         }
         if (reset_target_data) {
             g_free(p->target_data);



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 11:08       ` Ilya Leoshkevich
@ 2022-08-17 13:15         ` Richard Henderson
  2022-08-17 13:27           ` Ilya Leoshkevich
  2022-08-17 13:42         ` Richard Henderson
  1 sibling, 1 reply; 32+ messages in thread
From: Richard Henderson @ 2022-08-17 13:15 UTC (permalink / raw)
  To: Ilya Leoshkevich, qemu-devel; +Cc: laurent, alex.bennee

On 8/17/22 06:08, Ilya Leoshkevich wrote:
> @@ -2243,6 +2250,13 @@ void page_set_flags(target_ulong start,
> target_ulong end, int flags)
>               (flags & PAGE_WRITE) &&
>               p->first_tb) {
>               tb_invalidate_phys_page(addr, 0);
> +        } else {
> +            TranslationBlock *tb;
> +            int n;
> +
> +            PAGE_FOR_EACH_TB(p, tb, n) {
> +                cpu_tb_jmp_cache_remove(tb);
> +            }
>           }

Here you would use tb_jmp_cache_clear_page(), which should be moved out of cputlb.c.


r~




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 13:15         ` Richard Henderson
@ 2022-08-17 13:27           ` Ilya Leoshkevich
  2022-08-17 13:38             ` Richard Henderson
  0 siblings, 1 reply; 32+ messages in thread
From: Ilya Leoshkevich @ 2022-08-17 13:27 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, alex.bennee

On Wed, 2022-08-17 at 08:15 -0500, Richard Henderson wrote:
> On 8/17/22 06:08, Ilya Leoshkevich wrote:
> > @@ -2243,6 +2250,13 @@ void page_set_flags(target_ulong start,
> > target_ulong end, int flags)
> >               (flags & PAGE_WRITE) &&
> >               p->first_tb) {
> >               tb_invalidate_phys_page(addr, 0);
> > +        } else {
> > +            TranslationBlock *tb;
> > +            int n;
> > +
> > +            PAGE_FOR_EACH_TB(p, tb, n) {
> > +                cpu_tb_jmp_cache_remove(tb);
> > +            }
> >           }
> 
> Here you would use tb_jmp_cache_clear_page(), which should be moved
> out of cputlb.c.

That was actually the first thing I tried.

Unfortunately tb_jmp_cache_clear_page() relies on
tb_jmp_cache_hash_func() returning the same top bits for addresses on
the same page.  This is not the case for qemu-user: there this property
was traded for better hashing with quite impressive performance
improvements (6f1653180f570).


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 13:27           ` Ilya Leoshkevich
@ 2022-08-17 13:38             ` Richard Henderson
  2022-08-17 14:07               ` Ilya Leoshkevich
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Henderson @ 2022-08-17 13:38 UTC (permalink / raw)
  To: Ilya Leoshkevich, qemu-devel; +Cc: laurent, alex.bennee

On 8/17/22 08:27, Ilya Leoshkevich wrote:
> On Wed, 2022-08-17 at 08:15 -0500, Richard Henderson wrote:
>> On 8/17/22 06:08, Ilya Leoshkevich wrote:
>>> @@ -2243,6 +2250,13 @@ void page_set_flags(target_ulong start,
>>> target_ulong end, int flags)
>>>                (flags & PAGE_WRITE) &&
>>>                p->first_tb) {
>>>                tb_invalidate_phys_page(addr, 0);
>>> +        } else {
>>> +            TranslationBlock *tb;
>>> +            int n;
>>> +
>>> +            PAGE_FOR_EACH_TB(p, tb, n) {
>>> +                cpu_tb_jmp_cache_remove(tb);
>>> +            }
>>>            }
>>
>> Here you would use tb_jmp_cache_clear_page(), which should be moved
>> out of cputlb.c.
> 
> That was actually the first thing I tried.
> 
> Unfortunately tb_jmp_cache_clear_page() relies on
> tb_jmp_cache_hash_func() returning the same top bits for addresses on
> the same page.  This is not the case for qemu-user: there this property
> was traded for better hashing with quite impressive performance
> improvements (6f1653180f570).

Oh my.  Well, we could

(1) revert that patch because the premise is wrong,
(2) go with your per-tb clearing,
(3) clear the whole thing with cpu_tb_jmp_cache_clear

Ideally we'd have some benchmark numbers to inform the choice...


r~


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 11:08       ` Ilya Leoshkevich
  2022-08-17 13:15         ` Richard Henderson
@ 2022-08-17 13:42         ` Richard Henderson
  1 sibling, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-17 13:42 UTC (permalink / raw)
  To: Ilya Leoshkevich, qemu-devel; +Cc: laurent, alex.bennee

On 8/17/22 06:08, Ilya Leoshkevich wrote:
> +static void cpu_tb_jmp_cache_remove(TranslationBlock *tb)
> +{
> +    CPUState *cpu;
> +    uint32_t h;
> +
> +    /* remove the TB from the hash list */
> +    if (TARGET_TB_PCREL) {
> +        /* Any TB may be at any virtual address */
> +        CPU_FOREACH(cpu) {
> +            cpu_tb_jmp_cache_clear(cpu);
> +        }

This comment is not currently true for user-only.  Although there's an outstanding bug 
report about our failure to manage virtual aliasing in user-only...

> +            PAGE_FOR_EACH_TB(p, tb, n) {
> +                cpu_tb_jmp_cache_remove(tb);
> +            }

You wouldn't want to call cpu_tb_jmp_cache_clear() 99 times for the 99 tb's on the page.

For user-only, I think mprotect is rare enough that just clearing the whole cache once is 
sufficient.


r~


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 13:38             ` Richard Henderson
@ 2022-08-17 14:07               ` Ilya Leoshkevich
  2022-08-17 16:07                 ` Richard Henderson
  0 siblings, 1 reply; 32+ messages in thread
From: Ilya Leoshkevich @ 2022-08-17 14:07 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: laurent, alex.bennee

On Wed, 2022-08-17 at 08:38 -0500, Richard Henderson wrote:
> On 8/17/22 08:27, Ilya Leoshkevich wrote:
> > On Wed, 2022-08-17 at 08:15 -0500, Richard Henderson wrote:
> > > On 8/17/22 06:08, Ilya Leoshkevich wrote:
> > > > @@ -2243,6 +2250,13 @@ void page_set_flags(target_ulong start,
> > > > target_ulong end, int flags)
> > > >                (flags & PAGE_WRITE) &&
> > > >                p->first_tb) {
> > > >                tb_invalidate_phys_page(addr, 0);
> > > > +        } else {
> > > > +            TranslationBlock *tb;
> > > > +            int n;
> > > > +
> > > > +            PAGE_FOR_EACH_TB(p, tb, n) {
> > > > +                cpu_tb_jmp_cache_remove(tb);
> > > > +            }
> > > >            }
> > > 
> > > Here you would use tb_jmp_cache_clear_page(), which should be
> > > moved
> > > out of cputlb.c.
> > 
> > That was actually the first thing I tried.
> > 
> > Unfortunately tb_jmp_cache_clear_page() relies on
> > tb_jmp_cache_hash_func() returning the same top bits for addresses
> > on
> > the same page.  This is not the case for qemu-user: there this
> > property
> > was traded for better hashing with quite impressive performance
> > improvements (6f1653180f570).
> 
> Oh my.  Well, we could
> 
> (1) revert that patch because the premise is wrong,
> (2) go with your per-tb clearing,
> (3) clear the whole thing with cpu_tb_jmp_cache_clear
> 
> Ideally we'd have some benchmark numbers to inform the choice...

FWIW 6f1653180f570 still looks useful.
Reverting it caused 620.omnetpp_s to take ~4% more time.
I ran the benchmark with reduced values in omnetpp.ini so as not to
wait forever, therefore the real figures might be closer to what the
commit message says. In any case this still shows that the patch has
measurable impact.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup
  2022-08-17 14:07               ` Ilya Leoshkevich
@ 2022-08-17 16:07                 ` Richard Henderson
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Henderson @ 2022-08-17 16:07 UTC (permalink / raw)
  To: Ilya Leoshkevich, qemu-devel; +Cc: laurent, alex.bennee

On 8/17/22 09:07, Ilya Leoshkevich wrote:
>> Oh my.  Well, we could
>>
>> (1) revert that patch because the premise is wrong,
>> (2) go with your per-tb clearing,
>> (3) clear the whole thing with cpu_tb_jmp_cache_clear
>>
>> Ideally we'd have some benchmark numbers to inform the choice...
> 
> FWIW 6f1653180f570 still looks useful.
> Reverting it caused 620.omnetpp_s to take ~4% more time.
> I ran the benchmark with reduced values in omnetpp.ini so as not to
> wait forever, therefore the real figures might be closer to what the
> commit message says. In any case this still shows that the patch has
> measurable impact.

Thanks for the testing.

I think option (3) will be best for user-only, because mprotect/munmap of existing code 
pages is rare -- usually only at process startup.


r~


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-08-17 16:26 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-12 18:07 [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 01/21] linux-user/arm: Mark the commpage executable Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 02/21] linux-user/hppa: Allocate page zero as a commpage Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 03/21] linux-user/x86_64: Allocate vsyscall page " Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 04/21] linux-user: Honor PT_GNU_STACK Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 05/21] tests/tcg/i386: Move smc_code2 to an executable section Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 06/21] accel/tcg: Remove PageDesc code_bitmap Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 07/21] accel/tcg: Use bool for page_find_alloc Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 08/21] accel/tcg: Merge tb_htable_lookup into caller Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 09/21] accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 10/21] accel/tcg: Properly implement get_page_addr_code for user-only Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 11/21] accel/tcg: Use probe_access_internal for softmmu get_page_addr_code_hostp Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 12/21] accel/tcg: Add nofault parameter to get_page_addr_code_hostp Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 13/21] accel/tcg: Unlock mmap_lock after longjmp Richard Henderson
2022-08-12 18:07 ` [PATCH for-7.2 14/21] accel/tcg: Hoist get_page_addr_code out of tb_lookup Richard Henderson
2022-08-16 23:43   ` Ilya Leoshkevich
2022-08-17  1:42     ` Richard Henderson
2022-08-17 11:08       ` Ilya Leoshkevich
2022-08-17 13:15         ` Richard Henderson
2022-08-17 13:27           ` Ilya Leoshkevich
2022-08-17 13:38             ` Richard Henderson
2022-08-17 14:07               ` Ilya Leoshkevich
2022-08-17 16:07                 ` Richard Henderson
2022-08-17 13:42         ` Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 15/21] accel/tcg: Hoist get_page_addr_code out of tb_gen_code Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 16/21] accel/tcg: Raise PROT_EXEC exception early Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 17/21] accel/tcg: Introduce is_same_page() Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 18/21] accel/tcg: Remove translator_ldsw Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 19/21] accel/tcg: Add pc and host_pc params to gen_intermediate_code Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 20/21] accel/tcg: Add fast path for translator_ld* Richard Henderson
2022-08-12 18:08 ` [PATCH for-7.2 21/21] accel/tcg: Use DisasContextBase in plugin_gen_tb_start Richard Henderson
2022-08-16 23:12 ` [PATCH for-7.2 00/21] accel/tcg: minimize tlb lookups during translate + user-only PROT_EXEC fixes Ilya Leoshkevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).