qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528]
@ 2023-03-06  2:12 Richard Henderson
  2023-03-06  2:12 ` [PATCH 1/9] linux-user: Diagnose incorrect -R size Richard Henderson
                   ` (9 more replies)
  0 siblings, 10 replies; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:12 UTC (permalink / raw)
  To: qemu-devel

The primary issue is that of overflow, where "end" for the last
page of the 32-bit address space overflows to 0.  The fix is to
use "last" instead, which can always be represented.

This requires that we adjust reserved_va as well, because of

-/*
- * There are a number of places where we assign reserved_va to a variable
- * of type abi_ulong and expect it to fit.  Avoid the last page.
- */
-#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)

and the related

-        /*
-         * reserved_va must be aligned with the host page size
-         * as it is used with mmap()
-         */
-        reserved_va = local_max_va & qemu_host_page_mask;

whereby we avoided the final (host | guest) page of the address space
because of said overflow.  With the change in representation, we can
always use UINT32_MAX as the end of the 32-bit address space.

This was observable on ppc64le (or any other 64k page host) not being
able to load any arm32 binary, because the COMMPAGE goes at 0xffff0000,
which violated that last host page problem above.

The issue is resolved in patch 4, but the rest clean up other interfaces
with the same issue.  I'm not touching any interfaces that use start+len
instead of start+end.


r~


Richard Henderson (9):
  linux-user: Diagnose incorrect -R size
  linux-user: Rename max_reserved_va in main
  include/exec: Replace reserved_va with max_reserved_va
  accel/tcg: Pass last not end to page_set_flags
  accel/tcg: Pass last not end to page_reset_target_data
  accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
  accel/tcg: Pass last not end to page_collection_lock
  accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
  accel/tcg: Pass last not end to tb_invalidate_phys_range

 include/exec/cpu-all.h      | 19 ++++++--
 include/exec/exec-all.h     |  2 +-
 linux-user/arm/target_cpu.h |  2 +-
 accel/tcg/tb-maint.c        | 95 +++++++++++++++++++------------------
 accel/tcg/translate-all.c   |  2 +-
 accel/tcg/user-exec.c       | 25 +++++-----
 bsd-user/main.c             | 18 +++----
 bsd-user/mmap.c             | 18 +++----
 bsd-user/signal.c           |  4 +-
 linux-user/elfload.c        | 47 +++++++++---------
 linux-user/main.c           | 44 +++++++++--------
 linux-user/mmap.c           | 38 +++++++--------
 linux-user/signal.c         |  4 +-
 linux-user/syscall.c        |  4 +-
 softmmu/physmem.c           |  2 +-
 target/arm/cpu.c            |  2 +-
 16 files changed, 169 insertions(+), 157 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
@ 2023-03-06  2:12 ` Richard Henderson
  2023-03-06 12:56   ` Peter Maydell
  2023-03-06  2:13 ` [PATCH 2/9] linux-user: Rename max_reserved_va in main Richard Henderson
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:12 UTC (permalink / raw)
  To: qemu-devel

Zero is the value for 'off', and should not be used with -R.
We have been enforcing host page alignment for the non-R
fallback of MAX_RESERVED_VA, but failing to enforce for -R.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/main.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 4ff30ff980..f4dea25242 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -395,6 +395,16 @@ static void handle_arg_reserved_va(const char *arg)
         fprintf(stderr, "Unrecognised -R size suffix '%s'\n", p);
         exit(EXIT_FAILURE);
     }
+    if (reserved_va == 0) {
+        fprintf(stderr, "Invalid -R size value 0\n");
+        exit(EXIT_FAILURE);
+    }
+    /* Must be aligned with the host page size as it is used with mmap. */
+    if (reserved_va & qemu_host_page_mask) {
+        fprintf(stderr, "Invalid -R size value %lu: must be aligned mod %lu\n",
+		reserved_va, qemu_host_page_size);
+        exit(EXIT_FAILURE);
+    }
 }
 
 static void handle_arg_singlestep(const char *arg)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/9] linux-user: Rename max_reserved_va in main
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
  2023-03-06  2:12 ` [PATCH 1/9] linux-user: Diagnose incorrect -R size Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:33   ` Philippe Mathieu-Daudé
  2023-03-06  2:13 ` [PATCH 3/9] include/exec: Replace reserved_va with max_reserved_va Richard Henderson
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Rename to local_max_va, to avoid a conflict with the next patch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/main.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index f4dea25242..5fcaddffc2 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -680,7 +680,7 @@ int main(int argc, char **argv, char **envp)
     int i;
     int ret;
     int execfd;
-    unsigned long max_reserved_va;
+    unsigned long local_max_va;
     bool preserve_argv0;
 
     error_init(argv[0]);
@@ -786,9 +786,9 @@ int main(int argc, char **argv, char **envp)
      * still try it, if directed by the command-line option, but
      * not by default.
      */
-    max_reserved_va = MAX_RESERVED_VA(cpu);
+    local_max_va = MAX_RESERVED_VA(cpu);
     if (reserved_va != 0) {
-        if (max_reserved_va && reserved_va > max_reserved_va) {
+        if (local_max_va && reserved_va > local_max_va) {
             fprintf(stderr, "Reserved virtual address too big\n");
             exit(EXIT_FAILURE);
         }
@@ -797,7 +797,7 @@ int main(int argc, char **argv, char **envp)
          * reserved_va must be aligned with the host page size
          * as it is used with mmap()
          */
-        reserved_va = max_reserved_va & qemu_host_page_mask;
+        reserved_va = local_max_va & qemu_host_page_mask;
     }
 
     {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/9] include/exec: Replace reserved_va with max_reserved_va
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
  2023-03-06  2:12 ` [PATCH 1/9] linux-user: Diagnose incorrect -R size Richard Henderson
  2023-03-06  2:13 ` [PATCH 2/9] linux-user: Rename max_reserved_va in main Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  2:13 ` [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags Richard Henderson
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

In addition to the rename, change the semantics to be the
last byte of the guest va, rather than the following byte.
This avoids some overflow conditions.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h      | 15 ++++++++++++---
 linux-user/arm/target_cpu.h |  2 +-
 bsd-user/main.c             | 18 +++++++-----------
 bsd-user/mmap.c             | 12 ++++++------
 bsd-user/signal.c           |  4 ++--
 linux-user/elfload.c        | 36 ++++++++++++++++++------------------
 linux-user/main.c           | 36 ++++++++++++++++--------------------
 linux-user/mmap.c           | 20 ++++++++++----------
 linux-user/signal.c         |  4 ++--
 target/arm/cpu.c            |  2 +-
 10 files changed, 75 insertions(+), 74 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 2eb1176538..7ef6b9a94d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -152,12 +152,21 @@ static inline void tswap64s(uint64_t *s)
  */
 extern uintptr_t guest_base;
 extern bool have_guest_base;
-extern unsigned long reserved_va;
+
+/*
+ * If non-zero, the guest virtual address space is a contiguous subset
+ * of the host virtual address space, i.e. '-R reserved-va' is in effect
+ * either from the command-line or by default.  The value is the last
+ * byte of the guest address space e.g. UINT32_MAX.
+ *
+ * If zero, the host and guest virtual address spaces are intermingled.
+ */
+extern unsigned long max_reserved_va;
 
 /*
  * Limit the guest addresses as best we can.
  *
- * When not using -R reserved_va, we cannot really limit the guest
+ * When not using -R <reserved-va>, we cannot really limit the guest
  * to less address space than the host.  For 32-bit guests, this
  * acts as a sanity check that we're not giving the guest an address
  * that it cannot even represent.  For 64-bit guests... the address
@@ -171,7 +180,7 @@ extern unsigned long reserved_va;
 #define GUEST_ADDR_MAX_                                                 \
     ((MIN_CONST(TARGET_VIRT_ADDR_SPACE_BITS, TARGET_ABI_BITS) <= 32) ?  \
      UINT32_MAX : ~0ul)
-#define GUEST_ADDR_MAX    (reserved_va ? reserved_va - 1 : GUEST_ADDR_MAX_)
+#define GUEST_ADDR_MAX    (max_reserved_va ? : GUEST_ADDR_MAX_)
 
 #else
 
diff --git a/linux-user/arm/target_cpu.h b/linux-user/arm/target_cpu.h
index 89ba274cfc..f6383a7cd1 100644
--- a/linux-user/arm/target_cpu.h
+++ b/linux-user/arm/target_cpu.h
@@ -30,7 +30,7 @@ static inline unsigned long arm_max_reserved_va(CPUState *cs)
          * the high addresses.  Restrict linux-user to the
          * cached write-back RAM in the system map.
          */
-        return 0x80000000ul;
+        return 0x7ffffffful;
     } else {
         /*
          * We need to be able to map the commpage.
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 41290e16f9..de413bd1d2 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -67,16 +67,12 @@ bool have_guest_base;
 # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
 #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
       (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
-/*
- * There are a number of places where we assign reserved_va to a variable
- * of type abi_ulong and expect it to fit.  Avoid the last page.
- */
-#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)
+#   define MAX_RESERVED_VA  0xfffffffful
 #  else
-#   define MAX_RESERVED_VA  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
+#   define MAX_RESERVED_VA  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
 #  endif
 # else
-#  define MAX_RESERVED_VA  0
+#  define MAX_RESERVED_VA   (-1ul)
 # endif
 #endif
 
@@ -86,9 +82,9 @@ bool have_guest_base;
  * if directed by the command-line option, but not by default.
  */
 #if HOST_LONG_BITS == 64 && TARGET_VIRT_ADDR_SPACE_BITS <= 32
-unsigned long reserved_va = MAX_RESERVED_VA;
+unsigned long max_reserved_va = MAX_RESERVED_VA;
 #else
-unsigned long reserved_va;
+unsigned long max_reserved_va;
 #endif
 
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
@@ -464,8 +460,8 @@ int main(int argc, char **argv)
     target_environ = envlist_to_environ(envlist, NULL);
     envlist_free(envlist);
 
-    if (reserved_va) {
-            mmap_next_start = reserved_va;
+    if (max_reserved_va) {
+        mmap_next_start = max_reserved_va;
     }
 
     {
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index d6c5a344c9..e9a330d599 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -227,14 +227,14 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
     int prot;
     int looped = 0;
 
-    if (size > reserved_va) {
+    if (size > max_reserved_va) {
         return (abi_ulong)-1;
     }
 
     size = HOST_PAGE_ALIGN(size) + alignment;
     end_addr = start + size;
-    if (end_addr > reserved_va) {
-        end_addr = reserved_va;
+    if (end_addr > max_reserved_va) {
+        end_addr = max_reserved_va + 1;
     }
     addr = end_addr - qemu_host_page_size;
 
@@ -243,7 +243,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
             if (looped) {
                 return (abi_ulong)-1;
             }
-            end_addr = reserved_va;
+            end_addr = max_reserved_va + 1;
             addr = end_addr - qemu_host_page_size;
             looped = 1;
             continue;
@@ -291,7 +291,7 @@ static abi_ulong mmap_find_vma_aligned(abi_ulong start, abi_ulong size,
 
     size = HOST_PAGE_ALIGN(size);
 
-    if (reserved_va) {
+    if (max_reserved_va) {
         return mmap_find_vma_reserved(start, size,
             (alignment != 0 ? 1 << alignment : 0));
     }
@@ -759,7 +759,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     ret = 0;
     /* unmap what we can */
     if (real_start < real_end) {
-        if (reserved_va) {
+        if (max_reserved_va) {
             mmap_reserve(real_start, real_end - real_start);
         } else {
             ret = munmap(g2h_untagged(real_start), real_end - real_start);
diff --git a/bsd-user/signal.c b/bsd-user/signal.c
index 58a5386395..be12568f6c 100644
--- a/bsd-user/signal.c
+++ b/bsd-user/signal.c
@@ -492,7 +492,7 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
 
         /*
          * Convert forcefully to guest address space: addresses outside
-         * reserved_va are still valid to report via SEGV_MAPERR.
+         * max_reserved_va are still valid to report via SEGV_MAPERR.
          */
         guest_addr = h2g_nocheck(host_addr);
 
@@ -512,7 +512,7 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
                 }
 
                 /*
-                 * With reserved_va, the whole address space is PROT_NONE,
+                 * With max_reserved_va, the whole address space is PROT_NONE,
                  * which means that we may get ACCERR when we want MAPERR.
                  */
                 if (page_get_flags(guest_addr) & PAGE_VALID) {
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5928c14dfc..104c13ec77 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -204,10 +204,10 @@ static bool init_guest_commpage(void)
      * The vsyscall page is at a high negative address aka kernel space,
      * which means that we cannot actually allocate it with target_mmap.
      * We still should be able to use page_set_flags, unless the user
-     * has specified -R reserved_va, which would trigger an assert().
+     * has specified -R <reserved-va>, which would trigger an assert().
      */
-    if (reserved_va != 0 &&
-        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE >= reserved_va) {
+    if (max_reserved_va != 0 &&
+        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE - 1 >= max_reserved_va) {
         error_report("Cannot allocate vsyscall page");
         exit(EXIT_FAILURE);
     }
@@ -2484,11 +2484,11 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
     }
 
     /* Sanity check the guest binary. */
-    if (reserved_va) {
-        if (guest_hiaddr > reserved_va) {
+    if (max_reserved_va) {
+        if (guest_hiaddr - 1 > max_reserved_va) {
             error_report("%s: requires more than reserved virtual "
                          "address space (0x%" PRIx64 " > 0x%lx)",
-                         image_name, (uint64_t)guest_hiaddr, reserved_va);
+                         image_name, (uint64_t)guest_hiaddr - 1, max_reserved_va);
             exit(EXIT_FAILURE);
         }
     } else {
@@ -2503,16 +2503,16 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
     }
 
     /*
-     * Expand the allocation to the entire reserved_va.
+     * Expand the allocation to the entire max_reserved_va.
      * Exclude the mmap_min_addr hole.
      */
-    if (reserved_va) {
+    if (max_reserved_va) {
         guest_loaddr = (guest_base >= mmap_min_addr ? 0
                         : mmap_min_addr - guest_base);
-        guest_hiaddr = reserved_va;
+        guest_hiaddr = max_reserved_va + 1;
     }
 
-    /* Reserve the address space for the binary, or reserved_va. */
+    /* Reserve the address space for the binary, or max_reserved_va. */
     test = g2h_untagged(guest_loaddr);
     addr = mmap(test, guest_hiaddr - guest_loaddr, PROT_NONE, flags, -1, 0);
     if (test != addr) {
@@ -2716,7 +2716,7 @@ static void pgb_dynamic(const char *image_name, long align)
     if (HI_COMMPAGE) {
         uintptr_t addr, commpage;
 
-        /* 64-bit hosts should have used reserved_va. */
+        /* 64-bit hosts should have used max_reserved_va. */
         assert(sizeof(uintptr_t) == 4);
 
         /*
@@ -2736,15 +2736,15 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
     int flags = MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE;
     void *addr, *test;
 
-    if (guest_hiaddr > reserved_va) {
+    if (guest_hiaddr - 1 > max_reserved_va) {
         error_report("%s: requires more than reserved virtual "
                      "address space (0x%" PRIx64 " > 0x%lx)",
-                     image_name, (uint64_t)guest_hiaddr, reserved_va);
+                     image_name, (uint64_t)guest_hiaddr - 1, max_reserved_va);
         exit(EXIT_FAILURE);
     }
 
     /* Widen the "image" to the entire reserved address space. */
-    pgb_static(image_name, 0, reserved_va, align);
+    pgb_static(image_name, 0, max_reserved_va + 1, align);
 
     /* osdep.h defines this as 0 if it's missing */
     flags |= MAP_FIXED_NOREPLACE;
@@ -2752,17 +2752,17 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
     /* Reserve the memory on the host. */
     assert(guest_base != 0);
     test = g2h_untagged(0);
-    addr = mmap(test, reserved_va, PROT_NONE, flags, -1, 0);
+    addr = mmap(test, max_reserved_va + 1, PROT_NONE, flags, -1, 0);
     if (addr == MAP_FAILED || addr != test) {
         error_report("Unable to reserve 0x%lx bytes of virtual address "
                      "space at %p (%s) for use as guest address space (check your "
                      "virtual memory ulimit setting, min_mmap_addr or reserve less "
-                     "using -R option)", reserved_va, test, strerror(errno));
+                     "using -R option)", max_reserved_va + 1, test, strerror(errno));
         exit(EXIT_FAILURE);
     }
 
     qemu_log_mask(CPU_LOG_PAGE, "%s: base @ %p for %lu bytes\n",
-                  __func__, addr, reserved_va);
+                  __func__, addr, max_reserved_va + 1);
 }
 
 void probe_guest_base(const char *image_name, abi_ulong guest_loaddr,
@@ -2773,7 +2773,7 @@ void probe_guest_base(const char *image_name, abi_ulong guest_loaddr,
 
     if (have_guest_base) {
         pgb_have_guest_base(image_name, guest_loaddr, guest_hiaddr, align);
-    } else if (reserved_va) {
+    } else if (max_reserved_va) {
         pgb_reserved_va(image_name, guest_loaddr, guest_hiaddr, align);
     } else if (guest_loaddr) {
         pgb_static(image_name, guest_loaddr, guest_hiaddr, align);
diff --git a/linux-user/main.c b/linux-user/main.c
index 5fcaddffc2..f5731a1157 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -107,18 +107,16 @@ static const char *last_log_filename;
 # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
 #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
       (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
-/* There are a number of places where we assign reserved_va to a variable
-   of type abi_ulong and expect it to fit.  Avoid the last page.  */
-#   define MAX_RESERVED_VA(CPU)  (0xfffffffful & TARGET_PAGE_MASK)
+#   define MAX_RESERVED_VA(CPU)  0xfffffffful
 #  else
-#   define MAX_RESERVED_VA(CPU)  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
+#   define MAX_RESERVED_VA(CPU)  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
 #  endif
 # else
-#  define MAX_RESERVED_VA(CPU)  0
+#  define MAX_RESERVED_VA(CPU)   (-1ul)
 # endif
 #endif
 
-unsigned long reserved_va;
+unsigned long max_reserved_va;
 
 static void usage(int exitcode);
 
@@ -369,7 +367,8 @@ static void handle_arg_reserved_va(const char *arg)
 {
     char *p;
     int shift = 0;
-    reserved_va = strtoul(arg, &p, 0);
+
+    max_reserved_va = strtoul(arg, &p, 0);
     switch (*p) {
     case 'k':
     case 'K':
@@ -383,10 +382,10 @@ static void handle_arg_reserved_va(const char *arg)
         break;
     }
     if (shift) {
-        unsigned long unshifted = reserved_va;
+        unsigned long unshifted = max_reserved_va;
         p++;
-        reserved_va <<= shift;
-        if (reserved_va >> shift != unshifted) {
+        max_reserved_va <<= shift;
+        if (max_reserved_va >> shift != unshifted) {
             fprintf(stderr, "Reserved virtual address too big\n");
             exit(EXIT_FAILURE);
         }
@@ -395,16 +394,17 @@ static void handle_arg_reserved_va(const char *arg)
         fprintf(stderr, "Unrecognised -R size suffix '%s'\n", p);
         exit(EXIT_FAILURE);
     }
-    if (reserved_va == 0) {
+    if (max_reserved_va == 0) {
         fprintf(stderr, "Invalid -R size value 0\n");
         exit(EXIT_FAILURE);
     }
     /* Must be aligned with the host page size as it is used with mmap. */
-    if (reserved_va & qemu_host_page_mask) {
+    if (max_reserved_va & qemu_host_page_mask) {
         fprintf(stderr, "Invalid -R size value %lu: must be aligned mod %lu\n",
-		reserved_va, qemu_host_page_size);
+                max_reserved_va, qemu_host_page_size);
         exit(EXIT_FAILURE);
     }
+    max_reserved_va--;
 }
 
 static void handle_arg_singlestep(const char *arg)
@@ -787,17 +787,13 @@ int main(int argc, char **argv, char **envp)
      * not by default.
      */
     local_max_va = MAX_RESERVED_VA(cpu);
-    if (reserved_va != 0) {
-        if (local_max_va && reserved_va > local_max_va) {
+    if (max_reserved_va != 0) {
+        if (max_reserved_va > local_max_va) {
             fprintf(stderr, "Reserved virtual address too big\n");
             exit(EXIT_FAILURE);
         }
     } else if (HOST_LONG_BITS == 64 && TARGET_VIRT_ADDR_SPACE_BITS <= 32) {
-        /*
-         * reserved_va must be aligned with the host page size
-         * as it is used with mmap()
-         */
-        reserved_va = local_max_va & qemu_host_page_mask;
+        max_reserved_va = local_max_va;
     }
 
     {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 28135c9e6a..547be8dff6 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -274,16 +274,16 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
     int prot;
     bool looped = false;
 
-    if (size > reserved_va) {
+    if (size > max_reserved_va) {
         return (abi_ulong)-1;
     }
 
     /* Note that start and size have already been aligned by mmap_find_vma. */
 
     end_addr = start + size;
-    if (start > reserved_va - size) {
+    if (start > max_reserved_va + 1 - size) {
         /* Start at the top of the address space.  */
-        end_addr = ((reserved_va - size) & -align) + size;
+        end_addr = ((max_reserved_va + 1 - size) & -align) + size;
         looped = true;
     }
 
@@ -297,7 +297,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
                 return (abi_ulong)-1;
             }
             /* Re-start at the top of the address space.  */
-            addr = end_addr = ((reserved_va - size) & -align) + size;
+            addr = end_addr = ((max_reserved_va + 1 - size) & -align) + size;
             looped = true;
         } else {
             prot = page_get_flags(addr);
@@ -339,7 +339,7 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size, abi_ulong align)
 
     size = HOST_PAGE_ALIGN(size);
 
-    if (reserved_va) {
+    if (max_reserved_va) {
         return mmap_find_vma_reserved(start, size, align);
     }
 
@@ -755,7 +755,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     ret = 0;
     /* unmap what we can */
     if (real_start < real_end) {
-        if (reserved_va) {
+        if (max_reserved_va) {
             mmap_reserve(real_start, real_end - real_start);
         } else {
             ret = munmap(g2h_untagged(real_start), real_end - real_start);
@@ -791,7 +791,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
         host_addr = mremap(g2h_untagged(old_addr), old_size, new_size,
                            flags, g2h_untagged(new_addr));
 
-        if (reserved_va && host_addr != MAP_FAILED) {
+        if (max_reserved_va && host_addr != MAP_FAILED) {
             /* If new and old addresses overlap then the above mremap will
                already have failed with EINVAL.  */
             mmap_reserve(old_addr, old_size);
@@ -808,13 +808,13 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
             host_addr = mremap(g2h_untagged(old_addr), old_size, new_size,
                                flags | MREMAP_FIXED,
                                g2h_untagged(mmap_start));
-            if (reserved_va) {
+            if (max_reserved_va) {
                 mmap_reserve(old_addr, old_size);
             }
         }
     } else {
         int prot = 0;
-        if (reserved_va && old_size < new_size) {
+        if (max_reserved_va && old_size < new_size) {
             abi_ulong addr;
             for (addr = old_addr + old_size;
                  addr < old_addr + new_size;
@@ -834,7 +834,7 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
                                        new_size, old_size, flags);
                     errno = ENOMEM;
                     host_addr = MAP_FAILED;
-                } else if (reserved_va && old_size > new_size) {
+                } else if (max_reserved_va && old_size > new_size) {
                     mmap_reserve(old_addr + old_size, old_size - new_size);
                 }
             }
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 098f3a787d..f40b0a616a 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -808,7 +808,7 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
 
         /*
          * Convert forcefully to guest address space: addresses outside
-         * reserved_va are still valid to report via SEGV_MAPERR.
+         * max_reserved_va are still valid to report via SEGV_MAPERR.
          */
         guest_addr = h2g_nocheck(host_addr);
 
@@ -827,7 +827,7 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
                 }
 
                 /*
-                 * With reserved_va, the whole address space is PROT_NONE,
+                 * With max_reserved_va, the whole address space is PROT_NONE,
                  * which means that we may get ACCERR when we want MAPERR.
                  */
                 if (page_get_flags(guest_addr) & PAGE_VALID) {
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5182ed0c91..b9cd0d3e33 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -264,7 +264,7 @@ static void arm_cpu_reset_hold(Object *obj)
             }
         }
         /*
-         * Enable 48-bit address space (TODO: take reserved_va into account).
+         * Enable 48-bit address space (TODO: take max_reserved_va into account).
          * Enable TBI0 but not TBI1.
          * Note that this must match useronly_clean_ptr.
          */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (2 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 3/9] include/exec: Replace reserved_va with max_reserved_va Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:49   ` Philippe Mathieu-Daudé
  2023-03-06  2:13 ` [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 16 +++++++---------
 bsd-user/mmap.c        |  6 +++---
 linux-user/elfload.c   | 11 ++++++-----
 linux-user/mmap.c      | 16 ++++++++--------
 linux-user/syscall.c   |  4 ++--
 6 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 7ef6b9a94d..748764459c 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -285,7 +285,7 @@ typedef int (*walk_memory_regions_fn)(void *, target_ulong,
 int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
-void page_set_flags(target_ulong start, target_ulong end, int flags);
+void page_set_flags(target_ulong start, target_ulong last, int flags);
 void page_reset_target_data(target_ulong start, target_ulong end);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 7b37fd229e..035f8096b2 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -480,24 +480,22 @@ static bool pageflags_set_clear(target_ulong start, target_ulong last,
  * The flag PAGE_WRITE_ORG is positioned automatically depending
  * on PAGE_WRITE.  The mmap_lock should already be held.
  */
-void page_set_flags(target_ulong start, target_ulong end, int flags)
+void page_set_flags(target_ulong start, target_ulong last, int flags)
 {
-    target_ulong last;
     bool reset = false;
     bool inval_tb = false;
 
     /* This function should never be called with addresses outside the
        guest address space.  If this assert fires, it probably indicates
        a missing call to h2g_valid.  */
-    assert(start < end);
-    assert(end - 1 <= GUEST_ADDR_MAX);
+    assert(start <= last);
+    assert(last <= GUEST_ADDR_MAX);
     /* Only set PAGE_ANON with new mappings. */
     assert(!(flags & PAGE_ANON) || (flags & PAGE_RESET));
     assert_memory_lock();
 
-    start = start & TARGET_PAGE_MASK;
-    end = TARGET_PAGE_ALIGN(end);
-    last = end - 1;
+    start &= TARGET_PAGE_MASK;
+    last |= ~TARGET_PAGE_MASK;
 
     if (!(flags & PAGE_VALID)) {
         flags = 0;
@@ -510,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
     }
 
     if (!flags || reset) {
-        page_reset_target_data(start, end);
+        page_reset_target_data(start, last + 1);
         inval_tb |= pageflags_unset(start, last);
     }
     if (flags) {
@@ -518,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
                                         ~(reset ? 0 : PAGE_STICKY));
     }
     if (inval_tb) {
-        tb_invalidate_phys_range(start, end);
+        tb_invalidate_phys_range(start, last + 1);
     }
 }
 
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index e9a330d599..301fc63817 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -118,7 +118,7 @@ int target_mprotect(abi_ulong start, abi_ulong len, int prot)
         if (ret != 0)
             goto error;
     }
-    page_set_flags(start, start + len, prot | PAGE_VALID);
+    page_set_flags(start, start + len - 1, prot | PAGE_VALID);
     mmap_unlock();
     return 0;
 error:
@@ -656,7 +656,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
         }
     }
  the_end1:
-    page_set_flags(start, start + len, prot | PAGE_VALID);
+    page_set_flags(start, start + len - 1, prot | PAGE_VALID);
  the_end:
 #ifdef DEBUG_MMAP
     printf("ret=0x" TARGET_ABI_FMT_lx "\n", start);
@@ -767,7 +767,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     }
 
     if (ret == 0) {
-        page_set_flags(start, start + len, 0);
+        page_set_flags(start, start + len - 1, 0);
     }
     mmap_unlock();
     return ret;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 104c13ec77..a3431d8d62 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -212,7 +212,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
     page_set_flags(TARGET_VSYSCALL_PAGE,
-                   TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE,
+                   TARGET_VSYSCALL_PAGE | ~TARGET_PAGE_MASK,
                    PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -443,7 +443,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
 
-    page_set_flags(commpage, commpage + qemu_host_page_size,
+    page_set_flags(commpage, commpage | ~qemu_host_page_mask,
                    PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -1315,7 +1315,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
 
-    page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+    page_set_flags(LO_COMMPAGE, LO_COMMPAGE | ~TARGET_PAGE_MASK,
                    PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -1727,7 +1727,7 @@ static bool init_guest_commpage(void)
      * and implement syscalls.  Here, simply mark the page executable.
      * Special case the entry points during translation (see do_page_zero).
      */
-    page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+    page_set_flags(LO_COMMPAGE, LO_COMMPAGE | ~TARGET_PAGE_MASK,
                    PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -2199,7 +2199,8 @@ static void zero_bss(abi_ulong elf_bss, abi_ulong last_bss, int prot)
 
     /* Ensure that the bss page(s) are valid */
     if ((page_get_flags(last_bss-1) & prot) != prot) {
-        page_set_flags(elf_bss & TARGET_PAGE_MASK, last_bss, prot | PAGE_VALID);
+        page_set_flags(elf_bss & TARGET_PAGE_MASK, last_bss - 1,
+                       prot | PAGE_VALID);
     }
 
     if (host_start < host_map_start) {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 547be8dff6..9cf85f4090 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -181,7 +181,7 @@ int target_mprotect(abi_ulong start, abi_ulong len, int target_prot)
         }
     }
 
-    page_set_flags(start, start + len, page_flags);
+    page_set_flags(start, start + len - 1, page_flags);
     ret = 0;
 
 error:
@@ -640,15 +640,15 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot,
     }
     page_flags |= PAGE_RESET;
     if (passthrough_start == passthrough_end) {
-        page_set_flags(start, start + len, page_flags);
+        page_set_flags(start, start + len - 1, page_flags);
     } else {
         if (start < passthrough_start) {
-            page_set_flags(start, passthrough_start, page_flags);
+            page_set_flags(start, passthrough_start - 1, page_flags);
         }
-        page_set_flags(passthrough_start, passthrough_end,
+        page_set_flags(passthrough_start, passthrough_end - 1,
                        page_flags | PAGE_PASSTHROUGH);
         if (passthrough_end < start + len) {
-            page_set_flags(passthrough_end, start + len, page_flags);
+            page_set_flags(passthrough_end, start + len - 1, page_flags);
         }
     }
  the_end:
@@ -763,7 +763,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     }
 
     if (ret == 0) {
-        page_set_flags(start, start + len, 0);
+        page_set_flags(start, start + len - 1, 0);
     }
     mmap_unlock();
     return ret;
@@ -849,8 +849,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
     } else {
         new_addr = h2g(host_addr);
         prot = page_get_flags(old_addr);
-        page_set_flags(old_addr, old_addr + old_size, 0);
-        page_set_flags(new_addr, new_addr + new_size,
+        page_set_flags(old_addr, old_addr + old_size - 1, 0);
+        page_set_flags(new_addr, new_addr + new_size - 1,
                        prot | PAGE_VALID | PAGE_RESET);
     }
     mmap_unlock();
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a6c426d73c..78e14ee875 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4583,7 +4583,7 @@ static inline abi_ulong do_shmat(CPUArchState *cpu_env,
     }
     raddr=h2g((unsigned long)host_raddr);
 
-    page_set_flags(raddr, raddr + shm_info.shm_segsz,
+    page_set_flags(raddr, raddr + shm_info.shm_segsz - 1,
                    PAGE_VALID | PAGE_RESET | PAGE_READ |
                    (shmflg & SHM_RDONLY ? 0 : PAGE_WRITE));
 
@@ -4613,7 +4613,7 @@ static inline abi_long do_shmdt(abi_ulong shmaddr)
     for (i = 0; i < N_SHM_REGIONS; ++i) {
         if (shm_regions[i].in_use && shm_regions[i].start == shmaddr) {
             shm_regions[i].in_use = false;
-            page_set_flags(shmaddr, shmaddr + shm_regions[i].size, 0);
+            page_set_flags(shmaddr, shmaddr + shm_regions[i].size - 1, 0);
             break;
         }
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (3 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:50   ` Philippe Mathieu-Daudé
  2023-03-06  2:13 ` [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 11 +++++------
 linux-user/mmap.c      |  2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 748764459c..a8cb4c905d 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -286,7 +286,7 @@ int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
 void page_set_flags(target_ulong start, target_ulong last, int flags);
-void page_reset_target_data(target_ulong start, target_ulong end);
+void page_reset_target_data(target_ulong start, target_ulong last);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
 /**
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 035f8096b2..20b6fc2f6e 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -508,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong last, int flags)
     }
 
     if (!flags || reset) {
-        page_reset_target_data(start, last + 1);
+        page_reset_target_data(start, last);
         inval_tb |= pageflags_unset(start, last);
     }
     if (flags) {
@@ -814,15 +814,14 @@ typedef struct TargetPageDataNode {
 
 static IntervalTreeRoot targetdata_root;
 
-void page_reset_target_data(target_ulong start, target_ulong end)
+void page_reset_target_data(target_ulong start, target_ulong last)
 {
     IntervalTreeNode *n, *next;
-    target_ulong last;
 
     assert_memory_lock();
 
-    start = start & TARGET_PAGE_MASK;
-    last = TARGET_PAGE_ALIGN(end) - 1;
+    start &= TARGET_PAGE_MASK;
+    last |= ~TARGET_PAGE_MASK;
 
     for (n = interval_tree_iter_first(&targetdata_root, start, last),
          next = n ? interval_tree_iter_next(n, start, last) : NULL;
@@ -885,7 +884,7 @@ void *page_get_target_data(target_ulong address)
     return t->data[(page - region) >> TARGET_PAGE_BITS];
 }
 #else
-void page_reset_target_data(target_ulong start, target_ulong end) { }
+void page_reset_target_data(target_ulong start, target_ulong last) { }
 #endif /* TARGET_PAGE_DATA_SIZE */
 
 /* The softmmu versions of these helpers are in cputlb.c.  */
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 9cf85f4090..c153277afb 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -946,7 +946,7 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
         if (can_passthrough_madvise(start, end)) {
             ret = get_errno(madvise(g2h_untagged(start), len, advice));
             if ((advice == MADV_DONTNEED) && (ret == 0)) {
-                page_reset_target_data(start, start + len);
+                page_reset_target_data(start, start + len - 1);
             }
         }
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (4 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:52   ` Philippe Mathieu-Daudé
  2023-03-06  2:13 ` [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index efefa08ee1..745912e60a 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -125,29 +125,29 @@ static void tb_remove(TranslationBlock *tb)
 }
 
 /* TODO: For now, still shared with translate-all.c for system mode. */
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, T, N)    \
-    for (T = foreach_tb_first(start, end),              \
-         N = foreach_tb_next(T, start, end);            \
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, T, N)   \
+    for (T = foreach_tb_first(start, last),             \
+         N = foreach_tb_next(T, start, last);           \
          T != NULL;                                     \
-         T = N, N = foreach_tb_next(N, start, end))
+         T = N, N = foreach_tb_next(N, start, last))
 
 typedef TranslationBlock *PageForEachNext;
 
 static PageForEachNext foreach_tb_first(tb_page_addr_t start,
-                                        tb_page_addr_t end)
+                                        tb_page_addr_t last)
 {
-    IntervalTreeNode *n = interval_tree_iter_first(&tb_root, start, end - 1);
+    IntervalTreeNode *n = interval_tree_iter_first(&tb_root, start, last);
     return n ? container_of(n, TranslationBlock, itree) : NULL;
 }
 
 static PageForEachNext foreach_tb_next(PageForEachNext tb,
                                        tb_page_addr_t start,
-                                       tb_page_addr_t end)
+                                       tb_page_addr_t last)
 {
     IntervalTreeNode *n;
 
     if (tb) {
-        n = interval_tree_iter_next(&tb->itree, start, end - 1);
+        n = interval_tree_iter_next(&tb->itree, start, last);
         if (n) {
             return container_of(n, TranslationBlock, itree);
         }
@@ -318,7 +318,7 @@ struct page_collection {
 };
 
 typedef int PageForEachNext;
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, tb, n) \
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, tb, n) \
     TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
 
 #ifdef CONFIG_DEBUG_TCG
@@ -993,10 +993,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
 {
     TranslationBlock *tb;
     PageForEachNext n;
+    tb_page_addr_t last = end - 1;
 
     assert_memory_lock();
 
-    PAGE_FOR_EACH_TB(start, end, unused, tb, n) {
+    PAGE_FOR_EACH_TB(start, last, unused, tb, n) {
         tb_phys_invalidate__locked(tb);
     }
 }
@@ -1028,6 +1029,7 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
     bool current_tb_modified;
     TranslationBlock *tb;
     PageForEachNext n;
+    tb_page_addr_t last;
 
     /*
      * Without precise smc semantics, or when outside of a TB,
@@ -1044,10 +1046,11 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
     assert_memory_lock();
     current_tb = tcg_tb_lookup(pc);
 
+    last = addr | ~TARGET_PAGE_MASK;
     addr &= TARGET_PAGE_MASK;
     current_tb_modified = false;
 
-    PAGE_FOR_EACH_TB(addr, addr + TARGET_PAGE_SIZE, unused, tb, n) {
+    PAGE_FOR_EACH_TB(addr, last, unused, tb, n) {
         if (current_tb == tb &&
             (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
             /*
@@ -1089,12 +1092,13 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
     bool current_tb_modified = false;
     TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
+    tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
     /*
      * We remove all the TBs in the range [start, end[.
      * XXX: see if in some cases it could be faster to invalidate all the code
      */
-    PAGE_FOR_EACH_TB(start, end, p, tb, n) {
+    PAGE_FOR_EACH_TB(start, last, p, tb, n) {
         /* NOTE: this is subtle as a TB may span two physical pages */
         if (n == 0) {
             /* NOTE: tb_end may be after the end of the page, but
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (5 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:53   ` Philippe Mathieu-Daudé
  2023-03-06  2:13 ` [PATCH 8/9] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Fixes a bug in the loop comparision where "<= end" would lock
one more page than required.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 745912e60a..c4e15c5591 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -509,20 +509,20 @@ static gint tb_page_addr_cmp(gconstpointer ap, gconstpointer bp, gpointer udata)
 }
 
 /*
- * Lock a range of pages ([@start,@end[) as well as the pages of all
+ * Lock a range of pages ([@start,@last]) as well as the pages of all
  * intersecting TBs.
  * Locking order: acquire locks in ascending order of page index.
  */
 static struct page_collection *page_collection_lock(tb_page_addr_t start,
-                                                    tb_page_addr_t end)
+                                                    tb_page_addr_t last)
 {
     struct page_collection *set = g_malloc(sizeof(*set));
     tb_page_addr_t index;
     PageDesc *pd;
 
     start >>= TARGET_PAGE_BITS;
-    end   >>= TARGET_PAGE_BITS;
-    g_assert(start <= end);
+    last >>= TARGET_PAGE_BITS;
+    g_assert(start <= last);
 
     set->tree = g_tree_new_full(tb_page_addr_cmp, NULL, NULL,
                                 page_entry_destroy);
@@ -532,7 +532,7 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
  retry:
     g_tree_foreach(set->tree, page_entry_lock, NULL);
 
-    for (index = start; index <= end; index++) {
+    for (index = start; index <= last; index++) {
         TranslationBlock *tb;
         PageForEachNext n;
 
@@ -1152,7 +1152,7 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
     struct page_collection *pages;
-    tb_page_addr_t start, end;
+    tb_page_addr_t start, last;
     PageDesc *p;
 
     p = page_find(addr >> TARGET_PAGE_BITS);
@@ -1161,9 +1161,9 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
     }
 
     start = addr & TARGET_PAGE_MASK;
-    end = start + TARGET_PAGE_SIZE;
-    pages = page_collection_lock(start, end);
-    tb_invalidate_phys_page_range__locked(pages, p, start, end, 0);
+    last = addr | ~TARGET_PAGE_MASK;
+    pages = page_collection_lock(start, last);
+    tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
     page_collection_unlock(pages);
 }
 
@@ -1179,7 +1179,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
     struct page_collection *pages;
     tb_page_addr_t next;
 
-    pages = page_collection_lock(start, end);
+    pages = page_collection_lock(start, end - 1);
     for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
          start < end;
          start = next, next += TARGET_PAGE_SIZE) {
@@ -1224,7 +1224,7 @@ void tb_invalidate_phys_range_fast(ram_addr_t ram_addr,
 {
     struct page_collection *pages;
 
-    pages = page_collection_lock(ram_addr, ram_addr + size);
+    pages = page_collection_lock(ram_addr, ram_addr + size - 1);
     tb_invalidate_phys_page_fast__locked(pages, ram_addr, size, retaddr);
     page_collection_unlock(pages);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/9] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (6 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  2:13 ` [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
  2023-03-07  3:19 ` [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Joel Stanley
  9 siblings, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Properly truncate tb_last to the end of the page; the comment about
tb_end being past the end of the page being ok is not correct,
considering overflow.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index c4e15c5591..a93c4c3ef7 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -1082,35 +1082,33 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
 static void
 tb_invalidate_phys_page_range__locked(struct page_collection *pages,
                                       PageDesc *p, tb_page_addr_t start,
-                                      tb_page_addr_t end,
+                                      tb_page_addr_t last,
                                       uintptr_t retaddr)
 {
     TranslationBlock *tb;
-    tb_page_addr_t tb_start, tb_end;
     PageForEachNext n;
 #ifdef TARGET_HAS_PRECISE_SMC
     bool current_tb_modified = false;
     TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
-    tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
     /*
-     * We remove all the TBs in the range [start, end[.
+     * We remove all the TBs in the range [start, last].
      * XXX: see if in some cases it could be faster to invalidate all the code
      */
     PAGE_FOR_EACH_TB(start, last, p, tb, n) {
+        tb_page_addr_t tb_start, tb_last;
+
         /* NOTE: this is subtle as a TB may span two physical pages */
+        tb_start = tb_page_addr0(tb);
+        tb_last = tb_start + tb->size - 1;
         if (n == 0) {
-            /* NOTE: tb_end may be after the end of the page, but
-               it is not a problem */
-            tb_start = tb_page_addr0(tb);
-            tb_end = tb_start + tb->size;
+            tb_last = MIN(tb_last, tb_start | ~TARGET_PAGE_MASK);
         } else {
             tb_start = tb_page_addr1(tb);
-            tb_end = tb_start + ((tb_page_addr0(tb) + tb->size)
-                                 & ~TARGET_PAGE_MASK);
+            tb_last = tb_start + (tb_last & ~TARGET_PAGE_MASK);
         }
-        if (!(tb_end <= start || tb_start >= end)) {
+        if (!(tb_last < start || tb_start > last)) {
 #ifdef TARGET_HAS_PRECISE_SMC
             if (current_tb == tb &&
                 (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
@@ -1163,7 +1161,7 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
     start = addr & TARGET_PAGE_MASK;
     last = addr | ~TARGET_PAGE_MASK;
     pages = page_collection_lock(start, last);
-    tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
+    tb_invalidate_phys_page_range__locked(pages, p, start, last, 0);
     page_collection_unlock(pages);
 }
 
@@ -1190,7 +1188,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
             continue;
         }
         assert_page_locked(pd);
-        tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
+        tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
     }
     page_collection_unlock(pages);
 }
@@ -1210,7 +1208,7 @@ static void tb_invalidate_phys_page_fast__locked(struct page_collection *pages,
     }
 
     assert_page_locked(p);
-    tb_invalidate_phys_page_range__locked(pages, p, start, start + len, ra);
+    tb_invalidate_phys_page_range__locked(pages, p, start, start + len - 1, ra);
 }
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (7 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 8/9] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
@ 2023-03-06  2:13 ` Richard Henderson
  2023-03-06  7:58   ` Philippe Mathieu-Daudé
  2023-03-07  3:19 ` [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Joel Stanley
  9 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06  2:13 UTC (permalink / raw)
  To: qemu-devel

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   |  2 +-
 accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
 accel/tcg/translate-all.c |  2 +-
 accel/tcg/user-exec.c     |  2 +-
 softmmu/physmem.c         |  2 +-
 5 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index e09254333d..58d37276d9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -679,7 +679,7 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs);
 #endif
 void tb_flush(CPUState *cpu);
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end);
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last);
 void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
 
 /* GETPC is the true target of the return instruction that we'll execute.  */
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index a93c4c3ef7..19f88fd048 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -989,11 +989,10 @@ TranslationBlock *tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
  * Called with mmap_lock held for user-mode emulation.
  * NOTE: this function must not be called while a TB is running.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
     TranslationBlock *tb;
     PageForEachNext n;
-    tb_page_addr_t last = end - 1;
 
     assert_memory_lock();
 
@@ -1009,11 +1008,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
  */
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
-    tb_page_addr_t start, end;
+    tb_page_addr_t start, last;
 
     start = addr & TARGET_PAGE_MASK;
-    end = start + TARGET_PAGE_SIZE;
-    tb_invalidate_phys_range(start, end);
+    last = addr | ~TARGET_PAGE_MASK;
+    tb_invalidate_phys_range(start, last);
 }
 
 /*
@@ -1167,28 +1166,30 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
 
 /*
  * Invalidate all TBs which intersect with the target physical address range
- * [start;end[. NOTE: start and end may refer to *different* physical pages.
+ * [start;last]. NOTE: start and end may refer to *different* physical pages.
  * 'is_cpu_write_access' should be true if called from a real cpu write
  * access: the virtual CPU will exit the current TB if code is modified inside
  * this TB.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
     struct page_collection *pages;
-    tb_page_addr_t next;
+    tb_page_addr_t index, index_last;
 
-    pages = page_collection_lock(start, end - 1);
-    for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-         start < end;
-         start = next, next += TARGET_PAGE_SIZE) {
-        PageDesc *pd = page_find(start >> TARGET_PAGE_BITS);
-        tb_page_addr_t bound = MIN(next, end);
+    pages = page_collection_lock(start, last);
+
+    index_last = last >> TARGET_PAGE_BITS;
+    for (index = start >> TARGET_PAGE_BITS; index <= index_last; index++) {
+        PageDesc *pd = page_find(index);
+        tb_page_addr_t bound;
 
         if (pd == NULL) {
             continue;
         }
         assert_page_locked(pd);
-        tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
+        bound = (index << TARGET_PAGE_BITS) | ~TARGET_PAGE_MASK;
+        bound = MIN(bound, last);
+        tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
     }
     page_collection_unlock(pages);
 }
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4b5abc0f44..4500d78a16 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -570,7 +570,7 @@ void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr)
         cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
         addr = get_page_addr_code(env, pc);
         if (addr != -1) {
-            tb_invalidate_phys_range(addr, addr + 1);
+            tb_invalidate_phys_range(addr, addr);
         }
     }
 }
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 20b6fc2f6e..a7e0c3e2f4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -516,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong last, int flags)
                                         ~(reset ? 0 : PAGE_STICKY));
     }
     if (inval_tb) {
-        tb_invalidate_phys_range(start, last + 1);
+        tb_invalidate_phys_range(start, last);
     }
 }
 
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 47143edb4f..abebf5b963 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2521,7 +2521,7 @@ static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
     }
     if (dirty_log_mask & (1 << DIRTY_MEMORY_CODE)) {
         assert(tcg_enabled());
-        tb_invalidate_phys_range(addr, addr + length);
+        tb_invalidate_phys_range(addr, addr + length - 1);
         dirty_log_mask &= ~(1 << DIRTY_MEMORY_CODE);
     }
     cpu_physical_memory_set_dirty_range(addr, length, dirty_log_mask);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/9] linux-user: Rename max_reserved_va in main
  2023-03-06  2:13 ` [PATCH 2/9] linux-user: Rename max_reserved_va in main Richard Henderson
@ 2023-03-06  7:33   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:33 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Rename to local_max_va, to avoid a conflict with the next patch.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   linux-user/main.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags
  2023-03-06  2:13 ` [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags Richard Henderson
@ 2023-03-06  7:49   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Pass the address of the last byte to be changed, rather than
> the first address past the last byte.  This avoids overflow
> when the last page of the address space is involved.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/cpu-all.h |  2 +-
>   accel/tcg/user-exec.c  | 16 +++++++---------
>   bsd-user/mmap.c        |  6 +++---
>   linux-user/elfload.c   | 11 ++++++-----
>   linux-user/mmap.c      | 16 ++++++++--------
>   linux-user/syscall.c   |  4 ++--
>   6 files changed, 27 insertions(+), 28 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data
  2023-03-06  2:13 ` [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
@ 2023-03-06  7:50   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:50 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Pass the address of the last byte to be changed, rather than
> the first address past the last byte.  This avoids overflow
> when the last page of the address space is involved.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/cpu-all.h |  2 +-
>   accel/tcg/user-exec.c  | 11 +++++------
>   linux-user/mmap.c      |  2 +-
>   3 files changed, 7 insertions(+), 8 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
  2023-03-06  2:13 ` [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
@ 2023-03-06  7:52   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Pass the address of the last byte to be changed, rather than
> the first address past the last byte.  This avoids overflow
> when the last page of the address space is involved.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/tb-maint.c | 28 ++++++++++++++++------------
>   1 file changed, 16 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock
  2023-03-06  2:13 ` [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
@ 2023-03-06  7:53   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:53 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Pass the address of the last byte to be changed, rather than
> the first address past the last byte.  This avoids overflow
> when the last page of the address space is involved.
> 
> Fixes a bug in the loop comparision where "<= end" would lock
> one more page than required.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   accel/tcg/tb-maint.c | 22 +++++++++++-----------
>   1 file changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range
  2023-03-06  2:13 ` [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
@ 2023-03-06  7:58   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-03-06  7:58 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

On 6/3/23 03:13, Richard Henderson wrote:
> Pass the address of the last byte to be changed, rather than
> the first address past the last byte.  This avoids overflow
> when the last page of the address space is involved.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/exec-all.h   |  2 +-
>   accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
>   accel/tcg/translate-all.c |  2 +-
>   accel/tcg/user-exec.c     |  2 +-
>   softmmu/physmem.c         |  2 +-
>   5 files changed, 20 insertions(+), 19 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-06  2:12 ` [PATCH 1/9] linux-user: Diagnose incorrect -R size Richard Henderson
@ 2023-03-06 12:56   ` Peter Maydell
  2023-03-06 21:24     ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2023-03-06 12:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Mon, 6 Mar 2023 at 02:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Zero is the value for 'off', and should not be used with -R.
> We have been enforcing host page alignment for the non-R
> fallback of MAX_RESERVED_VA, but failing to enforce for -R.

I'm pretty sure we have users who specifically use "-R 0" to
ask for "definitely turn off any reserved VA".
Here's a random example from an old gcc bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60681
and somebody using it via the environment variable:
https://www.openembedded.org/pipermail/openembedded-core/2015-January/100572.html

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-06 12:56   ` Peter Maydell
@ 2023-03-06 21:24     ` Richard Henderson
  2023-03-07 10:12       ` Peter Maydell
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-06 21:24 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On 3/6/23 04:56, Peter Maydell wrote:
> On Mon, 6 Mar 2023 at 02:14, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Zero is the value for 'off', and should not be used with -R.
>> We have been enforcing host page alignment for the non-R
>> fallback of MAX_RESERVED_VA, but failing to enforce for -R.
> 
> I'm pretty sure we have users who specifically use "-R 0" to
> ask for "definitely turn off any reserved VA".
> Here's a random example from an old gcc bug report:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60681
> and somebody using it via the environment variable:
> https://www.openembedded.org/pipermail/openembedded-core/2015-January/100572.html

Odd.

Well, it won't actually have the effect of "definitely turn off", it will merely leave 
things as per the default, which *will* enable reserved va for 32-bit guests on 64-bit hosts.

The only remaining question is whether we diagnose this oddness or silently accept it.  It 
feels like someone playing with options they don't actually understand and an error is 
warranted.


r~


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528]
  2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
                   ` (8 preceding siblings ...)
  2023-03-06  2:13 ` [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
@ 2023-03-07  3:19 ` Joel Stanley
  2023-03-07 13:55   ` Ninad Palsule
  9 siblings, 1 reply; 25+ messages in thread
From: Joel Stanley @ 2023-03-07  3:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, ninad, Andrew Geissler

On Mon, 6 Mar 2023 at 02:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The primary issue is that of overflow, where "end" for the last
> page of the 32-bit address space overflows to 0.  The fix is to
> use "last" instead, which can always be represented.
>
> This requires that we adjust reserved_va as well, because of
>
> -/*
> - * There are a number of places where we assign reserved_va to a variable
> - * of type abi_ulong and expect it to fit.  Avoid the last page.
> - */
> -#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)
>
> and the related
>
> -        /*
> -         * reserved_va must be aligned with the host page size
> -         * as it is used with mmap()
> -         */
> -        reserved_va = local_max_va & qemu_host_page_mask;
>
> whereby we avoided the final (host | guest) page of the address space
> because of said overflow.  With the change in representation, we can
> always use UINT32_MAX as the end of the 32-bit address space.
>
> This was observable on ppc64le (or any other 64k page host) not being
> able to load any arm32 binary, because the COMMPAGE goes at 0xffff0000,
> which violated that last host page problem above.
>
> The issue is resolved in patch 4, but the rest clean up other interfaces
> with the same issue.  I'm not touching any interfaces that use start+len
> instead of start+end.

Thanks for looking at this Richard. I gave it a spin on a ppc64le host
and it resolved the assert.

Tested-by: Joel Stanley <joel@jms.id.au>

Cheers,

Joel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-06 21:24     ` Richard Henderson
@ 2023-03-07 10:12       ` Peter Maydell
  2023-03-07 10:17         ` Peter Maydell
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2023-03-07 10:12 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Mon, 6 Mar 2023 at 21:24, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 3/6/23 04:56, Peter Maydell wrote:
> > On Mon, 6 Mar 2023 at 02:14, Richard Henderson
> > <richard.henderson@linaro.org> wrote:
> >>
> >> Zero is the value for 'off', and should not be used with -R.
> >> We have been enforcing host page alignment for the non-R
> >> fallback of MAX_RESERVED_VA, but failing to enforce for -R.
> >
> > I'm pretty sure we have users who specifically use "-R 0" to
> > ask for "definitely turn off any reserved VA".
> > Here's a random example from an old gcc bug report:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60681
> > and somebody using it via the environment variable:
> > https://www.openembedded.org/pipermail/openembedded-core/2015-January/100572.html
>
> Odd.
>
> Well, it won't actually have the effect of "definitely turn off", it will merely leave
> things as per the default, which *will* enable reserved va for 32-bit guests on 64-bit hosts.
>
> The only remaining question is whether we diagnose this oddness or silently accept it.  It
> feels like someone playing with options they don't actually understand and an error is
> warranted.

I'm pretty sure I've issued the advice "turn off the reserved
area stuff with -R 0" in the past, for working around various
QEMU bugs where it wasn't able to allocate the whole reserved
area it wanted to but the guest program didn't actually care.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-07 10:12       ` Peter Maydell
@ 2023-03-07 10:17         ` Peter Maydell
  2023-03-17 14:46           ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2023-03-07 10:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, 7 Mar 2023 at 10:12, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Mon, 6 Mar 2023 at 21:24, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > On 3/6/23 04:56, Peter Maydell wrote:
> > > On Mon, 6 Mar 2023 at 02:14, Richard Henderson
> > > <richard.henderson@linaro.org> wrote:
> > >>
> > >> Zero is the value for 'off', and should not be used with -R.
> > >> We have been enforcing host page alignment for the non-R
> > >> fallback of MAX_RESERVED_VA, but failing to enforce for -R.
> > >
> > > I'm pretty sure we have users who specifically use "-R 0" to
> > > ask for "definitely turn off any reserved VA".
> > > Here's a random example from an old gcc bug report:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60681
> > > and somebody using it via the environment variable:
> > > https://www.openembedded.org/pipermail/openembedded-core/2015-January/100572.html
> >
> > Odd.
> >
> > Well, it won't actually have the effect of "definitely turn off", it will merely leave
> > things as per the default, which *will* enable reserved va for 32-bit guests on 64-bit hosts.
> >
> > The only remaining question is whether we diagnose this oddness or silently accept it.  It
> > feels like someone playing with options they don't actually understand and an error is
> > warranted.
>
> I'm pretty sure I've issued the advice "turn off the reserved
> area stuff with -R 0" in the past, for working around various
> QEMU bugs where it wasn't able to allocate the whole reserved
> area it wanted to but the guest program didn't actually care.

It looks like we (inadvertently) broke "-R 0 means turn off"
in 2019 with commit dc18baaef36d95e5; prior to that the
64-on-32 default was set by the initial value of the global
variable and could be overridden on the command line. After
that we ended up doing the default-value stuff after the
command line was parsed instead.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528]
  2023-03-07  3:19 ` [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Joel Stanley
@ 2023-03-07 13:55   ` Ninad Palsule
  0 siblings, 0 replies; 25+ messages in thread
From: Ninad Palsule @ 2023-03-07 13:55 UTC (permalink / raw)
  To: Joel Stanley, Richard Henderson; +Cc: qemu-devel, ninad, Andrew Geissler


On 3/6/23 9:19 PM, Joel Stanley wrote:
> On Mon, 6 Mar 2023 at 02:14, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> The primary issue is that of overflow, where "end" for the last
>> page of the 32-bit address space overflows to 0.  The fix is to
>> use "last" instead, which can always be represented.
>>
>> This requires that we adjust reserved_va as well, because of
>>
>> -/*
>> - * There are a number of places where we assign reserved_va to a variable
>> - * of type abi_ulong and expect it to fit.  Avoid the last page.
>> - */
>> -#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)
>>
>> and the related
>>
>> -        /*
>> -         * reserved_va must be aligned with the host page size
>> -         * as it is used with mmap()
>> -         */
>> -        reserved_va = local_max_va & qemu_host_page_mask;
>>
>> whereby we avoided the final (host | guest) page of the address space
>> because of said overflow.  With the change in representation, we can
>> always use UINT32_MAX as the end of the 32-bit address space.
>>
>> This was observable on ppc64le (or any other 64k page host) not being
>> able to load any arm32 binary, because the COMMPAGE goes at 0xffff0000,
>> which violated that last host page problem above.
>>
>> The issue is resolved in patch 4, but the rest clean up other interfaces
>> with the same issue.  I'm not touching any interfaces that use start+len
>> instead of start+end.

Richard, I tested it on ppc64le host and it fix is working.

Tested-by:NinadPalsule <ninad@linux.ibm.com<mailto:ninad@linux.ibm.com>>

Thx,

Ninad Palsule



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-07 10:17         ` Peter Maydell
@ 2023-03-17 14:46           ` Richard Henderson
  2023-03-17 14:57             ` Peter Maydell
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-03-17 14:46 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On 3/7/23 02:17, Peter Maydell wrote:
> It looks like we (inadvertently) broke "-R 0 means turn off"
> in 2019 with commit dc18baaef36d95e5; prior to that the
> 64-on-32 default was set by the initial value of the global
> variable and could be overridden on the command line. After
> that we ended up doing the default-value stuff after the
> command line was parsed instead.

(Not 64-on-32, but 32-on-64.)

I don't understand how 32-on-64 would ever work without reserved_va.  The host kernel 
would otherwise place mmap blocks anywhere it chooses, which may not be within 4GB of any 
given guest_base.

I would not characterize that patch as "inadvertently broke" but "fixed bug but didn't 
record that fact in the commit message".


r~



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-17 14:46           ` Richard Henderson
@ 2023-03-17 14:57             ` Peter Maydell
  2023-03-17 15:05               ` Peter Maydell
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2023-03-17 14:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Fri, 17 Mar 2023 at 14:46, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 3/7/23 02:17, Peter Maydell wrote:
> > It looks like we (inadvertently) broke "-R 0 means turn off"
> > in 2019 with commit dc18baaef36d95e5; prior to that the
> > 64-on-32 default was set by the initial value of the global
> > variable and could be overridden on the command line. After
> > that we ended up doing the default-value stuff after the
> > command line was parsed instead.
>
> (Not 64-on-32, but 32-on-64.)
>
> I don't understand how 32-on-64 would ever work without reserved_va.  The host kernel
> would otherwise place mmap blocks anywhere it chooses, which may not be within 4GB of any
> given guest_base.

I think most of the use cases weren't doing mmap of any
kind. The gcc test suite is one example of that.

-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] linux-user: Diagnose incorrect -R size
  2023-03-17 14:57             ` Peter Maydell
@ 2023-03-17 15:05               ` Peter Maydell
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Maydell @ 2023-03-17 15:05 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Fri, 17 Mar 2023 at 14:57, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 17 Mar 2023 at 14:46, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > On 3/7/23 02:17, Peter Maydell wrote:
> > > It looks like we (inadvertently) broke "-R 0 means turn off"
> > > in 2019 with commit dc18baaef36d95e5; prior to that the
> > > 64-on-32 default was set by the initial value of the global
> > > variable and could be overridden on the command line. After
> > > that we ended up doing the default-value stuff after the
> > > command line was parsed instead.
> >
> > (Not 64-on-32, but 32-on-64.)
> >
> > I don't understand how 32-on-64 would ever work without reserved_va.  The host kernel
> > would otherwise place mmap blocks anywhere it chooses, which may not be within 4GB of any
> > given guest_base.
>
> I think most of the use cases weren't doing mmap of any
> kind. The gcc test suite is one example of that.

...but in any case, looking at the linux-user/mmap.c
code it doesn't let the kernel give it any old host
address, even in the no-reserved_va code path:
mmap_find_vma() calls mmap() with a hint address it wants
the kernel to try, and it refuses to use addresses which
aren't reachable by the guest (as defined by h2g_valid()).
So as long as the guest program isn't a really heavy
mmap user it will be fine even with a 0 reserved_va.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-03-17 15:06 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-06  2:12 [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Richard Henderson
2023-03-06  2:12 ` [PATCH 1/9] linux-user: Diagnose incorrect -R size Richard Henderson
2023-03-06 12:56   ` Peter Maydell
2023-03-06 21:24     ` Richard Henderson
2023-03-07 10:12       ` Peter Maydell
2023-03-07 10:17         ` Peter Maydell
2023-03-17 14:46           ` Richard Henderson
2023-03-17 14:57             ` Peter Maydell
2023-03-17 15:05               ` Peter Maydell
2023-03-06  2:13 ` [PATCH 2/9] linux-user: Rename max_reserved_va in main Richard Henderson
2023-03-06  7:33   ` Philippe Mathieu-Daudé
2023-03-06  2:13 ` [PATCH 3/9] include/exec: Replace reserved_va with max_reserved_va Richard Henderson
2023-03-06  2:13 ` [PATCH 4/9] accel/tcg: Pass last not end to page_set_flags Richard Henderson
2023-03-06  7:49   ` Philippe Mathieu-Daudé
2023-03-06  2:13 ` [PATCH 5/9] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
2023-03-06  7:50   ` Philippe Mathieu-Daudé
2023-03-06  2:13 ` [PATCH 6/9] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
2023-03-06  7:52   ` Philippe Mathieu-Daudé
2023-03-06  2:13 ` [PATCH 7/9] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
2023-03-06  7:53   ` Philippe Mathieu-Daudé
2023-03-06  2:13 ` [PATCH 8/9] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
2023-03-06  2:13 ` [PATCH 9/9] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
2023-03-06  7:58   ` Philippe Mathieu-Daudé
2023-03-07  3:19 ` [PATCH 0/9] accel/tcg: Fix page_set_flags and related [#1528] Joel Stanley
2023-03-07 13:55   ` Ninad Palsule

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).