qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 00/15] tcg patch queue
@ 2021-06-04 20:11 Richard Henderson
  2021-06-05 15:11 ` Peter Maydell
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2021-06-04 20:11 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 1cbd2d914939ee6028e9688d4ba859a528c28405:

  Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2021-06-04 13:38:49 +0100)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20210604

for you to fetch changes up to 0006039e29b9e6118beab300146f7c4931f7a217:

  tcg/arm: Implement TCG_TARGET_HAS_rotv_vec (2021-06-04 11:50:11 -0700)

----------------------------------------------------------------
Host vector support for arm neon.

----------------------------------------------------------------
Richard Henderson (15):
      tcg: Change parameters for tcg_target_const_match
      tcg/arm: Add host vector framework
      tcg/arm: Implement tcg_out_ld/st for vector types
      tcg/arm: Implement tcg_out_mov for vector types
      tcg/arm: Implement tcg_out_dup*_vec
      tcg/arm: Implement minimal vector operations
      tcg/arm: Implement andc, orc, abs, neg, not vector operations
      tcg/arm: Implement TCG_TARGET_HAS_shi_vec
      tcg/arm: Implement TCG_TARGET_HAS_mul_vec
      tcg/arm: Implement TCG_TARGET_HAS_sat_vec
      tcg/arm: Implement TCG_TARGET_HAS_minmax_vec
      tcg/arm: Implement TCG_TARGET_HAS_bitsel_vec
      tcg/arm: Implement TCG_TARGET_HAS_shv_vec
      tcg/arm: Implement TCG_TARGET_HAS_roti_vec
      tcg/arm: Implement TCG_TARGET_HAS_rotv_vec

 tcg/arm/tcg-target-con-set.h |  10 +
 tcg/arm/tcg-target-con-str.h |   3 +
 tcg/arm/tcg-target.h         |  52 ++-
 tcg/arm/tcg-target.opc.h     |  16 +
 tcg/tcg.c                    |   5 +-
 tcg/aarch64/tcg-target.c.inc |   5 +-
 tcg/arm/tcg-target.c.inc     | 956 +++++++++++++++++++++++++++++++++++++++++--
 tcg/i386/tcg-target.c.inc    |   4 +-
 tcg/mips/tcg-target.c.inc    |   5 +-
 tcg/ppc/tcg-target.c.inc     |   4 +-
 tcg/riscv/tcg-target.c.inc   |   4 +-
 tcg/s390/tcg-target.c.inc    |   5 +-
 tcg/sparc/tcg-target.c.inc   |   5 +-
 tcg/tci/tcg-target.c.inc     |   6 +-
 14 files changed, 1001 insertions(+), 79 deletions(-)
 create mode 100644 tcg/arm/tcg-target.opc.h


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2021-06-04 20:11 Richard Henderson
@ 2021-06-05 15:11 ` Peter Maydell
  0 siblings, 0 replies; 27+ messages in thread
From: Peter Maydell @ 2021-06-05 15:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

On Fri, 4 Jun 2021 at 21:14, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 1cbd2d914939ee6028e9688d4ba859a528c28405:
>
>   Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging (2021-06-04 13:38:49 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20210604
>
> for you to fetch changes up to 0006039e29b9e6118beab300146f7c4931f7a217:
>
>   tcg/arm: Implement TCG_TARGET_HAS_rotv_vec (2021-06-04 11:50:11 -0700)
>
> ----------------------------------------------------------------
> Host vector support for arm neon.
>
> ----------------------------------------------------------------
> Richard Henderson (15):
>       tcg: Change parameters for tcg_target_const_match
>       tcg/arm: Add host vector framework
>       tcg/arm: Implement tcg_out_ld/st for vector types
>       tcg/arm: Implement tcg_out_mov for vector types
>       tcg/arm: Implement tcg_out_dup*_vec
>       tcg/arm: Implement minimal vector operations
>       tcg/arm: Implement andc, orc, abs, neg, not vector operations
>       tcg/arm: Implement TCG_TARGET_HAS_shi_vec
>       tcg/arm: Implement TCG_TARGET_HAS_mul_vec
>       tcg/arm: Implement TCG_TARGET_HAS_sat_vec
>       tcg/arm: Implement TCG_TARGET_HAS_minmax_vec
>       tcg/arm: Implement TCG_TARGET_HAS_bitsel_vec
>       tcg/arm: Implement TCG_TARGET_HAS_shv_vec
>       tcg/arm: Implement TCG_TARGET_HAS_roti_vec
>       tcg/arm: Implement TCG_TARGET_HAS_rotv_vec


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PULL 00/15] tcg patch queue
@ 2021-10-13 18:22 Richard Henderson
  2021-10-13 19:55 ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2021-10-13 18:22 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit ee26ce674a93c824713542cec3b6a9ca85459165:

  Merge remote-tracking branch 'remotes/jsnow/tags/python-pull-request' into staging (2021-10-12 16:08:33 -0700)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20211013

for you to fetch changes up to 76e366e728549b3324cc2dee6745d6a4f1af18e6:

  tcg: Canonicalize alignment flags in MemOp (2021-10-13 09:14:35 -0700)

----------------------------------------------------------------
Use MO_128 for 16-byte atomic memory operations.
Add cpu_ld/st_mmu memory primitives.
Move helper_ld/st memory helpers out of tcg.h.
Canonicalize alignment flags in MemOp.

----------------------------------------------------------------
BALATON Zoltan (1):
      memory: Log access direction for invalid accesses

Richard Henderson (14):
      target/arm: Use MO_128 for 16 byte atomics
      target/i386: Use MO_128 for 16 byte atomics
      target/ppc: Use MO_128 for 16 byte atomics
      target/s390x: Use MO_128 for 16 byte atomics
      target/hexagon: Implement cpu_mmu_index
      accel/tcg: Add cpu_{ld,st}*_mmu interfaces
      accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h
      target/mips: Use cpu_*_data_ra for msa load/store
      target/mips: Use 8-byte memory ops for msa load/store
      target/s390x: Use cpu_*_mmu instead of helper_*_mmu
      target/sparc: Use cpu_*_mmu instead of helper_*_mmu
      target/arm: Use cpu_*_mmu instead of helper_*_mmu
      tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h
      tcg: Canonicalize alignment flags in MemOp

 docs/devel/loads-stores.rst   |  52 +++++-
 include/exec/cpu_ldst.h       | 332 ++++++++++++++++++-----------------
 include/tcg/tcg-ldst.h        |  74 ++++++++
 include/tcg/tcg.h             | 158 -----------------
 target/hexagon/cpu.h          |   9 +
 accel/tcg/cputlb.c            | 393 ++++++++++++++----------------------------
 accel/tcg/user-exec.c         | 385 +++++++++++++++++------------------------
 softmmu/memory.c              |  20 +--
 target/arm/helper-a64.c       |  61 ++-----
 target/arm/m_helper.c         |   6 +-
 target/i386/tcg/mem_helper.c  |   2 +-
 target/m68k/op_helper.c       |   1 -
 target/mips/tcg/msa_helper.c  | 389 ++++++++++-------------------------------
 target/ppc/mem_helper.c       |   1 -
 target/ppc/translate.c        |  12 +-
 target/s390x/tcg/mem_helper.c |  13 +-
 target/sparc/ldst_helper.c    |  14 +-
 tcg/tcg-op.c                  |   7 +-
 tcg/tcg.c                     |   1 +
 tcg/tci.c                     |   1 +
 accel/tcg/ldst_common.c.inc   | 307 +++++++++++++++++++++++++++++++++
 21 files changed, 1032 insertions(+), 1206 deletions(-)
 create mode 100644 include/tcg/tcg-ldst.h
 create mode 100644 accel/tcg/ldst_common.c.inc


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2021-10-13 18:22 Richard Henderson
@ 2021-10-13 19:55 ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2021-10-13 19:55 UTC (permalink / raw)
  To: qemu-devel

On 10/13/21 11:22 AM, Richard Henderson wrote:
> The following changes since commit ee26ce674a93c824713542cec3b6a9ca85459165:
> 
>    Merge remote-tracking branch 'remotes/jsnow/tags/python-pull-request' into staging (2021-10-12 16:08:33 -0700)
> 
> are available in the Git repository at:
> 
>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20211013
> 
> for you to fetch changes up to 76e366e728549b3324cc2dee6745d6a4f1af18e6:
> 
>    tcg: Canonicalize alignment flags in MemOp (2021-10-13 09:14:35 -0700)
> 
> ----------------------------------------------------------------
> Use MO_128 for 16-byte atomic memory operations.
> Add cpu_ld/st_mmu memory primitives.
> Move helper_ld/st memory helpers out of tcg.h.
> Canonicalize alignment flags in MemOp.
> 
> ----------------------------------------------------------------
> BALATON Zoltan (1):
>        memory: Log access direction for invalid accesses
> 
> Richard Henderson (14):
>        target/arm: Use MO_128 for 16 byte atomics
>        target/i386: Use MO_128 for 16 byte atomics
>        target/ppc: Use MO_128 for 16 byte atomics
>        target/s390x: Use MO_128 for 16 byte atomics
>        target/hexagon: Implement cpu_mmu_index
>        accel/tcg: Add cpu_{ld,st}*_mmu interfaces
>        accel/tcg: Move cpu_atomic decls to exec/cpu_ldst.h
>        target/mips: Use cpu_*_data_ra for msa load/store
>        target/mips: Use 8-byte memory ops for msa load/store
>        target/s390x: Use cpu_*_mmu instead of helper_*_mmu
>        target/sparc: Use cpu_*_mmu instead of helper_*_mmu
>        target/arm: Use cpu_*_mmu instead of helper_*_mmu
>        tcg: Move helper_*_mmu decls to tcg/tcg-ldst.h
>        tcg: Canonicalize alignment flags in MemOp
> 
>   docs/devel/loads-stores.rst   |  52 +++++-
>   include/exec/cpu_ldst.h       | 332 ++++++++++++++++++-----------------
>   include/tcg/tcg-ldst.h        |  74 ++++++++
>   include/tcg/tcg.h             | 158 -----------------
>   target/hexagon/cpu.h          |   9 +
>   accel/tcg/cputlb.c            | 393 ++++++++++++++----------------------------
>   accel/tcg/user-exec.c         | 385 +++++++++++++++++------------------------
>   softmmu/memory.c              |  20 +--
>   target/arm/helper-a64.c       |  61 ++-----
>   target/arm/m_helper.c         |   6 +-
>   target/i386/tcg/mem_helper.c  |   2 +-
>   target/m68k/op_helper.c       |   1 -
>   target/mips/tcg/msa_helper.c  | 389 ++++++++++-------------------------------
>   target/ppc/mem_helper.c       |   1 -
>   target/ppc/translate.c        |  12 +-
>   target/s390x/tcg/mem_helper.c |  13 +-
>   target/sparc/ldst_helper.c    |  14 +-
>   tcg/tcg-op.c                  |   7 +-
>   tcg/tcg.c                     |   1 +
>   tcg/tci.c                     |   1 +
>   accel/tcg/ldst_common.c.inc   | 307 +++++++++++++++++++++++++++++++++
>   21 files changed, 1032 insertions(+), 1206 deletions(-)
>   create mode 100644 include/tcg/tcg-ldst.h
>   create mode 100644 accel/tcg/ldst_common.c.inc

Applied, thanks.

r~



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PULL 00/15] tcg patch queue
@ 2023-03-28 22:57 Richard Henderson
  2023-03-28 22:57 ` [PULL 01/15] util: import GTree as QTree Richard Henderson
                   ` (16 more replies)
  0 siblings, 17 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The following changes since commit d37158bb2425e7ebffb167d611be01f1e9e6c86f:

  Update version for v8.0.0-rc2 release (2023-03-28 20:43:21 +0100)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230328

for you to fetch changes up to 87e303de70f93bf700f58412fb9b2c3ec918c4b5:

  softmmu: Restore use of CPU watchpoint for all accelerators (2023-03-28 15:24:06 -0700)

----------------------------------------------------------------
Use a local version of GTree [#285]
Fix page_set_flags vs the last page of the address space [#1528]
Re-enable gdbstub breakpoints under KVM

----------------------------------------------------------------
Emilio Cota (2):
      util: import GTree as QTree
      tcg: use QTree instead of GTree

Philippe Mathieu-Daudé (3):
      softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel
      softmmu/watchpoint: Add missing 'qemu/error-report.h' include
      softmmu: Restore use of CPU watchpoint for all accelerators

Richard Henderson (10):
      linux-user: Diagnose misaligned -R size
      accel/tcg: Pass last not end to page_set_flags
      accel/tcg: Pass last not end to page_reset_target_data
      accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
      accel/tcg: Pass last not end to page_collection_lock
      accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
      accel/tcg: Pass last not end to tb_invalidate_phys_range
      linux-user: Pass last not end to probe_guest_base
      include/exec: Change reserved_va semantics to last byte
      linux-user/arm: Take more care allocating commpage

 configure                     |   15 +
 meson.build                   |    4 +
 include/exec/cpu-all.h        |   15 +-
 include/exec/exec-all.h       |    2 +-
 include/hw/core/cpu.h         |   39 +-
 include/hw/core/tcg-cpu-ops.h |   43 ++
 include/qemu/qtree.h          |  201 ++++++
 linux-user/arm/target_cpu.h   |    2 +-
 linux-user/user-internals.h   |   12 +-
 accel/tcg/tb-maint.c          |  112 ++--
 accel/tcg/translate-all.c     |    2 +-
 accel/tcg/user-exec.c         |   25 +-
 bsd-user/main.c               |   10 +-
 bsd-user/mmap.c               |   10 +-
 linux-user/elfload.c          |   72 ++-
 linux-user/flatload.c         |    2 +-
 linux-user/main.c             |   31 +-
 linux-user/mmap.c             |   22 +-
 linux-user/syscall.c          |    4 +-
 softmmu/physmem.c             |    2 +-
 softmmu/watchpoint.c          |    5 +
 target/arm/tcg/mte_helper.c   |    1 +
 target/arm/tcg/sve_helper.c   |    1 +
 target/s390x/tcg/mem_helper.c |    1 +
 tcg/region.c                  |   19 +-
 tests/bench/qtree-bench.c     |  286 +++++++++
 tests/unit/test-qtree.c       |  333 ++++++++++
 util/qtree.c                  | 1390 +++++++++++++++++++++++++++++++++++++++++
 softmmu/meson.build           |    2 +-
 tests/bench/meson.build       |    4 +
 tests/unit/meson.build        |    1 +
 util/meson.build              |    1 +
 32 files changed, 2474 insertions(+), 195 deletions(-)
 create mode 100644 include/qemu/qtree.h
 create mode 100644 tests/bench/qtree-bench.c
 create mode 100644 tests/unit/test-qtree.c
 create mode 100644 util/qtree.c


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PULL 01/15] util: import GTree as QTree
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 02/15] tcg: use QTree instead of GTree Richard Henderson
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Emilio Cota

From: Emilio Cota <cota@braap.org>

The only reason to add this implementation is to control the memory allocator
used. Some users (e.g. TCG) cannot work reliably in multi-threaded
environments (e.g. forking in user-mode) with GTree's allocator, GSlice.
See https://gitlab.com/qemu-project/qemu/-/issues/285 for details.

Importing GTree is a temporary workaround until GTree migrates away
from GSlice.

This implementation is identical to that in glib v2.75.0, except that
we don't import recent additions to the API nor deprecated API calls,
none of which are used in QEMU.

I've imported tests from glib and added a benchmark just to
make sure that performance is similar. Note: it cannot be identical
because (1) we are not using GSlice, (2) we use different compilation flags
(e.g. -fPIC) and (3) we're linking statically.

$ cat /proc/cpuinfo| grep 'model name' | head -1
model name      : AMD Ryzen 7 PRO 5850U with Radeon Graphics
$ echo '0' | sudo tee /sys/devices/system/cpu/cpufreq/boost
$ tests/bench/qtree-bench

 Tree         Op      32            1024            4096          131072         1048576
------------------------------------------------------------------------------------------------
GTree     Lookup   83.23           43.08           25.31           19.40           16.22
QTree     Lookup  113.42 (1.36x)   53.83 (1.25x)   28.38 (1.12x)   17.64 (0.91x)   13.04 (0.80x)
GTree     Insert   44.23           29.37           25.83           19.49           17.03
QTree     Insert   46.87 (1.06x)   25.62 (0.87x)   24.29 (0.94x)   16.83 (0.86x)   12.97 (0.76x)
GTree     Remove   53.27           35.15           31.43           24.64           16.70
QTree     Remove   57.32 (1.08x)   41.76 (1.19x)   38.37 (1.22x)   29.30 (1.19x)   15.07 (0.90x)
GTree  RemoveAll  135.44          127.52          126.72          120.11           64.34
QTree  RemoveAll  127.15 (0.94x)  110.37 (0.87x)  107.97 (0.85x)   97.13 (0.81x)   55.10 (0.86x)
GTree   Traverse  277.71          276.09          272.78          246.72           98.47
QTree   Traverse  370.33 (1.33x)  411.97 (1.49x)  400.23 (1.47x)  262.82 (1.07x)   78.52 (0.80x)
------------------------------------------------------------------------------------------------

As a sanity check, the same benchmark when Glib's version
is >= $glib_dropped_gslice_version (i.e. QTree == GTree):

 Tree         Op      32            1024            4096          131072         1048576
------------------------------------------------------------------------------------------------
GTree     Lookup   82.72           43.09           24.18           19.73           16.09
QTree     Lookup   81.82 (0.99x)   43.10 (1.00x)   24.20 (1.00x)   19.76 (1.00x)   16.26 (1.01x)
GTree     Insert   45.07           29.62           26.34           19.90           17.18
QTree     Insert   45.72 (1.01x)   29.60 (1.00x)   26.38 (1.00x)   19.71 (0.99x)   17.20 (1.00x)
GTree     Remove   54.48           35.36           31.77           24.97           16.95
QTree     Remove   54.46 (1.00x)   35.32 (1.00x)   31.77 (1.00x)   24.91 (1.00x)   17.15 (1.01x)
GTree  RemoveAll  140.68          127.36          125.43          121.45           68.20
QTree  RemoveAll  140.65 (1.00x)  127.64 (1.00x)  125.01 (1.00x)  121.73 (1.00x)   67.06 (0.98x)
GTree   Traverse  278.68          276.05          266.75          251.65          104.93
QTree   Traverse  278.31 (1.00x)  275.78 (1.00x)  266.42 (1.00x)  247.89 (0.99x)  104.58 (1.00x)
------------------------------------------------------------------------------------------------

Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230205163758.416992-2-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 configure                 |   15 +
 meson.build               |    4 +
 include/qemu/qtree.h      |  201 ++++++
 tests/bench/qtree-bench.c |  286 ++++++++
 tests/unit/test-qtree.c   |  333 +++++++++
 util/qtree.c              | 1390 +++++++++++++++++++++++++++++++++++++
 tests/bench/meson.build   |    4 +
 tests/unit/meson.build    |    1 +
 util/meson.build          |    1 +
 9 files changed, 2235 insertions(+)
 create mode 100644 include/qemu/qtree.h
 create mode 100644 tests/bench/qtree-bench.c
 create mode 100644 tests/unit/test-qtree.c
 create mode 100644 util/qtree.c

diff --git a/configure b/configure
index 05bed4f4a1..800b5850f4 100755
--- a/configure
+++ b/configure
@@ -231,6 +231,7 @@ safe_stack=""
 use_containers="yes"
 gdb_bin=$(command -v "gdb-multiarch" || command -v "gdb")
 gdb_arches=""
+glib_has_gslice="no"
 
 if test -e "$source_path/.git"
 then
@@ -1494,6 +1495,17 @@ for i in $glib_modules; do
     fi
 done
 
+# Check whether glib has gslice, which we have to avoid for correctness.
+# TODO: remove this check and the corresponding workaround (qtree) when
+# the minimum supported glib is >= $glib_dropped_gslice_version.
+glib_dropped_gslice_version=2.75.3
+for i in $glib_modules; do
+    if ! $pkg_config --atleast-version=$glib_dropped_gslice_version $i; then
+        glib_has_gslice="yes"
+	break
+    fi
+done
+
 glib_bindir="$($pkg_config --variable=bindir glib-2.0)"
 if test -z "$glib_bindir" ; then
 	glib_bindir="$($pkg_config --variable=prefix glib-2.0)"/bin
@@ -2420,6 +2432,9 @@ echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
 echo "GLIB_LIBS=$glib_libs" >> $config_host_mak
 echo "GLIB_BINDIR=$glib_bindir" >> $config_host_mak
 echo "GLIB_VERSION=$($pkg_config --modversion glib-2.0)" >> $config_host_mak
+if test "$glib_has_gslice" = "yes" ; then
+    echo "HAVE_GLIB_WITH_SLICE_ALLOCATOR=y" >> $config_host_mak
+fi
 echo "QEMU_LDFLAGS=$QEMU_LDFLAGS" >> $config_host_mak
 echo "EXESUF=$EXESUF" >> $config_host_mak
 
diff --git a/meson.build b/meson.build
index 29f8644d6d..c44d05a13f 100644
--- a/meson.build
+++ b/meson.build
@@ -508,6 +508,10 @@ glib = declare_dependency(compile_args: config_host['GLIB_CFLAGS'].split(),
                           })
 # override glib dep with the configure results (for subprojects)
 meson.override_dependency('glib-2.0', glib)
+# pass down whether Glib has the slice allocator
+if config_host.has_key('HAVE_GLIB_WITH_SLICE_ALLOCATOR')
+  config_host_data.set('HAVE_GLIB_WITH_SLICE_ALLOCATOR', true)
+endif
 
 gio = not_found
 gdbus_codegen = not_found
diff --git a/include/qemu/qtree.h b/include/qemu/qtree.h
new file mode 100644
index 0000000000..69fe74b50d
--- /dev/null
+++ b/include/qemu/qtree.h
@@ -0,0 +1,201 @@
+/*
+ * GLIB - Library of useful routines for C programming
+ * Copyright (C) 1995-1997  Peter Mattis, Spencer Kimball and Josh MacDonald
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Modified by the GLib Team and others 1997-2000.  See the AUTHORS
+ * file for a list of people on the GLib Team.  See the ChangeLog
+ * files for a list of changes.  These files are distributed with
+ * GLib at ftp://ftp.gtk.org/pub/gtk/.
+ */
+
+/*
+ * QTree is a partial import of Glib's GTree. The parts excluded correspond
+ * to API calls either deprecated (e.g. g_tree_traverse) or recently added
+ * (e.g. g_tree_search_node, added in 2.68); neither have callers in QEMU.
+ *
+ * The reason for this import is to allow us to control the memory allocator
+ * used by the tree implementation. Until Glib 2.75.3, GTree uses Glib's
+ * slice allocator, which causes problems when forking in user-mode;
+ * see https://gitlab.com/qemu-project/qemu/-/issues/285 and glib's
+ * "45b5a6c1e gslice: Remove slice allocator and use malloc() instead".
+ *
+ * TODO: remove QTree when QEMU's minimum Glib version is >= 2.75.3.
+ */
+
+#ifndef QEMU_QTREE_H
+#define QEMU_QTREE_H
+
+#include "qemu/osdep.h"
+
+#ifdef HAVE_GLIB_WITH_SLICE_ALLOCATOR
+
+typedef struct _QTree  QTree;
+
+typedef struct _QTreeNode QTreeNode;
+
+typedef gboolean (*QTraverseNodeFunc)(QTreeNode *node,
+                                      gpointer user_data);
+
+/*
+ * Balanced binary trees
+ */
+QTree *q_tree_new(GCompareFunc key_compare_func);
+QTree *q_tree_new_with_data(GCompareDataFunc key_compare_func,
+                            gpointer key_compare_data);
+QTree *q_tree_new_full(GCompareDataFunc key_compare_func,
+                       gpointer key_compare_data,
+                       GDestroyNotify key_destroy_func,
+                       GDestroyNotify value_destroy_func);
+QTree *q_tree_ref(QTree *tree);
+void q_tree_unref(QTree *tree);
+void q_tree_destroy(QTree *tree);
+void q_tree_insert(QTree *tree,
+                   gpointer key,
+                   gpointer value);
+void q_tree_replace(QTree *tree,
+                    gpointer key,
+                    gpointer value);
+gboolean q_tree_remove(QTree *tree,
+                       gconstpointer key);
+gboolean q_tree_steal(QTree *tree,
+                      gconstpointer key);
+gpointer q_tree_lookup(QTree *tree,
+                       gconstpointer key);
+gboolean q_tree_lookup_extended(QTree *tree,
+                                gconstpointer lookup_key,
+                                gpointer *orig_key,
+                                gpointer *value);
+void q_tree_foreach(QTree *tree,
+                    GTraverseFunc func,
+                    gpointer user_data);
+gpointer q_tree_search(QTree *tree,
+                       GCompareFunc search_func,
+                       gconstpointer user_data);
+gint q_tree_height(QTree *tree);
+gint q_tree_nnodes(QTree *tree);
+
+#else /* !HAVE_GLIB_WITH_SLICE_ALLOCATOR */
+
+typedef GTree QTree;
+typedef GTreeNode QTreeNode;
+typedef GTraverseNodeFunc QTraverseNodeFunc;
+
+static inline QTree *q_tree_new(GCompareFunc key_compare_func)
+{
+    return g_tree_new(key_compare_func);
+}
+
+static inline QTree *q_tree_new_with_data(GCompareDataFunc key_compare_func,
+                                          gpointer key_compare_data)
+{
+    return g_tree_new_with_data(key_compare_func, key_compare_data);
+}
+
+static inline QTree *q_tree_new_full(GCompareDataFunc key_compare_func,
+                                     gpointer key_compare_data,
+                                     GDestroyNotify key_destroy_func,
+                                     GDestroyNotify value_destroy_func)
+{
+    return g_tree_new_full(key_compare_func, key_compare_data,
+                           key_destroy_func, value_destroy_func);
+}
+
+static inline QTree *q_tree_ref(QTree *tree)
+{
+    return g_tree_ref(tree);
+}
+
+static inline void q_tree_unref(QTree *tree)
+{
+    g_tree_unref(tree);
+}
+
+static inline void q_tree_destroy(QTree *tree)
+{
+    g_tree_destroy(tree);
+}
+
+static inline void q_tree_insert(QTree *tree,
+                                 gpointer key,
+                                 gpointer value)
+{
+    g_tree_insert(tree, key, value);
+}
+
+static inline void q_tree_replace(QTree *tree,
+                                  gpointer key,
+                                  gpointer value)
+{
+    g_tree_replace(tree, key, value);
+}
+
+static inline gboolean q_tree_remove(QTree *tree,
+                                     gconstpointer key)
+{
+    return g_tree_remove(tree, key);
+}
+
+static inline gboolean q_tree_steal(QTree *tree,
+                                    gconstpointer key)
+{
+    return g_tree_steal(tree, key);
+}
+
+static inline gpointer q_tree_lookup(QTree *tree,
+                                     gconstpointer key)
+{
+    return g_tree_lookup(tree, key);
+}
+
+static inline gboolean q_tree_lookup_extended(QTree *tree,
+                                              gconstpointer lookup_key,
+                                              gpointer *orig_key,
+                                              gpointer *value)
+{
+    return g_tree_lookup_extended(tree, lookup_key, orig_key, value);
+}
+
+static inline void q_tree_foreach(QTree *tree,
+                                  GTraverseFunc func,
+                                  gpointer user_data)
+{
+    return g_tree_foreach(tree, func, user_data);
+}
+
+static inline gpointer q_tree_search(QTree *tree,
+                                     GCompareFunc search_func,
+                                     gconstpointer user_data)
+{
+    return g_tree_search(tree, search_func, user_data);
+}
+
+static inline gint q_tree_height(QTree *tree)
+{
+    return g_tree_height(tree);
+}
+
+static inline gint q_tree_nnodes(QTree *tree)
+{
+    return g_tree_nnodes(tree);
+}
+
+#endif /* HAVE_GLIB_WITH_SLICE_ALLOCATOR */
+
+#endif /* QEMU_QTREE_H */
diff --git a/tests/bench/qtree-bench.c b/tests/bench/qtree-bench.c
new file mode 100644
index 0000000000..f3d7edc76d
--- /dev/null
+++ b/tests/bench/qtree-bench.c
@@ -0,0 +1,286 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#include "qemu/osdep.h"
+#include "qemu/qtree.h"
+#include "qemu/timer.h"
+
+enum tree_op {
+    OP_LOOKUP,
+    OP_INSERT,
+    OP_REMOVE,
+    OP_REMOVE_ALL,
+    OP_TRAVERSE,
+};
+
+struct benchmark {
+    const char * const name;
+    enum tree_op op;
+    bool fill_on_init;
+};
+
+enum impl_type {
+    IMPL_GTREE,
+    IMPL_QTREE,
+};
+
+struct tree_implementation {
+    const char * const name;
+    enum impl_type type;
+};
+
+static const struct benchmark benchmarks[] = {
+    {
+        .name = "Lookup",
+        .op = OP_LOOKUP,
+        .fill_on_init = true,
+    },
+    {
+        .name = "Insert",
+        .op = OP_INSERT,
+        .fill_on_init = false,
+    },
+    {
+        .name = "Remove",
+        .op = OP_REMOVE,
+        .fill_on_init = true,
+    },
+    {
+        .name = "RemoveAll",
+        .op = OP_REMOVE_ALL,
+        .fill_on_init = true,
+    },
+    {
+        .name = "Traverse",
+        .op = OP_TRAVERSE,
+        .fill_on_init = true,
+    },
+};
+
+static const struct tree_implementation impls[] = {
+    {
+        .name = "GTree",
+        .type = IMPL_GTREE,
+    },
+    {
+        .name = "QTree",
+        .type = IMPL_QTREE,
+    },
+};
+
+static int compare_func(const void *ap, const void *bp)
+{
+    const size_t *a = ap;
+    const size_t *b = bp;
+
+    return *a - *b;
+}
+
+static void init_empty_tree_and_keys(enum impl_type impl,
+                                     void **ret_tree, size_t **ret_keys,
+                                     size_t n_elems)
+{
+    size_t *keys = g_malloc_n(n_elems, sizeof(*keys));
+    for (size_t i = 0; i < n_elems; i++) {
+        keys[i] = i;
+    }
+
+    void *tree;
+    switch (impl) {
+    case IMPL_GTREE:
+        tree = g_tree_new(compare_func);
+        break;
+    case IMPL_QTREE:
+        tree = q_tree_new(compare_func);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    *ret_tree = tree;
+    *ret_keys = keys;
+}
+
+static gboolean traverse_func(gpointer key, gpointer value, gpointer data)
+{
+    return FALSE;
+}
+
+static inline void remove_all(void *tree, enum impl_type impl)
+{
+    switch (impl) {
+    case IMPL_GTREE:
+        g_tree_destroy(tree);
+        break;
+    case IMPL_QTREE:
+        q_tree_destroy(tree);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static int64_t run_benchmark(const struct benchmark *bench,
+                             enum impl_type impl,
+                             size_t n_elems)
+{
+    void *tree;
+    size_t *keys;
+
+    init_empty_tree_and_keys(impl, &tree, &keys, n_elems);
+    if (bench->fill_on_init) {
+        for (size_t i = 0; i < n_elems; i++) {
+            switch (impl) {
+            case IMPL_GTREE:
+                g_tree_insert(tree, &keys[i], &keys[i]);
+                break;
+            case IMPL_QTREE:
+                q_tree_insert(tree, &keys[i], &keys[i]);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+        }
+    }
+
+    int64_t start_ns = get_clock();
+    switch (bench->op) {
+    case OP_LOOKUP:
+        for (size_t i = 0; i < n_elems; i++) {
+            void *value;
+            switch (impl) {
+            case IMPL_GTREE:
+                value = g_tree_lookup(tree, &keys[i]);
+                break;
+            case IMPL_QTREE:
+                value = q_tree_lookup(tree, &keys[i]);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            (void)value;
+        }
+        break;
+    case OP_INSERT:
+        for (size_t i = 0; i < n_elems; i++) {
+            switch (impl) {
+            case IMPL_GTREE:
+                g_tree_insert(tree, &keys[i], &keys[i]);
+                break;
+            case IMPL_QTREE:
+                q_tree_insert(tree, &keys[i], &keys[i]);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+        }
+        break;
+    case OP_REMOVE:
+        for (size_t i = 0; i < n_elems; i++) {
+            switch (impl) {
+            case IMPL_GTREE:
+                g_tree_remove(tree, &keys[i]);
+                break;
+            case IMPL_QTREE:
+                q_tree_remove(tree, &keys[i]);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+        }
+        break;
+    case OP_REMOVE_ALL:
+        remove_all(tree, impl);
+        break;
+    case OP_TRAVERSE:
+        switch (impl) {
+        case IMPL_GTREE:
+            g_tree_foreach(tree, traverse_func, NULL);
+            break;
+        case IMPL_QTREE:
+            q_tree_foreach(tree, traverse_func, NULL);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    int64_t ns = get_clock() - start_ns;
+
+    if (bench->op != OP_REMOVE_ALL) {
+        remove_all(tree, impl);
+    }
+    g_free(keys);
+
+    return ns;
+}
+
+int main(int argc, char *argv[])
+{
+    size_t sizes[] = {
+        32,
+        1024,
+        1024 * 4,
+        1024 * 128,
+        1024 * 1024,
+    };
+
+    double res[ARRAY_SIZE(benchmarks)][ARRAY_SIZE(impls)][ARRAY_SIZE(sizes)];
+    for (int i = 0; i < ARRAY_SIZE(sizes); i++) {
+        size_t size = sizes[i];
+        for (int j = 0; j < ARRAY_SIZE(impls); j++) {
+            const struct tree_implementation *impl = &impls[j];
+            for (int k = 0; k < ARRAY_SIZE(benchmarks); k++) {
+                const struct benchmark *bench = &benchmarks[k];
+
+                /* warm-up run */
+                run_benchmark(bench, impl->type, size);
+
+                int64_t total_ns = 0;
+                int64_t n_runs = 0;
+                while (total_ns < 2e8 || n_runs < 5) {
+                    total_ns += run_benchmark(bench, impl->type, size);
+                    n_runs++;
+                }
+                double ns_per_run = (double)total_ns / n_runs;
+
+                /* Throughput, in Mops/s */
+                res[k][j][i] = size / ns_per_run * 1e3;
+            }
+        }
+    }
+
+    printf("# Results' breakdown: Tree, Op and #Elements. Units: Mops/s\n");
+    printf("%5s %10s ", "Tree", "Op");
+    for (int i = 0; i < ARRAY_SIZE(sizes); i++) {
+        printf("%7zu         ", sizes[i]);
+    }
+    printf("\n");
+    char separator[97];
+    for (int i = 0; i < ARRAY_SIZE(separator) - 1; i++) {
+        separator[i] = '-';
+    }
+    separator[ARRAY_SIZE(separator) - 1] = '\0';
+    printf("%s\n", separator);
+    for (int i = 0; i < ARRAY_SIZE(benchmarks); i++) {
+        for (int j = 0; j < ARRAY_SIZE(impls); j++) {
+            printf("%5s %10s ", impls[j].name, benchmarks[i].name);
+            for (int k = 0; k < ARRAY_SIZE(sizes); k++) {
+                printf("%7.2f ", res[i][j][k]);
+                if (j == 0) {
+                    printf("        ");
+                } else {
+                    if (res[i][0][k] != 0) {
+                        double speedup = res[i][j][k] / res[i][0][k];
+                        printf("(%4.2fx) ", speedup);
+                    } else {
+                        printf("(     ) ");
+                    }
+                }
+            }
+            printf("\n");
+        }
+    }
+    printf("%s\n", separator);
+    return 0;
+}
diff --git a/tests/unit/test-qtree.c b/tests/unit/test-qtree.c
new file mode 100644
index 0000000000..4d836d22c7
--- /dev/null
+++ b/tests/unit/test-qtree.c
@@ -0,0 +1,333 @@
+/*
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * Tests for QTree.
+ * Original source: glib
+ *   https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/tests/tree.c
+ *   LGPL license.
+ *   Copyright (C) 1995-1997 Peter Mattis, Spencer Kimball and Josh MacDonald
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/qtree.h"
+
+static gint my_compare(gconstpointer a, gconstpointer b)
+{
+    const char *cha = a;
+    const char *chb = b;
+
+    return *cha - *chb;
+}
+
+static gint my_compare_with_data(gconstpointer a,
+                                 gconstpointer b,
+                                 gpointer user_data)
+{
+    const char *cha = a;
+    const char *chb = b;
+
+    /* just check that we got the right data */
+    g_assert(GPOINTER_TO_INT(user_data) == 123);
+
+    return *cha - *chb;
+}
+
+static gint my_search(gconstpointer a, gconstpointer b)
+{
+    return my_compare(b, a);
+}
+
+static gpointer destroyed_key;
+static gpointer destroyed_value;
+static guint destroyed_key_count;
+static guint destroyed_value_count;
+
+static void my_key_destroy(gpointer key)
+{
+    destroyed_key = key;
+    destroyed_key_count++;
+}
+
+static void my_value_destroy(gpointer value)
+{
+    destroyed_value = value;
+    destroyed_value_count++;
+}
+
+static gint my_traverse(gpointer key, gpointer value, gpointer data)
+{
+    char *ch = key;
+
+    g_assert((*ch) > 0);
+
+    if (*ch == 'd') {
+        return TRUE;
+    }
+
+    return FALSE;
+}
+
+char chars[] =
+    "0123456789"
+    "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+    "abcdefghijklmnopqrstuvwxyz";
+
+char chars2[] =
+    "0123456789"
+    "abcdefghijklmnopqrstuvwxyz";
+
+static gint check_order(gpointer key, gpointer value, gpointer data)
+{
+    char **p = data;
+    char *ch = key;
+
+    g_assert(**p == *ch);
+
+    (*p)++;
+
+    return FALSE;
+}
+
+static void test_tree_search(void)
+{
+    gint i;
+    QTree *tree;
+    gboolean removed;
+    gchar c;
+    gchar *p, *d;
+
+    tree = q_tree_new_with_data(my_compare_with_data, GINT_TO_POINTER(123));
+
+    for (i = 0; chars[i]; i++) {
+        q_tree_insert(tree, &chars[i], &chars[i]);
+    }
+
+    q_tree_foreach(tree, my_traverse, NULL);
+
+    g_assert(q_tree_nnodes(tree) == strlen(chars));
+    g_assert(q_tree_height(tree) == 6);
+
+    p = chars;
+    q_tree_foreach(tree, check_order, &p);
+
+    for (i = 0; i < 26; i++) {
+        removed = q_tree_remove(tree, &chars[i + 10]);
+        g_assert(removed);
+    }
+
+    c = '\0';
+    removed = q_tree_remove(tree, &c);
+    g_assert(!removed);
+
+    q_tree_foreach(tree, my_traverse, NULL);
+
+    g_assert(q_tree_nnodes(tree) == strlen(chars2));
+    g_assert(q_tree_height(tree) == 6);
+
+    p = chars2;
+    q_tree_foreach(tree, check_order, &p);
+
+    for (i = 25; i >= 0; i--) {
+        q_tree_insert(tree, &chars[i + 10], &chars[i + 10]);
+    }
+
+    p = chars;
+    q_tree_foreach(tree, check_order, &p);
+
+    c = '0';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p && *p == c);
+    g_assert(q_tree_lookup_extended(tree, &c, (gpointer *)&d, (gpointer *)&p));
+    g_assert(c == *d && c == *p);
+
+    c = 'A';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p && *p == c);
+
+    c = 'a';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p && *p == c);
+
+    c = 'z';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p && *p == c);
+
+    c = '!';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p == NULL);
+
+    c = '=';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p == NULL);
+
+    c = '|';
+    p = q_tree_lookup(tree, &c);
+    g_assert(p == NULL);
+
+    c = '0';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p && *p == c);
+
+    c = 'A';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p && *p == c);
+
+    c = 'a';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p && *p == c);
+
+    c = 'z';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p && *p == c);
+
+    c = '!';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p == NULL);
+
+    c = '=';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p == NULL);
+
+    c = '|';
+    p = q_tree_search(tree, my_search, &c);
+    g_assert(p == NULL);
+
+    q_tree_destroy(tree);
+}
+
+static void test_tree_remove(void)
+{
+    QTree *tree;
+    char c, d;
+    gint i;
+    gboolean removed;
+
+    tree = q_tree_new_full((GCompareDataFunc)my_compare, NULL,
+                           my_key_destroy,
+                           my_value_destroy);
+
+    for (i = 0; chars[i]; i++) {
+        q_tree_insert(tree, &chars[i], &chars[i]);
+    }
+
+    c = '0';
+    q_tree_insert(tree, &c, &c);
+    g_assert(destroyed_key == &c);
+    g_assert(destroyed_value == &chars[0]);
+    destroyed_key = NULL;
+    destroyed_value = NULL;
+
+    d = '1';
+    q_tree_replace(tree, &d, &d);
+    g_assert(destroyed_key == &chars[1]);
+    g_assert(destroyed_value == &chars[1]);
+    destroyed_key = NULL;
+    destroyed_value = NULL;
+
+    c = '2';
+    removed = q_tree_remove(tree, &c);
+    g_assert(removed);
+    g_assert(destroyed_key == &chars[2]);
+    g_assert(destroyed_value == &chars[2]);
+    destroyed_key = NULL;
+    destroyed_value = NULL;
+
+    c = '3';
+    removed = q_tree_steal(tree, &c);
+    g_assert(removed);
+    g_assert(destroyed_key == NULL);
+    g_assert(destroyed_value == NULL);
+
+    const gchar *remove = "omkjigfedba";
+    for (i = 0; remove[i]; i++) {
+        removed = q_tree_remove(tree, &remove[i]);
+        g_assert(removed);
+    }
+
+    q_tree_destroy(tree);
+}
+
+static void test_tree_destroy(void)
+{
+    QTree *tree;
+    gint i;
+
+    tree = q_tree_new(my_compare);
+
+    for (i = 0; chars[i]; i++) {
+        q_tree_insert(tree, &chars[i], &chars[i]);
+    }
+
+    g_assert(q_tree_nnodes(tree) == strlen(chars));
+
+    g_test_message("nnodes: %d", q_tree_nnodes(tree));
+    q_tree_ref(tree);
+    q_tree_destroy(tree);
+
+    g_test_message("nnodes: %d", q_tree_nnodes(tree));
+    g_assert(q_tree_nnodes(tree) == 0);
+
+    q_tree_unref(tree);
+}
+
+static void test_tree_insert(void)
+{
+    QTree *tree;
+    gchar *p;
+    gint i;
+    gchar *scrambled;
+
+    tree = q_tree_new(my_compare);
+
+    for (i = 0; chars[i]; i++) {
+        q_tree_insert(tree, &chars[i], &chars[i]);
+    }
+    p = chars;
+    q_tree_foreach(tree, check_order, &p);
+
+    q_tree_unref(tree);
+    tree = q_tree_new(my_compare);
+
+    for (i = strlen(chars) - 1; i >= 0; i--) {
+        q_tree_insert(tree, &chars[i], &chars[i]);
+    }
+    p = chars;
+    q_tree_foreach(tree, check_order, &p);
+
+    q_tree_unref(tree);
+    tree = q_tree_new(my_compare);
+
+    scrambled = g_strdup(chars);
+
+    for (i = 0; i < 30; i++) {
+        gchar tmp;
+        gint a, b;
+
+        a = g_random_int_range(0, strlen(scrambled));
+        b = g_random_int_range(0, strlen(scrambled));
+        tmp = scrambled[a];
+        scrambled[a] = scrambled[b];
+        scrambled[b] = tmp;
+    }
+
+    for (i = 0; scrambled[i]; i++) {
+        q_tree_insert(tree, &scrambled[i], &scrambled[i]);
+    }
+    p = chars;
+    q_tree_foreach(tree, check_order, &p);
+
+    g_free(scrambled);
+    q_tree_unref(tree);
+}
+
+int main(int argc, char *argv[])
+{
+    g_test_init(&argc, &argv, NULL);
+
+    g_test_add_func("/qtree/search", test_tree_search);
+    g_test_add_func("/qtree/remove", test_tree_remove);
+    g_test_add_func("/qtree/destroy", test_tree_destroy);
+    g_test_add_func("/qtree/insert", test_tree_insert);
+
+    return g_test_run();
+}
diff --git a/util/qtree.c b/util/qtree.c
new file mode 100644
index 0000000000..deb46c187f
--- /dev/null
+++ b/util/qtree.c
@@ -0,0 +1,1390 @@
+/*
+ * GLIB - Library of useful routines for C programming
+ * Copyright (C) 1995-1997  Peter Mattis, Spencer Kimball and Josh MacDonald
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * Modified by the GLib Team and others 1997-2000.  See the AUTHORS
+ * file for a list of people on the GLib Team.  See the ChangeLog
+ * files for a list of changes.  These files are distributed with
+ * GLib at ftp://ftp.gtk.org/pub/gtk/.
+ */
+
+/*
+ * MT safe
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/qtree.h"
+
+/**
+ * SECTION:trees-binary
+ * @title: Balanced Binary Trees
+ * @short_description: a sorted collection of key/value pairs optimized
+ *                     for searching and traversing in order
+ *
+ * The #QTree structure and its associated functions provide a sorted
+ * collection of key/value pairs optimized for searching and traversing
+ * in order. This means that most of the operations  (access, search,
+ * insertion, deletion, ...) on #QTree are O(log(n)) in average and O(n)
+ * in worst case for time complexity. But, note that maintaining a
+ * balanced sorted #QTree of n elements is done in time O(n log(n)).
+ *
+ * To create a new #QTree use q_tree_new().
+ *
+ * To insert a key/value pair into a #QTree use q_tree_insert()
+ * (O(n log(n))).
+ *
+ * To remove a key/value pair use q_tree_remove() (O(n log(n))).
+ *
+ * To look up the value corresponding to a given key, use
+ * q_tree_lookup() and q_tree_lookup_extended().
+ *
+ * To find out the number of nodes in a #QTree, use q_tree_nnodes(). To
+ * get the height of a #QTree, use q_tree_height().
+ *
+ * To traverse a #QTree, calling a function for each node visited in
+ * the traversal, use q_tree_foreach().
+ *
+ * To destroy a #QTree, use q_tree_destroy().
+ **/
+
+#define MAX_GTREE_HEIGHT 40
+
+/**
+ * QTree:
+ *
+ * The QTree struct is an opaque data structure representing a
+ * [balanced binary tree][glib-Balanced-Binary-Trees]. It should be
+ * accessed only by using the following functions.
+ */
+struct _QTree {
+    QTreeNode        *root;
+    GCompareDataFunc  key_compare;
+    GDestroyNotify    key_destroy_func;
+    GDestroyNotify    value_destroy_func;
+    gpointer          key_compare_data;
+    guint             nnodes;
+    gint              ref_count;
+};
+
+struct _QTreeNode {
+    gpointer   key;         /* key for this node */
+    gpointer   value;       /* value stored at this node */
+    QTreeNode *left;        /* left subtree */
+    QTreeNode *right;       /* right subtree */
+    gint8      balance;     /* height (right) - height (left) */
+    guint8     left_child;
+    guint8     right_child;
+};
+
+
+static QTreeNode *q_tree_node_new(gpointer       key,
+                                  gpointer       value);
+static QTreeNode *q_tree_insert_internal(QTree *tree,
+                                         gpointer key,
+                                         gpointer value,
+                                         gboolean replace);
+static gboolean   q_tree_remove_internal(QTree         *tree,
+                                         gconstpointer  key,
+                                         gboolean       steal);
+static QTreeNode *q_tree_node_balance(QTreeNode     *node);
+static QTreeNode *q_tree_find_node(QTree         *tree,
+                                   gconstpointer  key);
+static QTreeNode *q_tree_node_search(QTreeNode *node,
+                                     GCompareFunc search_func,
+                                     gconstpointer data);
+static QTreeNode *q_tree_node_rotate_left(QTreeNode     *node);
+static QTreeNode *q_tree_node_rotate_right(QTreeNode     *node);
+#ifdef Q_TREE_DEBUG
+static void       q_tree_node_check(QTreeNode     *node);
+#endif
+
+static QTreeNode*
+q_tree_node_new(gpointer key,
+                gpointer value)
+{
+    QTreeNode *node = g_new(QTreeNode, 1);
+
+    node->balance = 0;
+    node->left = NULL;
+    node->right = NULL;
+    node->left_child = FALSE;
+    node->right_child = FALSE;
+    node->key = key;
+    node->value = value;
+
+    return node;
+}
+
+/**
+ * q_tree_new:
+ * @key_compare_func: the function used to order the nodes in the #QTree.
+ *   It should return values similar to the standard strcmp() function -
+ *   0 if the two arguments are equal, a negative value if the first argument
+ *   comes before the second, or a positive value if the first argument comes
+ *   after the second.
+ *
+ * Creates a new #QTree.
+ *
+ * Returns: a newly allocated #QTree
+ */
+QTree *
+q_tree_new(GCompareFunc key_compare_func)
+{
+    g_return_val_if_fail(key_compare_func != NULL, NULL);
+
+    return q_tree_new_full((GCompareDataFunc) key_compare_func, NULL,
+                           NULL, NULL);
+}
+
+/**
+ * q_tree_new_with_data:
+ * @key_compare_func: qsort()-style comparison function
+ * @key_compare_data: data to pass to comparison function
+ *
+ * Creates a new #QTree with a comparison function that accepts user data.
+ * See q_tree_new() for more details.
+ *
+ * Returns: a newly allocated #QTree
+ */
+QTree *
+q_tree_new_with_data(GCompareDataFunc key_compare_func,
+                     gpointer         key_compare_data)
+{
+    g_return_val_if_fail(key_compare_func != NULL, NULL);
+
+    return q_tree_new_full(key_compare_func, key_compare_data,
+                           NULL, NULL);
+}
+
+/**
+ * q_tree_new_full:
+ * @key_compare_func: qsort()-style comparison function
+ * @key_compare_data: data to pass to comparison function
+ * @key_destroy_func: a function to free the memory allocated for the key
+ *   used when removing the entry from the #QTree or %NULL if you don't
+ *   want to supply such a function
+ * @value_destroy_func: a function to free the memory allocated for the
+ *   value used when removing the entry from the #QTree or %NULL if you
+ *   don't want to supply such a function
+ *
+ * Creates a new #QTree like q_tree_new() and allows to specify functions
+ * to free the memory allocated for the key and value that get called when
+ * removing the entry from the #QTree.
+ *
+ * Returns: a newly allocated #QTree
+ */
+QTree *
+q_tree_new_full(GCompareDataFunc key_compare_func,
+                gpointer         key_compare_data,
+                GDestroyNotify   key_destroy_func,
+                GDestroyNotify   value_destroy_func)
+{
+    QTree *tree;
+
+    g_return_val_if_fail(key_compare_func != NULL, NULL);
+
+    tree = g_new(QTree, 1);
+    tree->root               = NULL;
+    tree->key_compare        = key_compare_func;
+    tree->key_destroy_func   = key_destroy_func;
+    tree->value_destroy_func = value_destroy_func;
+    tree->key_compare_data   = key_compare_data;
+    tree->nnodes             = 0;
+    tree->ref_count          = 1;
+
+    return tree;
+}
+
+/**
+ * q_tree_node_first:
+ * @tree: a #QTree
+ *
+ * Returns the first in-order node of the tree, or %NULL
+ * for an empty tree.
+ *
+ * Returns: (nullable) (transfer none): the first node in the tree
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_node_first(QTree *tree)
+{
+    QTreeNode *tmp;
+
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    if (!tree->root) {
+        return NULL;
+    }
+
+    tmp = tree->root;
+
+    while (tmp->left_child) {
+        tmp = tmp->left;
+    }
+
+    return tmp;
+}
+
+/**
+ * q_tree_node_previous
+ * @node: a #QTree node
+ *
+ * Returns the previous in-order node of the tree, or %NULL
+ * if the passed node was already the first one.
+ *
+ * Returns: (nullable) (transfer none): the previous node in the tree
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_node_previous(QTreeNode *node)
+{
+    QTreeNode *tmp;
+
+    g_return_val_if_fail(node != NULL, NULL);
+
+    tmp = node->left;
+
+    if (node->left_child) {
+        while (tmp->right_child) {
+            tmp = tmp->right;
+        }
+    }
+
+    return tmp;
+}
+
+/**
+ * q_tree_node_next
+ * @node: a #QTree node
+ *
+ * Returns the next in-order node of the tree, or %NULL
+ * if the passed node was already the last one.
+ *
+ * Returns: (nullable) (transfer none): the next node in the tree
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_node_next(QTreeNode *node)
+{
+    QTreeNode *tmp;
+
+    g_return_val_if_fail(node != NULL, NULL);
+
+    tmp = node->right;
+
+    if (node->right_child) {
+        while (tmp->left_child) {
+            tmp = tmp->left;
+        }
+    }
+
+    return tmp;
+}
+
+/**
+ * q_tree_remove_all:
+ * @tree: a #QTree
+ *
+ * Removes all nodes from a #QTree and destroys their keys and values,
+ * then resets the #QTree’s root to %NULL.
+ *
+ * Since: 2.70 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static void
+q_tree_remove_all(QTree *tree)
+{
+    QTreeNode *node;
+    QTreeNode *next;
+
+    g_return_if_fail(tree != NULL);
+
+    node = q_tree_node_first(tree);
+
+    while (node) {
+        next = q_tree_node_next(node);
+
+        if (tree->key_destroy_func) {
+            tree->key_destroy_func(node->key);
+        }
+        if (tree->value_destroy_func) {
+            tree->value_destroy_func(node->value);
+        }
+        g_free(node);
+
+#ifdef Q_TREE_DEBUG
+        g_assert(tree->nnodes > 0);
+        tree->nnodes--;
+#endif
+
+        node = next;
+    }
+
+#ifdef Q_TREE_DEBUG
+    g_assert(tree->nnodes == 0);
+#endif
+
+    tree->root = NULL;
+#ifndef Q_TREE_DEBUG
+    tree->nnodes = 0;
+#endif
+}
+
+/**
+ * q_tree_ref:
+ * @tree: a #QTree
+ *
+ * Increments the reference count of @tree by one.
+ *
+ * It is safe to call this function from any thread.
+ *
+ * Returns: the passed in #QTree
+ *
+ * Since: 2.22
+ */
+QTree *
+q_tree_ref(QTree *tree)
+{
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    g_atomic_int_inc(&tree->ref_count);
+
+    return tree;
+}
+
+/**
+ * q_tree_unref:
+ * @tree: a #QTree
+ *
+ * Decrements the reference count of @tree by one.
+ * If the reference count drops to 0, all keys and values will
+ * be destroyed (if destroy functions were specified) and all
+ * memory allocated by @tree will be released.
+ *
+ * It is safe to call this function from any thread.
+ *
+ * Since: 2.22
+ */
+void
+q_tree_unref(QTree *tree)
+{
+    g_return_if_fail(tree != NULL);
+
+    if (g_atomic_int_dec_and_test(&tree->ref_count)) {
+        q_tree_remove_all(tree);
+        g_free(tree);
+    }
+}
+
+/**
+ * q_tree_destroy:
+ * @tree: a #QTree
+ *
+ * Removes all keys and values from the #QTree and decreases its
+ * reference count by one. If keys and/or values are dynamically
+ * allocated, you should either free them first or create the #QTree
+ * using q_tree_new_full(). In the latter case the destroy functions
+ * you supplied will be called on all keys and values before destroying
+ * the #QTree.
+ */
+void
+q_tree_destroy(QTree *tree)
+{
+    g_return_if_fail(tree != NULL);
+
+    q_tree_remove_all(tree);
+    q_tree_unref(tree);
+}
+
+/**
+ * q_tree_insert_node:
+ * @tree: a #QTree
+ * @key: the key to insert
+ * @value: the value corresponding to the key
+ *
+ * Inserts a key/value pair into a #QTree.
+ *
+ * If the given key already exists in the #QTree its corresponding value
+ * is set to the new value. If you supplied a @value_destroy_func when
+ * creating the #QTree, the old value is freed using that function. If
+ * you supplied a @key_destroy_func when creating the #QTree, the passed
+ * key is freed using that function.
+ *
+ * The tree is automatically 'balanced' as new key/value pairs are added,
+ * so that the distance from the root to every leaf is as small as possible.
+ * The cost of maintaining a balanced tree while inserting new key/value
+ * result in a O(n log(n)) operation where most of the other operations
+ * are O(log(n)).
+ *
+ * Returns: (transfer none): the inserted (or set) node.
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_insert_node(QTree    *tree,
+                   gpointer  key,
+                   gpointer  value)
+{
+    QTreeNode *node;
+
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    node = q_tree_insert_internal(tree, key, value, FALSE);
+
+#ifdef Q_TREE_DEBUG
+    q_tree_node_check(tree->root);
+#endif
+
+    return node;
+}
+
+/**
+ * q_tree_insert:
+ * @tree: a #QTree
+ * @key: the key to insert
+ * @value: the value corresponding to the key
+ *
+ * Inserts a key/value pair into a #QTree.
+ *
+ * Inserts a new key and value into a #QTree as q_tree_insert_node() does,
+ * only this function does not return the inserted or set node.
+ */
+void
+q_tree_insert(QTree    *tree,
+              gpointer  key,
+              gpointer  value)
+{
+    q_tree_insert_node(tree, key, value);
+}
+
+/**
+ * q_tree_replace_node:
+ * @tree: a #QTree
+ * @key: the key to insert
+ * @value: the value corresponding to the key
+ *
+ * Inserts a new key and value into a #QTree similar to q_tree_insert_node().
+ * The difference is that if the key already exists in the #QTree, it gets
+ * replaced by the new key. If you supplied a @value_destroy_func when
+ * creating the #QTree, the old value is freed using that function. If you
+ * supplied a @key_destroy_func when creating the #QTree, the old key is
+ * freed using that function.
+ *
+ * The tree is automatically 'balanced' as new key/value pairs are added,
+ * so that the distance from the root to every leaf is as small as possible.
+ *
+ * Returns: (transfer none): the inserted (or set) node.
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_replace_node(QTree    *tree,
+                    gpointer  key,
+                    gpointer  value)
+{
+    QTreeNode *node;
+
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    node = q_tree_insert_internal(tree, key, value, TRUE);
+
+#ifdef Q_TREE_DEBUG
+    q_tree_node_check(tree->root);
+#endif
+
+    return node;
+}
+
+/**
+ * q_tree_replace:
+ * @tree: a #QTree
+ * @key: the key to insert
+ * @value: the value corresponding to the key
+ *
+ * Inserts a new key and value into a #QTree as q_tree_replace_node() does,
+ * only this function does not return the inserted or set node.
+ */
+void
+q_tree_replace(QTree    *tree,
+               gpointer  key,
+               gpointer  value)
+{
+    q_tree_replace_node(tree, key, value);
+}
+
+/* internal insert routine */
+static QTreeNode *
+q_tree_insert_internal(QTree    *tree,
+                       gpointer  key,
+                       gpointer  value,
+                       gboolean  replace)
+{
+    QTreeNode *node, *retnode;
+    QTreeNode *path[MAX_GTREE_HEIGHT];
+    int idx;
+
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    if (!tree->root) {
+        tree->root = q_tree_node_new(key, value);
+        tree->nnodes++;
+        return tree->root;
+    }
+
+    idx = 0;
+    path[idx++] = NULL;
+    node = tree->root;
+
+    while (1) {
+        int cmp = tree->key_compare(key, node->key, tree->key_compare_data);
+
+        if (cmp == 0) {
+            if (tree->value_destroy_func) {
+                tree->value_destroy_func(node->value);
+            }
+
+            node->value = value;
+
+            if (replace) {
+                if (tree->key_destroy_func) {
+                    tree->key_destroy_func(node->key);
+                }
+
+                node->key = key;
+            } else {
+                /* free the passed key */
+                if (tree->key_destroy_func) {
+                    tree->key_destroy_func(key);
+                }
+            }
+
+            return node;
+        } else if (cmp < 0) {
+            if (node->left_child) {
+                path[idx++] = node;
+                node = node->left;
+            } else {
+                QTreeNode *child = q_tree_node_new(key, value);
+
+                child->left = node->left;
+                child->right = node;
+                node->left = child;
+                node->left_child = TRUE;
+                node->balance -= 1;
+
+                tree->nnodes++;
+
+                retnode = child;
+                break;
+            }
+        } else {
+            if (node->right_child) {
+                path[idx++] = node;
+                node = node->right;
+            } else {
+                QTreeNode *child = q_tree_node_new(key, value);
+
+                child->right = node->right;
+                child->left = node;
+                node->right = child;
+                node->right_child = TRUE;
+                node->balance += 1;
+
+                tree->nnodes++;
+
+                retnode = child;
+                break;
+            }
+        }
+    }
+
+    /*
+     * Restore balance. This is the goodness of a non-recursive
+     * implementation, when we are done with balancing we 'break'
+     * the loop and we are done.
+     */
+    while (1) {
+        QTreeNode *bparent = path[--idx];
+        gboolean left_node = (bparent && node == bparent->left);
+        g_assert(!bparent || bparent->left == node || bparent->right == node);
+
+        if (node->balance < -1 || node->balance > 1) {
+            node = q_tree_node_balance(node);
+            if (bparent == NULL) {
+                tree->root = node;
+            } else if (left_node) {
+                bparent->left = node;
+            } else {
+                bparent->right = node;
+            }
+        }
+
+        if (node->balance == 0 || bparent == NULL) {
+            break;
+        }
+
+        if (left_node) {
+            bparent->balance -= 1;
+        } else {
+            bparent->balance += 1;
+        }
+
+        node = bparent;
+    }
+
+    return retnode;
+}
+
+/**
+ * q_tree_remove:
+ * @tree: a #QTree
+ * @key: the key to remove
+ *
+ * Removes a key/value pair from a #QTree.
+ *
+ * If the #QTree was created using q_tree_new_full(), the key and value
+ * are freed using the supplied destroy functions, otherwise you have to
+ * make sure that any dynamically allocated values are freed yourself.
+ * If the key does not exist in the #QTree, the function does nothing.
+ *
+ * The cost of maintaining a balanced tree while removing a key/value
+ * result in a O(n log(n)) operation where most of the other operations
+ * are O(log(n)).
+ *
+ * Returns: %TRUE if the key was found (prior to 2.8, this function
+ *     returned nothing)
+ */
+gboolean
+q_tree_remove(QTree         *tree,
+              gconstpointer  key)
+{
+    gboolean removed;
+
+    g_return_val_if_fail(tree != NULL, FALSE);
+
+    removed = q_tree_remove_internal(tree, key, FALSE);
+
+#ifdef Q_TREE_DEBUG
+    q_tree_node_check(tree->root);
+#endif
+
+    return removed;
+}
+
+/**
+ * q_tree_steal:
+ * @tree: a #QTree
+ * @key: the key to remove
+ *
+ * Removes a key and its associated value from a #QTree without calling
+ * the key and value destroy functions.
+ *
+ * If the key does not exist in the #QTree, the function does nothing.
+ *
+ * Returns: %TRUE if the key was found (prior to 2.8, this function
+ *     returned nothing)
+ */
+gboolean
+q_tree_steal(QTree         *tree,
+             gconstpointer  key)
+{
+    gboolean removed;
+
+    g_return_val_if_fail(tree != NULL, FALSE);
+
+    removed = q_tree_remove_internal(tree, key, TRUE);
+
+#ifdef Q_TREE_DEBUG
+    q_tree_node_check(tree->root);
+#endif
+
+    return removed;
+}
+
+/* internal remove routine */
+static gboolean
+q_tree_remove_internal(QTree         *tree,
+                       gconstpointer  key,
+                       gboolean       steal)
+{
+    QTreeNode *node, *parent, *balance;
+    QTreeNode *path[MAX_GTREE_HEIGHT];
+    int idx;
+    gboolean left_node;
+
+    g_return_val_if_fail(tree != NULL, FALSE);
+
+    if (!tree->root) {
+        return FALSE;
+    }
+
+    idx = 0;
+    path[idx++] = NULL;
+    node = tree->root;
+
+    while (1) {
+        int cmp = tree->key_compare(key, node->key, tree->key_compare_data);
+
+        if (cmp == 0) {
+            break;
+        } else if (cmp < 0) {
+            if (!node->left_child) {
+                return FALSE;
+            }
+
+            path[idx++] = node;
+            node = node->left;
+        } else {
+            if (!node->right_child) {
+                return FALSE;
+            }
+
+            path[idx++] = node;
+            node = node->right;
+        }
+    }
+
+    /*
+     * The following code is almost equal to q_tree_remove_node,
+     * except that we do not have to call q_tree_node_parent.
+     */
+    balance = parent = path[--idx];
+    g_assert(!parent || parent->left == node || parent->right == node);
+    left_node = (parent && node == parent->left);
+
+    if (!node->left_child) {
+        if (!node->right_child) {
+            if (!parent) {
+                tree->root = NULL;
+            } else if (left_node) {
+                parent->left_child = FALSE;
+                parent->left = node->left;
+                parent->balance += 1;
+            } else {
+                parent->right_child = FALSE;
+                parent->right = node->right;
+                parent->balance -= 1;
+            }
+        } else {
+            /* node has a right child */
+            QTreeNode *tmp = q_tree_node_next(node);
+            tmp->left = node->left;
+
+            if (!parent) {
+                tree->root = node->right;
+            } else if (left_node) {
+                parent->left = node->right;
+                parent->balance += 1;
+            } else {
+                parent->right = node->right;
+                parent->balance -= 1;
+            }
+        }
+    } else {
+        /* node has a left child */
+        if (!node->right_child) {
+            QTreeNode *tmp = q_tree_node_previous(node);
+            tmp->right = node->right;
+
+            if (parent == NULL) {
+                tree->root = node->left;
+            } else if (left_node) {
+                parent->left = node->left;
+                parent->balance += 1;
+            } else {
+                parent->right = node->left;
+                parent->balance -= 1;
+            }
+        } else {
+            /* node has a both children (pant, pant!) */
+            QTreeNode *prev = node->left;
+            QTreeNode *next = node->right;
+            QTreeNode *nextp = node;
+            int old_idx = idx + 1;
+            idx++;
+
+            /* path[idx] == parent */
+            /* find the immediately next node (and its parent) */
+            while (next->left_child) {
+                path[++idx] = nextp = next;
+                next = next->left;
+            }
+
+            path[old_idx] = next;
+            balance = path[idx];
+
+            /* remove 'next' from the tree */
+            if (nextp != node) {
+                if (next->right_child) {
+                    nextp->left = next->right;
+                } else {
+                    nextp->left_child = FALSE;
+                }
+                nextp->balance += 1;
+
+                next->right_child = TRUE;
+                next->right = node->right;
+            } else {
+                node->balance -= 1;
+            }
+
+            /* set the prev to point to the right place */
+            while (prev->right_child) {
+                prev = prev->right;
+            }
+            prev->right = next;
+
+            /* prepare 'next' to replace 'node' */
+            next->left_child = TRUE;
+            next->left = node->left;
+            next->balance = node->balance;
+
+            if (!parent) {
+                tree->root = next;
+            } else if (left_node) {
+                parent->left = next;
+            } else {
+                parent->right = next;
+            }
+        }
+    }
+
+    /* restore balance */
+    if (balance) {
+        while (1) {
+            QTreeNode *bparent = path[--idx];
+            g_assert(!bparent ||
+                     bparent->left == balance ||
+                     bparent->right == balance);
+            left_node = (bparent && balance == bparent->left);
+
+            if (balance->balance < -1 || balance->balance > 1) {
+                balance = q_tree_node_balance(balance);
+                if (!bparent) {
+                    tree->root = balance;
+                } else if (left_node) {
+                    bparent->left = balance;
+                } else {
+                    bparent->right = balance;
+                }
+            }
+
+            if (balance->balance != 0 || !bparent) {
+                break;
+            }
+
+            if (left_node) {
+                bparent->balance += 1;
+            } else {
+                bparent->balance -= 1;
+            }
+
+            balance = bparent;
+        }
+    }
+
+    if (!steal) {
+        if (tree->key_destroy_func) {
+            tree->key_destroy_func(node->key);
+        }
+        if (tree->value_destroy_func) {
+            tree->value_destroy_func(node->value);
+        }
+    }
+
+    g_free(node);
+
+    tree->nnodes--;
+
+    return TRUE;
+}
+
+/**
+ * q_tree_lookup_node:
+ * @tree: a #QTree
+ * @key: the key to look up
+ *
+ * Gets the tree node corresponding to the given key. Since a #QTree is
+ * automatically balanced as key/value pairs are added, key lookup
+ * is O(log n) (where n is the number of key/value pairs in the tree).
+ *
+ * Returns: (nullable) (transfer none): the tree node corresponding to
+ *          the key, or %NULL if the key was not found
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_lookup_node(QTree         *tree,
+                   gconstpointer  key)
+{
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    return q_tree_find_node(tree, key);
+}
+
+/**
+ * q_tree_lookup:
+ * @tree: a #QTree
+ * @key: the key to look up
+ *
+ * Gets the value corresponding to the given key. Since a #QTree is
+ * automatically balanced as key/value pairs are added, key lookup
+ * is O(log n) (where n is the number of key/value pairs in the tree).
+ *
+ * Returns: the value corresponding to the key, or %NULL
+ *     if the key was not found
+ */
+gpointer
+q_tree_lookup(QTree         *tree,
+              gconstpointer  key)
+{
+    QTreeNode *node;
+
+    node = q_tree_lookup_node(tree, key);
+
+    return node ? node->value : NULL;
+}
+
+/**
+ * q_tree_lookup_extended:
+ * @tree: a #QTree
+ * @lookup_key: the key to look up
+ * @orig_key: (out) (optional) (nullable): returns the original key
+ * @value: (out) (optional) (nullable): returns the value associated with
+ *         the key
+ *
+ * Looks up a key in the #QTree, returning the original key and the
+ * associated value. This is useful if you need to free the memory
+ * allocated for the original key, for example before calling
+ * q_tree_remove().
+ *
+ * Returns: %TRUE if the key was found in the #QTree
+ */
+gboolean
+q_tree_lookup_extended(QTree         *tree,
+                       gconstpointer  lookup_key,
+                       gpointer      *orig_key,
+                       gpointer      *value)
+{
+    QTreeNode *node;
+
+    g_return_val_if_fail(tree != NULL, FALSE);
+
+    node = q_tree_find_node(tree, lookup_key);
+
+    if (node) {
+        if (orig_key) {
+            *orig_key = node->key;
+        }
+        if (value) {
+            *value = node->value;
+        }
+        return TRUE;
+    } else {
+        return FALSE;
+    }
+}
+
+/**
+ * q_tree_foreach:
+ * @tree: a #QTree
+ * @func: the function to call for each node visited.
+ *     If this function returns %TRUE, the traversal is stopped.
+ * @user_data: user data to pass to the function
+ *
+ * Calls the given function for each of the key/value pairs in the #QTree.
+ * The function is passed the key and value of each pair, and the given
+ * @data parameter. The tree is traversed in sorted order.
+ *
+ * The tree may not be modified while iterating over it (you can't
+ * add/remove items). To remove all items matching a predicate, you need
+ * to add each item to a list in your #GTraverseFunc as you walk over
+ * the tree, then walk the list and remove each item.
+ */
+void
+q_tree_foreach(QTree         *tree,
+               GTraverseFunc  func,
+               gpointer       user_data)
+{
+    QTreeNode *node;
+
+    g_return_if_fail(tree != NULL);
+
+    if (!tree->root) {
+        return;
+    }
+
+    node = q_tree_node_first(tree);
+
+    while (node) {
+        if ((*func)(node->key, node->value, user_data)) {
+            break;
+        }
+
+        node = q_tree_node_next(node);
+    }
+}
+
+/**
+ * q_tree_search_node:
+ * @tree: a #QTree
+ * @search_func: a function used to search the #QTree
+ * @user_data: the data passed as the second argument to @search_func
+ *
+ * Searches a #QTree using @search_func.
+ *
+ * The @search_func is called with a pointer to the key of a key/value
+ * pair in the tree, and the passed in @user_data. If @search_func returns
+ * 0 for a key/value pair, then the corresponding node is returned as
+ * the result of q_tree_search(). If @search_func returns -1, searching
+ * will proceed among the key/value pairs that have a smaller key; if
+ * @search_func returns 1, searching will proceed among the key/value
+ * pairs that have a larger key.
+ *
+ * Returns: (nullable) (transfer none): the node corresponding to the
+ *          found key, or %NULL if the key was not found
+ *
+ * Since: 2.68 in GLib. Internal in Qtree, i.e. not in the public API.
+ */
+static QTreeNode *
+q_tree_search_node(QTree         *tree,
+                   GCompareFunc   search_func,
+                   gconstpointer  user_data)
+{
+    g_return_val_if_fail(tree != NULL, NULL);
+
+    if (!tree->root) {
+        return NULL;
+    }
+
+    return q_tree_node_search(tree->root, search_func, user_data);
+}
+
+/**
+ * q_tree_search:
+ * @tree: a #QTree
+ * @search_func: a function used to search the #QTree
+ * @user_data: the data passed as the second argument to @search_func
+ *
+ * Searches a #QTree using @search_func.
+ *
+ * The @search_func is called with a pointer to the key of a key/value
+ * pair in the tree, and the passed in @user_data. If @search_func returns
+ * 0 for a key/value pair, then the corresponding value is returned as
+ * the result of q_tree_search(). If @search_func returns -1, searching
+ * will proceed among the key/value pairs that have a smaller key; if
+ * @search_func returns 1, searching will proceed among the key/value
+ * pairs that have a larger key.
+ *
+ * Returns: the value corresponding to the found key, or %NULL
+ *     if the key was not found
+ */
+gpointer
+q_tree_search(QTree         *tree,
+              GCompareFunc   search_func,
+              gconstpointer  user_data)
+{
+    QTreeNode *node;
+
+    node = q_tree_search_node(tree, search_func, user_data);
+
+    return node ? node->value : NULL;
+}
+
+/**
+ * q_tree_height:
+ * @tree: a #QTree
+ *
+ * Gets the height of a #QTree.
+ *
+ * If the #QTree contains no nodes, the height is 0.
+ * If the #QTree contains only one root node the height is 1.
+ * If the root node has children the height is 2, etc.
+ *
+ * Returns: the height of @tree
+ */
+gint
+q_tree_height(QTree *tree)
+{
+    QTreeNode *node;
+    gint height;
+
+    g_return_val_if_fail(tree != NULL, 0);
+
+    if (!tree->root) {
+        return 0;
+    }
+
+    height = 0;
+    node = tree->root;
+
+    while (1) {
+        height += 1 + MAX(node->balance, 0);
+
+        if (!node->left_child) {
+            return height;
+        }
+
+        node = node->left;
+    }
+}
+
+/**
+ * q_tree_nnodes:
+ * @tree: a #QTree
+ *
+ * Gets the number of nodes in a #QTree.
+ *
+ * Returns: the number of nodes in @tree
+ */
+gint
+q_tree_nnodes(QTree *tree)
+{
+    g_return_val_if_fail(tree != NULL, 0);
+
+    return tree->nnodes;
+}
+
+static QTreeNode *
+q_tree_node_balance(QTreeNode *node)
+{
+    if (node->balance < -1) {
+        if (node->left->balance > 0) {
+            node->left = q_tree_node_rotate_left(node->left);
+        }
+        node = q_tree_node_rotate_right(node);
+    } else if (node->balance > 1) {
+        if (node->right->balance < 0) {
+            node->right = q_tree_node_rotate_right(node->right);
+        }
+        node = q_tree_node_rotate_left(node);
+    }
+
+    return node;
+}
+
+static QTreeNode *
+q_tree_find_node(QTree        *tree,
+                 gconstpointer key)
+{
+    QTreeNode *node;
+    gint cmp;
+
+    node = tree->root;
+    if (!node) {
+        return NULL;
+    }
+
+    while (1) {
+        cmp = tree->key_compare(key, node->key, tree->key_compare_data);
+        if (cmp == 0) {
+            return node;
+        } else if (cmp < 0) {
+            if (!node->left_child) {
+                return NULL;
+            }
+
+            node = node->left;
+        } else {
+            if (!node->right_child) {
+                return NULL;
+            }
+
+            node = node->right;
+        }
+    }
+}
+
+static QTreeNode *
+q_tree_node_search(QTreeNode     *node,
+                   GCompareFunc   search_func,
+                   gconstpointer  data)
+{
+    gint dir;
+
+    if (!node) {
+        return NULL;
+    }
+
+    while (1) {
+        dir = (*search_func)(node->key, data);
+        if (dir == 0) {
+            return node;
+        } else if (dir < 0) {
+            if (!node->left_child) {
+                return NULL;
+            }
+
+            node = node->left;
+        } else {
+            if (!node->right_child) {
+                return NULL;
+            }
+
+            node = node->right;
+        }
+    }
+}
+
+static QTreeNode *
+q_tree_node_rotate_left(QTreeNode *node)
+{
+    QTreeNode *right;
+    gint a_bal;
+    gint b_bal;
+
+    right = node->right;
+
+    if (right->left_child) {
+        node->right = right->left;
+    } else {
+        node->right_child = FALSE;
+        right->left_child = TRUE;
+    }
+    right->left = node;
+
+    a_bal = node->balance;
+    b_bal = right->balance;
+
+    if (b_bal <= 0) {
+        if (a_bal >= 1) {
+            right->balance = b_bal - 1;
+        } else {
+            right->balance = a_bal + b_bal - 2;
+        }
+        node->balance = a_bal - 1;
+    } else {
+        if (a_bal <= b_bal) {
+            right->balance = a_bal - 2;
+        } else {
+            right->balance = b_bal - 1;
+        }
+        node->balance = a_bal - b_bal - 1;
+    }
+
+    return right;
+}
+
+static QTreeNode *
+q_tree_node_rotate_right(QTreeNode *node)
+{
+    QTreeNode *left;
+    gint a_bal;
+    gint b_bal;
+
+    left = node->left;
+
+    if (left->right_child) {
+        node->left = left->right;
+    } else {
+        node->left_child = FALSE;
+        left->right_child = TRUE;
+    }
+    left->right = node;
+
+    a_bal = node->balance;
+    b_bal = left->balance;
+
+    if (b_bal <= 0) {
+        if (b_bal > a_bal) {
+            left->balance = b_bal + 1;
+        } else {
+            left->balance = a_bal + 2;
+        }
+        node->balance = a_bal - b_bal + 1;
+    } else {
+        if (a_bal <= -1) {
+            left->balance = b_bal + 1;
+        } else {
+            left->balance = a_bal + b_bal + 2;
+        }
+        node->balance = a_bal + 1;
+    }
+
+    return left;
+}
+
+#ifdef Q_TREE_DEBUG
+static gint
+q_tree_node_height(QTreeNode *node)
+{
+    gint left_height;
+    gint right_height;
+
+    if (node) {
+        left_height = 0;
+        right_height = 0;
+
+        if (node->left_child) {
+            left_height = q_tree_node_height(node->left);
+        }
+
+        if (node->right_child) {
+            right_height = q_tree_node_height(node->right);
+        }
+
+        return MAX(left_height, right_height) + 1;
+    }
+
+    return 0;
+}
+
+static void q_tree_node_check(QTreeNode *node)
+{
+    gint left_height;
+    gint right_height;
+    gint balance;
+    QTreeNode *tmp;
+
+    if (node) {
+        if (node->left_child) {
+            tmp = q_tree_node_previous(node);
+            g_assert(tmp->right == node);
+        }
+
+        if (node->right_child) {
+            tmp = q_tree_node_next(node);
+            g_assert(tmp->left == node);
+        }
+
+        left_height = 0;
+        right_height = 0;
+
+        if (node->left_child) {
+            left_height = q_tree_node_height(node->left);
+        }
+        if (node->right_child) {
+            right_height = q_tree_node_height(node->right);
+        }
+
+        balance = right_height - left_height;
+        g_assert(balance == node->balance);
+
+        if (node->left_child) {
+            q_tree_node_check(node->left);
+        }
+        if (node->right_child) {
+            q_tree_node_check(node->right);
+        }
+    }
+}
+#endif
diff --git a/tests/bench/meson.build b/tests/bench/meson.build
index 7477a1f401..4e6b469066 100644
--- a/tests/bench/meson.build
+++ b/tests/bench/meson.build
@@ -9,6 +9,10 @@ xbzrle_bench = executable('xbzrle-bench',
                        dependencies: [qemuutil,migration])
 endif
 
+qtree_bench = executable('qtree-bench',
+                         sources: 'qtree-bench.c',
+                         dependencies: [qemuutil])
+
 executable('atomic_add-bench',
            sources: files('atomic_add-bench.c'),
            dependencies: [qemuutil],
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index fa63cfe6ff..3bc78d8660 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -36,6 +36,7 @@ tests = {
   'test-rcu-slist': [],
   'test-qdist': [],
   'test-qht': [],
+  'test-qtree': [],
   'test-bitops': [],
   'test-bitcnt': [],
   'test-qgraph': ['../qtest/libqos/qgraph.c'],
diff --git a/util/meson.build b/util/meson.build
index 26c73e586b..3c2cfc6ede 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -26,6 +26,7 @@ util_ss.add(when: 'CONFIG_WIN32', if_true: files('oslib-win32.c'))
 util_ss.add(when: 'CONFIG_WIN32', if_true: files('qemu-thread-win32.c'))
 util_ss.add(when: 'CONFIG_WIN32', if_true: winmm)
 util_ss.add(when: 'CONFIG_WIN32', if_true: pathcch)
+util_ss.add(when: 'HAVE_GLIB_WITH_SLICE_ALLOCATOR', if_true: files('qtree.c'))
 util_ss.add(files('envlist.c', 'path.c', 'module.c'))
 util_ss.add(files('host-utils.c'))
 util_ss.add(files('bitmap.c', 'bitops.c'))
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 02/15] tcg: use QTree instead of GTree
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
  2023-03-28 22:57 ` [PULL 01/15] util: import GTree as QTree Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 03/15] linux-user: Diagnose misaligned -R size Richard Henderson
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: peter.maydell, Emilio Cota, Valentin David,
	Philippe Mathieu-Daudé

From: Emilio Cota <cota@braap.org>

qemu-user can hang in a multi-threaded fork. One common
reason is that when creating a TB, between fork and exec
we manipulate a GTree whose memory allocator (GSlice) is
not fork-safe.

Although POSIX does not mandate it, the system's allocator
(e.g. tcmalloc, libc malloc) is probably fork-safe.

Fix some of these hangs by using QTree, which uses the system's
allocator regardless of the Glib version that we used at
configuration time.

Tested with the test program in the original bug report, i.e.:
```

void garble() {
  int pid = fork();
  if (pid == 0) {
    exit(0);
  } else {
    int wstatus;
    waitpid(pid, &wstatus, 0);
  }
}

void supragarble(unsigned depth) {
  if (depth == 0)
    return ;

  std::thread a(supragarble, depth-1);
  std::thread b(supragarble, depth-1);
  garble();
  a.join();
  b.join();
}

int main() {
  supragarble(10);
}
```

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/285
Reported-by: Valentin David <me@valentindavid.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Emilio Cota <cota@braap.org>
Message-Id: <20230205163758.416992-3-cota@braap.org>
[rth: Add QEMU_DISABLE_CFI for all callback using functions.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 17 +++++++++--------
 tcg/region.c         | 19 ++++++++++---------
 util/qtree.c         |  8 ++++----
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 7246c1c46b..a173db17e6 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/interval-tree.h"
+#include "qemu/qtree.h"
 #include "exec/cputlb.h"
 #include "exec/log.h"
 #include "exec/exec-all.h"
@@ -314,7 +315,7 @@ struct page_entry {
  * See also: page_collection_lock().
  */
 struct page_collection {
-    GTree *tree;
+    QTree *tree;
     struct page_entry *max;
 };
 
@@ -467,7 +468,7 @@ static bool page_trylock_add(struct page_collection *set, tb_page_addr_t addr)
     struct page_entry *pe;
     PageDesc *pd;
 
-    pe = g_tree_lookup(set->tree, &index);
+    pe = q_tree_lookup(set->tree, &index);
     if (pe) {
         return false;
     }
@@ -478,7 +479,7 @@ static bool page_trylock_add(struct page_collection *set, tb_page_addr_t addr)
     }
 
     pe = page_entry_new(pd, index);
-    g_tree_insert(set->tree, &pe->index, pe);
+    q_tree_insert(set->tree, &pe->index, pe);
 
     /*
      * If this is either (1) the first insertion or (2) a page whose index
@@ -525,13 +526,13 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
     end   >>= TARGET_PAGE_BITS;
     g_assert(start <= end);
 
-    set->tree = g_tree_new_full(tb_page_addr_cmp, NULL, NULL,
+    set->tree = q_tree_new_full(tb_page_addr_cmp, NULL, NULL,
                                 page_entry_destroy);
     set->max = NULL;
     assert_no_pages_locked();
 
  retry:
-    g_tree_foreach(set->tree, page_entry_lock, NULL);
+    q_tree_foreach(set->tree, page_entry_lock, NULL);
 
     for (index = start; index <= end; index++) {
         TranslationBlock *tb;
@@ -542,7 +543,7 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
             continue;
         }
         if (page_trylock_add(set, index << TARGET_PAGE_BITS)) {
-            g_tree_foreach(set->tree, page_entry_unlock, NULL);
+            q_tree_foreach(set->tree, page_entry_unlock, NULL);
             goto retry;
         }
         assert_page_locked(pd);
@@ -551,7 +552,7 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
                 (tb_page_addr1(tb) != -1 &&
                  page_trylock_add(set, tb_page_addr1(tb)))) {
                 /* drop all locks, and reacquire in order */
-                g_tree_foreach(set->tree, page_entry_unlock, NULL);
+                q_tree_foreach(set->tree, page_entry_unlock, NULL);
                 goto retry;
             }
         }
@@ -562,7 +563,7 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
 static void page_collection_unlock(struct page_collection *set)
 {
     /* entries are unlocked and freed via page_entry_destroy */
-    g_tree_destroy(set->tree);
+    q_tree_destroy(set->tree);
     g_free(set);
 }
 
diff --git a/tcg/region.c b/tcg/region.c
index 88d6bb273f..bef4c4756f 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -28,6 +28,7 @@
 #include "qemu/mprotect.h"
 #include "qemu/memalign.h"
 #include "qemu/cacheinfo.h"
+#include "qemu/qtree.h"
 #include "qapi/error.h"
 #include "exec/exec-all.h"
 #include "tcg/tcg.h"
@@ -36,7 +37,7 @@
 
 struct tcg_region_tree {
     QemuMutex lock;
-    GTree *tree;
+    QTree *tree;
     /* padding to avoid false sharing is computed at run-time */
 };
 
@@ -163,7 +164,7 @@ static void tcg_region_trees_init(void)
         struct tcg_region_tree *rt = region_trees + i * tree_size;
 
         qemu_mutex_init(&rt->lock);
-        rt->tree = g_tree_new_full(tb_tc_cmp, NULL, NULL, tb_destroy);
+        rt->tree = q_tree_new_full(tb_tc_cmp, NULL, NULL, tb_destroy);
     }
 }
 
@@ -202,7 +203,7 @@ void tcg_tb_insert(TranslationBlock *tb)
 
     g_assert(rt != NULL);
     qemu_mutex_lock(&rt->lock);
-    g_tree_insert(rt->tree, &tb->tc, tb);
+    q_tree_insert(rt->tree, &tb->tc, tb);
     qemu_mutex_unlock(&rt->lock);
 }
 
@@ -212,7 +213,7 @@ void tcg_tb_remove(TranslationBlock *tb)
 
     g_assert(rt != NULL);
     qemu_mutex_lock(&rt->lock);
-    g_tree_remove(rt->tree, &tb->tc);
+    q_tree_remove(rt->tree, &tb->tc);
     qemu_mutex_unlock(&rt->lock);
 }
 
@@ -232,7 +233,7 @@ TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
     }
 
     qemu_mutex_lock(&rt->lock);
-    tb = g_tree_lookup(rt->tree, &s);
+    tb = q_tree_lookup(rt->tree, &s);
     qemu_mutex_unlock(&rt->lock);
     return tb;
 }
@@ -267,7 +268,7 @@ void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
     for (i = 0; i < region.n; i++) {
         struct tcg_region_tree *rt = region_trees + i * tree_size;
 
-        g_tree_foreach(rt->tree, func, user_data);
+        q_tree_foreach(rt->tree, func, user_data);
     }
     tcg_region_tree_unlock_all();
 }
@@ -281,7 +282,7 @@ size_t tcg_nb_tbs(void)
     for (i = 0; i < region.n; i++) {
         struct tcg_region_tree *rt = region_trees + i * tree_size;
 
-        nb_tbs += g_tree_nnodes(rt->tree);
+        nb_tbs += q_tree_nnodes(rt->tree);
     }
     tcg_region_tree_unlock_all();
     return nb_tbs;
@@ -296,8 +297,8 @@ static void tcg_region_tree_reset_all(void)
         struct tcg_region_tree *rt = region_trees + i * tree_size;
 
         /* Increment the refcount first so that destroy acts as a reset */
-        g_tree_ref(rt->tree);
-        g_tree_destroy(rt->tree);
+        q_tree_ref(rt->tree);
+        q_tree_destroy(rt->tree);
     }
     tcg_region_tree_unlock_all();
 }
diff --git a/util/qtree.c b/util/qtree.c
index deb46c187f..31f0b46182 100644
--- a/util/qtree.c
+++ b/util/qtree.c
@@ -310,7 +310,7 @@ q_tree_node_next(QTreeNode *node)
  *
  * Since: 2.70 in GLib. Internal in Qtree, i.e. not in the public API.
  */
-static void
+static void QEMU_DISABLE_CFI
 q_tree_remove_all(QTree *tree)
 {
     QTreeNode *node;
@@ -532,7 +532,7 @@ q_tree_replace(QTree    *tree,
 }
 
 /* internal insert routine */
-static QTreeNode *
+static QTreeNode * QEMU_DISABLE_CFI
 q_tree_insert_internal(QTree    *tree,
                        gpointer  key,
                        gpointer  value,
@@ -721,7 +721,7 @@ q_tree_steal(QTree         *tree,
 }
 
 /* internal remove routine */
-static gboolean
+static gboolean QEMU_DISABLE_CFI
 q_tree_remove_internal(QTree         *tree,
                        gconstpointer  key,
                        gboolean       steal)
@@ -1182,7 +1182,7 @@ q_tree_node_balance(QTreeNode *node)
     return node;
 }
 
-static QTreeNode *
+static QTreeNode * QEMU_DISABLE_CFI
 q_tree_find_node(QTree        *tree,
                  gconstpointer key)
 {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 03/15] linux-user: Diagnose misaligned -R size
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
  2023-03-28 22:57 ` [PULL 01/15] util: import GTree as QTree Richard Henderson
  2023-03-28 22:57 ` [PULL 02/15] tcg: use QTree instead of GTree Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 04/15] accel/tcg: Pass last not end to page_set_flags Richard Henderson
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

We have been enforcing host page alignment for the non-R
fallback of MAX_RESERVED_VA, but failing to enforce for -R.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 4b18461969..39d9bd4d7a 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -793,6 +793,12 @@ int main(int argc, char **argv, char **envp)
      */
     max_reserved_va = MAX_RESERVED_VA(cpu);
     if (reserved_va != 0) {
+        if (reserved_va % qemu_host_page_size) {
+            char *s = size_to_str(qemu_host_page_size);
+            fprintf(stderr, "Reserved virtual address not aligned mod %s\n", s);
+            g_free(s);
+            exit(EXIT_FAILURE);
+        }
         if (max_reserved_va && reserved_va > max_reserved_va) {
             fprintf(stderr, "Reserved virtual address too big\n");
             exit(EXIT_FAILURE);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 04/15] accel/tcg: Pass last not end to page_set_flags
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (2 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 03/15] linux-user: Diagnose misaligned -R size Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 05/15] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1528
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 16 +++++++---------
 bsd-user/mmap.c        |  6 +++---
 linux-user/elfload.c   | 11 ++++++-----
 linux-user/mmap.c      | 16 ++++++++--------
 linux-user/syscall.c   |  4 ++--
 6 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 548be9c8ea..a2662b1e83 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -276,7 +276,7 @@ typedef int (*walk_memory_regions_fn)(void *, target_ulong,
 int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
-void page_set_flags(target_ulong start, target_ulong end, int flags);
+void page_set_flags(target_ulong start, target_ulong last, int flags);
 void page_reset_target_data(target_ulong start, target_ulong end);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 7b37fd229e..035f8096b2 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -480,24 +480,22 @@ static bool pageflags_set_clear(target_ulong start, target_ulong last,
  * The flag PAGE_WRITE_ORG is positioned automatically depending
  * on PAGE_WRITE.  The mmap_lock should already be held.
  */
-void page_set_flags(target_ulong start, target_ulong end, int flags)
+void page_set_flags(target_ulong start, target_ulong last, int flags)
 {
-    target_ulong last;
     bool reset = false;
     bool inval_tb = false;
 
     /* This function should never be called with addresses outside the
        guest address space.  If this assert fires, it probably indicates
        a missing call to h2g_valid.  */
-    assert(start < end);
-    assert(end - 1 <= GUEST_ADDR_MAX);
+    assert(start <= last);
+    assert(last <= GUEST_ADDR_MAX);
     /* Only set PAGE_ANON with new mappings. */
     assert(!(flags & PAGE_ANON) || (flags & PAGE_RESET));
     assert_memory_lock();
 
-    start = start & TARGET_PAGE_MASK;
-    end = TARGET_PAGE_ALIGN(end);
-    last = end - 1;
+    start &= TARGET_PAGE_MASK;
+    last |= ~TARGET_PAGE_MASK;
 
     if (!(flags & PAGE_VALID)) {
         flags = 0;
@@ -510,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
     }
 
     if (!flags || reset) {
-        page_reset_target_data(start, end);
+        page_reset_target_data(start, last + 1);
         inval_tb |= pageflags_unset(start, last);
     }
     if (flags) {
@@ -518,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
                                         ~(reset ? 0 : PAGE_STICKY));
     }
     if (inval_tb) {
-        tb_invalidate_phys_range(start, end);
+        tb_invalidate_phys_range(start, last + 1);
     }
 }
 
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index d6c5a344c9..696057551a 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -118,7 +118,7 @@ int target_mprotect(abi_ulong start, abi_ulong len, int prot)
         if (ret != 0)
             goto error;
     }
-    page_set_flags(start, start + len, prot | PAGE_VALID);
+    page_set_flags(start, start + len - 1, prot | PAGE_VALID);
     mmap_unlock();
     return 0;
 error:
@@ -656,7 +656,7 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
         }
     }
  the_end1:
-    page_set_flags(start, start + len, prot | PAGE_VALID);
+    page_set_flags(start, start + len - 1, prot | PAGE_VALID);
  the_end:
 #ifdef DEBUG_MMAP
     printf("ret=0x" TARGET_ABI_FMT_lx "\n", start);
@@ -767,7 +767,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     }
 
     if (ret == 0) {
-        page_set_flags(start, start + len, 0);
+        page_set_flags(start, start + len - 1, 0);
     }
     mmap_unlock();
     return ret;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 1dbc1f0f9b..fa4cc41567 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -213,7 +213,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
     page_set_flags(TARGET_VSYSCALL_PAGE,
-                   TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE,
+                   TARGET_VSYSCALL_PAGE | ~TARGET_PAGE_MASK,
                    PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -444,7 +444,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
 
-    page_set_flags(commpage, commpage + qemu_host_page_size,
+    page_set_flags(commpage, commpage | ~qemu_host_page_mask,
                    PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -1316,7 +1316,7 @@ static bool init_guest_commpage(void)
         exit(EXIT_FAILURE);
     }
 
-    page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+    page_set_flags(LO_COMMPAGE, LO_COMMPAGE | ~TARGET_PAGE_MASK,
                    PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -1728,7 +1728,7 @@ static bool init_guest_commpage(void)
      * and implement syscalls.  Here, simply mark the page executable.
      * Special case the entry points during translation (see do_page_zero).
      */
-    page_set_flags(LO_COMMPAGE, LO_COMMPAGE + TARGET_PAGE_SIZE,
+    page_set_flags(LO_COMMPAGE, LO_COMMPAGE | ~TARGET_PAGE_MASK,
                    PAGE_EXEC | PAGE_VALID);
     return true;
 }
@@ -2209,7 +2209,8 @@ static void zero_bss(abi_ulong elf_bss, abi_ulong last_bss, int prot)
 
     /* Ensure that the bss page(s) are valid */
     if ((page_get_flags(last_bss-1) & prot) != prot) {
-        page_set_flags(elf_bss & TARGET_PAGE_MASK, last_bss, prot | PAGE_VALID);
+        page_set_flags(elf_bss & TARGET_PAGE_MASK, last_bss - 1,
+                       prot | PAGE_VALID);
     }
 
     if (host_start < host_map_start) {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 28135c9e6a..1d07ff5d2c 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -181,7 +181,7 @@ int target_mprotect(abi_ulong start, abi_ulong len, int target_prot)
         }
     }
 
-    page_set_flags(start, start + len, page_flags);
+    page_set_flags(start, start + len - 1, page_flags);
     ret = 0;
 
 error:
@@ -640,15 +640,15 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int target_prot,
     }
     page_flags |= PAGE_RESET;
     if (passthrough_start == passthrough_end) {
-        page_set_flags(start, start + len, page_flags);
+        page_set_flags(start, start + len - 1, page_flags);
     } else {
         if (start < passthrough_start) {
-            page_set_flags(start, passthrough_start, page_flags);
+            page_set_flags(start, passthrough_start - 1, page_flags);
         }
-        page_set_flags(passthrough_start, passthrough_end,
+        page_set_flags(passthrough_start, passthrough_end - 1,
                        page_flags | PAGE_PASSTHROUGH);
         if (passthrough_end < start + len) {
-            page_set_flags(passthrough_end, start + len, page_flags);
+            page_set_flags(passthrough_end, start + len - 1, page_flags);
         }
     }
  the_end:
@@ -763,7 +763,7 @@ int target_munmap(abi_ulong start, abi_ulong len)
     }
 
     if (ret == 0) {
-        page_set_flags(start, start + len, 0);
+        page_set_flags(start, start + len - 1, 0);
     }
     mmap_unlock();
     return ret;
@@ -849,8 +849,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
     } else {
         new_addr = h2g(host_addr);
         prot = page_get_flags(old_addr);
-        page_set_flags(old_addr, old_addr + old_size, 0);
-        page_set_flags(new_addr, new_addr + new_size,
+        page_set_flags(old_addr, old_addr + old_size - 1, 0);
+        page_set_flags(new_addr, new_addr + new_size - 1,
                        prot | PAGE_VALID | PAGE_RESET);
     }
     mmap_unlock();
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 27871641f4..69f740ff98 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4595,7 +4595,7 @@ static inline abi_ulong do_shmat(CPUArchState *cpu_env,
     }
     raddr=h2g((unsigned long)host_raddr);
 
-    page_set_flags(raddr, raddr + shm_info.shm_segsz,
+    page_set_flags(raddr, raddr + shm_info.shm_segsz - 1,
                    PAGE_VALID | PAGE_RESET | PAGE_READ |
                    (shmflg & SHM_RDONLY ? 0 : PAGE_WRITE));
 
@@ -4625,7 +4625,7 @@ static inline abi_long do_shmdt(abi_ulong shmaddr)
     for (i = 0; i < N_SHM_REGIONS; ++i) {
         if (shm_regions[i].in_use && shm_regions[i].start == shmaddr) {
             shm_regions[i].in_use = false;
-            page_set_flags(shmaddr, shmaddr + shm_regions[i].size, 0);
+            page_set_flags(shmaddr, shmaddr + shm_regions[i].size - 1, 0);
             break;
         }
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 05/15] accel/tcg: Pass last not end to page_reset_target_data
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (3 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 04/15] accel/tcg: Pass last not end to page_set_flags Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 06/15] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h |  2 +-
 accel/tcg/user-exec.c  | 11 +++++------
 linux-user/mmap.c      |  2 +-
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index a2662b1e83..64cb62dc54 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -277,7 +277,7 @@ int walk_memory_regions(void *, walk_memory_regions_fn);
 
 int page_get_flags(target_ulong address);
 void page_set_flags(target_ulong start, target_ulong last, int flags);
-void page_reset_target_data(target_ulong start, target_ulong end);
+void page_reset_target_data(target_ulong start, target_ulong last);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 
 /**
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 035f8096b2..20b6fc2f6e 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -508,7 +508,7 @@ void page_set_flags(target_ulong start, target_ulong last, int flags)
     }
 
     if (!flags || reset) {
-        page_reset_target_data(start, last + 1);
+        page_reset_target_data(start, last);
         inval_tb |= pageflags_unset(start, last);
     }
     if (flags) {
@@ -814,15 +814,14 @@ typedef struct TargetPageDataNode {
 
 static IntervalTreeRoot targetdata_root;
 
-void page_reset_target_data(target_ulong start, target_ulong end)
+void page_reset_target_data(target_ulong start, target_ulong last)
 {
     IntervalTreeNode *n, *next;
-    target_ulong last;
 
     assert_memory_lock();
 
-    start = start & TARGET_PAGE_MASK;
-    last = TARGET_PAGE_ALIGN(end) - 1;
+    start &= TARGET_PAGE_MASK;
+    last |= ~TARGET_PAGE_MASK;
 
     for (n = interval_tree_iter_first(&targetdata_root, start, last),
          next = n ? interval_tree_iter_next(n, start, last) : NULL;
@@ -885,7 +884,7 @@ void *page_get_target_data(target_ulong address)
     return t->data[(page - region) >> TARGET_PAGE_BITS];
 }
 #else
-void page_reset_target_data(target_ulong start, target_ulong end) { }
+void page_reset_target_data(target_ulong start, target_ulong last) { }
 #endif /* TARGET_PAGE_DATA_SIZE */
 
 /* The softmmu versions of these helpers are in cputlb.c.  */
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 1d07ff5d2c..995146f60d 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -946,7 +946,7 @@ abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice)
         if (can_passthrough_madvise(start, end)) {
             ret = get_errno(madvise(g2h_untagged(start), len, advice));
             if ((advice == MADV_DONTNEED) && (ret == 0)) {
-                page_reset_target_data(start, start + len);
+                page_reset_target_data(start, start + len - 1);
             }
         }
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 06/15] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (4 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 05/15] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 07/15] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index a173db17e6..04d2751bb6 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -127,29 +127,29 @@ static void tb_remove(TranslationBlock *tb)
 }
 
 /* TODO: For now, still shared with translate-all.c for system mode. */
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, T, N)    \
-    for (T = foreach_tb_first(start, end),              \
-         N = foreach_tb_next(T, start, end);            \
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, T, N)   \
+    for (T = foreach_tb_first(start, last),             \
+         N = foreach_tb_next(T, start, last);           \
          T != NULL;                                     \
-         T = N, N = foreach_tb_next(N, start, end))
+         T = N, N = foreach_tb_next(N, start, last))
 
 typedef TranslationBlock *PageForEachNext;
 
 static PageForEachNext foreach_tb_first(tb_page_addr_t start,
-                                        tb_page_addr_t end)
+                                        tb_page_addr_t last)
 {
-    IntervalTreeNode *n = interval_tree_iter_first(&tb_root, start, end - 1);
+    IntervalTreeNode *n = interval_tree_iter_first(&tb_root, start, last);
     return n ? container_of(n, TranslationBlock, itree) : NULL;
 }
 
 static PageForEachNext foreach_tb_next(PageForEachNext tb,
                                        tb_page_addr_t start,
-                                       tb_page_addr_t end)
+                                       tb_page_addr_t last)
 {
     IntervalTreeNode *n;
 
     if (tb) {
-        n = interval_tree_iter_next(&tb->itree, start, end - 1);
+        n = interval_tree_iter_next(&tb->itree, start, last);
         if (n) {
             return container_of(n, TranslationBlock, itree);
         }
@@ -320,7 +320,7 @@ struct page_collection {
 };
 
 typedef int PageForEachNext;
-#define PAGE_FOR_EACH_TB(start, end, pagedesc, tb, n) \
+#define PAGE_FOR_EACH_TB(start, last, pagedesc, tb, n) \
     TB_FOR_EACH_TAGGED((pagedesc)->first_tb, tb, n, page_next)
 
 #ifdef CONFIG_DEBUG_TCG
@@ -995,10 +995,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
 {
     TranslationBlock *tb;
     PageForEachNext n;
+    tb_page_addr_t last = end - 1;
 
     assert_memory_lock();
 
-    PAGE_FOR_EACH_TB(start, end, unused, tb, n) {
+    PAGE_FOR_EACH_TB(start, last, unused, tb, n) {
         tb_phys_invalidate__locked(tb);
     }
 }
@@ -1030,6 +1031,7 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
     bool current_tb_modified;
     TranslationBlock *tb;
     PageForEachNext n;
+    tb_page_addr_t last;
 
     /*
      * Without precise smc semantics, or when outside of a TB,
@@ -1046,10 +1048,11 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
     assert_memory_lock();
     current_tb = tcg_tb_lookup(pc);
 
+    last = addr | ~TARGET_PAGE_MASK;
     addr &= TARGET_PAGE_MASK;
     current_tb_modified = false;
 
-    PAGE_FOR_EACH_TB(addr, addr + TARGET_PAGE_SIZE, unused, tb, n) {
+    PAGE_FOR_EACH_TB(addr, last, unused, tb, n) {
         if (current_tb == tb &&
             (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
             /*
@@ -1091,12 +1094,13 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
     bool current_tb_modified = false;
     TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
+    tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
     /*
      * We remove all the TBs in the range [start, end[.
      * XXX: see if in some cases it could be faster to invalidate all the code
      */
-    PAGE_FOR_EACH_TB(start, end, p, tb, n) {
+    PAGE_FOR_EACH_TB(start, last, p, tb, n) {
         /* NOTE: this is subtle as a TB may span two physical pages */
         if (n == 0) {
             /* NOTE: tb_end may be after the end of the page, but
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 07/15] accel/tcg: Pass last not end to page_collection_lock
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (5 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 06/15] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:57 ` [PULL 08/15] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Fixes a bug in the loop comparision where "<= end" would lock
one more page than required.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 04d2751bb6..57da2feb2f 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -511,20 +511,20 @@ static gint tb_page_addr_cmp(gconstpointer ap, gconstpointer bp, gpointer udata)
 }
 
 /*
- * Lock a range of pages ([@start,@end[) as well as the pages of all
+ * Lock a range of pages ([@start,@last]) as well as the pages of all
  * intersecting TBs.
  * Locking order: acquire locks in ascending order of page index.
  */
 static struct page_collection *page_collection_lock(tb_page_addr_t start,
-                                                    tb_page_addr_t end)
+                                                    tb_page_addr_t last)
 {
     struct page_collection *set = g_malloc(sizeof(*set));
     tb_page_addr_t index;
     PageDesc *pd;
 
     start >>= TARGET_PAGE_BITS;
-    end   >>= TARGET_PAGE_BITS;
-    g_assert(start <= end);
+    last >>= TARGET_PAGE_BITS;
+    g_assert(start <= last);
 
     set->tree = q_tree_new_full(tb_page_addr_cmp, NULL, NULL,
                                 page_entry_destroy);
@@ -534,7 +534,7 @@ static struct page_collection *page_collection_lock(tb_page_addr_t start,
  retry:
     q_tree_foreach(set->tree, page_entry_lock, NULL);
 
-    for (index = start; index <= end; index++) {
+    for (index = start; index <= last; index++) {
         TranslationBlock *tb;
         PageForEachNext n;
 
@@ -1154,7 +1154,7 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
     struct page_collection *pages;
-    tb_page_addr_t start, end;
+    tb_page_addr_t start, last;
     PageDesc *p;
 
     p = page_find(addr >> TARGET_PAGE_BITS);
@@ -1163,9 +1163,9 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
     }
 
     start = addr & TARGET_PAGE_MASK;
-    end = start + TARGET_PAGE_SIZE;
-    pages = page_collection_lock(start, end);
-    tb_invalidate_phys_page_range__locked(pages, p, start, end, 0);
+    last = addr | ~TARGET_PAGE_MASK;
+    pages = page_collection_lock(start, last);
+    tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
     page_collection_unlock(pages);
 }
 
@@ -1181,7 +1181,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
     struct page_collection *pages;
     tb_page_addr_t next;
 
-    pages = page_collection_lock(start, end);
+    pages = page_collection_lock(start, end - 1);
     for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
          start < end;
          start = next, next += TARGET_PAGE_SIZE) {
@@ -1226,7 +1226,7 @@ void tb_invalidate_phys_range_fast(ram_addr_t ram_addr,
 {
     struct page_collection *pages;
 
-    pages = page_collection_lock(ram_addr, ram_addr + size);
+    pages = page_collection_lock(ram_addr, ram_addr + size - 1);
     tb_invalidate_phys_page_fast__locked(pages, ram_addr, size, retaddr);
     page_collection_unlock(pages);
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 08/15] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (6 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 07/15] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
@ 2023-03-28 22:57 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 09/15] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Properly truncate tb_last to the end of the page; the comment about
tb_end being past the end of the page being ok is not correct,
considering overflow.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-maint.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 57da2feb2f..74823ba464 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -1084,35 +1084,33 @@ bool tb_invalidate_phys_page_unwind(tb_page_addr_t addr, uintptr_t pc)
 static void
 tb_invalidate_phys_page_range__locked(struct page_collection *pages,
                                       PageDesc *p, tb_page_addr_t start,
-                                      tb_page_addr_t end,
+                                      tb_page_addr_t last,
                                       uintptr_t retaddr)
 {
     TranslationBlock *tb;
-    tb_page_addr_t tb_start, tb_end;
     PageForEachNext n;
 #ifdef TARGET_HAS_PRECISE_SMC
     bool current_tb_modified = false;
     TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
 #endif /* TARGET_HAS_PRECISE_SMC */
-    tb_page_addr_t last G_GNUC_UNUSED = end - 1;
 
     /*
-     * We remove all the TBs in the range [start, end[.
+     * We remove all the TBs in the range [start, last].
      * XXX: see if in some cases it could be faster to invalidate all the code
      */
     PAGE_FOR_EACH_TB(start, last, p, tb, n) {
+        tb_page_addr_t tb_start, tb_last;
+
         /* NOTE: this is subtle as a TB may span two physical pages */
+        tb_start = tb_page_addr0(tb);
+        tb_last = tb_start + tb->size - 1;
         if (n == 0) {
-            /* NOTE: tb_end may be after the end of the page, but
-               it is not a problem */
-            tb_start = tb_page_addr0(tb);
-            tb_end = tb_start + tb->size;
+            tb_last = MIN(tb_last, tb_start | ~TARGET_PAGE_MASK);
         } else {
             tb_start = tb_page_addr1(tb);
-            tb_end = tb_start + ((tb_page_addr0(tb) + tb->size)
-                                 & ~TARGET_PAGE_MASK);
+            tb_last = tb_start + (tb_last & ~TARGET_PAGE_MASK);
         }
-        if (!(tb_end <= start || tb_start >= end)) {
+        if (!(tb_last < start || tb_start > last)) {
 #ifdef TARGET_HAS_PRECISE_SMC
             if (current_tb == tb &&
                 (tb_cflags(current_tb) & CF_COUNT_MASK) != 1) {
@@ -1165,7 +1163,7 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
     start = addr & TARGET_PAGE_MASK;
     last = addr | ~TARGET_PAGE_MASK;
     pages = page_collection_lock(start, last);
-    tb_invalidate_phys_page_range__locked(pages, p, start, last + 1, 0);
+    tb_invalidate_phys_page_range__locked(pages, p, start, last, 0);
     page_collection_unlock(pages);
 }
 
@@ -1192,7 +1190,7 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
             continue;
         }
         assert_page_locked(pd);
-        tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
+        tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
     }
     page_collection_unlock(pages);
 }
@@ -1212,7 +1210,7 @@ static void tb_invalidate_phys_page_fast__locked(struct page_collection *pages,
     }
 
     assert_page_locked(p);
-    tb_invalidate_phys_page_range__locked(pages, p, start, start + len, ra);
+    tb_invalidate_phys_page_range__locked(pages, p, start, start + len - 1, ra);
 }
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 09/15] accel/tcg: Pass last not end to tb_invalidate_phys_range
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (7 preceding siblings ...)
  2023-03-28 22:57 ` [PULL 08/15] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 10/15] linux-user: Pass last not end to probe_guest_base Richard Henderson
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte to be changed, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   |  2 +-
 accel/tcg/tb-maint.c      | 31 ++++++++++++++++---------------
 accel/tcg/translate-all.c |  2 +-
 accel/tcg/user-exec.c     |  2 +-
 softmmu/physmem.c         |  2 +-
 5 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ad9eb6067b..ecded1f112 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -678,7 +678,7 @@ void tb_invalidate_phys_addr(target_ulong addr);
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs);
 #endif
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end);
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last);
 void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
 
 /* GETPC is the true target of the return instruction that we'll execute.  */
diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 74823ba464..cb1f806f00 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -991,11 +991,10 @@ TranslationBlock *tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
  * Called with mmap_lock held for user-mode emulation.
  * NOTE: this function must not be called while a TB is running.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
     TranslationBlock *tb;
     PageForEachNext n;
-    tb_page_addr_t last = end - 1;
 
     assert_memory_lock();
 
@@ -1011,11 +1010,11 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
  */
 void tb_invalidate_phys_page(tb_page_addr_t addr)
 {
-    tb_page_addr_t start, end;
+    tb_page_addr_t start, last;
 
     start = addr & TARGET_PAGE_MASK;
-    end = start + TARGET_PAGE_SIZE;
-    tb_invalidate_phys_range(start, end);
+    last = addr | ~TARGET_PAGE_MASK;
+    tb_invalidate_phys_range(start, last);
 }
 
 /*
@@ -1169,28 +1168,30 @@ void tb_invalidate_phys_page(tb_page_addr_t addr)
 
 /*
  * Invalidate all TBs which intersect with the target physical address range
- * [start;end[. NOTE: start and end may refer to *different* physical pages.
+ * [start;last]. NOTE: start and end may refer to *different* physical pages.
  * 'is_cpu_write_access' should be true if called from a real cpu write
  * access: the virtual CPU will exit the current TB if code is modified inside
  * this TB.
  */
-void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t end)
+void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
 {
     struct page_collection *pages;
-    tb_page_addr_t next;
+    tb_page_addr_t index, index_last;
 
-    pages = page_collection_lock(start, end - 1);
-    for (next = (start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-         start < end;
-         start = next, next += TARGET_PAGE_SIZE) {
-        PageDesc *pd = page_find(start >> TARGET_PAGE_BITS);
-        tb_page_addr_t bound = MIN(next, end);
+    pages = page_collection_lock(start, last);
+
+    index_last = last >> TARGET_PAGE_BITS;
+    for (index = start >> TARGET_PAGE_BITS; index <= index_last; index++) {
+        PageDesc *pd = page_find(index);
+        tb_page_addr_t bound;
 
         if (pd == NULL) {
             continue;
         }
         assert_page_locked(pd);
-        tb_invalidate_phys_page_range__locked(pages, pd, start, bound - 1, 0);
+        bound = (index << TARGET_PAGE_BITS) | ~TARGET_PAGE_MASK;
+        bound = MIN(bound, last);
+        tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
     }
     page_collection_unlock(pages);
 }
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 74deb18bd0..5b13281119 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -572,7 +572,7 @@ void tb_check_watchpoint(CPUState *cpu, uintptr_t retaddr)
         cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
         addr = get_page_addr_code(env, pc);
         if (addr != -1) {
-            tb_invalidate_phys_range(addr, addr + 1);
+            tb_invalidate_phys_range(addr, addr);
         }
     }
 }
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 20b6fc2f6e..a7e0c3e2f4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -516,7 +516,7 @@ void page_set_flags(target_ulong start, target_ulong last, int flags)
                                         ~(reset ? 0 : PAGE_STICKY));
     }
     if (inval_tb) {
-        tb_invalidate_phys_range(start, last + 1);
+        tb_invalidate_phys_range(start, last);
     }
 }
 
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index e35061bba4..0e0182d9f2 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2527,7 +2527,7 @@ static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
     }
     if (dirty_log_mask & (1 << DIRTY_MEMORY_CODE)) {
         assert(tcg_enabled());
-        tb_invalidate_phys_range(addr, addr + length);
+        tb_invalidate_phys_range(addr, addr + length - 1);
         dirty_log_mask &= ~(1 << DIRTY_MEMORY_CODE);
     }
     cpu_physical_memory_set_dirty_range(addr, length, dirty_log_mask);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 10/15] linux-user: Pass last not end to probe_guest_base
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (8 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 09/15] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 11/15] include/exec: Change reserved_va semantics to last byte Richard Henderson
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Pass the address of the last byte of the image, rather than
the first address past the last byte.  This avoids overflow
when the last page of the address space is involved.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/user-internals.h | 12 ++++++------
 linux-user/elfload.c        | 24 ++++++++++++------------
 linux-user/flatload.c       |  2 +-
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/linux-user/user-internals.h b/linux-user/user-internals.h
index 9333db4f51..c63ef45fc7 100644
--- a/linux-user/user-internals.h
+++ b/linux-user/user-internals.h
@@ -76,19 +76,19 @@ void fork_end(int child);
 /**
  * probe_guest_base:
  * @image_name: the executable being loaded
- * @loaddr: the lowest fixed address in the executable
- * @hiaddr: the highest fixed address in the executable
+ * @loaddr: the lowest fixed address within the executable
+ * @hiaddr: the highest fixed address within the executable
  *
  * Creates the initial guest address space in the host memory space.
  *
- * If @loaddr == 0, then no address in the executable is fixed,
- * i.e. it is fully relocatable.  In that case @hiaddr is the size
- * of the executable.
+ * If @loaddr == 0, then no address in the executable is fixed, i.e.
+ * it is fully relocatable.  In that case @hiaddr is the size of the
+ * executable minus one.
  *
  * This function will not return if a valid value for guest_base
  * cannot be chosen.  On return, the executable loader can expect
  *
- *    target_mmap(loaddr, hiaddr - loaddr, ...)
+ *    target_mmap(loaddr, hiaddr - loaddr + 1, ...)
  *
  * to succeed.
  */
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index fa4cc41567..dfae967908 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2504,7 +2504,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
         if (guest_hiaddr > reserved_va) {
             error_report("%s: requires more than reserved virtual "
                          "address space (0x%" PRIx64 " > 0x%lx)",
-                         image_name, (uint64_t)guest_hiaddr, reserved_va);
+                         image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
             exit(EXIT_FAILURE);
         }
     } else {
@@ -2512,7 +2512,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
         if ((guest_hiaddr - guest_base) > ~(uintptr_t)0) {
             error_report("%s: requires more virtual address space "
                          "than the host can provide (0x%" PRIx64 ")",
-                         image_name, (uint64_t)guest_hiaddr - guest_base);
+                         image_name, (uint64_t)guest_hiaddr + 1 - guest_base);
             exit(EXIT_FAILURE);
         }
 #endif
@@ -2525,18 +2525,18 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
     if (reserved_va) {
         guest_loaddr = (guest_base >= mmap_min_addr ? 0
                         : mmap_min_addr - guest_base);
-        guest_hiaddr = reserved_va;
+        guest_hiaddr = reserved_va - 1;
     }
 
     /* Reserve the address space for the binary, or reserved_va. */
     test = g2h_untagged(guest_loaddr);
-    addr = mmap(test, guest_hiaddr - guest_loaddr, PROT_NONE, flags, -1, 0);
+    addr = mmap(test, guest_hiaddr - guest_loaddr + 1, PROT_NONE, flags, -1, 0);
     if (test != addr) {
         pgb_fail_in_use(image_name);
     }
     qemu_log_mask(CPU_LOG_PAGE,
-                  "%s: base @ %p for " TARGET_ABI_FMT_ld " bytes\n",
-                  __func__, addr, guest_hiaddr - guest_loaddr);
+                  "%s: base @ %p for %" PRIu64 " bytes\n",
+                  __func__, addr, (uint64_t)guest_hiaddr - guest_loaddr + 1);
 }
 
 /**
@@ -2680,7 +2680,7 @@ static void pgb_static(const char *image_name, abi_ulong orig_loaddr,
     if (hiaddr != orig_hiaddr) {
         error_report("%s: requires virtual address space that the "
                      "host cannot provide (0x%" PRIx64 ")",
-                     image_name, (uint64_t)orig_hiaddr);
+                     image_name, (uint64_t)orig_hiaddr + 1);
         exit(EXIT_FAILURE);
     }
 
@@ -2694,7 +2694,7 @@ static void pgb_static(const char *image_name, abi_ulong orig_loaddr,
          * arithmetic wraps around.
          */
         if (sizeof(uintptr_t) == 8 || loaddr >= 0x80000000u) {
-            hiaddr = (uintptr_t) 4 << 30;
+            hiaddr = UINT32_MAX;
         } else {
             offset = -(HI_COMMPAGE & -align);
         }
@@ -2702,7 +2702,7 @@ static void pgb_static(const char *image_name, abi_ulong orig_loaddr,
         loaddr = MIN(loaddr, LO_COMMPAGE & -align);
     }
 
-    addr = pgb_find_hole(loaddr, hiaddr - loaddr, align, offset);
+    addr = pgb_find_hole(loaddr, hiaddr - loaddr + 1, align, offset);
     if (addr == -1) {
         /*
          * If HI_COMMPAGE, there *might* be a non-consecutive allocation
@@ -2755,7 +2755,7 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
     if (guest_hiaddr > reserved_va) {
         error_report("%s: requires more than reserved virtual "
                      "address space (0x%" PRIx64 " > 0x%lx)",
-                     image_name, (uint64_t)guest_hiaddr, reserved_va);
+                     image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
         exit(EXIT_FAILURE);
     }
 
@@ -3021,7 +3021,7 @@ static void load_elf_image(const char *image_name, int image_fd,
             if (a < loaddr) {
                 loaddr = a;
             }
-            a = eppnt->p_vaddr + eppnt->p_memsz;
+            a = eppnt->p_vaddr + eppnt->p_memsz - 1;
             if (a > hiaddr) {
                 hiaddr = a;
             }
@@ -3112,7 +3112,7 @@ static void load_elf_image(const char *image_name, int image_fd,
      * In both cases, we will overwrite pages in this range with mappings
      * from the executable.
      */
-    load_addr = target_mmap(loaddr, hiaddr - loaddr, PROT_NONE,
+    load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
                             MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
                             (ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
                             -1, 0);
diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index e99570ca18..5efec2630e 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -448,7 +448,7 @@ static int load_flat_file(struct linux_binprm * bprm,
      * Allocate the address space.
      */
     probe_guest_base(bprm->filename, 0,
-                     text_len + data_len + extra + indx_len);
+                     text_len + data_len + extra + indx_len - 1);
 
     /*
      * there are a couple of cases here,  the separate code/data
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 11/15] include/exec: Change reserved_va semantics to last byte
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (9 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 10/15] linux-user: Pass last not end to probe_guest_base Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-05-11 11:48   ` Laurent Vivier
  2023-03-28 22:58 ` [PULL 12/15] linux-user/arm: Take more care allocating commpage Richard Henderson
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

Change the semantics to be the last byte of the guest va, rather
than the following byte.  This avoids some overflow conditions.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu-all.h      | 11 ++++++++++-
 linux-user/arm/target_cpu.h |  2 +-
 bsd-user/main.c             | 10 +++-------
 bsd-user/mmap.c             |  4 ++--
 linux-user/elfload.c        | 14 +++++++-------
 linux-user/main.c           | 27 +++++++++++++--------------
 linux-user/mmap.c           |  4 ++--
 7 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 64cb62dc54..090922e4a8 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -152,6 +152,15 @@ static inline void tswap64s(uint64_t *s)
  */
 extern uintptr_t guest_base;
 extern bool have_guest_base;
+
+/*
+ * If non-zero, the guest virtual address space is a contiguous subset
+ * of the host virtual address space, i.e. '-R reserved_va' is in effect
+ * either from the command-line or by default.  The value is the last
+ * byte of the guest address space e.g. UINT32_MAX.
+ *
+ * If zero, the host and guest virtual address spaces are intermingled.
+ */
 extern unsigned long reserved_va;
 
 /*
@@ -171,7 +180,7 @@ extern unsigned long reserved_va;
 #define GUEST_ADDR_MAX_                                                 \
     ((MIN_CONST(TARGET_VIRT_ADDR_SPACE_BITS, TARGET_ABI_BITS) <= 32) ?  \
      UINT32_MAX : ~0ul)
-#define GUEST_ADDR_MAX    (reserved_va ? reserved_va - 1 : GUEST_ADDR_MAX_)
+#define GUEST_ADDR_MAX    (reserved_va ? : GUEST_ADDR_MAX_)
 
 #else
 
diff --git a/linux-user/arm/target_cpu.h b/linux-user/arm/target_cpu.h
index 89ba274cfc..f6383a7cd1 100644
--- a/linux-user/arm/target_cpu.h
+++ b/linux-user/arm/target_cpu.h
@@ -30,7 +30,7 @@ static inline unsigned long arm_max_reserved_va(CPUState *cs)
          * the high addresses.  Restrict linux-user to the
          * cached write-back RAM in the system map.
          */
-        return 0x80000000ul;
+        return 0x7ffffffful;
     } else {
         /*
          * We need to be able to map the commpage.
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 89f225dead..babc3b009b 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -68,13 +68,9 @@ bool have_guest_base;
 # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
 #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
       (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
-/*
- * There are a number of places where we assign reserved_va to a variable
- * of type abi_ulong and expect it to fit.  Avoid the last page.
- */
-#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)
+#   define MAX_RESERVED_VA  0xfffffffful
 #  else
-#   define MAX_RESERVED_VA  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
+#   define MAX_RESERVED_VA  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
 #  endif
 # else
 #  define MAX_RESERVED_VA  0
@@ -466,7 +462,7 @@ int main(int argc, char **argv)
     envlist_free(envlist);
 
     if (reserved_va) {
-            mmap_next_start = reserved_va;
+        mmap_next_start = reserved_va + 1;
     }
 
     {
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index 696057551a..565b9f97ed 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -234,7 +234,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
     size = HOST_PAGE_ALIGN(size) + alignment;
     end_addr = start + size;
     if (end_addr > reserved_va) {
-        end_addr = reserved_va;
+        end_addr = reserved_va + 1;
     }
     addr = end_addr - qemu_host_page_size;
 
@@ -243,7 +243,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
             if (looped) {
                 return (abi_ulong)-1;
             }
-            end_addr = reserved_va;
+            end_addr = reserved_va + 1;
             addr = end_addr - qemu_host_page_size;
             looped = 1;
             continue;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index dfae967908..f1370a7a8b 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -208,7 +208,7 @@ static bool init_guest_commpage(void)
      * has specified -R reserved_va, which would trigger an assert().
      */
     if (reserved_va != 0 &&
-        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE >= reserved_va) {
+        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE - 1 > reserved_va) {
         error_report("Cannot allocate vsyscall page");
         exit(EXIT_FAILURE);
     }
@@ -2504,7 +2504,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
         if (guest_hiaddr > reserved_va) {
             error_report("%s: requires more than reserved virtual "
                          "address space (0x%" PRIx64 " > 0x%lx)",
-                         image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
+                         image_name, (uint64_t)guest_hiaddr, reserved_va);
             exit(EXIT_FAILURE);
         }
     } else {
@@ -2525,7 +2525,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
     if (reserved_va) {
         guest_loaddr = (guest_base >= mmap_min_addr ? 0
                         : mmap_min_addr - guest_base);
-        guest_hiaddr = reserved_va - 1;
+        guest_hiaddr = reserved_va;
     }
 
     /* Reserve the address space for the binary, or reserved_va. */
@@ -2755,7 +2755,7 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
     if (guest_hiaddr > reserved_va) {
         error_report("%s: requires more than reserved virtual "
                      "address space (0x%" PRIx64 " > 0x%lx)",
-                     image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
+                     image_name, (uint64_t)guest_hiaddr, reserved_va);
         exit(EXIT_FAILURE);
     }
 
@@ -2768,17 +2768,17 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
     /* Reserve the memory on the host. */
     assert(guest_base != 0);
     test = g2h_untagged(0);
-    addr = mmap(test, reserved_va, PROT_NONE, flags, -1, 0);
+    addr = mmap(test, reserved_va + 1, PROT_NONE, flags, -1, 0);
     if (addr == MAP_FAILED || addr != test) {
         error_report("Unable to reserve 0x%lx bytes of virtual address "
                      "space at %p (%s) for use as guest address space (check your "
                      "virtual memory ulimit setting, min_mmap_addr or reserve less "
-                     "using -R option)", reserved_va, test, strerror(errno));
+                     "using -R option)", reserved_va + 1, test, strerror(errno));
         exit(EXIT_FAILURE);
     }
 
     qemu_log_mask(CPU_LOG_PAGE, "%s: base @ %p for %lu bytes\n",
-                  __func__, addr, reserved_va);
+                  __func__, addr, reserved_va + 1);
 }
 
 void probe_guest_base(const char *image_name, abi_ulong guest_loaddr,
diff --git a/linux-user/main.c b/linux-user/main.c
index 39d9bd4d7a..fe03293516 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -109,11 +109,9 @@ static const char *last_log_filename;
 # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
 #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
       (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
-/* There are a number of places where we assign reserved_va to a variable
-   of type abi_ulong and expect it to fit.  Avoid the last page.  */
-#   define MAX_RESERVED_VA(CPU)  (0xfffffffful & TARGET_PAGE_MASK)
+#   define MAX_RESERVED_VA(CPU)  0xfffffffful
 #  else
-#   define MAX_RESERVED_VA(CPU)  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
+#   define MAX_RESERVED_VA(CPU)  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
 #  endif
 # else
 #  define MAX_RESERVED_VA(CPU)  0
@@ -379,7 +377,9 @@ static void handle_arg_reserved_va(const char *arg)
 {
     char *p;
     int shift = 0;
-    reserved_va = strtoul(arg, &p, 0);
+    unsigned long val;
+
+    val = strtoul(arg, &p, 0);
     switch (*p) {
     case 'k':
     case 'K':
@@ -393,10 +393,10 @@ static void handle_arg_reserved_va(const char *arg)
         break;
     }
     if (shift) {
-        unsigned long unshifted = reserved_va;
+        unsigned long unshifted = val;
         p++;
-        reserved_va <<= shift;
-        if (reserved_va >> shift != unshifted) {
+        val <<= shift;
+        if (val >> shift != unshifted) {
             fprintf(stderr, "Reserved virtual address too big\n");
             exit(EXIT_FAILURE);
         }
@@ -405,6 +405,8 @@ static void handle_arg_reserved_va(const char *arg)
         fprintf(stderr, "Unrecognised -R size suffix '%s'\n", p);
         exit(EXIT_FAILURE);
     }
+    /* The representation is size - 1, with 0 remaining "default". */
+    reserved_va = val ? val - 1 : 0;
 }
 
 static void handle_arg_singlestep(const char *arg)
@@ -793,7 +795,7 @@ int main(int argc, char **argv, char **envp)
      */
     max_reserved_va = MAX_RESERVED_VA(cpu);
     if (reserved_va != 0) {
-        if (reserved_va % qemu_host_page_size) {
+        if ((reserved_va + 1) % qemu_host_page_size) {
             char *s = size_to_str(qemu_host_page_size);
             fprintf(stderr, "Reserved virtual address not aligned mod %s\n", s);
             g_free(s);
@@ -804,11 +806,8 @@ int main(int argc, char **argv, char **envp)
             exit(EXIT_FAILURE);
         }
     } else if (HOST_LONG_BITS == 64 && TARGET_VIRT_ADDR_SPACE_BITS <= 32) {
-        /*
-         * reserved_va must be aligned with the host page size
-         * as it is used with mmap()
-         */
-        reserved_va = max_reserved_va & qemu_host_page_mask;
+        /* MAX_RESERVED_VA + 1 is a large power of 2, so is aligned. */
+        reserved_va = max_reserved_va;
     }
 
     {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 995146f60d..0aa8ae7356 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -283,7 +283,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
     end_addr = start + size;
     if (start > reserved_va - size) {
         /* Start at the top of the address space.  */
-        end_addr = ((reserved_va - size) & -align) + size;
+        end_addr = ((reserved_va + 1 - size) & -align) + size;
         looped = true;
     }
 
@@ -297,7 +297,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
                 return (abi_ulong)-1;
             }
             /* Re-start at the top of the address space.  */
-            addr = end_addr = ((reserved_va - size) & -align) + size;
+            addr = end_addr = ((reserved_va + 1 - size) & -align) + size;
             looped = true;
         } else {
             prot = page_get_flags(addr);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 12/15] linux-user/arm: Take more care allocating commpage
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (10 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 11/15] include/exec: Change reserved_va semantics to last byte Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 13/15] softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel Richard Henderson
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

User setting of -R reserved_va can lead to an assertion
failure in page_set_flags.  Sanity check the value of
reserved_va and print an error message instead.  Do not
allocate a commpage at all for m-profile cpus.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index f1370a7a8b..b96b3e566b 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -423,12 +423,32 @@ enum {
 
 static bool init_guest_commpage(void)
 {
-    abi_ptr commpage = HI_COMMPAGE & -qemu_host_page_size;
-    void *want = g2h_untagged(commpage);
-    void *addr = mmap(want, qemu_host_page_size, PROT_READ | PROT_WRITE,
-                      MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
+    ARMCPU *cpu = ARM_CPU(thread_cpu);
+    abi_ptr want = HI_COMMPAGE & TARGET_PAGE_MASK;
+    abi_ptr addr;
 
-    if (addr == MAP_FAILED) {
+    /*
+     * M-profile allocates maximum of 2GB address space, so can never
+     * allocate the commpage.  Skip it.
+     */
+    if (arm_feature(&cpu->env, ARM_FEATURE_M)) {
+        return true;
+    }
+
+    /*
+     * If reserved_va does not cover the commpage, we get an assert
+     * in page_set_flags.  Produce an intelligent error instead.
+     */
+    if (reserved_va != 0 && want + TARGET_PAGE_SIZE - 1 > reserved_va) {
+        error_report("Allocating guest commpage: -R 0x%" PRIx64 " too small",
+                     (uint64_t)reserved_va + 1);
+        exit(EXIT_FAILURE);
+    }
+
+    addr = target_mmap(want, TARGET_PAGE_SIZE, PROT_READ | PROT_WRITE,
+                       MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
+
+    if (addr == -1) {
         perror("Allocating guest commpage");
         exit(EXIT_FAILURE);
     }
@@ -437,15 +457,12 @@ static bool init_guest_commpage(void)
     }
 
     /* Set kernel helper versions; rest of page is 0.  */
-    __put_user(5, (uint32_t *)g2h_untagged(0xffff0ffcu));
+    put_user_u32(5, 0xffff0ffcu);
 
-    if (mprotect(addr, qemu_host_page_size, PROT_READ)) {
+    if (target_mprotect(addr, qemu_host_page_size, PROT_READ | PROT_EXEC)) {
         perror("Protecting guest commpage");
         exit(EXIT_FAILURE);
     }
-
-    page_set_flags(commpage, commpage | ~qemu_host_page_mask,
-                   PAGE_READ | PAGE_EXEC | PAGE_VALID);
     return true;
 }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 13/15] softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (11 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 12/15] linux-user/arm: Take more care allocating commpage Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 14/15] softmmu/watchpoint: Add missing 'qemu/error-report.h' include Richard Henderson
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé <philmd@linaro.org>

Both cpu_check_watchpoint() and cpu_watchpoint_address_matches()
are specific to TCG system emulation. Declare them in "tcg-cpu-ops.h"
to be sure accessing them from non-TCG code is a compilation error.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230328173117.15226-2-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h         | 37 ------------------------------
 include/hw/core/tcg-cpu-ops.h | 43 +++++++++++++++++++++++++++++++++++
 target/arm/tcg/mte_helper.c   |  1 +
 target/arm/tcg/sve_helper.c   |  1 +
 target/s390x/tcg/mem_helper.c |  1 +
 5 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 821e937020..ce312745d5 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -970,17 +970,6 @@ static inline void cpu_watchpoint_remove_by_ref(CPUState *cpu,
 static inline void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
 {
 }
-
-static inline void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
-                                        MemTxAttrs atr, int fl, uintptr_t ra)
-{
-}
-
-static inline int cpu_watchpoint_address_matches(CPUState *cpu,
-                                                 vaddr addr, vaddr len)
-{
-    return 0;
-}
 #else
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
                           int flags, CPUWatchpoint **watchpoint);
@@ -988,32 +977,6 @@ int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
                           vaddr len, int flags);
 void cpu_watchpoint_remove_by_ref(CPUState *cpu, CPUWatchpoint *watchpoint);
 void cpu_watchpoint_remove_all(CPUState *cpu, int mask);
-
-/**
- * cpu_check_watchpoint:
- * @cpu: cpu context
- * @addr: guest virtual address
- * @len: access length
- * @attrs: memory access attributes
- * @flags: watchpoint access type
- * @ra: unwind return address
- *
- * Check for a watchpoint hit in [addr, addr+len) of the type
- * specified by @flags.  Exit via exception with a hit.
- */
-void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
-                          MemTxAttrs attrs, int flags, uintptr_t ra);
-
-/**
- * cpu_watchpoint_address_matches:
- * @cpu: cpu context
- * @addr: guest virtual address
- * @len: access length
- *
- * Return the watchpoint flags that apply to [addr, addr+len).
- * If no watchpoint is registered for the range, the result is 0.
- */
-int cpu_watchpoint_address_matches(CPUState *cpu, vaddr addr, vaddr len);
 #endif
 
 /**
diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h
index 20e3c0ffbb..0ae08df47e 100644
--- a/include/hw/core/tcg-cpu-ops.h
+++ b/include/hw/core/tcg-cpu-ops.h
@@ -175,4 +175,47 @@ struct TCGCPUOps {
 
 };
 
+#if defined(CONFIG_USER_ONLY)
+
+static inline void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
+                                        MemTxAttrs atr, int fl, uintptr_t ra)
+{
+}
+
+static inline int cpu_watchpoint_address_matches(CPUState *cpu,
+                                                 vaddr addr, vaddr len)
+{
+    return 0;
+}
+
+#else
+
+/**
+ * cpu_check_watchpoint:
+ * @cpu: cpu context
+ * @addr: guest virtual address
+ * @len: access length
+ * @attrs: memory access attributes
+ * @flags: watchpoint access type
+ * @ra: unwind return address
+ *
+ * Check for a watchpoint hit in [addr, addr+len) of the type
+ * specified by @flags.  Exit via exception with a hit.
+ */
+void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
+                          MemTxAttrs attrs, int flags, uintptr_t ra);
+
+/**
+ * cpu_watchpoint_address_matches:
+ * @cpu: cpu context
+ * @addr: guest virtual address
+ * @len: access length
+ *
+ * Return the watchpoint flags that apply to [addr, addr+len).
+ * If no watchpoint is registered for the range, the result is 0.
+ */
+int cpu_watchpoint_address_matches(CPUState *cpu, vaddr addr, vaddr len);
+
+#endif
+
 #endif /* TCG_CPU_OPS_H */
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index fee3c7eb96..a4f3f92bc0 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -25,6 +25,7 @@
 #include "exec/ram_addr.h"
 #include "exec/cpu_ldst.h"
 #include "exec/helper-proto.h"
+#include "hw/core/tcg-cpu-ops.h"
 #include "qapi/error.h"
 #include "qemu/guest-random.h"
 
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index 9a8951afa4..ccf5e5beca 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -27,6 +27,7 @@
 #include "tcg/tcg.h"
 #include "vec_internal.h"
 #include "sve_ldst_internal.h"
+#include "hw/core/tcg-cpu-ops.h"
 
 
 /* Return a value for NZCV as per the ARM PredTest pseudofunction.
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index b93dbd3dad..8b58b8d88d 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -26,6 +26,7 @@
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
+#include "hw/core/tcg-cpu-ops.h"
 #include "qemu/int128.h"
 #include "qemu/atomic128.h"
 #include "trace.h"
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 14/15] softmmu/watchpoint: Add missing 'qemu/error-report.h' include
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (12 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 13/15] softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-28 22:58 ` [PULL 15/15] softmmu: Restore use of CPU watchpoint for all accelerators Richard Henderson
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé <philmd@linaro.org>

cpu_watchpoint_insert() calls error_report() which is declared
in "qemu/error-report.h". When moving this code in commit 2609ec2868
("softmmu: Extract watchpoint API from physmem.c") we neglected to
include this header. This works so far because it is indirectly
included by TCG headers -> "qemu/plugin.h" -> "qemu/error-report.h".

Currently cpu_watchpoint_insert() is only built with the TCG
accelerator. When building it with other ones (or without TCG)
we get:

  softmmu/watchpoint.c:38:9: error: implicit declaration of function 'error_report' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        error_report("tried to set invalid watchpoint at %"
        ^

Include "qemu/error-report.h" in order to fix this for non-TCG
builds.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230328173117.15226-3-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 softmmu/watchpoint.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/softmmu/watchpoint.c b/softmmu/watchpoint.c
index ad58736787..9d6ae68499 100644
--- a/softmmu/watchpoint.c
+++ b/softmmu/watchpoint.c
@@ -19,6 +19,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
+#include "qemu/error-report.h"
 #include "exec/exec-all.h"
 #include "exec/translate-all.h"
 #include "sysemu/tcg.h"
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PULL 15/15] softmmu: Restore use of CPU watchpoint for all accelerators
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (13 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 14/15] softmmu/watchpoint: Add missing 'qemu/error-report.h' include Richard Henderson
@ 2023-03-28 22:58 ` Richard Henderson
  2023-03-29 13:01 ` [PULL 00/15] tcg patch queue Peter Maydell
  2023-03-30 10:37 ` Joel Stanley
  16 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-28 22:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé <philmd@linaro.org>

CPU watchpoints can be use by non-TCG accelerators.

KVM uses them:

  $ git grep CPUWatchpoint|fgrep kvm
  target/arm/kvm64.c:1558:        CPUWatchpoint *wp = find_hw_watchpoint(cs, debug_exit->far);
  target/i386/kvm/kvm.c:5216:static CPUWatchpoint hw_watchpoint;
  target/ppc/kvm.c:443:static CPUWatchpoint hw_watchpoint;
  target/s390x/kvm/kvm.c:139:static CPUWatchpoint hw_watchpoint;

See for example commit e4482ab7e3 ("target-arm: kvm - add support
for HW assisted debug"):

     This adds basic support for HW assisted debug. The ioctl interface
     to KVM allows us to pass an implementation defined number of break
     and watch point registers. [...]

This partially reverts commit 2609ec2868e6c286e755a73b4504714a0296a.

Fixes: 2609ec2868 ("softmmu: Extract watchpoint API from physmem.c")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230328173117.15226-4-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/hw/core/cpu.h | 2 +-
 softmmu/watchpoint.c  | 4 ++++
 softmmu/meson.build   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index ce312745d5..397fd3ac68 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -949,7 +949,7 @@ static inline bool cpu_breakpoint_test(CPUState *cpu, vaddr pc, int mask)
     return false;
 }
 
-#if !defined(CONFIG_TCG) || defined(CONFIG_USER_ONLY)
+#if defined(CONFIG_USER_ONLY)
 static inline int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
                                         int flags, CPUWatchpoint **watchpoint)
 {
diff --git a/softmmu/watchpoint.c b/softmmu/watchpoint.c
index 9d6ae68499..5350163385 100644
--- a/softmmu/watchpoint.c
+++ b/softmmu/watchpoint.c
@@ -104,6 +104,8 @@ void cpu_watchpoint_remove_all(CPUState *cpu, int mask)
     }
 }
 
+#ifdef CONFIG_TCG
+
 /*
  * Return true if this watchpoint address matches the specified
  * access (ie the address range covered by the watchpoint overlaps
@@ -220,3 +222,5 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
         }
     }
 }
+
+#endif /* CONFIG_TCG */
diff --git a/softmmu/meson.build b/softmmu/meson.build
index 0180577517..1a7c7ac089 100644
--- a/softmmu/meson.build
+++ b/softmmu/meson.build
@@ -5,11 +5,11 @@ specific_ss.add(when: 'CONFIG_SOFTMMU', if_true: [files(
   'physmem.c',
   'qtest.c',
   'dirtylimit.c',
+  'watchpoint.c',
 )])
 
 specific_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: [files(
   'icount.c',
-  'watchpoint.c',
 )])
 
 softmmu_ss.add(files(
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (14 preceding siblings ...)
  2023-03-28 22:58 ` [PULL 15/15] softmmu: Restore use of CPU watchpoint for all accelerators Richard Henderson
@ 2023-03-29 13:01 ` Peter Maydell
  2023-03-30 10:37 ` Joel Stanley
  16 siblings, 0 replies; 27+ messages in thread
From: Peter Maydell @ 2023-03-29 13:01 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On Tue, 28 Mar 2023 at 23:58, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit d37158bb2425e7ebffb167d611be01f1e9e6c86f:
>
>   Update version for v8.0.0-rc2 release (2023-03-28 20:43:21 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230328
>
> for you to fetch changes up to 87e303de70f93bf700f58412fb9b2c3ec918c4b5:
>
>   softmmu: Restore use of CPU watchpoint for all accelerators (2023-03-28 15:24:06 -0700)
>
> ----------------------------------------------------------------
> Use a local version of GTree [#285]
> Fix page_set_flags vs the last page of the address space [#1528]
> Re-enable gdbstub breakpoints under KVM
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
                   ` (15 preceding siblings ...)
  2023-03-29 13:01 ` [PULL 00/15] tcg patch queue Peter Maydell
@ 2023-03-30 10:37 ` Joel Stanley
  2023-03-31 17:20   ` Richard Henderson
  16 siblings, 1 reply; 27+ messages in thread
From: Joel Stanley @ 2023-03-30 10:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, peter.maydell

On Tue, 28 Mar 2023 at 22:59, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit d37158bb2425e7ebffb167d611be01f1e9e6c86f:
>
>   Update version for v8.0.0-rc2 release (2023-03-28 20:43:21 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230328
>
> for you to fetch changes up to 87e303de70f93bf700f58412fb9b2c3ec918c4b5:
>
>   softmmu: Restore use of CPU watchpoint for all accelerators (2023-03-28 15:24:06 -0700)
>
> ----------------------------------------------------------------
> Use a local version of GTree [#285]
> Fix page_set_flags vs the last page of the address space [#1528]
> Re-enable gdbstub breakpoints under KVM
>
> ----------------------------------------------------------------
> Emilio Cota (2):
>       util: import GTree as QTree
>       tcg: use QTree instead of GTree
>
> Philippe Mathieu-Daudé (3):
>       softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel
>       softmmu/watchpoint: Add missing 'qemu/error-report.h' include
>       softmmu: Restore use of CPU watchpoint for all accelerators
>
> Richard Henderson (10):
>       linux-user: Diagnose misaligned -R size
>       accel/tcg: Pass last not end to page_set_flags
>       accel/tcg: Pass last not end to page_reset_target_data
>       accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
>       accel/tcg: Pass last not end to page_collection_lock
>       accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
>       accel/tcg: Pass last not end to tb_invalidate_phys_range
>       linux-user: Pass last not end to probe_guest_base
>       include/exec: Change reserved_va semantics to last byte
>       linux-user/arm: Take more care allocating commpage

Thanks for getting these fixes merged.

This last one (4f5c67f8df7f26e559509c68c45e652709edd23f) causes a
regression for me. On ppc64le, qemu-arm now segfaults. If I revert
this one I can run executables without the assert.

The segfault looks like this:

#0  0x00000001001e44fc in stl_he_p (v=5, ptr=0x240450ffc) at
/home/joel/qemu/include/qemu/bswap.h:260
#1  stl_le_p (v=5, ptr=0x240450ffc) at /home/joel/qemu/include/qemu/bswap.h:302
#2  init_guest_commpage () at ../linux-user/elfload.c:460
#3  probe_guest_base (image_name=image_name@entry=0x1003c72e0
<real_exec_path> "/home/joel/hello",
    guest_loaddr=guest_loaddr@entry=65536,
guest_hiaddr=guest_hiaddr@entry=17411743) at
../linux-user/elfload.c:2818
#4  0x00000001001e50d4 in load_elf_image (image_name=0x1003c72e0
<real_exec_path> "/home/joel/hello",
    image_fd=<optimised out>, info=info@entry=0x7fffffffe7e8,
pinterp_name=pinterp_name@entry=0x7fffffffe558,
    bprm_buf=bprm_buf@entry=0x7fffffffe8d0 "\177ELF\001\001\001") at
../linux-user/elfload.c:3108
#5  0x00000001001e5434 in load_elf_binary (bprm=0x7fffffffe8d0,
info=0x7fffffffe7e8) at ../linux-user/elfload.c:3548
#6  0x00000001001e85bc in loader_exec (fdexec=<optimised out>,
filename=<optimised out>, argv=<optimised out>,
    envp=<optimised out>, regs=0x7fffffffe888, infop=0x7fffffffe7e8,
bprm=0x7fffffffe8d0) at ../linux-user/linuxload.c:155
#7  0x0000000100046c7c in main (argc=<optimised out>,
argv=0x7ffffffff1c8, envp=<optimised out>) at ../linux-user/main.c:892

Cheers,

Joel


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2023-03-30 10:37 ` Joel Stanley
@ 2023-03-31 17:20   ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-03-31 17:20 UTC (permalink / raw)
  To: Joel Stanley; +Cc: qemu-devel, peter.maydell

On 3/30/23 03:37, Joel Stanley wrote:
> On Tue, 28 Mar 2023 at 22:59, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The following changes since commit d37158bb2425e7ebffb167d611be01f1e9e6c86f:
>>
>>    Update version for v8.0.0-rc2 release (2023-03-28 20:43:21 +0100)
>>
>> are available in the Git repository at:
>>
>>    https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230328
>>
>> for you to fetch changes up to 87e303de70f93bf700f58412fb9b2c3ec918c4b5:
>>
>>    softmmu: Restore use of CPU watchpoint for all accelerators (2023-03-28 15:24:06 -0700)
>>
>> ----------------------------------------------------------------
>> Use a local version of GTree [#285]
>> Fix page_set_flags vs the last page of the address space [#1528]
>> Re-enable gdbstub breakpoints under KVM
>>
>> ----------------------------------------------------------------
>> Emilio Cota (2):
>>        util: import GTree as QTree
>>        tcg: use QTree instead of GTree
>>
>> Philippe Mathieu-Daudé (3):
>>        softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel
>>        softmmu/watchpoint: Add missing 'qemu/error-report.h' include
>>        softmmu: Restore use of CPU watchpoint for all accelerators
>>
>> Richard Henderson (10):
>>        linux-user: Diagnose misaligned -R size
>>        accel/tcg: Pass last not end to page_set_flags
>>        accel/tcg: Pass last not end to page_reset_target_data
>>        accel/tcg: Pass last not end to PAGE_FOR_EACH_TB
>>        accel/tcg: Pass last not end to page_collection_lock
>>        accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked
>>        accel/tcg: Pass last not end to tb_invalidate_phys_range
>>        linux-user: Pass last not end to probe_guest_base
>>        include/exec: Change reserved_va semantics to last byte
>>        linux-user/arm: Take more care allocating commpage
> 
> Thanks for getting these fixes merged.
> 
> This last one (4f5c67f8df7f26e559509c68c45e652709edd23f) causes a
> regression for me. On ppc64le, qemu-arm now segfaults. If I revert
> this one I can run executables without the assert.
> 
> The segfault looks like this:
> 
> #0  0x00000001001e44fc in stl_he_p (v=5, ptr=0x240450ffc) at
> /home/joel/qemu/include/qemu/bswap.h:260
> #1  stl_le_p (v=5, ptr=0x240450ffc) at /home/joel/qemu/include/qemu/bswap.h:302
> #2  init_guest_commpage () at ../linux-user/elfload.c:460
> #3  probe_guest_base (image_name=image_name@entry=0x1003c72e0
> <real_exec_path> "/home/joel/hello",
>      guest_loaddr=guest_loaddr@entry=65536,
> guest_hiaddr=guest_hiaddr@entry=17411743) at
> ../linux-user/elfload.c:2818
> #4  0x00000001001e50d4 in load_elf_image (image_name=0x1003c72e0
> <real_exec_path> "/home/joel/hello",
>      image_fd=<optimised out>, info=info@entry=0x7fffffffe7e8,
> pinterp_name=pinterp_name@entry=0x7fffffffe558,
>      bprm_buf=bprm_buf@entry=0x7fffffffe8d0 "\177ELF\001\001\001") at
> ../linux-user/elfload.c:3108
> #5  0x00000001001e5434 in load_elf_binary (bprm=0x7fffffffe8d0,
> info=0x7fffffffe7e8) at ../linux-user/elfload.c:3548
> #6  0x00000001001e85bc in loader_exec (fdexec=<optimised out>,
> filename=<optimised out>, argv=<optimised out>,
>      envp=<optimised out>, regs=0x7fffffffe888, infop=0x7fffffffe7e8,
> bprm=0x7fffffffe8d0) at ../linux-user/linuxload.c:155
> #7  0x0000000100046c7c in main (argc=<optimised out>,
> argv=0x7ffffffff1c8, envp=<optimised out>) at ../linux-user/main.c:892

Gah!  I've exposed the same sort of overflow conditions within target_mmap and friends.  I 
think the only short-term solution for 8.0 is to revert the last patch.


r~



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PULL 00/15] tcg patch queue
@ 2023-04-23 10:19 Richard Henderson
  2023-04-23 14:55 ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Henderson @ 2023-04-23 10:19 UTC (permalink / raw)
  To: qemu-devel

Merge the first set of reviewed patches from my queue.

r~

The following changes since commit 6dd06214892d71cbbdd25daed7693e58afcb1093:

  Merge tag 'pull-hex-20230421' of https://github.com/quic/qemu into staging (2023-04-22 08:31:38 +0100)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230423

for you to fetch changes up to 3ea9be33400f14305565a9a094cb6031c07183d5:

  tcg/riscv: Conditionalize tcg_out_exts_i32_i64 (2023-04-23 08:46:45 +0100)

----------------------------------------------------------------
tcg cleanups:
  - Remove tcg_abort()
  - Split out extensions as known backend interfaces
  - Put the separate extensions together as tcg_out_movext
  - Introduce tcg_out_xchg as a backend interface
  - Clear TCGLabelQemuLdst on allocation
  - Avoid redundant extensions for riscv

----------------------------------------------------------------
Richard Henderson (15):
      tcg: Replace if + tcg_abort with tcg_debug_assert
      tcg: Replace tcg_abort with g_assert_not_reached
      tcg: Split out tcg_out_ext8s
      tcg: Split out tcg_out_ext8u
      tcg: Split out tcg_out_ext16s
      tcg: Split out tcg_out_ext16u
      tcg: Split out tcg_out_ext32s
      tcg: Split out tcg_out_ext32u
      tcg: Split out tcg_out_exts_i32_i64
      tcg: Split out tcg_out_extu_i32_i64
      tcg: Split out tcg_out_extrl_i64_i32
      tcg: Introduce tcg_out_movext
      tcg: Introduce tcg_out_xchg
      tcg: Clear TCGLabelQemuLdst on allocation
      tcg/riscv: Conditionalize tcg_out_exts_i32_i64

 include/tcg/tcg.h                |   6 --
 target/i386/tcg/translate.c      |  20 +++---
 target/s390x/tcg/translate.c     |   4 +-
 tcg/optimize.c                   |  10 ++-
 tcg/tcg.c                        | 135 +++++++++++++++++++++++++++++++++++----
 tcg/aarch64/tcg-target.c.inc     | 106 +++++++++++++++++++-----------
 tcg/arm/tcg-target.c.inc         |  93 +++++++++++++++++----------
 tcg/i386/tcg-target.c.inc        | 129 ++++++++++++++++++-------------------
 tcg/loongarch64/tcg-target.c.inc | 123 +++++++++++++----------------------
 tcg/mips/tcg-target.c.inc        |  94 +++++++++++++++++++--------
 tcg/ppc/tcg-target.c.inc         | 119 ++++++++++++++++++----------------
 tcg/riscv/tcg-target.c.inc       |  83 +++++++++++-------------
 tcg/s390x/tcg-target.c.inc       | 128 +++++++++++++++++--------------------
 tcg/sparc64/tcg-target.c.inc     | 117 +++++++++++++++++++++------------
 tcg/tcg-ldst.c.inc               |   1 +
 tcg/tci/tcg-target.c.inc         | 116 ++++++++++++++++++++++++++++++---
 16 files changed, 786 insertions(+), 498 deletions(-)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 00/15] tcg patch queue
  2023-04-23 10:19 Richard Henderson
@ 2023-04-23 14:55 ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-04-23 14:55 UTC (permalink / raw)
  To: qemu-devel

On 4/23/23 11:19, Richard Henderson wrote:
> Merge the first set of reviewed patches from my queue.
> 
> r~
> 
> The following changes since commit 6dd06214892d71cbbdd25daed7693e58afcb1093:
> 
>    Merge tag 'pull-hex-20230421' ofhttps://github.com/quic/qemu  into staging (2023-04-22 08:31:38 +0100)
> 
> are available in the Git repository at:
> 
>    https://gitlab.com/rth7680/qemu.git  tags/pull-tcg-20230423
> 
> for you to fetch changes up to 3ea9be33400f14305565a9a094cb6031c07183d5:
> 
>    tcg/riscv: Conditionalize tcg_out_exts_i32_i64 (2023-04-23 08:46:45 +0100)
> 
> ----------------------------------------------------------------
> tcg cleanups:
>    - Remove tcg_abort()
>    - Split out extensions as known backend interfaces
>    - Put the separate extensions together as tcg_out_movext
>    - Introduce tcg_out_xchg as a backend interface
>    - Clear TCGLabelQemuLdst on allocation
>    - Avoid redundant extensions for riscv

Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as appropriate.


r~



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 11/15] include/exec: Change reserved_va semantics to last byte
  2023-03-28 22:58 ` [PULL 11/15] include/exec: Change reserved_va semantics to last byte Richard Henderson
@ 2023-05-11 11:48   ` Laurent Vivier
  2023-05-11 13:24     ` Richard Henderson
  0 siblings, 1 reply; 27+ messages in thread
From: Laurent Vivier @ 2023-05-11 11:48 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

On 3/29/23 00:58, Richard Henderson wrote:
> Change the semantics to be the last byte of the guest va, rather
> than the following byte.  This avoids some overflow conditions.
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/exec/cpu-all.h      | 11 ++++++++++-
>   linux-user/arm/target_cpu.h |  2 +-
>   bsd-user/main.c             | 10 +++-------
>   bsd-user/mmap.c             |  4 ++--
>   linux-user/elfload.c        | 14 +++++++-------
>   linux-user/main.c           | 27 +++++++++++++--------------
>   linux-user/mmap.c           |  4 ++--
>   7 files changed, 38 insertions(+), 34 deletions(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 64cb62dc54..090922e4a8 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -152,6 +152,15 @@ static inline void tswap64s(uint64_t *s)
>    */
>   extern uintptr_t guest_base;
>   extern bool have_guest_base;
> +
> +/*
> + * If non-zero, the guest virtual address space is a contiguous subset
> + * of the host virtual address space, i.e. '-R reserved_va' is in effect
> + * either from the command-line or by default.  The value is the last
> + * byte of the guest address space e.g. UINT32_MAX.
> + *
> + * If zero, the host and guest virtual address spaces are intermingled.
> + */
>   extern unsigned long reserved_va;
>   
>   /*
> @@ -171,7 +180,7 @@ extern unsigned long reserved_va;
>   #define GUEST_ADDR_MAX_                                                 \
>       ((MIN_CONST(TARGET_VIRT_ADDR_SPACE_BITS, TARGET_ABI_BITS) <= 32) ?  \
>        UINT32_MAX : ~0ul)
> -#define GUEST_ADDR_MAX    (reserved_va ? reserved_va - 1 : GUEST_ADDR_MAX_)
> +#define GUEST_ADDR_MAX    (reserved_va ? : GUEST_ADDR_MAX_)
>   
>   #else
>   
> diff --git a/linux-user/arm/target_cpu.h b/linux-user/arm/target_cpu.h
> index 89ba274cfc..f6383a7cd1 100644
> --- a/linux-user/arm/target_cpu.h
> +++ b/linux-user/arm/target_cpu.h
> @@ -30,7 +30,7 @@ static inline unsigned long arm_max_reserved_va(CPUState *cs)
>            * the high addresses.  Restrict linux-user to the
>            * cached write-back RAM in the system map.
>            */
> -        return 0x80000000ul;
> +        return 0x7ffffffful;
>       } else {
>           /*
>            * We need to be able to map the commpage.
> diff --git a/bsd-user/main.c b/bsd-user/main.c
> index 89f225dead..babc3b009b 100644
> --- a/bsd-user/main.c
> +++ b/bsd-user/main.c
> @@ -68,13 +68,9 @@ bool have_guest_base;
>   # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
>   #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
>         (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
> -/*
> - * There are a number of places where we assign reserved_va to a variable
> - * of type abi_ulong and expect it to fit.  Avoid the last page.
> - */
> -#   define MAX_RESERVED_VA  (0xfffffffful & TARGET_PAGE_MASK)
> +#   define MAX_RESERVED_VA  0xfffffffful
>   #  else
> -#   define MAX_RESERVED_VA  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
> +#   define MAX_RESERVED_VA  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
>   #  endif
>   # else
>   #  define MAX_RESERVED_VA  0
> @@ -466,7 +462,7 @@ int main(int argc, char **argv)
>       envlist_free(envlist);
>   
>       if (reserved_va) {
> -            mmap_next_start = reserved_va;
> +        mmap_next_start = reserved_va + 1;
>       }
>   
>       {
> diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
> index 696057551a..565b9f97ed 100644
> --- a/bsd-user/mmap.c
> +++ b/bsd-user/mmap.c
> @@ -234,7 +234,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
>       size = HOST_PAGE_ALIGN(size) + alignment;
>       end_addr = start + size;
>       if (end_addr > reserved_va) {
> -        end_addr = reserved_va;
> +        end_addr = reserved_va + 1;
>       }
>       addr = end_addr - qemu_host_page_size;
>   
> @@ -243,7 +243,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
>               if (looped) {
>                   return (abi_ulong)-1;
>               }
> -            end_addr = reserved_va;
> +            end_addr = reserved_va + 1;
>               addr = end_addr - qemu_host_page_size;
>               looped = 1;
>               continue;
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index dfae967908..f1370a7a8b 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -208,7 +208,7 @@ static bool init_guest_commpage(void)
>        * has specified -R reserved_va, which would trigger an assert().
>        */
>       if (reserved_va != 0 &&
> -        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE >= reserved_va) {
> +        TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE - 1 > reserved_va) {
>           error_report("Cannot allocate vsyscall page");
>           exit(EXIT_FAILURE);
>       }
> @@ -2504,7 +2504,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
>           if (guest_hiaddr > reserved_va) {
>               error_report("%s: requires more than reserved virtual "
>                            "address space (0x%" PRIx64 " > 0x%lx)",
> -                         image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
> +                         image_name, (uint64_t)guest_hiaddr, reserved_va);
>               exit(EXIT_FAILURE);
>           }
>       } else {
> @@ -2525,7 +2525,7 @@ static void pgb_have_guest_base(const char *image_name, abi_ulong guest_loaddr,
>       if (reserved_va) {
>           guest_loaddr = (guest_base >= mmap_min_addr ? 0
>                           : mmap_min_addr - guest_base);
> -        guest_hiaddr = reserved_va - 1;
> +        guest_hiaddr = reserved_va;
>       }
>   
>       /* Reserve the address space for the binary, or reserved_va. */
> @@ -2755,7 +2755,7 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
>       if (guest_hiaddr > reserved_va) {
>           error_report("%s: requires more than reserved virtual "
>                        "address space (0x%" PRIx64 " > 0x%lx)",
> -                     image_name, (uint64_t)guest_hiaddr + 1, reserved_va);
> +                     image_name, (uint64_t)guest_hiaddr, reserved_va);
>           exit(EXIT_FAILURE);
>       }
>   
> @@ -2768,17 +2768,17 @@ static void pgb_reserved_va(const char *image_name, abi_ulong guest_loaddr,
>       /* Reserve the memory on the host. */
>       assert(guest_base != 0);
>       test = g2h_untagged(0);
> -    addr = mmap(test, reserved_va, PROT_NONE, flags, -1, 0);
> +    addr = mmap(test, reserved_va + 1, PROT_NONE, flags, -1, 0);
>       if (addr == MAP_FAILED || addr != test) {
>           error_report("Unable to reserve 0x%lx bytes of virtual address "
>                        "space at %p (%s) for use as guest address space (check your "
>                        "virtual memory ulimit setting, min_mmap_addr or reserve less "
> -                     "using -R option)", reserved_va, test, strerror(errno));
> +                     "using -R option)", reserved_va + 1, test, strerror(errno));
>           exit(EXIT_FAILURE);
>       }
>   
>       qemu_log_mask(CPU_LOG_PAGE, "%s: base @ %p for %lu bytes\n",
> -                  __func__, addr, reserved_va);
> +                  __func__, addr, reserved_va + 1);
>   }
>   
>   void probe_guest_base(const char *image_name, abi_ulong guest_loaddr,
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 39d9bd4d7a..fe03293516 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -109,11 +109,9 @@ static const char *last_log_filename;
>   # if HOST_LONG_BITS > TARGET_VIRT_ADDR_SPACE_BITS
>   #  if TARGET_VIRT_ADDR_SPACE_BITS == 32 && \
>         (TARGET_LONG_BITS == 32 || defined(TARGET_ABI32))
> -/* There are a number of places where we assign reserved_va to a variable
> -   of type abi_ulong and expect it to fit.  Avoid the last page.  */
> -#   define MAX_RESERVED_VA(CPU)  (0xfffffffful & TARGET_PAGE_MASK)
> +#   define MAX_RESERVED_VA(CPU)  0xfffffffful
>   #  else
> -#   define MAX_RESERVED_VA(CPU)  (1ul << TARGET_VIRT_ADDR_SPACE_BITS)
> +#   define MAX_RESERVED_VA(CPU)  ((1ul << TARGET_VIRT_ADDR_SPACE_BITS) - 1)
>   #  endif
>   # else
>   #  define MAX_RESERVED_VA(CPU)  0
> @@ -379,7 +377,9 @@ static void handle_arg_reserved_va(const char *arg)
>   {
>       char *p;
>       int shift = 0;
> -    reserved_va = strtoul(arg, &p, 0);
> +    unsigned long val;
> +
> +    val = strtoul(arg, &p, 0);
>       switch (*p) {
>       case 'k':
>       case 'K':
> @@ -393,10 +393,10 @@ static void handle_arg_reserved_va(const char *arg)
>           break;
>       }
>       if (shift) {
> -        unsigned long unshifted = reserved_va;
> +        unsigned long unshifted = val;
>           p++;
> -        reserved_va <<= shift;
> -        if (reserved_va >> shift != unshifted) {
> +        val <<= shift;
> +        if (val >> shift != unshifted) {
>               fprintf(stderr, "Reserved virtual address too big\n");
>               exit(EXIT_FAILURE);
>           }
> @@ -405,6 +405,8 @@ static void handle_arg_reserved_va(const char *arg)
>           fprintf(stderr, "Unrecognised -R size suffix '%s'\n", p);
>           exit(EXIT_FAILURE);
>       }
> +    /* The representation is size - 1, with 0 remaining "default". */
> +    reserved_va = val ? val - 1 : 0;
>   }
>   
>   static void handle_arg_singlestep(const char *arg)
> @@ -793,7 +795,7 @@ int main(int argc, char **argv, char **envp)
>        */
>       max_reserved_va = MAX_RESERVED_VA(cpu);
>       if (reserved_va != 0) {
> -        if (reserved_va % qemu_host_page_size) {
> +        if ((reserved_va + 1) % qemu_host_page_size) {
>               char *s = size_to_str(qemu_host_page_size);
>               fprintf(stderr, "Reserved virtual address not aligned mod %s\n", s);
>               g_free(s);
> @@ -804,11 +806,8 @@ int main(int argc, char **argv, char **envp)
>               exit(EXIT_FAILURE);
>           }
>       } else if (HOST_LONG_BITS == 64 && TARGET_VIRT_ADDR_SPACE_BITS <= 32) {
> -        /*
> -         * reserved_va must be aligned with the host page size
> -         * as it is used with mmap()
> -         */
> -        reserved_va = max_reserved_va & qemu_host_page_mask;
> +        /* MAX_RESERVED_VA + 1 is a large power of 2, so is aligned. */
> +        reserved_va = max_reserved_va;
>       }
>   
>       {
> diff --git a/linux-user/mmap.c b/linux-user/mmap.c
> index 995146f60d..0aa8ae7356 100644
> --- a/linux-user/mmap.c
> +++ b/linux-user/mmap.c
> @@ -283,7 +283,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
>       end_addr = start + size;
>       if (start > reserved_va - size) {
>           /* Start at the top of the address space.  */
> -        end_addr = ((reserved_va - size) & -align) + size;
> +        end_addr = ((reserved_va + 1 - size) & -align) + size;
>           looped = true;
>       }
>   
> @@ -297,7 +297,7 @@ static abi_ulong mmap_find_vma_reserved(abi_ulong start, abi_ulong size,
>                   return (abi_ulong)-1;
>               }
>               /* Re-start at the top of the address space.  */
> -            addr = end_addr = ((reserved_va - size) & -align) + size;
> +            addr = end_addr = ((reserved_va + 1 - size) & -align) + size;
>               looped = true;
>           } else {
>               prot = page_get_flags(addr);

This patch breaks something.

In LTP (20230127), fcntl36 fails now (all archs):

sudo unshare --time --ipc --uts --pid --fork --kill-child --mount --mount-proc --root 
chroot/m68k/sid
# /opt/ltp/testcases/bin/fcntl36

tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
fcntl36.c:288: TINFO: OFD read lock vs OFD write lock
tst_kernel.c:87: TINFO: uname.machine=m68k kernel is 32bit
fcntl36.c:366: TPASS: Access between threads synchronized
fcntl36.c:288: TINFO: OFD write lock vs POSIX write lock
fcntl36.c:318: TBROK: pthread_create(0x40800330,(nil),0x80004328,0x40800068) failed: 
EAGAIN/EWOULDBLOCK

Expected result:

tst_test.c:1250: TINFO: Timeout per run is 0h 05m 00s
fcntl36.c:289: TINFO: OFD read lock vs OFD write lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD write lock vs POSIX write lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD read lock vs POSIX write lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD write lock vs POSIX read lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD write lock vs OFD write lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD r/w lock vs POSIX write lock
fcntl36.c:367: TPASS: Access between threads synchronized
fcntl36.c:289: TINFO: OFD r/w lock vs POSIX read lock
fcntl36.c:367: TPASS: Access between threads synchronized

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PULL 11/15] include/exec: Change reserved_va semantics to last byte
  2023-05-11 11:48   ` Laurent Vivier
@ 2023-05-11 13:24     ` Richard Henderson
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Henderson @ 2023-05-11 13:24 UTC (permalink / raw)
  To: Laurent Vivier, qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé

On 5/11/23 12:48, Laurent Vivier wrote:
> This patch breaks something.
> 
> In LTP (20230127), fcntl36 fails now (all archs):
> 
> sudo unshare --time --ipc --uts --pid --fork --kill-child --mount --mount-proc --root 
> chroot/m68k/sid
> # /opt/ltp/testcases/bin/fcntl36
> 
> tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
> fcntl36.c:288: TINFO: OFD read lock vs OFD write lock
> tst_kernel.c:87: TINFO: uname.machine=m68k kernel is 32bit
> fcntl36.c:366: TPASS: Access between threads synchronized
> fcntl36.c:288: TINFO: OFD write lock vs POSIX write lock
> fcntl36.c:318: TBROK: pthread_create(0x40800330,(nil),0x80004328,0x40800068) failed: 
> EAGAIN/EWOULDBLOCK

Any idea where the failure is coming from?
I don't see a matching syscall in -strace...


r~


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-05-11 13:25 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-28 22:57 [PULL 00/15] tcg patch queue Richard Henderson
2023-03-28 22:57 ` [PULL 01/15] util: import GTree as QTree Richard Henderson
2023-03-28 22:57 ` [PULL 02/15] tcg: use QTree instead of GTree Richard Henderson
2023-03-28 22:57 ` [PULL 03/15] linux-user: Diagnose misaligned -R size Richard Henderson
2023-03-28 22:57 ` [PULL 04/15] accel/tcg: Pass last not end to page_set_flags Richard Henderson
2023-03-28 22:57 ` [PULL 05/15] accel/tcg: Pass last not end to page_reset_target_data Richard Henderson
2023-03-28 22:57 ` [PULL 06/15] accel/tcg: Pass last not end to PAGE_FOR_EACH_TB Richard Henderson
2023-03-28 22:57 ` [PULL 07/15] accel/tcg: Pass last not end to page_collection_lock Richard Henderson
2023-03-28 22:57 ` [PULL 08/15] accel/tcg: Pass last not end to tb_invalidate_phys_page_range__locked Richard Henderson
2023-03-28 22:58 ` [PULL 09/15] accel/tcg: Pass last not end to tb_invalidate_phys_range Richard Henderson
2023-03-28 22:58 ` [PULL 10/15] linux-user: Pass last not end to probe_guest_base Richard Henderson
2023-03-28 22:58 ` [PULL 11/15] include/exec: Change reserved_va semantics to last byte Richard Henderson
2023-05-11 11:48   ` Laurent Vivier
2023-05-11 13:24     ` Richard Henderson
2023-03-28 22:58 ` [PULL 12/15] linux-user/arm: Take more care allocating commpage Richard Henderson
2023-03-28 22:58 ` [PULL 13/15] softmmu: Restrict cpu_check_watchpoint / address_matches to TCG accel Richard Henderson
2023-03-28 22:58 ` [PULL 14/15] softmmu/watchpoint: Add missing 'qemu/error-report.h' include Richard Henderson
2023-03-28 22:58 ` [PULL 15/15] softmmu: Restore use of CPU watchpoint for all accelerators Richard Henderson
2023-03-29 13:01 ` [PULL 00/15] tcg patch queue Peter Maydell
2023-03-30 10:37 ` Joel Stanley
2023-03-31 17:20   ` Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2023-04-23 10:19 Richard Henderson
2023-04-23 14:55 ` Richard Henderson
2021-10-13 18:22 Richard Henderson
2021-10-13 19:55 ` Richard Henderson
2021-06-04 20:11 Richard Henderson
2021-06-05 15:11 ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).