qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v12 0/6] Specifying cache topology on ARM
@ 2025-06-04 13:34 Alireza Sanaee via
  2025-06-04 13:34 ` [PATCH v12 1/6] target/arm/tcg: increase cache level for cpu=max Alireza Sanaee via
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Alireza Sanaee via @ 2025-06-04 13:34 UTC (permalink / raw)
  To: mst
  Cc: anisinha, armbru, berrange, dapeng1.mi, eric.auger, farman,
	gustavo.romero, imammedo, jiangkunkun, jonathan.cameron, linuxarm,
	mtosatti, peter.maydell, philmd, qemu-arm, qemu-devel,
	richard.henderson, shameerali.kolothum.thodi, shannon.zhaosl,
	yangyicong, zhao1.liu, maobibo

Specifying the cache layout in virtual machines is useful for
applications and operating systems to fetch accurate information about
the cache structure and make appropriate adjustments. Enforcing correct
sharing information can lead to better optimizations. Patches that allow
for an interface to express caches was landed in the prior cycles. This
patchset uses the interface as a foundation.  Thus, the device tree and
ACPI/PPTT table, and device tree are populated based on
user-provided information and CPU topology.

Example:


+----------------+                            +----------------+
|    Socket 0    |                            |    Socket 1    |
|    (L3 Cache)  |                            |    (L3 Cache)  |
+--------+-------+                            +--------+-------+
         |                                             |
+--------+--------+                            +--------+--------+
|   Cluster 0     |                            |   Cluster 0     |
|   (L2 Cache)    |                            |   (L2 Cache)    |
+--------+--------+                            +--------+--------+
         |                                             |
+--------+--------+  +--------+--------+    +--------+--------+  +--------+----+
|   Core 0         | |   Core 1        |    |   Core 0        |  |   Core 1    |
|   (L1i, L1d)     | |   (L1i, L1d)    |    |   (L1i, L1d)    |  |   (L1i, L1d)|
+--------+--------+  +--------+--------+    +--------+--------+  +--------+----+
         |                   |                       |                   |
+--------+              +--------+              +--------+          +--------+
|Thread 0|              |Thread 1|              |Thread 1|          |Thread 0|
+--------+              +--------+              +--------+          +--------+
|Thread 1|              |Thread 0|              |Thread 0|          |Thread 1|
+--------+              +--------+              +--------+          +--------+


The following command will represent the system relying on **ACPI PPTT tables**.

./qemu-system-aarch64 \
 -machine virt,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-cache.3.cache=l3,smp-cache.3.topology=socket \
 -cpu max \
 -m 2048 \
 -smp sockets=2,clusters=1,cores=2,threads=2 \
 -kernel ./Image.gz \
 -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=force" \
 -initrd rootfs.cpio.gz \
 -bios ./edk2-aarch64-code.fd \
 -nographic

The following command will represent the system relying on **the device tree**.

./qemu-system-aarch64 \
 -machine virt,acpi=off,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-cache.3.cache=l3,smp-cache.3.topology=socket \
 -cpu max \
 -m 2048 \
 -smp sockets=2,clusters=1,cores=2,threads=2 \
 -kernel ./Image.gz \
 -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=off" \
 -initrd rootfs.cpio.gz \
 -nographic

Failure cases:
    1) There are scenarios where caches exist in systems' registers but
    left unspecified by users. In this case qemu returns failure.

    2) SMT threads cannot share caches which is not very common. More
    discussions here [1].

Currently only three levels of caches are supported to be specified from
the command line. However, increasing the value does not require
significant changes. Further, this patch assumes l2 and l3 unified
caches and does not allow l(2/3)(i/d). The level terminology is
thread/core/cluster/socket right now. Hierarchy assumed in this patch:
Socket level = Cluster level + 1 = Core level + 2 = Thread level + 3;

TODO:
  1) Making the code to work with arbitrary levels
  2) Separated data and instruction cache at L2 and L3.
  3) Additional cache controls.  e.g. size of L3 may not want to just
  match the underlying system, because only some of the associated host
  CPUs may be bound to this VM.

[1] https://lore.kernel.org/devicetree-spec/20250203120527.3534-1-alireza.sanaee@huawei.com/

Change Log:
  v11->v12:
   * Patch #4 couldn't not merge properly as the main file diverged. Now it is fixed (hopefully).
   * Loonarch build_pptt function updated.
   * Rebased on 09be8a511a2e278b45729d7b065d30c68dd699d0.

  v10->v11:
   * Fix some coding style issues.
   * Rename some variables.

  v9->v10:
   * PPTT rev down to 2.

  v8->v9:
   * rebase to 10
   * Fixed a bug in device-tree generation related to a scenario when
        caches are shared at core in higher levels than 1.
  v7->v8:
   * rebase: Merge tag 'pull-nbd-2024-08-26' of https://repo.or.cz/qemu/ericb into staging
   * I mis-included a file in patch #4 and I removed it in this one.

  v6->v7:
   * Intel stuff got pulled up, so rebase.
   * added some discussions on device tree.

  v5->v6:
   * Minor bug fix.
   * rebase based on new Intel patchset.
     - https://lore.kernel.org/qemu-devel/20250110145115.1574345-1-zhao1.liu@intel.com/

  v4->v5:
    * Added Reviewed-by tags.
    * Applied some comments.

  v3->v4:
    * Device tree added.

Depends-on: Building PPTT with root node and identical implementation flag
Depends-on: Msg-id: 20250604115233.1234-1-alireza.sanaee@huawei.com

Alireza Sanaee (6):
  target/arm/tcg: increase cache level for cpu=max
  arm/virt.c: add cache hierarchy to device tree
  bios-tables-test: prepare to change ARM ACPI virt PPTT
  hw/acpi: add cache hierarchy to pptt table
  tests/qtest/bios-table-test: testing new ARM ACPI PPTT topology
  Update the ACPI tables based on new aml-build.c

 hw/acpi/aml-build.c                        | 222 ++++++++++++-
 hw/arm/virt-acpi-build.c                   |   8 +-
 hw/arm/virt.c                              | 359 ++++++++++++++++++++-
 hw/cpu/core.c                              |  92 ++++++
 hw/loongarch/virt-acpi-build.c             |   2 +-
 include/hw/acpi/aml-build.h                |   4 +-
 include/hw/arm/virt.h                      |   7 +-
 include/hw/cpu/core.h                      |  27 ++
 target/arm/tcg/cpu64.c                     |  13 +
 tests/data/acpi/aarch64/virt/PPTT.topology | Bin 356 -> 540 bytes
 tests/qtest/bios-tables-test.c             |   4 +
 11 files changed, 725 insertions(+), 13 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-06-11 11:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 13:34 [PATCH v12 0/6] Specifying cache topology on ARM Alireza Sanaee via
2025-06-04 13:34 ` [PATCH v12 1/6] target/arm/tcg: increase cache level for cpu=max Alireza Sanaee via
2025-06-04 13:34 ` [PATCH v12 2/6] arm/virt.c: add cache hierarchy to device tree Alireza Sanaee via
2025-06-05  9:44   ` Zhao Liu
2025-06-05  9:40     ` Alireza Sanaee via
2025-06-05 13:58       ` Zhao Liu
2025-06-11 11:38     ` Alireza Sanaee via
2025-06-04 13:34 ` [PATCH v12 3/6] bios-tables-test: prepare to change ARM ACPI virt PPTT Alireza Sanaee via
2025-06-05  8:29   ` Zhao Liu
2025-06-04 13:34 ` [PATCH v12 4/6] hw/acpi: add cache hierarchy to pptt table Alireza Sanaee via
2025-06-04 13:34 ` [PATCH v12 5/6] tests/qtest/bios-table-test: testing new ARM ACPI PPTT topology Alireza Sanaee via
2025-06-05  9:46   ` Zhao Liu
2025-06-04 13:34 ` [PATCH v12 6/6] Update the ACPI tables based on new aml-build.c Alireza Sanaee via
2025-06-05  9:47   ` Zhao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).