qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/8] Introduce SMP Cache Topology
@ 2024-02-20  9:24 Zhao Liu
  2024-02-20  9:24 ` [RFC 1/8] hw/core: Rename CpuTopology to CPUTopology Zhao Liu
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:24 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Hi list,

This's our proposal for supporting (SMP) cache topology in -smp as
the following example:

-smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
     l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die

With the new cache topology options ("l1d-cache", "l1i-cache",
"l2-cache" and "l3-cache"), we could adjust the cache topology via -smp.

This patch set is rebased on our i386 module series:
https://lore.kernel.org/qemu-devel/20240131101350.109512-1-zhao1.liu@linux.intel.com/

Since the ARM [1] and RISC-V [2] folks have similar needs for the cache
topology, I also cc'd the ARM and RISC-V folks and lists.


Welcome your feedback!


Introduction
============

Background
----------

Intel client platforms (ADL/RPL/MTL) and E core server platforms (SRF)
share the L2 cache domain among multiple E cores (in the same module).

Thus we need a way to adjust the cache topology so that users could
create the cache topology for Guest that is nearly identical to Host.

This is necessary in cases where there are bound vCPUs, especially
considering that Guest scheduling often takes into account the cache
topology as well (e.g. Linux cluster aware scheduling, i.e. L2 cache
scheduling).

Previously, we introduced a x86 specific option to adjust the cache
topology:

-cpu x-l2-cache-topo=[core|module] [3]

However, considering the needs of other arches, we re-implemented the
generic cache topology (aslo in response to Michael's [4] and Daniel's
comment [5]) in this series.


Cache Topology Representation
-----------------------------

We consider to define the cache topology based on CPU topology level for
two reasons:

1. In practice, a cache will always be bound to the CPU container -
   "CPU container" indicates to a set of CPUs that refer to a certain
   level of CPU topology - where the cache is either private in that
   CPU container or shared among multiple containers.

2. The x86's cache-related CPUIDs encode cache topology based on APIC
   ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV
   relies on also requires CPU containers (CPU topology) to help
   indicate the private shared hierarchy of the cache.

Therefore, for SMP systems, it is natural to use the CPU topology
hierarchy directly in QEMU to define the cache topology.

And currently, separated L1 cache (L1 data cache and L1 instruction
cache) with unified higher-level caches (e.g., unified L2 and L3
caches), is the most common cache architectures.

Thus, we define the topology for L1 D-cache, L1 I-cache, L2 cache and L3
cache in MachineState as the basic cache topology support:

typedef struct CacheTopology {
    CPUTopoLevel l1d;
    CPUTopoLevel l1i;
    CPUTopoLevel l2;
    CPUTopoLevel l3;
} CacheTopology;

Machines may also only support a subset of the cache topology
to be configured in -smp by setting the SMP property of MachineClass:

typedef struct {
    ...
    bool l1_separated_cache_supported;
    bool l2_unified_cache_supported;
    bool l3_unified_cache_supported;
} SMPCompatProps;


Cache Topology Configuration in -smp
------------------------------------

Further, we add new parameters to -smp:
* l1d-cache=level
* l1i-cache=level
* l2-cache=level
* l3-cache=level

These cache topology parameters accept the strings of CPU topology
levels (such as "drawer", "book", "socket", "die", "cluster", "module",
"core" or "thread"). Exactly which topology level strings could be
accepted as the parameter depends on the machine's support for the
corresponding CPU topology level.

Unsupported cache topology parameters will be omitted, and
correspondingly, the target CPU's cache topology will use the its
default cache topology setting.

In this series, we add the cache topology support in -smp for x86 PC
machine.

The following example defines a 3-level cache topology hierarchy (L1
D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per
die) for PC machine.

-smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
     l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die


Reference
---------

[1]: [ARM] Jonathan's proposal to adjust cache topology:
     https://lore.kernel.org/qemu-devel/20230808115713.2613-2-Jonathan.Cameron@huawei.com/
[2]: [RISC-V] Discussion between JeeHeng and Jonathan about cache
     topology:
     https://lore.kernel.org/qemu-devel/20240131155336.000068d1@Huawei.com/
[3]: Previous x86 specific cache topology option:
     https://lore.kernel.org/qemu-devel/20230914072159.1177582-22-zhao1.liu@linux.intel.com/
[4]: Michael's comment about generic cache topology support:
     https://lore.kernel.org/qemu-devel/20231003085516-mutt-send-email-mst@kernel.org/
[5]: Daniel's question about how x86 support L2 cache domain (cluster)
     configuration:
     https://lore.kernel.org/qemu-devel/ZcUG0Uc8KylEQhUW@redhat.com/

Thanks and Best Regards,
Zhao

---
Zhao Liu (8):
  hw/core: Rename CpuTopology to CPUTopology
  hw/core: Move CPU topology enumeration into arch-agnostic file
  hw/core: Define cache topology for machine
  hw/core: Add cache topology options in -smp
  i386/cpu: Support thread and module level cache topology
  i386/cpu: Update cache topology with machine's configuration
  i386/pc: Support cache topology in -smp for PC machine
  qemu-options: Add the cache topology description of -smp

 MAINTAINERS                     |   2 +
 hw/core/cpu-topology.c          |  56 ++++++++++++++
 hw/core/machine-smp.c           | 128 ++++++++++++++++++++++++++++++++
 hw/core/machine.c               |   9 +++
 hw/core/meson.build             |   1 +
 hw/i386/pc.c                    |   3 +
 hw/s390x/cpu-topology.c         |   6 +-
 include/hw/boards.h             |  33 +++++++-
 include/hw/core/cpu-topology.h  |  40 ++++++++++
 include/hw/i386/topology.h      |  18 +----
 include/hw/s390x/cpu-topology.h |   6 +-
 qapi/machine.json               |  14 +++-
 qemu-options.hx                 |  54 ++++++++++++--
 system/vl.c                     |  15 ++++
 target/i386/cpu.c               |  55 ++++++++++----
 target/i386/cpu.h               |   2 +-
 tests/unit/meson.build          |   3 +-
 tests/unit/test-smp-parse.c     |  14 ++--
 18 files changed, 399 insertions(+), 60 deletions(-)
 create mode 100644 hw/core/cpu-topology.c
 create mode 100644 include/hw/core/cpu-topology.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC 1/8] hw/core: Rename CpuTopology to CPUTopology
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
@ 2024-02-20  9:24 ` Zhao Liu
  2024-02-20  9:24 ` [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file Zhao Liu
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:24 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Use CPUTopology to honor the generic style of CPU capitalization
abbreviations.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/s390x/cpu-topology.c         |  6 +++---
 include/hw/boards.h             |  8 ++++----
 include/hw/s390x/cpu-topology.h |  6 +++---
 tests/unit/test-smp-parse.c     | 14 +++++++-------
 4 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index f16bdf65faa0..016f6c1c15ac 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -86,7 +86,7 @@ bool s390_has_topology(void)
  */
 static void s390_topology_init(MachineState *ms)
 {
-    CpuTopology *smp = &ms->smp;
+    CPUTopology *smp = &ms->smp;
 
     s390_topology.cores_per_socket = g_new0(uint8_t, smp->sockets *
                                             smp->books * smp->drawers);
@@ -181,7 +181,7 @@ void s390_topology_reset(void)
  */
 static bool s390_topology_cpu_default(S390CPU *cpu, Error **errp)
 {
-    CpuTopology *smp = &current_machine->smp;
+    CPUTopology *smp = &current_machine->smp;
     CPUS390XState *env = &cpu->env;
 
     /* All geometry topology attributes must be set or all unset */
@@ -234,7 +234,7 @@ static bool s390_topology_check(uint16_t socket_id, uint16_t book_id,
                                 uint16_t drawer_id, uint16_t entitlement,
                                 bool dedicated, Error **errp)
 {
-    CpuTopology *smp = &current_machine->smp;
+    CPUTopology *smp = &current_machine->smp;
 
     if (socket_id >= smp->sockets) {
         error_setg(errp, "Unavailable socket: %d", socket_id);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 78dea50054a1..e63dec919da2 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -333,7 +333,7 @@ typedef struct DeviceMemoryState {
 } DeviceMemoryState;
 
 /**
- * CpuTopology:
+ * CPUTopology:
  * @cpus: the number of present logical processors on the machine
  * @drawers: the number of drawers on the machine
  * @books: the number of books in one drawer
@@ -345,7 +345,7 @@ typedef struct DeviceMemoryState {
  * @threads: the number of threads in one core
  * @max_cpus: the maximum number of logical processors on the machine
  */
-typedef struct CpuTopology {
+typedef struct CPUTopology {
     unsigned int cpus;
     unsigned int drawers;
     unsigned int books;
@@ -356,7 +356,7 @@ typedef struct CpuTopology {
     unsigned int cores;
     unsigned int threads;
     unsigned int max_cpus;
-} CpuTopology;
+} CPUTopology;
 
 /**
  * MachineState:
@@ -407,7 +407,7 @@ struct MachineState {
     const char *cpu_type;
     AccelState *accelerator;
     CPUArchIdList *possible_cpus;
-    CpuTopology smp;
+    CPUTopology smp;
     struct NVDIMMState *nvdimms_state;
     struct NumaState *numa_state;
 };
diff --git a/include/hw/s390x/cpu-topology.h b/include/hw/s390x/cpu-topology.h
index c064f427e948..ff09c57a4428 100644
--- a/include/hw/s390x/cpu-topology.h
+++ b/include/hw/s390x/cpu-topology.h
@@ -63,17 +63,17 @@ static inline void s390_topology_reset(void)
 
 extern S390Topology s390_topology;
 
-static inline int s390_std_socket(int n, CpuTopology *smp)
+static inline int s390_std_socket(int n, CPUTopology *smp)
 {
     return (n / smp->cores) % smp->sockets;
 }
 
-static inline int s390_std_book(int n, CpuTopology *smp)
+static inline int s390_std_book(int n, CPUTopology *smp)
 {
     return (n / (smp->cores * smp->sockets)) % smp->books;
 }
 
-static inline int s390_std_drawer(int n, CpuTopology *smp)
+static inline int s390_std_drawer(int n, CPUTopology *smp)
 {
     return (n / (smp->cores * smp->sockets * smp->books)) % smp->drawers;
 }
diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c
index 24972666a74d..f660d6b0df45 100644
--- a/tests/unit/test-smp-parse.c
+++ b/tests/unit/test-smp-parse.c
@@ -86,8 +86,8 @@
  */
 typedef struct SMPTestData {
     SMPConfiguration config;
-    CpuTopology expect_prefer_sockets;
-    CpuTopology expect_prefer_cores;
+    CPUTopology expect_prefer_sockets;
+    CPUTopology expect_prefer_cores;
     const char *expect_error;
 } SMPTestData;
 
@@ -395,7 +395,7 @@ static char *smp_config_to_string(const SMPConfiguration *config)
 }
 
 /* Use the different calculation than machine_topo_get_threads_per_socket(). */
-static unsigned int cpu_topology_get_threads_per_socket(const CpuTopology *topo)
+static unsigned int cpu_topology_get_threads_per_socket(const CPUTopology *topo)
 {
     /* Check the divisor to avoid invalid topology examples causing SIGFPE. */
     if (!topo->sockets) {
@@ -406,7 +406,7 @@ static unsigned int cpu_topology_get_threads_per_socket(const CpuTopology *topo)
 }
 
 /* Use the different calculation than machine_topo_get_cores_per_socket(). */
-static unsigned int cpu_topology_get_cores_per_socket(const CpuTopology *topo)
+static unsigned int cpu_topology_get_cores_per_socket(const CPUTopology *topo)
 {
     /* Check the divisor to avoid invalid topology examples causing SIGFPE. */
     if (!topo->threads) {
@@ -416,12 +416,12 @@ static unsigned int cpu_topology_get_cores_per_socket(const CpuTopology *topo)
     }
 }
 
-static char *cpu_topology_to_string(const CpuTopology *topo,
+static char *cpu_topology_to_string(const CPUTopology *topo,
                                     unsigned int threads_per_socket,
                                     unsigned int cores_per_socket)
 {
     return g_strdup_printf(
-        "(CpuTopology) {\n"
+        "(CPUTopology) {\n"
         "    .cpus               = %u,\n"
         "    .sockets            = %u,\n"
         "    .dies               = %u,\n"
@@ -438,7 +438,7 @@ static char *cpu_topology_to_string(const CpuTopology *topo,
 }
 
 static void check_parse(MachineState *ms, const SMPConfiguration *config,
-                        const CpuTopology *expect_topo, const char *expect_err,
+                        const CPUTopology *expect_topo, const char *expect_err,
                         bool is_valid)
 {
     g_autofree char *config_str = smp_config_to_string(config);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
  2024-02-20  9:24 ` [RFC 1/8] hw/core: Rename CpuTopology to CPUTopology Zhao Liu
@ 2024-02-20  9:24 ` Zhao Liu
  2024-02-28  9:53   ` JeeHeng Sia
  2024-02-20  9:24 ` [RFC 3/8] hw/core: Define cache topology for machine Zhao Liu
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:24 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Cache topology needs to be defined based on CPU topology levels. Thus,
move CPU topology enumeration into a common header.

To match the general topology naming style, rename CPU_TOPO_LEVEL_SMT
and CPU_TOPO_LEVEL_PACKAGE to CPU_TOPO_LEVEL_THREAD and
CPU_TOPO_LEVEL_SOCKET.

Also, enumerate additional topology levels for non-i386 arches, and add
helpers for topology enumeration and string conversion.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 MAINTAINERS                    |  2 ++
 hw/core/cpu-topology.c         | 56 ++++++++++++++++++++++++++++++++++
 hw/core/meson.build            |  1 +
 include/hw/core/cpu-topology.h | 40 ++++++++++++++++++++++++
 include/hw/i386/topology.h     | 18 +----------
 target/i386/cpu.c              | 24 +++++++--------
 target/i386/cpu.h              |  2 +-
 tests/unit/meson.build         |  3 +-
 8 files changed, 115 insertions(+), 31 deletions(-)
 create mode 100644 hw/core/cpu-topology.c
 create mode 100644 include/hw/core/cpu-topology.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7d61fb93194b..4b1cce938915 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1871,6 +1871,7 @@ R: Yanan Wang <wangyanan55@huawei.com>
 S: Supported
 F: hw/core/cpu-common.c
 F: hw/core/cpu-sysemu.c
+F: hw/core/cpu-topology.c
 F: hw/core/machine-qmp-cmds.c
 F: hw/core/machine.c
 F: hw/core/machine-smp.c
@@ -1882,6 +1883,7 @@ F: qapi/machine-common.json
 F: qapi/machine-target.json
 F: include/hw/boards.h
 F: include/hw/core/cpu.h
+F: include/hw/core/cpu-topology.h
 F: include/hw/cpu/cluster.h
 F: include/sysemu/numa.h
 F: tests/unit/test-smp-parse.c
diff --git a/hw/core/cpu-topology.c b/hw/core/cpu-topology.c
new file mode 100644
index 000000000000..ca1361d13c16
--- /dev/null
+++ b/hw/core/cpu-topology.c
@@ -0,0 +1,56 @@
+/*
+ * QEMU CPU Topology Representation
+ *
+ * Copyright (c) 2024 Intel Corporation
+ * Author: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License,
+ * or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/core/cpu-topology.h"
+
+typedef struct CPUTopoInfo {
+    const char *name;
+} CPUTopoInfo;
+
+CPUTopoInfo cpu_topo_descriptors[] = {
+    [CPU_TOPO_LEVEL_INVALID] = { .name = "invalid", },
+    [CPU_TOPO_LEVEL_THREAD]  = { .name = "thread",  },
+    [CPU_TOPO_LEVEL_CORE]    = { .name = "core",    },
+    [CPU_TOPO_LEVEL_MODULE]  = { .name = "module",  },
+    [CPU_TOPO_LEVEL_CLUSTER] = { .name = "cluster", },
+    [CPU_TOPO_LEVEL_DIE]     = { .name = "die",     },
+    [CPU_TOPO_LEVEL_SOCKET]  = { .name = "socket",  },
+    [CPU_TOPO_LEVEL_BOOK]    = { .name = "book",    },
+    [CPU_TOPO_LEVEL_DRAWER]  = { .name = "drawer",  },
+    [CPU_TOPO_LEVEL_MAX]     = { .name = NULL,      },
+};
+
+const char *cpu_topo_to_string(CPUTopoLevel topo)
+{
+    return cpu_topo_descriptors[topo].name;
+}
+
+CPUTopoLevel string_to_cpu_topo(char *str)
+{
+    for (int i = 0; i < ARRAY_SIZE(cpu_topo_descriptors); i++) {
+        CPUTopoInfo *info = &cpu_topo_descriptors[i];
+
+        if (!strcmp(info->name, str)) {
+            return (CPUTopoLevel)i;
+        }
+    }
+    return CPU_TOPO_LEVEL_MAX;
+}
diff --git a/hw/core/meson.build b/hw/core/meson.build
index 67dad04de559..3b1d5ffab3e3 100644
--- a/hw/core/meson.build
+++ b/hw/core/meson.build
@@ -23,6 +23,7 @@ else
 endif
 
 common_ss.add(files('cpu-common.c'))
+common_ss.add(files('cpu-topology.c'))
 common_ss.add(files('machine-smp.c'))
 system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
 system_ss.add(when: 'CONFIG_GENERIC_LOADER', if_true: files('generic-loader.c'))
diff --git a/include/hw/core/cpu-topology.h b/include/hw/core/cpu-topology.h
new file mode 100644
index 000000000000..cc6ca186ce3f
--- /dev/null
+++ b/include/hw/core/cpu-topology.h
@@ -0,0 +1,40 @@
+/*
+ * QEMU CPU Topology Representation Header
+ *
+ * Copyright (c) 2024 Intel Corporation
+ * Author: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License,
+ * or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef CPU_TOPOLOGY_H
+#define CPU_TOPOLOGY_H
+
+typedef enum CPUTopoLevel {
+    CPU_TOPO_LEVEL_INVALID,
+    CPU_TOPO_LEVEL_THREAD,
+    CPU_TOPO_LEVEL_CORE,
+    CPU_TOPO_LEVEL_MODULE,
+    CPU_TOPO_LEVEL_CLUSTER,
+    CPU_TOPO_LEVEL_DIE,
+    CPU_TOPO_LEVEL_SOCKET,
+    CPU_TOPO_LEVEL_BOOK,
+    CPU_TOPO_LEVEL_DRAWER,
+    CPU_TOPO_LEVEL_MAX,
+} CPUTopoLevel;
+
+const char *cpu_topo_to_string(CPUTopoLevel topo);
+CPUTopoLevel string_to_cpu_topo(char *str);
+
+#endif /* CPU_TOPOLOGY_H */
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index dff49fce1154..c6ff75f23991 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -39,7 +39,7 @@
  *  CPUID Fn8000_0008_ECX[ApicIdCoreIdSize[3:0]] is set to apicid_core_width().
  */
 
-
+#include "hw/core/cpu-topology.h"
 #include "qemu/bitops.h"
 
 /*
@@ -62,22 +62,6 @@ typedef struct X86CPUTopoInfo {
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
-/*
- * CPUTopoLevel is the general i386 topology hierarchical representation,
- * ordered by increasing hierarchical relationship.
- * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
- * or AMD (CPUID[0x80000026]).
- */
-enum CPUTopoLevel {
-    CPU_TOPO_LEVEL_INVALID,
-    CPU_TOPO_LEVEL_SMT,
-    CPU_TOPO_LEVEL_CORE,
-    CPU_TOPO_LEVEL_MODULE,
-    CPU_TOPO_LEVEL_DIE,
-    CPU_TOPO_LEVEL_PACKAGE,
-    CPU_TOPO_LEVEL_MAX,
-};
-
 /* Return the bit width needed for 'count' IDs */
 static unsigned apicid_bitwidth_for_count(unsigned count)
 {
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ac0a10abd45f..725d7e70182d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -247,7 +247,7 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
     case CPU_TOPO_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPO_LEVEL_SOCKET:
         num_ids = 1 << apicid_pkg_offset(topo_info);
         break;
     default:
@@ -304,7 +304,7 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
                                           enum CPUTopoLevel topo_level)
 {
     switch (topo_level) {
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPO_LEVEL_THREAD:
         return 1;
     case CPU_TOPO_LEVEL_CORE:
         return topo_info->threads_per_core;
@@ -313,7 +313,7 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
     case CPU_TOPO_LEVEL_DIE:
         return topo_info->threads_per_core * topo_info->cores_per_module *
                topo_info->modules_per_die;
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPO_LEVEL_SOCKET:
         return topo_info->threads_per_core * topo_info->cores_per_module *
                topo_info->modules_per_die * topo_info->dies_per_pkg;
     default:
@@ -326,7 +326,7 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
                                             enum CPUTopoLevel topo_level)
 {
     switch (topo_level) {
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPO_LEVEL_THREAD:
         return 0;
     case CPU_TOPO_LEVEL_CORE:
         return apicid_core_offset(topo_info);
@@ -334,7 +334,7 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
         return apicid_module_offset(topo_info);
     case CPU_TOPO_LEVEL_DIE:
         return apicid_die_offset(topo_info);
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPO_LEVEL_SOCKET:
         return apicid_pkg_offset(topo_info);
     default:
         g_assert_not_reached();
@@ -347,7 +347,7 @@ static uint32_t cpuid1f_topo_type(enum CPUTopoLevel topo_level)
     switch (topo_level) {
     case CPU_TOPO_LEVEL_INVALID:
         return CPUID_1F_ECX_TOPO_LEVEL_INVALID;
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPO_LEVEL_THREAD:
         return CPUID_1F_ECX_TOPO_LEVEL_SMT;
     case CPU_TOPO_LEVEL_CORE:
         return CPUID_1F_ECX_TOPO_LEVEL_CORE;
@@ -380,7 +380,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
     level = CPU_TOPO_LEVEL_INVALID;
     for (int i = 0; i <= count; i++) {
         level = find_next_bit(env->avail_cpu_topo,
-                              CPU_TOPO_LEVEL_PACKAGE,
+                              CPU_TOPO_LEVEL_SOCKET,
                               level + 1);
 
         /*
@@ -388,7 +388,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
          * and it just encode the invalid level (all fields are 0)
          * into the last subleaf of 0x1f.
          */
-        if (level == CPU_TOPO_LEVEL_PACKAGE) {
+        if (level == CPU_TOPO_LEVEL_SOCKET) {
             level = CPU_TOPO_LEVEL_INVALID;
             break;
         }
@@ -401,7 +401,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
         unsigned long next_level;
 
         next_level = find_next_bit(env->avail_cpu_topo,
-                                   CPU_TOPO_LEVEL_PACKAGE,
+                                   CPU_TOPO_LEVEL_SOCKET,
                                    level + 1);
         num_threads_next_level = num_threads_by_topo_level(topo_info,
                                                            next_level);
@@ -6290,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 
                     /* Share the cache at package level. */
                     *eax |= max_thread_ids_for_cache(&topo_info,
-                                CPU_TOPO_LEVEL_PACKAGE) << 14;
+                                CPU_TOPO_LEVEL_SOCKET) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
@@ -7756,9 +7756,9 @@ static void x86_cpu_init_default_topo(X86CPU *cpu)
     env->nr_dies = 1;
 
     /* SMT, core and package levels are set by default. */
-    set_bit(CPU_TOPO_LEVEL_SMT, env->avail_cpu_topo);
+    set_bit(CPU_TOPO_LEVEL_THREAD, env->avail_cpu_topo);
     set_bit(CPU_TOPO_LEVEL_CORE, env->avail_cpu_topo);
-    set_bit(CPU_TOPO_LEVEL_PACKAGE, env->avail_cpu_topo);
+    set_bit(CPU_TOPO_LEVEL_SOCKET, env->avail_cpu_topo);
 }
 
 static void x86_cpu_initfn(Object *obj)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 4b4cc70c8859..fcbf278b49e6 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1596,7 +1596,7 @@ typedef struct CPUCacheInfo {
      * Used to encode CPUID[4].EAX[bits 25:14] or
      * CPUID[0x8000001D].EAX[bits 25:14].
      */
-    enum CPUTopoLevel share_level;
+    CPUTopoLevel share_level;
 } CPUCacheInfo;
 
 
diff --git a/tests/unit/meson.build b/tests/unit/meson.build
index cae925c13259..4fe0aaff3a5e 100644
--- a/tests/unit/meson.build
+++ b/tests/unit/meson.build
@@ -138,7 +138,8 @@ if have_system
     'test-util-sockets': ['socket-helpers.c'],
     'test-base64': [],
     'test-bufferiszero': [],
-    'test-smp-parse': [qom, meson.project_source_root() / 'hw/core/machine-smp.c'],
+    'test-smp-parse': [qom, meson.project_source_root() / 'hw/core/machine-smp.c',
+                       meson.project_source_root() / 'hw/core/cpu-topology.c'],
     'test-vmstate': [migration, io],
     'test-yank': ['socket-helpers.c', qom, io, chardev]
   }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 3/8] hw/core: Define cache topology for machine
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
  2024-02-20  9:24 ` [RFC 1/8] hw/core: Rename CpuTopology to CPUTopology Zhao Liu
  2024-02-20  9:24 ` [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file Zhao Liu
@ 2024-02-20  9:24 ` Zhao Liu
  2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:24 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Define the cache topology based on CPU topology level for two reasons:

1. In practice, a cache will always be bound to the CPU container
   (either private in the CPU container or shared among multiple
   containers), and CPU container is often expressed in terms of CPU
   topology level.
2. The x86's cache-related CPUIDs encode cache topology based on APIC
   ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV
   relies on also requires CPU containers to help indicate the private
   shared hierarchy of the cache. Therefore, for SMP systems, it is
   natural to use the CPU topology hierarchy directly in QEMU to define
   the cache topology.

Currently, separated L1 cache (L1 data cache and L1 instruction cache)
with unified higher-level cache (e.g., unified L2 and L3 caches), is the
most common cache architectures.

Therefore, define the topology for L1 D-cache, L1 I-cache, L2 cache and
L3 cache in machine as the basic cache topology support.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/core/machine.c   |  5 +++++
 include/hw/boards.h | 25 +++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index b3199c710194..426f71770a84 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1163,6 +1163,11 @@ static void machine_initfn(Object *obj)
     ms->smp.cores = 1;
     ms->smp.threads = 1;
 
+    ms->smp_cache.l1d = CPU_TOPO_LEVEL_INVALID;
+    ms->smp_cache.l1i = CPU_TOPO_LEVEL_INVALID;
+    ms->smp_cache.l2 = CPU_TOPO_LEVEL_INVALID;
+    ms->smp_cache.l3 = CPU_TOPO_LEVEL_INVALID;
+
     machine_copy_boot_config(ms, &(BootConfiguration){ 0 });
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index e63dec919da2..8558b88aea52 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -10,6 +10,7 @@
 #include "qemu/module.h"
 #include "qom/object.h"
 #include "hw/core/cpu.h"
+#include "hw/core/cpu-topology.h"
 
 #define TYPE_MACHINE_SUFFIX "-machine"
 
@@ -144,6 +145,12 @@ typedef struct {
  * @books_supported - whether books are supported by the machine
  * @drawers_supported - whether drawers are supported by the machine
  * @modules_supported - whether modules are supported by the machine
+ * @l1_separated_cache_supported - whether l1 data and instruction cache
+ *                                 topology are supported by the machine
+ * @l2_unified_cache_supported - whether l2 unified cache topology are
+ *                               supported by the machine
+ * @l3_unified_cache_supported - whether l3 unified cache topology are
+ *                               supported by the machine
  */
 typedef struct {
     bool prefer_sockets;
@@ -153,6 +160,9 @@ typedef struct {
     bool books_supported;
     bool drawers_supported;
     bool modules_supported;
+    bool l1_separated_cache_supported;
+    bool l2_unified_cache_supported;
+    bool l3_unified_cache_supported;
 } SMPCompatProps;
 
 /**
@@ -358,6 +368,20 @@ typedef struct CPUTopology {
     unsigned int max_cpus;
 } CPUTopology;
 
+/**
+ * CPUTopology:
+ * @l1d: the CPU topology hierarchy the L1 data cache is shared at.
+ * @l1i: the CPU topology hierarchy the L1 instruction cache is shared at.
+ * @l2: the CPU topology hierarchy the L2 (unified) cache is shared at.
+ * @l3: the CPU topology hierarchy the L3 (unified) cache is shared at.
+ */
+typedef struct CacheTopology {
+    CPUTopoLevel l1d;
+    CPUTopoLevel l1i;
+    CPUTopoLevel l2;
+    CPUTopoLevel l3;
+} CacheTopology;
+
 /**
  * MachineState:
  */
@@ -408,6 +432,7 @@ struct MachineState {
     AccelState *accelerator;
     CPUArchIdList *possible_cpus;
     CPUTopology smp;
+    CacheTopology smp_cache;
     struct NVDIMMState *nvdimms_state;
     struct NumaState *numa_state;
 };
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (2 preceding siblings ...)
  2024-02-20  9:24 ` [RFC 3/8] hw/core: Define cache topology for machine Zhao Liu
@ 2024-02-20  9:25 ` Zhao Liu
  2024-02-21 12:46   ` Markus Armbruster
                     ` (2 more replies)
  2024-02-20  9:25 ` [RFC 5/8] i386/cpu: Support thread and module level cache topology Zhao Liu
                   ` (4 subsequent siblings)
  8 siblings, 3 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:25 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
-smp to define the cache topology for SMP system.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/core/machine-smp.c | 128 ++++++++++++++++++++++++++++++++++++++++++
 hw/core/machine.c     |   4 ++
 qapi/machine.json     |  14 ++++-
 system/vl.c           |  15 +++++
 4 files changed, 160 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 8a8296b0d05b..2cbd19f4aa57 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -61,6 +61,132 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
     return g_string_free(s, false);
 }
 
+static bool machine_check_topo_support(MachineState *ms,
+                                       CPUTopoLevel topo)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+
+    if (topo == CPU_TOPO_LEVEL_MODULE && !mc->smp_props.modules_supported) {
+        return false;
+    }
+
+    if (topo == CPU_TOPO_LEVEL_CLUSTER && !mc->smp_props.clusters_supported) {
+        return false;
+    }
+
+    if (topo == CPU_TOPO_LEVEL_DIE && !mc->smp_props.dies_supported) {
+        return false;
+    }
+
+    if (topo == CPU_TOPO_LEVEL_BOOK && !mc->smp_props.books_supported) {
+        return false;
+    }
+
+    if (topo == CPU_TOPO_LEVEL_DRAWER && !mc->smp_props.drawers_supported) {
+        return false;
+    }
+
+    return true;
+}
+
+static int smp_cache_string_to_topology(MachineState *ms,
+                                        char *topo_str,
+                                        CPUTopoLevel *topo,
+                                        Error **errp)
+{
+    *topo = string_to_cpu_topo(topo_str);
+
+    if (*topo == CPU_TOPO_LEVEL_MAX || *topo == CPU_TOPO_LEVEL_INVALID) {
+        error_setg(errp, "Invalid cache topology level: %s. The cache "
+                   "topology should match the CPU topology level", topo_str);
+        return -1;
+    }
+
+    if (!machine_check_topo_support(ms, *topo)) {
+        error_setg(errp, "Invalid cache topology level: %s. The topology "
+                   "level is not supported by this machine", topo_str);
+        return -1;
+    }
+
+    return 0;
+}
+
+static void machine_parse_smp_cache_config(MachineState *ms,
+                                           const SMPConfiguration *config,
+                                           Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+
+    if (config->l1d_cache) {
+        if (!mc->smp_props.l1_separated_cache_supported) {
+            error_setg(errp, "L1 D-cache topology not "
+                       "supported by this machine");
+            return;
+        }
+
+        if (smp_cache_string_to_topology(ms, config->l1d_cache,
+            &ms->smp_cache.l1d, errp)) {
+            return;
+        }
+    }
+
+    if (config->l1i_cache) {
+        if (!mc->smp_props.l1_separated_cache_supported) {
+            error_setg(errp, "L1 I-cache topology not "
+                       "supported by this machine");
+            return;
+        }
+
+        if (smp_cache_string_to_topology(ms, config->l1i_cache,
+            &ms->smp_cache.l1i, errp)) {
+            return;
+        }
+    }
+
+    if (config->l2_cache) {
+        if (!mc->smp_props.l2_unified_cache_supported) {
+            error_setg(errp, "L2 cache topology not "
+                       "supported by this machine");
+            return;
+        }
+
+        if (smp_cache_string_to_topology(ms, config->l2_cache,
+            &ms->smp_cache.l2, errp)) {
+            return;
+        }
+
+        if (ms->smp_cache.l1d > ms->smp_cache.l2 ||
+            ms->smp_cache.l1i > ms->smp_cache.l2) {
+            error_setg(errp, "Invalid L2 cache topology. "
+                       "L2 cache topology level should not be "
+                       "lower than L1 D-cache/L1 I-cache");
+            return;
+        }
+    }
+
+    if (config->l3_cache) {
+        if (!mc->smp_props.l2_unified_cache_supported) {
+            error_setg(errp, "L3 cache topology not "
+                       "supported by this machine");
+            return;
+        }
+
+        if (smp_cache_string_to_topology(ms, config->l3_cache,
+            &ms->smp_cache.l3, errp)) {
+            return;
+        }
+
+        if (ms->smp_cache.l1d > ms->smp_cache.l3 ||
+            ms->smp_cache.l1i > ms->smp_cache.l3 ||
+            ms->smp_cache.l2 > ms->smp_cache.l3) {
+            error_setg(errp, "Invalid L3 cache topology. "
+                       "L3 cache topology level should not be "
+                       "lower than L1 D-cache/L1 I-cache/L2 cache");
+            return;
+        }
+    }
+}
+
 /*
  * machine_parse_smp_config: Generic function used to parse the given
  *                           SMP configuration
@@ -249,6 +375,8 @@ void machine_parse_smp_config(MachineState *ms,
                    mc->name, mc->max_cpus);
         return;
     }
+
+    machine_parse_smp_cache_config(ms, config, errp);
 }
 
 unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 426f71770a84..cb5173927b0d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -886,6 +886,10 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name,
         .has_cores = true, .cores = ms->smp.cores,
         .has_threads = true, .threads = ms->smp.threads,
         .has_maxcpus = true, .maxcpus = ms->smp.max_cpus,
+        .l1d_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1d)),
+        .l1i_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1i)),
+        .l2_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l2)),
+        .l3_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l3)),
     };
 
     if (!visit_type_SMPConfiguration(v, name, &config, &error_abort)) {
diff --git a/qapi/machine.json b/qapi/machine.json
index d0e7f1f615f3..0a923ac38803 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1650,6 +1650,14 @@
 #
 # @threads: number of threads per core
 #
+# @l1d-cache: topology hierarchy of L1 data cache (since 9.0)
+#
+# @l1i-cache: topology hierarchy of L1 instruction cache (since 9.0)
+#
+# @l2-cache: topology hierarchy of L2 unified cache (since 9.0)
+#
+# @l3-cache: topology hierarchy of L3 unified cache (since 9.0)
+#
 # Since: 6.1
 ##
 { 'struct': 'SMPConfiguration', 'data': {
@@ -1662,7 +1670,11 @@
      '*modules': 'int',
      '*cores': 'int',
      '*threads': 'int',
-     '*maxcpus': 'int' } }
+     '*maxcpus': 'int',
+     '*l1d-cache': 'str',
+     '*l1i-cache': 'str',
+     '*l2-cache': 'str',
+     '*l3-cache': 'str' } }
 
 ##
 # @x-query-irq:
diff --git a/system/vl.c b/system/vl.c
index a82555ae1558..ac95e5ddb656 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -741,6 +741,9 @@ static QemuOptsList qemu_smp_opts = {
         }, {
             .name = "clusters",
             .type = QEMU_OPT_NUMBER,
+        }, {
+            .name = "modules",
+            .type = QEMU_OPT_NUMBER,
         }, {
             .name = "cores",
             .type = QEMU_OPT_NUMBER,
@@ -750,6 +753,18 @@ static QemuOptsList qemu_smp_opts = {
         }, {
             .name = "maxcpus",
             .type = QEMU_OPT_NUMBER,
+        }, {
+            .name = "l1d-cache",
+            .type = QEMU_OPT_STRING,
+        }, {
+            .name = "l1i-cache",
+            .type = QEMU_OPT_STRING,
+        }, {
+            .name = "l2-cache",
+            .type = QEMU_OPT_STRING,
+        }, {
+            .name = "l3-cache",
+            .type = QEMU_OPT_STRING,
         },
         { /*End of list */ }
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 5/8] i386/cpu: Support thread and module level cache topology
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (3 preceding siblings ...)
  2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
@ 2024-02-20  9:25 ` Zhao Liu
  2024-02-20  9:25 ` [RFC 6/8] i386/cpu: Update cache topology with machine's configuration Zhao Liu
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:25 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Allows cache to be defined at the thread and module level. This
increases flexibility for x86 users to customize their cache topology.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 725d7e70182d..d7cb0f1e49b4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -241,9 +241,15 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
     uint32_t num_ids = 0;
 
     switch (share_level) {
+    case CPU_TOPO_LEVEL_THREAD:
+        num_ids = 1;
+        break;
     case CPU_TOPO_LEVEL_CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
+    case CPU_TOPO_LEVEL_MODULE:
+        num_ids = 1 << apicid_module_offset(topo_info);
+        break;
     case CPU_TOPO_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
@@ -251,10 +257,6 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
         num_ids = 1 << apicid_pkg_offset(topo_info);
         break;
     default:
-        /*
-         * Currently there is no use case for SMT and MODULE, so use
-         * assert directly to facilitate debugging.
-         */
         g_assert_not_reached();
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 6/8] i386/cpu: Update cache topology with machine's configuration
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (4 preceding siblings ...)
  2024-02-20  9:25 ` [RFC 5/8] i386/cpu: Support thread and module level cache topology Zhao Liu
@ 2024-02-20  9:25 ` Zhao Liu
  2024-02-28  9:45   ` JeeHeng Sia
  2024-02-20  9:25 ` [RFC 7/8] i386/pc: Support cache topology in -smp for PC machine Zhao Liu
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:25 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

User will configure SMP cache topology via -smp.

For this case, update the x86 CPUs' cache topology with user's
configuration in MachineState.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d7cb0f1e49b4..4b5c551fe7f0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7582,6 +7582,27 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
+
+    if (ms->smp_cache.l1d != CPU_TOPO_LEVEL_INVALID) {
+        env->cache_info_cpuid4.l1d_cache->share_level = ms->smp_cache.l1d;
+        env->cache_info_amd.l1d_cache->share_level = ms->smp_cache.l1d;
+    }
+
+    if (ms->smp_cache.l1i != CPU_TOPO_LEVEL_INVALID) {
+        env->cache_info_cpuid4.l1i_cache->share_level = ms->smp_cache.l1i;
+        env->cache_info_amd.l1i_cache->share_level = ms->smp_cache.l1i;
+    }
+
+    if (ms->smp_cache.l2 != CPU_TOPO_LEVEL_INVALID) {
+        env->cache_info_cpuid4.l2_cache->share_level = ms->smp_cache.l2;
+        env->cache_info_amd.l2_cache->share_level = ms->smp_cache.l2;
+    }
+
+    if (ms->smp_cache.l3 != CPU_TOPO_LEVEL_INVALID) {
+        env->cache_info_cpuid4.l3_cache->share_level = ms->smp_cache.l3;
+        env->cache_info_amd.l3_cache->share_level = ms->smp_cache.l3;
+    }
+
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
 
     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || ms->smp.cpus > 1) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 7/8] i386/pc: Support cache topology in -smp for PC machine
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (5 preceding siblings ...)
  2024-02-20  9:25 ` [RFC 6/8] i386/cpu: Update cache topology with machine's configuration Zhao Liu
@ 2024-02-20  9:25 ` Zhao Liu
  2024-02-20  9:25 ` [RFC 8/8] qemu-options: Add the cache topology description of -smp Zhao Liu
  2024-02-20 20:07 ` [RFC 0/8] Introduce SMP Cache Topology Philippe Mathieu-Daudé
  8 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:25 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 hw/i386/pc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 25124a077eea..76148c3337cf 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1848,6 +1848,9 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->nvdimm_supported = true;
     mc->smp_props.dies_supported = true;
     mc->smp_props.modules_supported = true;
+    mc->smp_props.l1_separated_cache_supported = true;
+    mc->smp_props.l2_unified_cache_supported = true;
+    mc->smp_props.l3_unified_cache_supported = true;
     mc->default_ram_id = "pc.ram";
     pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC 8/8] qemu-options: Add the cache topology description of -smp
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (6 preceding siblings ...)
  2024-02-20  9:25 ` [RFC 7/8] i386/pc: Support cache topology in -smp for PC machine Zhao Liu
@ 2024-02-20  9:25 ` Zhao Liu
  2024-02-26 15:47   ` Jonathan Cameron via
  2024-02-20 20:07 ` [RFC 0/8] Introduce SMP Cache Topology Philippe Mathieu-Daudé
  8 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-02-20  9:25 UTC (permalink / raw)
  To: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

From: Zhao Liu <zhao1.liu@intel.com>

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 qemu-options.hx | 54 ++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 47 insertions(+), 7 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 70eaf3256685..85c78c99a3b0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -281,7 +281,9 @@ ERST
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
     "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n"
-    "               [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
+    "               [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n"
+    "               [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n"
+    "               [,l3-cache=level]\n"
     "                set the number of initial CPUs to 'n' [default=1]\n"
     "                maxcpus= maximum number of total CPUs, including\n"
     "                offline CPUs for hotplug, etc\n"
@@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
     "                sockets= number of sockets in one book\n"
     "                dies= number of dies in one socket\n"
     "                clusters= number of clusters in one die\n"
-    "                cores= number of cores in one cluster\n"
+    "                modules= number of modules in one cluster\n"
+    "                cores= number of cores in one module\n"
     "                threads= number of threads in one core\n"
-    "Note: Different machines may have different subsets of the CPU topology\n"
+    "                l1d-cache= topology level of L1 D-cache\n"
+    "                l1i-cache= topology level of L1 I-cache\n"
+    "                l2-cache= topology level of L2 cache\n"
+    "                l3-cache= topology level of L3 cache\n"
+    "Note: Different machines may have different subsets of the CPU and cache topology\n"
     "      parameters supported, so the actual meaning of the supported parameters\n"
     "      will vary accordingly. For example, for a machine type that supports a\n"
     "      three-level CPU hierarchy of sockets/cores/threads, the parameters will\n"
@@ -306,7 +313,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
     "      must be set as 1 in the purpose of correct parsing.\n",
     QEMU_ARCH_ALL)
 SRST
-``-smp [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]``
+``-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,modules=modules][,cores=cores][,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level][,l3-cache=level]``
     Simulate a SMP system with '\ ``n``\ ' CPUs initially present on
     the machine type board. On boards supporting CPU hotplug, the optional
     '\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be
@@ -320,15 +327,34 @@ SRST
     Both parameters are subject to an upper limit that is determined by
     the specific machine type chosen.
 
+    CPU topology parameters include '\ ``drawers``\ ', '\ ``books``\ ',
+    '\ ``sockets``\ ', '\ ``dies``\ ', '\ ``clusters``\ ', '\ ``modules``\ ',
+    '\ ``cores``\ ' and '\ ``threads``\ '. These CPU parameters accept only
+    integers and are used to specify the number of specific topology domains
+    under the corresponding topology level.
+
     To control reporting of CPU topology information, values of the topology
     parameters can be specified. Machines may only support a subset of the
-    parameters and different machines may have different subsets supported
-    which vary depending on capacity of the corresponding CPU targets. So
-    for a particular machine type board, an expected topology hierarchy can
+    CPU topology parameters and different machines may have different subsets
+    supported which vary depending on capacity of the corresponding CPU targets.
+    So for a particular machine type board, an expected topology hierarchy can
     be defined through the supported sub-option. Unsupported parameters can
     also be provided in addition to the sub-option, but their values must be
     set as 1 in the purpose of correct parsing.
 
+    Cache topology parameters include '\ ``l1d-cache``\ ', '\ ``l1i-cache``\ ',
+    '\ ``l2-cache``\ ' and '\ ``l3-cache``\ '. These cache topology parameters
+    accept the strings of CPU topology levels (such as '\ ``drawer``\ ', '\ ``book``\ ',
+    '\ ``socket``\ ', '\ ``die``\ ', '\ ``cluster``\ ', '\ ``module``\ ',
+    '\ ``core``\ ' or '\ ``thread``\ '). Exactly which topology level strings
+    could be accepted as the parameter depends on the machine's support for the
+    corresponding CPU topology level.
+
+    Machines may also only support a subset of the cache topology parameters.
+    Unsupported cache topology parameters will be omitted, and correspondingly,
+    the target CPU's cache topology will use the its default cache topology
+    setting.
+
     Either the initial CPU count, or at least one of the topology parameters
     must be specified. The specified parameters must be greater than zero,
     explicit configuration like "cpus=0" is not allowed. Values for any
@@ -354,6 +380,20 @@ SRST
 
         -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32
 
+    The following sub-option defines a CPU topology hierarchy (2 sockets
+    totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per
+    module, 2 threads per core) with 3-level cache topology hierarchy (L1
+    D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per
+    die) for PC machines which support sockets/dies/modules/cores/threads.
+    Some members of the CPU topology option can be omitted but their values
+    will be automatically computed. Some members of the cache topology
+    option can also be omitted and target CPU will use the default topology.:
+
+    ::
+
+        -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
+             l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die
+
     The following sub-option defines a CPU topology hierarchy (2 sockets
     totally on the machine, 2 clusters per socket, 2 cores per cluster,
     2 threads per core) for ARM virt machines which support sockets/clusters
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC 0/8] Introduce SMP Cache Topology
  2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
                   ` (7 preceding siblings ...)
  2024-02-20  9:25 ` [RFC 8/8] qemu-options: Add the cache topology description of -smp Zhao Liu
@ 2024-02-20 20:07 ` Philippe Mathieu-Daudé
  8 siblings, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-02-20 20:07 UTC (permalink / raw)
  To: Zhao Liu, Daniel P . Berrangé, Eduardo Habkost,
	Marcel Apfelbaum, Yanan Wang, Michael S . Tsirkin, Paolo Bonzini,
	Richard Henderson, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Alex Bennée, Peter Maydell, Jonathan Cameron, Sia Jee Heng,
	Igor Mammedov
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

+Igor

On 20/2/24 10:24, Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi list,
> 
> This's our proposal for supporting (SMP) cache topology in -smp as
> the following example:
> 
> -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
>       l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die
> 
> With the new cache topology options ("l1d-cache", "l1i-cache",
> "l2-cache" and "l3-cache"), we could adjust the cache topology via -smp.
> 
> This patch set is rebased on our i386 module series:
> https://lore.kernel.org/qemu-devel/20240131101350.109512-1-zhao1.liu@linux.intel.com/
> 
> Since the ARM [1] and RISC-V [2] folks have similar needs for the cache
> topology, I also cc'd the ARM and RISC-V folks and lists.
> 
> 
> Welcome your feedback!
> 
> 
> Introduction
> ============
> 
> Background
> ----------
> 
> Intel client platforms (ADL/RPL/MTL) and E core server platforms (SRF)
> share the L2 cache domain among multiple E cores (in the same module).
> 
> Thus we need a way to adjust the cache topology so that users could
> create the cache topology for Guest that is nearly identical to Host.
> 
> This is necessary in cases where there are bound vCPUs, especially
> considering that Guest scheduling often takes into account the cache
> topology as well (e.g. Linux cluster aware scheduling, i.e. L2 cache
> scheduling).
> 
> Previously, we introduced a x86 specific option to adjust the cache
> topology:
> 
> -cpu x-l2-cache-topo=[core|module] [3]
> 
> However, considering the needs of other arches, we re-implemented the
> generic cache topology (aslo in response to Michael's [4] and Daniel's
> comment [5]) in this series.
> 
> 
> Cache Topology Representation
> -----------------------------
> 
> We consider to define the cache topology based on CPU topology level for
> two reasons:
> 
> 1. In practice, a cache will always be bound to the CPU container -
>     "CPU container" indicates to a set of CPUs that refer to a certain
>     level of CPU topology - where the cache is either private in that
>     CPU container or shared among multiple containers.
> 
> 2. The x86's cache-related CPUIDs encode cache topology based on APIC
>     ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV
>     relies on also requires CPU containers (CPU topology) to help
>     indicate the private shared hierarchy of the cache.
> 
> Therefore, for SMP systems, it is natural to use the CPU topology
> hierarchy directly in QEMU to define the cache topology.
> 
> And currently, separated L1 cache (L1 data cache and L1 instruction
> cache) with unified higher-level caches (e.g., unified L2 and L3
> caches), is the most common cache architectures.
> 
> Thus, we define the topology for L1 D-cache, L1 I-cache, L2 cache and L3
> cache in MachineState as the basic cache topology support:
> 
> typedef struct CacheTopology {
>      CPUTopoLevel l1d;
>      CPUTopoLevel l1i;
>      CPUTopoLevel l2;
>      CPUTopoLevel l3;
> } CacheTopology;
> 
> Machines may also only support a subset of the cache topology
> to be configured in -smp by setting the SMP property of MachineClass:
> 
> typedef struct {
>      ...
>      bool l1_separated_cache_supported;
>      bool l2_unified_cache_supported;
>      bool l3_unified_cache_supported;
> } SMPCompatProps;
> 
> 
> Cache Topology Configuration in -smp
> ------------------------------------
> 
> Further, we add new parameters to -smp:
> * l1d-cache=level
> * l1i-cache=level
> * l2-cache=level
> * l3-cache=level
> 
> These cache topology parameters accept the strings of CPU topology
> levels (such as "drawer", "book", "socket", "die", "cluster", "module",
> "core" or "thread"). Exactly which topology level strings could be
> accepted as the parameter depends on the machine's support for the
> corresponding CPU topology level.
> 
> Unsupported cache topology parameters will be omitted, and
> correspondingly, the target CPU's cache topology will use the its
> default cache topology setting.
> 
> In this series, we add the cache topology support in -smp for x86 PC
> machine.
> 
> The following example defines a 3-level cache topology hierarchy (L1
> D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per
> die) for PC machine.
> 
> -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
>       l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die
> 
> 
> Reference
> ---------
> 
> [1]: [ARM] Jonathan's proposal to adjust cache topology:
>       https://lore.kernel.org/qemu-devel/20230808115713.2613-2-Jonathan.Cameron@huawei.com/
> [2]: [RISC-V] Discussion between JeeHeng and Jonathan about cache
>       topology:
>       https://lore.kernel.org/qemu-devel/20240131155336.000068d1@Huawei.com/
> [3]: Previous x86 specific cache topology option:
>       https://lore.kernel.org/qemu-devel/20230914072159.1177582-22-zhao1.liu@linux.intel.com/
> [4]: Michael's comment about generic cache topology support:
>       https://lore.kernel.org/qemu-devel/20231003085516-mutt-send-email-mst@kernel.org/
> [5]: Daniel's question about how x86 support L2 cache domain (cluster)
>       configuration:
>       https://lore.kernel.org/qemu-devel/ZcUG0Uc8KylEQhUW@redhat.com/
> 
> Thanks and Best Regards,
> Zhao
> 
> ---
> Zhao Liu (8):
>    hw/core: Rename CpuTopology to CPUTopology
>    hw/core: Move CPU topology enumeration into arch-agnostic file
>    hw/core: Define cache topology for machine
>    hw/core: Add cache topology options in -smp
>    i386/cpu: Support thread and module level cache topology
>    i386/cpu: Update cache topology with machine's configuration
>    i386/pc: Support cache topology in -smp for PC machine
>    qemu-options: Add the cache topology description of -smp
> 
>   MAINTAINERS                     |   2 +
>   hw/core/cpu-topology.c          |  56 ++++++++++++++
>   hw/core/machine-smp.c           | 128 ++++++++++++++++++++++++++++++++
>   hw/core/machine.c               |   9 +++
>   hw/core/meson.build             |   1 +
>   hw/i386/pc.c                    |   3 +
>   hw/s390x/cpu-topology.c         |   6 +-
>   include/hw/boards.h             |  33 +++++++-
>   include/hw/core/cpu-topology.h  |  40 ++++++++++
>   include/hw/i386/topology.h      |  18 +----
>   include/hw/s390x/cpu-topology.h |   6 +-
>   qapi/machine.json               |  14 +++-
>   qemu-options.hx                 |  54 ++++++++++++--
>   system/vl.c                     |  15 ++++
>   target/i386/cpu.c               |  55 ++++++++++----
>   target/i386/cpu.h               |   2 +-
>   tests/unit/meson.build          |   3 +-
>   tests/unit/test-smp-parse.c     |  14 ++--
>   18 files changed, 399 insertions(+), 60 deletions(-)
>   create mode 100644 hw/core/cpu-topology.c
>   create mode 100644 include/hw/core/cpu-topology.h
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
@ 2024-02-21 12:46   ` Markus Armbruster
  2024-02-21 15:17     ` Zhao Liu
  2024-02-26 15:39   ` Jonathan Cameron via
  2024-02-28  5:38   ` JeeHeng Sia
  2 siblings, 1 reply; 26+ messages in thread
From: Markus Armbruster @ 2024-02-21 12:46 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Marcelo Tosatti,
	Alex Bennée, Peter Maydell, Jonathan Cameron, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

Zhao Liu <zhao1.liu@linux.intel.com> writes:

> From: Zhao Liu <zhao1.liu@intel.com>
>
> Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> -smp to define the cache topology for SMP system.
>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

[...]

> diff --git a/qapi/machine.json b/qapi/machine.json
> index d0e7f1f615f3..0a923ac38803 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1650,6 +1650,14 @@
>  #
>  # @threads: number of threads per core
>  #
> +# @l1d-cache: topology hierarchy of L1 data cache (since 9.0)
> +#
> +# @l1i-cache: topology hierarchy of L1 instruction cache (since 9.0)
> +#
> +# @l2-cache: topology hierarchy of L2 unified cache (since 9.0)
> +#
> +# @l3-cache: topology hierarchy of L3 unified cache (since 9.0)
> +#

Too terse, just like my review ;-P

>  # Since: 6.1
>  ##
>  { 'struct': 'SMPConfiguration', 'data': {
> @@ -1662,7 +1670,11 @@
>       '*modules': 'int',
>       '*cores': 'int',
>       '*threads': 'int',
> -     '*maxcpus': 'int' } }
> +     '*maxcpus': 'int',
> +     '*l1d-cache': 'str',
> +     '*l1i-cache': 'str',
> +     '*l2-cache': 'str',
> +     '*l3-cache': 'str' } }
>  
>  ##
>  # @x-query-irq:

[...]



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-21 12:46   ` Markus Armbruster
@ 2024-02-21 15:17     ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-21 15:17 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Marcelo Tosatti,
	Alex Bennée, Peter Maydell, Jonathan Cameron, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Wed, Feb 21, 2024 at 01:46:21PM +0100, Markus Armbruster wrote:
> Date: Wed, 21 Feb 2024 13:46:21 +0100
> From: Markus Armbruster <armbru@redhat.com>
> Subject: Re: [RFC 4/8] hw/core: Add cache topology options in -smp
> 
> Zhao Liu <zhao1.liu@linux.intel.com> writes:
> 
> > From: Zhao Liu <zhao1.liu@intel.com>
> >
> > Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> > -smp to define the cache topology for SMP system.
> >
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> 
> [...]
> 
> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index d0e7f1f615f3..0a923ac38803 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1650,6 +1650,14 @@
> >  #
> >  # @threads: number of threads per core
> >  #
> > +# @l1d-cache: topology hierarchy of L1 data cache (since 9.0)
> > +#
> > +# @l1i-cache: topology hierarchy of L1 instruction cache (since 9.0)
> > +#
> > +# @l2-cache: topology hierarchy of L2 unified cache (since 9.0)
> > +#
> > +# @l3-cache: topology hierarchy of L3 unified cache (since 9.0)
> > +#
> 
> Too terse, just like my review ;-P

;-) Yes, I'll add more information to improve the readability of the
code and comments.

Thanks,
Zhao

> 
> >  # Since: 6.1
> >  ##
> >  { 'struct': 'SMPConfiguration', 'data': {
> > @@ -1662,7 +1670,11 @@
> >       '*modules': 'int',
> >       '*cores': 'int',
> >       '*threads': 'int',
> > -     '*maxcpus': 'int' } }
> > +     '*maxcpus': 'int',
> > +     '*l1d-cache': 'str',
> > +     '*l1i-cache': 'str',
> > +     '*l2-cache': 'str',
> > +     '*l3-cache': 'str' } }
> >  
> >  ##
> >  # @x-query-irq:
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
  2024-02-21 12:46   ` Markus Armbruster
@ 2024-02-26 15:39   ` Jonathan Cameron via
  2024-02-27  9:20     ` Zhao Liu
  2024-02-28  5:38   ` JeeHeng Sia
  2 siblings, 1 reply; 26+ messages in thread
From: Jonathan Cameron via @ 2024-02-26 15:39 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Tue, 20 Feb 2024 17:25:00 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> -smp to define the cache topology for SMP system.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Hi Zhao Liu

I like the scheme.  Strikes a good balance between complexity of description
and systems that actually exist. Sure there are systems with more cache
levels etc but they are rare and support can be easily added later
if people want to model them.

A few minor comments inline.

Jonathan
> ---
>  hw/core/machine-smp.c | 128 ++++++++++++++++++++++++++++++++++++++++++
>  hw/core/machine.c     |   4 ++
>  qapi/machine.json     |  14 ++++-
>  system/vl.c           |  15 +++++
>  4 files changed, 160 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> index 8a8296b0d05b..2cbd19f4aa57 100644
> --- a/hw/core/machine-smp.c
> +++ b/hw/core/machine-smp.c
> @@ -61,6 +61,132 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
>      return g_string_free(s, false);
>  }
>  
> +static bool machine_check_topo_support(MachineState *ms,
> +                                       CPUTopoLevel topo)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +
> +    if (topo == CPU_TOPO_LEVEL_MODULE && !mc->smp_props.modules_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_CLUSTER && !mc->smp_props.clusters_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_DIE && !mc->smp_props.dies_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_BOOK && !mc->smp_props.books_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_DRAWER && !mc->smp_props.drawers_supported) {
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
> +static int smp_cache_string_to_topology(MachineState *ms,

Not a good name for a function that does rather more than that.

> +                                        char *topo_str,
> +                                        CPUTopoLevel *topo,
> +                                        Error **errp)
> +{
> +    *topo = string_to_cpu_topo(topo_str);
> +
> +    if (*topo == CPU_TOPO_LEVEL_MAX || *topo == CPU_TOPO_LEVEL_INVALID) {
> +        error_setg(errp, "Invalid cache topology level: %s. The cache "
> +                   "topology should match the CPU topology level", topo_str);
> +        return -1;
> +    }
> +
> +    if (!machine_check_topo_support(ms, *topo)) {
> +        error_setg(errp, "Invalid cache topology level: %s. The topology "
> +                   "level is not supported by this machine", topo_str);
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +static void machine_parse_smp_cache_config(MachineState *ms,
> +                                           const SMPConfiguration *config,
> +                                           Error **errp)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +
> +    if (config->l1d_cache) {
> +        if (!mc->smp_props.l1_separated_cache_supported) {
> +            error_setg(errp, "L1 D-cache topology not "
> +                       "supported by this machine");
> +            return;
> +        }
> +
> +        if (smp_cache_string_to_topology(ms, config->l1d_cache,
> +            &ms->smp_cache.l1d, errp)) {

Indent is to wrong opening bracket.
Same for other cases.


> +            return;
> +        }
> +    }

> +}
> +
>  /*
>   * machine_parse_smp_config: Generic function used to parse the given
>   *                           SMP configuration
> @@ -249,6 +375,8 @@ void machine_parse_smp_config(MachineState *ms,
>                     mc->name, mc->max_cpus);
>          return;
>      }
> +
> +    machine_parse_smp_cache_config(ms, config, errp);
>  }
>  
>  unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 8/8] qemu-options: Add the cache topology description of -smp
  2024-02-20  9:25 ` [RFC 8/8] qemu-options: Add the cache topology description of -smp Zhao Liu
@ 2024-02-26 15:47   ` Jonathan Cameron via
  2024-02-27 16:17     ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Jonathan Cameron via @ 2024-02-26 15:47 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P . Berrangé, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Tue, 20 Feb 2024 17:25:04 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Hi,

A trivial comment, but also a possibly more significant one about
whether the defaults are correctly verified.

Jonathan
> ---
>  qemu-options.hx | 54 ++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 47 insertions(+), 7 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 70eaf3256685..85c78c99a3b0 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -281,7 +281,9 @@ ERST
>  
>  DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>      "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n"
> -    "               [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
> +    "               [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n"
> +    "               [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n"
burns more characters but I'd go with
l1d->cache=topo_level

As level for a cache has a totally different meaning!

> +    "               [,l3-cache=level]\n"
>      "                set the number of initial CPUs to 'n' [default=1]\n"
>      "                maxcpus= maximum number of total CPUs, including\n"
>      "                offline CPUs for hotplug, etc\n"
> @@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>      "                sockets= number of sockets in one book\n"
>      "                dies= number of dies in one socket\n"
>      "                clusters= number of clusters in one die\n"
> -    "                cores= number of cores in one cluster\n"
> +    "                modules= number of modules in one cluster\n"
> +    "                cores= number of cores in one module\n"
>      "                threads= number of threads in one core\n"
> -    "Note: Different machines may have different subsets of the CPU topology\n"
> +    "                l1d-cache= topology level of L1 D-cache\n"
> +    "                l1i-cache= topology level of L1 I-cache\n"
> +    "                l2-cache= topology level of L2 cache\n"
> +    "                l3-cache= topology level of L3 cache\n"
> +    "Note: Different machines may have different subsets of the CPU and cache topology\n"

>  
>          -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32
>  
> +    The following sub-option defines a CPU topology hierarchy (2 sockets
> +    totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per
> +    module, 2 threads per core) with 3-level cache topology hierarchy (L1
> +    D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per
> +    die) for PC machines which support sockets/dies/modules/cores/threads.
> +    Some members of the CPU topology option can be omitted but their values
> +    will be automatically computed. Some members of the cache topology
> +    option can also be omitted and target CPU will use the default topology.:

Given the default could be inconsistent I wonder if we should 'push' levels
up.  So if L2 not defined it is set either to default of equal to max of
l1i and l1d level. L3 either default or same level as l2.

Won't always correspond to a sensible system so maybe just rejecting
cases where default isn't possible is the best plan.  However I don't
see that verification as the checks on higher levels are gated on them
being specified.

> +
> +    ::
> +
> +        -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
> +             l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die
> +
>      The following sub-option defines a CPU topology hierarchy (2 sockets
>      totally on the machine, 2 clusters per socket, 2 cores per cluster,
>      2 threads per core) for ARM virt machines which support sockets/clusters



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-27  9:20     ` Zhao Liu
@ 2024-02-27  9:12       ` Daniel P. Berrangé
  2024-02-27 10:35         ` Zhao Liu
  2024-02-27 10:51       ` Jonathan Cameron via
  1 sibling, 1 reply; 26+ messages in thread
From: Daniel P. Berrangé @ 2024-02-27  9:12 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Jonathan Cameron, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Tue, Feb 27, 2024 at 05:20:25PM +0800, Zhao Liu wrote:
> Hi Jonathan,
> 
> > Hi Zhao Liu
> > 
> > I like the scheme.  Strikes a good balance between complexity of description
> > and systems that actually exist. Sure there are systems with more cache
> > levels etc but they are rare and support can be easily added later
> > if people want to model them.
> 
> Thanks for your support!
> 
> [snip]
> 
> > > +static int smp_cache_string_to_topology(MachineState *ms,
> > 
> > Not a good name for a function that does rather more than that.
> 
> What about "smp_cache_get_valid_topology()"?
> 
> > 
> > > +                                        char *topo_str,
> > > +                                        CPUTopoLevel *topo,
> > > +                                        Error **errp)
> > > +{
> > > +    *topo = string_to_cpu_topo(topo_str);
> > > +
> > > +    if (*topo == CPU_TOPO_LEVEL_MAX || *topo == CPU_TOPO_LEVEL_INVALID) {
> > > +        error_setg(errp, "Invalid cache topology level: %s. The cache "
> > > +                   "topology should match the CPU topology level", topo_str);
> > > +        return -1;
> > > +    }
> > > +
> > > +    if (!machine_check_topo_support(ms, *topo)) {
> > > +        error_setg(errp, "Invalid cache topology level: %s. The topology "
> > > +                   "level is not supported by this machine", topo_str);
> > > +        return -1;
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > +
> > > +static void machine_parse_smp_cache_config(MachineState *ms,
> > > +                                           const SMPConfiguration *config,
> > > +                                           Error **errp)
> > > +{
> > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > > +
> > > +    if (config->l1d_cache) {
> > > +        if (!mc->smp_props.l1_separated_cache_supported) {
> > > +            error_setg(errp, "L1 D-cache topology not "
> > > +                       "supported by this machine");
> > > +            return;
> > > +        }
> > > +
> > > +        if (smp_cache_string_to_topology(ms, config->l1d_cache,
> > > +            &ms->smp_cache.l1d, errp)) {
> > 
> > Indent is to wrong opening bracket.
> > Same for other cases.
> 
> Could you please educate me about the correct style here?
> I'm unsure if it should be indented by 4 spaces.

It needs to look like this:

        if (smp_cache_string_to_topology(ms, config->l1d_cache,
                                         &ms->smp_cache.l1d, errp)) {

so func parameters are aligned to the function calls' opening bracket,
not the 'if' statement's opening bracket.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-26 15:39   ` Jonathan Cameron via
@ 2024-02-27  9:20     ` Zhao Liu
  2024-02-27  9:12       ` Daniel P. Berrangé
  2024-02-27 10:51       ` Jonathan Cameron via
  0 siblings, 2 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-27  9:20 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

Hi Jonathan,

> Hi Zhao Liu
> 
> I like the scheme.  Strikes a good balance between complexity of description
> and systems that actually exist. Sure there are systems with more cache
> levels etc but they are rare and support can be easily added later
> if people want to model them.

Thanks for your support!

[snip]

> > +static int smp_cache_string_to_topology(MachineState *ms,
> 
> Not a good name for a function that does rather more than that.

What about "smp_cache_get_valid_topology()"?

> 
> > +                                        char *topo_str,
> > +                                        CPUTopoLevel *topo,
> > +                                        Error **errp)
> > +{
> > +    *topo = string_to_cpu_topo(topo_str);
> > +
> > +    if (*topo == CPU_TOPO_LEVEL_MAX || *topo == CPU_TOPO_LEVEL_INVALID) {
> > +        error_setg(errp, "Invalid cache topology level: %s. The cache "
> > +                   "topology should match the CPU topology level", topo_str);
> > +        return -1;
> > +    }
> > +
> > +    if (!machine_check_topo_support(ms, *topo)) {
> > +        error_setg(errp, "Invalid cache topology level: %s. The topology "
> > +                   "level is not supported by this machine", topo_str);
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static void machine_parse_smp_cache_config(MachineState *ms,
> > +                                           const SMPConfiguration *config,
> > +                                           Error **errp)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +
> > +    if (config->l1d_cache) {
> > +        if (!mc->smp_props.l1_separated_cache_supported) {
> > +            error_setg(errp, "L1 D-cache topology not "
> > +                       "supported by this machine");
> > +            return;
> > +        }
> > +
> > +        if (smp_cache_string_to_topology(ms, config->l1d_cache,
> > +            &ms->smp_cache.l1d, errp)) {
> 
> Indent is to wrong opening bracket.
> Same for other cases.

Could you please educate me about the correct style here?
I'm unsure if it should be indented by 4 spaces.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-27  9:12       ` Daniel P. Berrangé
@ 2024-02-27 10:35         ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-27 10:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Jonathan Cameron, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

> > > > +        if (smp_cache_string_to_topology(ms, config->l1d_cache,
> > > > +            &ms->smp_cache.l1d, errp)) {
> > > 
> > > Indent is to wrong opening bracket.
> > > Same for other cases.
> > 
> > Could you please educate me about the correct style here?
> > I'm unsure if it should be indented by 4 spaces.
> 
> It needs to look like this:
> 
>         if (smp_cache_string_to_topology(ms, config->l1d_cache,
>                                          &ms->smp_cache.l1d, errp)) {
> 
> so func parameters are aligned to the function calls' opening bracket,
> not the 'if' statement's opening bracket.
>

Thanks for your explaination!

Regards,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-27  9:20     ` Zhao Liu
  2024-02-27  9:12       ` Daniel P. Berrangé
@ 2024-02-27 10:51       ` Jonathan Cameron via
  2024-02-27 15:55         ` Zhao Liu
  1 sibling, 1 reply; 26+ messages in thread
From: Jonathan Cameron via @ 2024-02-27 10:51 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Tue, 27 Feb 2024 17:20:25 +0800
Zhao Liu <zhao1.liu@linux.intel.com> wrote:

> Hi Jonathan,
> 
> > Hi Zhao Liu
> > 
> > I like the scheme.  Strikes a good balance between complexity of description
> > and systems that actually exist. Sure there are systems with more cache
> > levels etc but they are rare and support can be easily added later
> > if people want to model them.  
> 
> Thanks for your support!
> 
> [snip]
> 
> > > +static int smp_cache_string_to_topology(MachineState *ms,  
> > 
> > Not a good name for a function that does rather more than that.  
> 
> What about "smp_cache_get_valid_topology()"?

Looking again, could we return the CPUTopoLevel? I think returning
CPU_TOPO_LEVEL_INVALID will replace -1/0 returns and this can just
be smp_cache_string_to_topology() as you have it in this version.

The check on the return value becomes a little more more complex
and I think you want to squash CPU_TOPO_LEVEL_MAX down so we only
have one invalid value to check at callers..  E.g.

static CPUTopoLevel smp_cache_string_to_topolgy(MachineState *ms,
                                                char *top_str,
                                                Error **errp)
{
    CPUTopoLevel topo = string_to_cpu_topo(topo_str);

    if (topo == CPU_TOPO_LEVEL_MAX || topo == CPU_TOP?O_LEVEL_INVALID) {
        return CPU_TOPO_LEVEL_INVALID;
    }

    if (!machine_check_topo_support(ms, topo) {
        error_setg(errp,
                   "Invalid cache topology level: %s. "
                   "The cache topology should match the CPU topology level",
//Break string like this to make it as grep-able as possible!
                   topo_str);
        return CPU_TOPO_LEVEL_INVALID;
    }

    return topo;

}                


The checks then become != CPU_TOPO_LEVEL_INVALID at each callsite.

Jonathan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-27 10:51       ` Jonathan Cameron via
@ 2024-02-27 15:55         ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-27 15:55 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

On Tue, Feb 27, 2024 at 10:51:45AM +0000, Jonathan Cameron wrote:
> Date: Tue, 27 Feb 2024 10:51:45 +0000
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC 4/8] hw/core: Add cache topology options in -smp
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
> 
> On Tue, 27 Feb 2024 17:20:25 +0800
> Zhao Liu <zhao1.liu@linux.intel.com> wrote:
> 
> > Hi Jonathan,
> > 
> > > Hi Zhao Liu
> > > 
> > > I like the scheme.  Strikes a good balance between complexity of description
> > > and systems that actually exist. Sure there are systems with more cache
> > > levels etc but they are rare and support can be easily added later
> > > if people want to model them.  
> > 
> > Thanks for your support!
> > 
> > [snip]
> > 
> > > > +static int smp_cache_string_to_topology(MachineState *ms,  
> > > 
> > > Not a good name for a function that does rather more than that.  
> > 
> > What about "smp_cache_get_valid_topology()"?
> 
> Looking again, could we return the CPUTopoLevel? I think returning
> CPU_TOPO_LEVEL_INVALID will replace -1/0 returns and this can just
> be smp_cache_string_to_topology() as you have it in this version.
> 
> The check on the return value becomes a little more more complex
> and I think you want to squash CPU_TOPO_LEVEL_MAX down so we only
> have one invalid value to check at callers..  E.g.
> 
> static CPUTopoLevel smp_cache_string_to_topolgy(MachineState *ms,
>                                                 char *top_str,
>                                                 Error **errp)
> {
>     CPUTopoLevel topo = string_to_cpu_topo(topo_str);
> 
>     if (topo == CPU_TOPO_LEVEL_MAX || topo == CPU_TOP?O_LEVEL_INVALID) {
>         return CPU_TOPO_LEVEL_INVALID;
>     }
> 
>     if (!machine_check_topo_support(ms, topo) {
>         error_setg(errp,
>                    "Invalid cache topology level: %s. "
>                    "The cache topology should match the CPU topology level",
> //Break string like this to make it as grep-able as possible!
>                    topo_str);
>         return CPU_TOPO_LEVEL_INVALID;
>     }
> 
>     return topo;
> 
> }                
> 
> 
> The checks then become != CPU_TOPO_LEVEL_INVALID at each callsite.
>

Good idea! This makes the code clearer. Let me try this way. Thanks!



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 8/8] qemu-options: Add the cache topology description of -smp
  2024-02-26 15:47   ` Jonathan Cameron via
@ 2024-02-27 16:17     ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-27 16:17 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma, Zhao Liu

Hi Jonathan,

On Mon, Feb 26, 2024 at 03:47:34PM +0000, Jonathan Cameron wrote:
> Date: Mon, 26 Feb 2024 15:47:34 +0000
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Subject: Re: [RFC 8/8] qemu-options: Add the cache topology description of
>  -smp
> X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32)
> 
> On Tue, 20 Feb 2024 17:25:04 +0800
> Zhao Liu <zhao1.liu@linux.intel.com> wrote:
> 
> > From: Zhao Liu <zhao1.liu@intel.com>
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> 
> Hi,
> 
> A trivial comment, but also a possibly more significant one about
> whether the defaults are correctly verified.
> 
> Jonathan
> > ---
> >  qemu-options.hx | 54 ++++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 47 insertions(+), 7 deletions(-)
> > 
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 70eaf3256685..85c78c99a3b0 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -281,7 +281,9 @@ ERST
> >  
> >  DEF("smp", HAS_ARG, QEMU_OPTION_smp,
> >      "-smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n"
> > -    "               [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
> > +    "               [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n"
> > +    "               [,threads=threads][,l1d-cache=level][,l1i-cache=level][,l2-cache=level]\n"
> burns more characters but I'd go with
> l1d->cache=topo_level
> 
> As level for a cache has a totally different meaning!

Yes, good catch! Thanks.

> 
> > +    "               [,l3-cache=level]\n"
> >      "                set the number of initial CPUs to 'n' [default=1]\n"
> >      "                maxcpus= maximum number of total CPUs, including\n"
> >      "                offline CPUs for hotplug, etc\n"
> > @@ -290,9 +292,14 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
> >      "                sockets= number of sockets in one book\n"
> >      "                dies= number of dies in one socket\n"
> >      "                clusters= number of clusters in one die\n"
> > -    "                cores= number of cores in one cluster\n"
> > +    "                modules= number of modules in one cluster\n"
> > +    "                cores= number of cores in one module\n"
> >      "                threads= number of threads in one core\n"
> > -    "Note: Different machines may have different subsets of the CPU topology\n"
> > +    "                l1d-cache= topology level of L1 D-cache\n"
> > +    "                l1i-cache= topology level of L1 I-cache\n"
> > +    "                l2-cache= topology level of L2 cache\n"
> > +    "                l3-cache= topology level of L3 cache\n"
> > +    "Note: Different machines may have different subsets of the CPU and cache topology\n"
> 
> >  
> >          -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32
> >  
> > +    The following sub-option defines a CPU topology hierarchy (2 sockets
> > +    totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per
> > +    module, 2 threads per core) with 3-level cache topology hierarchy (L1
> > +    D-cache per core, L1 I-cache per core, L2 cache per core and L3 cache per
> > +    die) for PC machines which support sockets/dies/modules/cores/threads.
> > +    Some members of the CPU topology option can be omitted but their values
> > +    will be automatically computed. Some members of the cache topology
> > +    option can also be omitted and target CPU will use the default topology.:
> 
> Given the default could be inconsistent I wonder if we should 'push' levels
> up.  So if L2 not defined it is set either to default of equal to max of
> l1i and l1d level. L3 either default or same level as l2.

HMM, IIUC, I think there may be the case:

User sets L2 cache as per core and omits L3 cache. In this case, if L3
is per core (as L2) by default, how could we identify if that per core
L3 is the default or from user? We need to identify this becase x86's L3
is shared at die by default and L2 is shared at core level for current
CPU models.

To resolve this issue, we can add the status field in SMPCompatProps,
e.g., has_l3_cache, just like current SMPCompatProps.has_clusters, to
explicitly indicate that the L3 cache topo is set by user.

Then other caches also need the similar fields...It doesn't look as
simple as the current default invalid topology level.
 
> Won't always correspond to a sensible system so maybe just rejecting
> cases where default isn't possible is the best plan.  However I don't
> see that verification as the checks on higher levels are gated on them
> being specified.
> 
> > +
> > +    ::
> > +
> > +        -smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32,\
> > +             l1d-cache=core,l1i-cache=core,l2-cache=core,l3-cache=die
> > +
> >      The following sub-option defines a CPU topology hierarchy (2 sockets
> >      totally on the machine, 2 clusters per socket, 2 cores per cluster,
> >      2 threads per core) for ARM virt machines which support sockets/clusters
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
  2024-02-21 12:46   ` Markus Armbruster
  2024-02-26 15:39   ` Jonathan Cameron via
@ 2024-02-28  5:38   ` JeeHeng Sia
  2024-02-29  7:04     ` Zhao Liu
  2 siblings, 1 reply; 26+ messages in thread
From: JeeHeng Sia @ 2024-02-28  5:38 UTC (permalink / raw)
  To: Zhao Liu, Daniel P . Berrangé, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron
  Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, qemu-riscv@nongnu.org,
	qemu-arm@nongnu.org, Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu



> -----Original Message-----
> From: Zhao Liu <zhao1.liu@linux.intel.com>
> Sent: Tuesday, February 20, 2024 5:25 PM
> To: Daniel P . Berrangé <berrange@redhat.com>; Eduardo Habkost <eduardo@habkost.net>; Marcel Apfelbaum
> <marcel.apfelbaum@gmail.com>; Philippe Mathieu-Daudé <philmd@linaro.org>; Yanan Wang <wangyanan55@huawei.com>;
> Michael S . Tsirkin <mst@redhat.com>; Paolo Bonzini <pbonzini@redhat.com>; Richard Henderson <richard.henderson@linaro.org>;
> Eric Blake <eblake@redhat.com>; Markus Armbruster <armbru@redhat.com>; Marcelo Tosatti <mtosatti@redhat.com>; Alex Bennée
> <alex.bennee@linaro.org>; Peter Maydell <peter.maydell@linaro.org>; Jonathan Cameron <Jonathan.Cameron@huawei.com>;
> JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: qemu-devel@nongnu.org; kvm@vger.kernel.org; qemu-riscv@nongnu.org; qemu-arm@nongnu.org; Zhenyu Wang
> <zhenyu.z.wang@intel.com>; Dapeng Mi <dapeng1.mi@linux.intel.com>; Yongwei Ma <yongwei.ma@intel.com>; Zhao Liu
> <zhao1.liu@intel.com>
> Subject: [RFC 4/8] hw/core: Add cache topology options in -smp
> 
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Add "l1d-cache", "l1i-cache". "l2-cache", and "l3-cache" options in
> -smp to define the cache topology for SMP system.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  hw/core/machine-smp.c | 128 ++++++++++++++++++++++++++++++++++++++++++
>  hw/core/machine.c     |   4 ++
>  qapi/machine.json     |  14 ++++-
>  system/vl.c           |  15 +++++
>  4 files changed, 160 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> index 8a8296b0d05b..2cbd19f4aa57 100644
> --- a/hw/core/machine-smp.c
> +++ b/hw/core/machine-smp.c
> @@ -61,6 +61,132 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
>      return g_string_free(s, false);
>  }
> 
> +static bool machine_check_topo_support(MachineState *ms,
> +                                       CPUTopoLevel topo)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +
> +    if (topo == CPU_TOPO_LEVEL_MODULE && !mc->smp_props.modules_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_CLUSTER && !mc->smp_props.clusters_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_DIE && !mc->smp_props.dies_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_BOOK && !mc->smp_props.books_supported) {
> +        return false;
> +    }
> +
> +    if (topo == CPU_TOPO_LEVEL_DRAWER && !mc->smp_props.drawers_supported) {
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
> +static int smp_cache_string_to_topology(MachineState *ms,
> +                                        char *topo_str,
> +                                        CPUTopoLevel *topo,
> +                                        Error **errp)
> +{
> +    *topo = string_to_cpu_topo(topo_str);
> +
> +    if (*topo == CPU_TOPO_LEVEL_MAX || *topo == CPU_TOPO_LEVEL_INVALID) {
> +        error_setg(errp, "Invalid cache topology level: %s. The cache "
> +                   "topology should match the CPU topology level", topo_str);
> +        return -1;
> +    }
> +
> +    if (!machine_check_topo_support(ms, *topo)) {
> +        error_setg(errp, "Invalid cache topology level: %s. The topology "
> +                   "level is not supported by this machine", topo_str);
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +static void machine_parse_smp_cache_config(MachineState *ms,
> +                                           const SMPConfiguration *config,
> +                                           Error **errp)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +
> +    if (config->l1d_cache) {
> +        if (!mc->smp_props.l1_separated_cache_supported) {
> +            error_setg(errp, "L1 D-cache topology not "
> +                       "supported by this machine");
> +            return;
> +        }
> +
> +        if (smp_cache_string_to_topology(ms, config->l1d_cache,
> +            &ms->smp_cache.l1d, errp)) {
> +            return;
> +        }
> +    }
> +
> +    if (config->l1i_cache) {
> +        if (!mc->smp_props.l1_separated_cache_supported) {
> +            error_setg(errp, "L1 I-cache topology not "
> +                       "supported by this machine");
> +            return;
> +        }
> +
> +        if (smp_cache_string_to_topology(ms, config->l1i_cache,
> +            &ms->smp_cache.l1i, errp)) {
> +            return;
> +        }
> +    }
> +
> +    if (config->l2_cache) {
> +        if (!mc->smp_props.l2_unified_cache_supported) {
> +            error_setg(errp, "L2 cache topology not "
> +                       "supported by this machine");
> +            return;
> +        }
> +
> +        if (smp_cache_string_to_topology(ms, config->l2_cache,
> +            &ms->smp_cache.l2, errp)) {
> +            return;
> +        }
> +
> +        if (ms->smp_cache.l1d > ms->smp_cache.l2 ||
> +            ms->smp_cache.l1i > ms->smp_cache.l2) {
> +            error_setg(errp, "Invalid L2 cache topology. "
> +                       "L2 cache topology level should not be "
> +                       "lower than L1 D-cache/L1 I-cache");
> +            return;
> +        }
> +    }
> +
> +    if (config->l3_cache) {
> +        if (!mc->smp_props.l2_unified_cache_supported) {
> +            error_setg(errp, "L3 cache topology not "
> +                       "supported by this machine");
> +            return;
> +        }
> +
> +        if (smp_cache_string_to_topology(ms, config->l3_cache,
> +            &ms->smp_cache.l3, errp)) {
> +            return;
> +        }
> +
> +        if (ms->smp_cache.l1d > ms->smp_cache.l3 ||
> +            ms->smp_cache.l1i > ms->smp_cache.l3 ||
> +            ms->smp_cache.l2 > ms->smp_cache.l3) {
> +            error_setg(errp, "Invalid L3 cache topology. "
> +                       "L3 cache topology level should not be "
> +                       "lower than L1 D-cache/L1 I-cache/L2 cache");
> +            return;
> +        }
> +    }
> +}
> +
>  /*
>   * machine_parse_smp_config: Generic function used to parse the given
>   *                           SMP configuration
> @@ -249,6 +375,8 @@ void machine_parse_smp_config(MachineState *ms,
>                     mc->name, mc->max_cpus);
>          return;
>      }
> +
> +    machine_parse_smp_cache_config(ms, config, errp);
>  }
> 
>  unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 426f71770a84..cb5173927b0d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -886,6 +886,10 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name,
>          .has_cores = true, .cores = ms->smp.cores,
>          .has_threads = true, .threads = ms->smp.threads,
>          .has_maxcpus = true, .maxcpus = ms->smp.max_cpus,
> +        .l1d_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1d)),
> +        .l1i_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1i)),
> +        .l2_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l2)),
> +        .l3_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l3)),
Let's standardize the code by adding the 'has_' prefix.
>      };
> 
>      if (!visit_type_SMPConfiguration(v, name, &config, &error_abort)) {
> diff --git a/qapi/machine.json b/qapi/machine.json
> index d0e7f1f615f3..0a923ac38803 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1650,6 +1650,14 @@
>  #
>  # @threads: number of threads per core
>  #
> +# @l1d-cache: topology hierarchy of L1 data cache (since 9.0)
> +#
> +# @l1i-cache: topology hierarchy of L1 instruction cache (since 9.0)
> +#
> +# @l2-cache: topology hierarchy of L2 unified cache (since 9.0)
> +#
> +# @l3-cache: topology hierarchy of L3 unified cache (since 9.0)
> +#
>  # Since: 6.1
>  ##
>  { 'struct': 'SMPConfiguration', 'data': {
> @@ -1662,7 +1670,11 @@
>       '*modules': 'int',
>       '*cores': 'int',
>       '*threads': 'int',
> -     '*maxcpus': 'int' } }
> +     '*maxcpus': 'int',
> +     '*l1d-cache': 'str',
> +     '*l1i-cache': 'str',
> +     '*l2-cache': 'str',
> +     '*l3-cache': 'str' } }
> 
>  ##
>  # @x-query-irq:
> diff --git a/system/vl.c b/system/vl.c
> index a82555ae1558..ac95e5ddb656 100644
> --- a/system/vl.c
> +++ b/system/vl.c
> @@ -741,6 +741,9 @@ static QemuOptsList qemu_smp_opts = {
>          }, {
>              .name = "clusters",
>              .type = QEMU_OPT_NUMBER,
> +        }, {
> +            .name = "modules",
> +            .type = QEMU_OPT_NUMBER,
>          }, {
>              .name = "cores",
>              .type = QEMU_OPT_NUMBER,
> @@ -750,6 +753,18 @@ static QemuOptsList qemu_smp_opts = {
>          }, {
>              .name = "maxcpus",
>              .type = QEMU_OPT_NUMBER,
> +        }, {
> +            .name = "l1d-cache",
> +            .type = QEMU_OPT_STRING,
> +        }, {
> +            .name = "l1i-cache",
> +            .type = QEMU_OPT_STRING,
> +        }, {
> +            .name = "l2-cache",
> +            .type = QEMU_OPT_STRING,
> +        }, {
> +            .name = "l3-cache",
> +            .type = QEMU_OPT_STRING,
>          },
>          { /*End of list */ }
>      },
> --
> 2.34.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [RFC 6/8] i386/cpu: Update cache topology with machine's configuration
  2024-02-20  9:25 ` [RFC 6/8] i386/cpu: Update cache topology with machine's configuration Zhao Liu
@ 2024-02-28  9:45   ` JeeHeng Sia
  2024-02-29  7:19     ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: JeeHeng Sia @ 2024-02-28  9:45 UTC (permalink / raw)
  To: Zhao Liu, Daniel P . Berrangé, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron
  Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, qemu-riscv@nongnu.org,
	qemu-arm@nongnu.org, Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu



> -----Original Message-----
> From: Zhao Liu <zhao1.liu@linux.intel.com>
> Sent: Tuesday, February 20, 2024 5:25 PM
> To: Daniel P . Berrangé <berrange@redhat.com>; Eduardo Habkost <eduardo@habkost.net>; Marcel Apfelbaum
> <marcel.apfelbaum@gmail.com>; Philippe Mathieu-Daudé <philmd@linaro.org>; Yanan Wang <wangyanan55@huawei.com>;
> Michael S . Tsirkin <mst@redhat.com>; Paolo Bonzini <pbonzini@redhat.com>; Richard Henderson <richard.henderson@linaro.org>;
> Eric Blake <eblake@redhat.com>; Markus Armbruster <armbru@redhat.com>; Marcelo Tosatti <mtosatti@redhat.com>; Alex Bennée
> <alex.bennee@linaro.org>; Peter Maydell <peter.maydell@linaro.org>; Jonathan Cameron <Jonathan.Cameron@huawei.com>;
> JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: qemu-devel@nongnu.org; kvm@vger.kernel.org; qemu-riscv@nongnu.org; qemu-arm@nongnu.org; Zhenyu Wang
> <zhenyu.z.wang@intel.com>; Dapeng Mi <dapeng1.mi@linux.intel.com>; Yongwei Ma <yongwei.ma@intel.com>; Zhao Liu
> <zhao1.liu@intel.com>
> Subject: [RFC 6/8] i386/cpu: Update cache topology with machine's configuration
> 
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> User will configure SMP cache topology via -smp.
> 
> For this case, update the x86 CPUs' cache topology with user's
> configuration in MachineState.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d7cb0f1e49b4..4b5c551fe7f0 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7582,6 +7582,27 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> 
>  #ifndef CONFIG_USER_ONLY
>      MachineState *ms = MACHINE(qdev_get_machine());
> +
> +    if (ms->smp_cache.l1d != CPU_TOPO_LEVEL_INVALID) {
> +        env->cache_info_cpuid4.l1d_cache->share_level = ms->smp_cache.l1d;
> +        env->cache_info_amd.l1d_cache->share_level = ms->smp_cache.l1d;
> +    }
> +
> +    if (ms->smp_cache.l1i != CPU_TOPO_LEVEL_INVALID) {
> +        env->cache_info_cpuid4.l1i_cache->share_level = ms->smp_cache.l1i;
> +        env->cache_info_amd.l1i_cache->share_level = ms->smp_cache.l1i;
> +    }
> +
> +    if (ms->smp_cache.l2 != CPU_TOPO_LEVEL_INVALID) {
> +        env->cache_info_cpuid4.l2_cache->share_level = ms->smp_cache.l2;
> +        env->cache_info_amd.l2_cache->share_level = ms->smp_cache.l2;
> +    }
> +
> +    if (ms->smp_cache.l3 != CPU_TOPO_LEVEL_INVALID) {
> +        env->cache_info_cpuid4.l3_cache->share_level = ms->smp_cache.l3;
> +        env->cache_info_amd.l3_cache->share_level = ms->smp_cache.l3;
> +    }
> +
I think this block of code can be further optimized. Maybe we can create
a function called updateCacheShareLevel() that takes a cache pointer and
a share level as arguments. This function encapsulates the common
pattern of updating cache share levels for different caches. You can define
it like this:
void updateCacheShareLevel(XxxCacheInfo *cache, int shareLevel) {
    if (shareLevel != CPU_TOPO_LEVEL_INVALID) {
        cache->share_level = shareLevel;
    }
}
>      qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
> 
>      if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || ms->smp.cpus > 1) {
> --
> 2.34.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file
  2024-02-20  9:24 ` [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file Zhao Liu
@ 2024-02-28  9:53   ` JeeHeng Sia
  2024-02-29  4:46     ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: JeeHeng Sia @ 2024-02-28  9:53 UTC (permalink / raw)
  To: Zhao Liu, Daniel P . Berrangé, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron
  Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, qemu-riscv@nongnu.org,
	qemu-arm@nongnu.org, Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu



> -----Original Message-----
> From: Zhao Liu <zhao1.liu@linux.intel.com>
> Sent: Tuesday, February 20, 2024 5:25 PM
> To: Daniel P . Berrangé <berrange@redhat.com>; Eduardo Habkost <eduardo@habkost.net>; Marcel Apfelbaum
> <marcel.apfelbaum@gmail.com>; Philippe Mathieu-Daudé <philmd@linaro.org>; Yanan Wang <wangyanan55@huawei.com>;
> Michael S . Tsirkin <mst@redhat.com>; Paolo Bonzini <pbonzini@redhat.com>; Richard Henderson <richard.henderson@linaro.org>;
> Eric Blake <eblake@redhat.com>; Markus Armbruster <armbru@redhat.com>; Marcelo Tosatti <mtosatti@redhat.com>; Alex Bennée
> <alex.bennee@linaro.org>; Peter Maydell <peter.maydell@linaro.org>; Jonathan Cameron <Jonathan.Cameron@huawei.com>;
> JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: qemu-devel@nongnu.org; kvm@vger.kernel.org; qemu-riscv@nongnu.org; qemu-arm@nongnu.org; Zhenyu Wang
> <zhenyu.z.wang@intel.com>; Dapeng Mi <dapeng1.mi@linux.intel.com>; Yongwei Ma <yongwei.ma@intel.com>; Zhao Liu
> <zhao1.liu@intel.com>
> Subject: [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file
> 
> From: Zhao Liu <zhao1.liu@intel.com>
> 
> Cache topology needs to be defined based on CPU topology levels. Thus,
> move CPU topology enumeration into a common header.
> 
> To match the general topology naming style, rename CPU_TOPO_LEVEL_SMT
> and CPU_TOPO_LEVEL_PACKAGE to CPU_TOPO_LEVEL_THREAD and
> CPU_TOPO_LEVEL_SOCKET.
> 
> Also, enumerate additional topology levels for non-i386 arches, and add
> helpers for topology enumeration and string conversion.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  MAINTAINERS                    |  2 ++
>  hw/core/cpu-topology.c         | 56 ++++++++++++++++++++++++++++++++++
>  hw/core/meson.build            |  1 +
>  include/hw/core/cpu-topology.h | 40 ++++++++++++++++++++++++
>  include/hw/i386/topology.h     | 18 +----------
>  target/i386/cpu.c              | 24 +++++++--------
>  target/i386/cpu.h              |  2 +-
>  tests/unit/meson.build         |  3 +-
>  8 files changed, 115 insertions(+), 31 deletions(-)
>  create mode 100644 hw/core/cpu-topology.c
>  create mode 100644 include/hw/core/cpu-topology.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7d61fb93194b..4b1cce938915 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1871,6 +1871,7 @@ R: Yanan Wang <wangyanan55@huawei.com>
>  S: Supported
>  F: hw/core/cpu-common.c
>  F: hw/core/cpu-sysemu.c
> +F: hw/core/cpu-topology.c
>  F: hw/core/machine-qmp-cmds.c
>  F: hw/core/machine.c
>  F: hw/core/machine-smp.c
> @@ -1882,6 +1883,7 @@ F: qapi/machine-common.json
>  F: qapi/machine-target.json
>  F: include/hw/boards.h
>  F: include/hw/core/cpu.h
> +F: include/hw/core/cpu-topology.h
>  F: include/hw/cpu/cluster.h
>  F: include/sysemu/numa.h
>  F: tests/unit/test-smp-parse.c
> diff --git a/hw/core/cpu-topology.c b/hw/core/cpu-topology.c
> new file mode 100644
> index 000000000000..ca1361d13c16
> --- /dev/null
> +++ b/hw/core/cpu-topology.c
> @@ -0,0 +1,56 @@
> +/*
> + * QEMU CPU Topology Representation
> + *
> + * Copyright (c) 2024 Intel Corporation
> + * Author: Zhao Liu <zhao1.liu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License,
> + * or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/core/cpu-topology.h"
> +
> +typedef struct CPUTopoInfo {
> +    const char *name;
> +} CPUTopoInfo;
> +
> +CPUTopoInfo cpu_topo_descriptors[] = {
> +    [CPU_TOPO_LEVEL_INVALID] = { .name = "invalid", },
> +    [CPU_TOPO_LEVEL_THREAD]  = { .name = "thread",  },
> +    [CPU_TOPO_LEVEL_CORE]    = { .name = "core",    },
> +    [CPU_TOPO_LEVEL_MODULE]  = { .name = "module",  },
> +    [CPU_TOPO_LEVEL_CLUSTER] = { .name = "cluster", },
> +    [CPU_TOPO_LEVEL_DIE]     = { .name = "die",     },
> +    [CPU_TOPO_LEVEL_SOCKET]  = { .name = "socket",  },
> +    [CPU_TOPO_LEVEL_BOOK]    = { .name = "book",    },
> +    [CPU_TOPO_LEVEL_DRAWER]  = { .name = "drawer",  },
> +    [CPU_TOPO_LEVEL_MAX]     = { .name = NULL,      },
> +};
> +
> +const char *cpu_topo_to_string(CPUTopoLevel topo)
> +{
> +    return cpu_topo_descriptors[topo].name;
> +}
> +
> +CPUTopoLevel string_to_cpu_topo(char *str)
Can use const char *str.
> +{
> +    for (int i = 0; i < ARRAY_SIZE(cpu_topo_descriptors); i++) {
> +        CPUTopoInfo *info = &cpu_topo_descriptors[i];
> +
> +        if (!strcmp(info->name, str)) {
Suggest to use strncmp instead.
> +            return (CPUTopoLevel)i;
> +        }
> +    }
> +    return CPU_TOPO_LEVEL_MAX;
> +}
> diff --git a/hw/core/meson.build b/hw/core/meson.build
> index 67dad04de559..3b1d5ffab3e3 100644
> --- a/hw/core/meson.build
> +++ b/hw/core/meson.build
> @@ -23,6 +23,7 @@ else
>  endif
> 
>  common_ss.add(files('cpu-common.c'))
> +common_ss.add(files('cpu-topology.c'))
>  common_ss.add(files('machine-smp.c'))
>  system_ss.add(when: 'CONFIG_FITLOADER', if_true: files('loader-fit.c'))
>  system_ss.add(when: 'CONFIG_GENERIC_LOADER', if_true: files('generic-loader.c'))
> diff --git a/include/hw/core/cpu-topology.h b/include/hw/core/cpu-topology.h
> new file mode 100644
> index 000000000000..cc6ca186ce3f
> --- /dev/null
> +++ b/include/hw/core/cpu-topology.h
> @@ -0,0 +1,40 @@
> +/*
> + * QEMU CPU Topology Representation Header
> + *
> + * Copyright (c) 2024 Intel Corporation
> + * Author: Zhao Liu <zhao1.liu@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License,
> + * or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef CPU_TOPOLOGY_H
> +#define CPU_TOPOLOGY_H
> +
> +typedef enum CPUTopoLevel {
> +    CPU_TOPO_LEVEL_INVALID,
> +    CPU_TOPO_LEVEL_THREAD,
> +    CPU_TOPO_LEVEL_CORE,
> +    CPU_TOPO_LEVEL_MODULE,
> +    CPU_TOPO_LEVEL_CLUSTER,
> +    CPU_TOPO_LEVEL_DIE,
> +    CPU_TOPO_LEVEL_SOCKET,
> +    CPU_TOPO_LEVEL_BOOK,
> +    CPU_TOPO_LEVEL_DRAWER,
> +    CPU_TOPO_LEVEL_MAX,
> +} CPUTopoLevel;
> +
> +const char *cpu_topo_to_string(CPUTopoLevel topo);
> +CPUTopoLevel string_to_cpu_topo(char *str);
> +
> +#endif /* CPU_TOPOLOGY_H */
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index dff49fce1154..c6ff75f23991 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -39,7 +39,7 @@
>   *  CPUID Fn8000_0008_ECX[ApicIdCoreIdSize[3:0]] is set to apicid_core_width().
>   */
> 
> -
> +#include "hw/core/cpu-topology.h"
>  #include "qemu/bitops.h"
> 
>  /*
> @@ -62,22 +62,6 @@ typedef struct X86CPUTopoInfo {
>      unsigned threads_per_core;
>  } X86CPUTopoInfo;
> 
> -/*
> - * CPUTopoLevel is the general i386 topology hierarchical representation,
> - * ordered by increasing hierarchical relationship.
> - * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
> - * or AMD (CPUID[0x80000026]).
> - */
> -enum CPUTopoLevel {
> -    CPU_TOPO_LEVEL_INVALID,
> -    CPU_TOPO_LEVEL_SMT,
> -    CPU_TOPO_LEVEL_CORE,
> -    CPU_TOPO_LEVEL_MODULE,
> -    CPU_TOPO_LEVEL_DIE,
> -    CPU_TOPO_LEVEL_PACKAGE,
> -    CPU_TOPO_LEVEL_MAX,
> -};
> -
>  /* Return the bit width needed for 'count' IDs */
>  static unsigned apicid_bitwidth_for_count(unsigned count)
>  {
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ac0a10abd45f..725d7e70182d 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -247,7 +247,7 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
>      case CPU_TOPO_LEVEL_DIE:
>          num_ids = 1 << apicid_die_offset(topo_info);
>          break;
> -    case CPU_TOPO_LEVEL_PACKAGE:
> +    case CPU_TOPO_LEVEL_SOCKET:
>          num_ids = 1 << apicid_pkg_offset(topo_info);
>          break;
>      default:
> @@ -304,7 +304,7 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
>                                            enum CPUTopoLevel topo_level)
>  {
>      switch (topo_level) {
> -    case CPU_TOPO_LEVEL_SMT:
> +    case CPU_TOPO_LEVEL_THREAD:
>          return 1;
Just wondering why 'return 1' is used directly for the thread, but not
for the rest?
>      case CPU_TOPO_LEVEL_CORE:
>          return topo_info->threads_per_core;
> @@ -313,7 +313,7 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
>      case CPU_TOPO_LEVEL_DIE:
>          return topo_info->threads_per_core * topo_info->cores_per_module *
>                 topo_info->modules_per_die;
> -    case CPU_TOPO_LEVEL_PACKAGE:
> +    case CPU_TOPO_LEVEL_SOCKET:
>          return topo_info->threads_per_core * topo_info->cores_per_module *
>                 topo_info->modules_per_die * topo_info->dies_per_pkg;
>      default:
> @@ -326,7 +326,7 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
>                                              enum CPUTopoLevel topo_level)
>  {
>      switch (topo_level) {
> -    case CPU_TOPO_LEVEL_SMT:
> +    case CPU_TOPO_LEVEL_THREAD:
>          return 0;
>      case CPU_TOPO_LEVEL_CORE:
>          return apicid_core_offset(topo_info);
> @@ -334,7 +334,7 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
>          return apicid_module_offset(topo_info);
>      case CPU_TOPO_LEVEL_DIE:
>          return apicid_die_offset(topo_info);
> -    case CPU_TOPO_LEVEL_PACKAGE:
> +    case CPU_TOPO_LEVEL_SOCKET:
>          return apicid_pkg_offset(topo_info);
>      default:
>          g_assert_not_reached();
> @@ -347,7 +347,7 @@ static uint32_t cpuid1f_topo_type(enum CPUTopoLevel topo_level)
>      switch (topo_level) {
>      case CPU_TOPO_LEVEL_INVALID:
>          return CPUID_1F_ECX_TOPO_LEVEL_INVALID;
> -    case CPU_TOPO_LEVEL_SMT:
> +    case CPU_TOPO_LEVEL_THREAD:
>          return CPUID_1F_ECX_TOPO_LEVEL_SMT;
>      case CPU_TOPO_LEVEL_CORE:
>          return CPUID_1F_ECX_TOPO_LEVEL_CORE;
> @@ -380,7 +380,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
>      level = CPU_TOPO_LEVEL_INVALID;
>      for (int i = 0; i <= count; i++) {
>          level = find_next_bit(env->avail_cpu_topo,
> -                              CPU_TOPO_LEVEL_PACKAGE,
> +                              CPU_TOPO_LEVEL_SOCKET,
>                                level + 1);
> 
>          /*
> @@ -388,7 +388,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
>           * and it just encode the invalid level (all fields are 0)
>           * into the last subleaf of 0x1f.
>           */
> -        if (level == CPU_TOPO_LEVEL_PACKAGE) {
> +        if (level == CPU_TOPO_LEVEL_SOCKET) {
>              level = CPU_TOPO_LEVEL_INVALID;
>              break;
>          }
> @@ -401,7 +401,7 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
>          unsigned long next_level;
> 
>          next_level = find_next_bit(env->avail_cpu_topo,
> -                                   CPU_TOPO_LEVEL_PACKAGE,
> +                                   CPU_TOPO_LEVEL_SOCKET,
>                                     level + 1);
>          num_threads_next_level = num_threads_by_topo_level(topo_info,
>                                                             next_level);
> @@ -6290,7 +6290,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> 
>                      /* Share the cache at package level. */
>                      *eax |= max_thread_ids_for_cache(&topo_info,
> -                                CPU_TOPO_LEVEL_PACKAGE) << 14;
> +                                CPU_TOPO_LEVEL_SOCKET) << 14;
>                  }
>              }
>          } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> @@ -7756,9 +7756,9 @@ static void x86_cpu_init_default_topo(X86CPU *cpu)
>      env->nr_dies = 1;
> 
>      /* SMT, core and package levels are set by default. */
> -    set_bit(CPU_TOPO_LEVEL_SMT, env->avail_cpu_topo);
> +    set_bit(CPU_TOPO_LEVEL_THREAD, env->avail_cpu_topo);
>      set_bit(CPU_TOPO_LEVEL_CORE, env->avail_cpu_topo);
> -    set_bit(CPU_TOPO_LEVEL_PACKAGE, env->avail_cpu_topo);
> +    set_bit(CPU_TOPO_LEVEL_SOCKET, env->avail_cpu_topo);
>  }
> 
>  static void x86_cpu_initfn(Object *obj)
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 4b4cc70c8859..fcbf278b49e6 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1596,7 +1596,7 @@ typedef struct CPUCacheInfo {
>       * Used to encode CPUID[4].EAX[bits 25:14] or
>       * CPUID[0x8000001D].EAX[bits 25:14].
>       */
> -    enum CPUTopoLevel share_level;
> +    CPUTopoLevel share_level;
>  } CPUCacheInfo;
> 
> 
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index cae925c13259..4fe0aaff3a5e 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -138,7 +138,8 @@ if have_system
>      'test-util-sockets': ['socket-helpers.c'],
>      'test-base64': [],
>      'test-bufferiszero': [],
> -    'test-smp-parse': [qom, meson.project_source_root() / 'hw/core/machine-smp.c'],
> +    'test-smp-parse': [qom, meson.project_source_root() / 'hw/core/machine-smp.c',
> +                       meson.project_source_root() / 'hw/core/cpu-topology.c'],
>      'test-vmstate': [migration, io],
>      'test-yank': ['socket-helpers.c', qom, io, chardev]
>    }
> --
> 2.34.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file
  2024-02-28  9:53   ` JeeHeng Sia
@ 2024-02-29  4:46     ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-29  4:46 UTC (permalink / raw)
  To: JeeHeng Sia
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell,
	Jonathan Cameron, qemu-devel@nongnu.org, kvm@vger.kernel.org,
	qemu-riscv@nongnu.org, qemu-arm@nongnu.org, Zhenyu Wang,
	Dapeng Mi, Yongwei Ma, Zhao Liu

Hi JeeHeng,

> > +const char *cpu_topo_to_string(CPUTopoLevel topo)
> > +{
> > +    return cpu_topo_descriptors[topo].name;
> > +}
> > +
> > +CPUTopoLevel string_to_cpu_topo(char *str)
>
> Can use const char *str.

Okay, I'll.

> > +{
> > +    for (int i = 0; i < ARRAY_SIZE(cpu_topo_descriptors); i++) {
> > +        CPUTopoInfo *info = &cpu_topo_descriptors[i];
> > +
> > +        if (!strcmp(info->name, str)) {
>
> Suggest to use strncmp instead.

Thanks! I tries "l1i-cache=coree", and it causes Segmentation fault.
Will fix.

> > +            return (CPUTopoLevel)i;
> > +        }
> > +    }
> > +    return CPU_TOPO_LEVEL_MAX;
> > +}

> > @@ -304,7 +304,7 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
> >                                            enum CPUTopoLevel topo_level)
> >  {
> >      switch (topo_level) {
> > -    case CPU_TOPO_LEVEL_SMT:
> > +    case CPU_TOPO_LEVEL_THREAD:
> >          return 1;
> Just wondering why 'return 1' is used directly for the thread, but not
> for the rest?

This helper returens how many threads in one topology domain/container
at this level.

For thread level, it calculates how many threads are in one thread
domain, so it returns 1 directly.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 4/8] hw/core: Add cache topology options in -smp
  2024-02-28  5:38   ` JeeHeng Sia
@ 2024-02-29  7:04     ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-29  7:04 UTC (permalink / raw)
  To: JeeHeng Sia
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell,
	Jonathan Cameron, qemu-devel@nongnu.org, kvm@vger.kernel.org,
	qemu-riscv@nongnu.org, qemu-arm@nongnu.org, Zhenyu Wang,
	Dapeng Mi, Yongwei Ma, Zhao Liu

Hi JeeHeng,

> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 426f71770a84..cb5173927b0d 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -886,6 +886,10 @@ static void machine_get_smp(Object *obj, Visitor *v, const char *name,
> >          .has_cores = true, .cores = ms->smp.cores,
> >          .has_threads = true, .threads = ms->smp.threads,
> >          .has_maxcpus = true, .maxcpus = ms->smp.max_cpus,
> > +        .l1d_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1d)),
> > +        .l1i_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l1i)),
> > +        .l2_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l2)),
> > +        .l3_cache = g_strdup(cpu_topo_to_string(ms->smp_cache.l3)),
>
> Let's standardize the code by adding the 'has_' prefix.

SMPConfiguration is automatically generated in the compilation, and its
prototype is defined in qapi/machine.json like the following code:


> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index d0e7f1f615f3..0a923ac38803 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1650,6 +1650,14 @@
> >  #
> >  # @threads: number of threads per core
> >  #
> > +# @l1d-cache: topology hierarchy of L1 data cache (since 9.0)
> > +#
> > +# @l1i-cache: topology hierarchy of L1 instruction cache (since 9.0)
> > +#
> > +# @l2-cache: topology hierarchy of L2 unified cache (since 9.0)
> > +#
> > +# @l3-cache: topology hierarchy of L3 unified cache (since 9.0)
> > +#
> >  # Since: 6.1
> >  ##
> >  { 'struct': 'SMPConfiguration', 'data': {
> > @@ -1662,7 +1670,11 @@
> >       '*modules': 'int',
> >       '*cores': 'int',
> >       '*threads': 'int',
> > -     '*maxcpus': 'int' } }
> > +     '*maxcpus': 'int',
> > +     '*l1d-cache': 'str',
> > +     '*l1i-cache': 'str',
> > +     '*l2-cache': 'str',
> > +     '*l3-cache': 'str' } }
> >

The gnerated complete structure is (will in build/qapi/qapi-types-machine.h):

struct SMPConfiguration {
    bool has_cpus;
    int64_t cpus;
    bool has_drawers;
    int64_t drawers;
    bool has_books;
    int64_t books;
    bool has_sockets;
    int64_t sockets;
    bool has_dies;
    int64_t dies;
    bool has_clusters;
    int64_t clusters;
    bool has_modules;
    int64_t modules;
    bool has_cores;
    int64_t cores;
    bool has_threads;
    int64_t threads;
    bool has_maxcpus;
    int64_t maxcpus;
    char *l1d_cache;
    char *l1i_cache;
    char *l2_cache;
    char *l3_cache;
};

The int member defined in qapi/machine.json will get their corresponding
status fields as has_* to indicate if user sets those int fields.

For str type, the status field is not needed since NULL is enough to
indicate no user sets that.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC 6/8] i386/cpu: Update cache topology with machine's configuration
  2024-02-28  9:45   ` JeeHeng Sia
@ 2024-02-29  7:19     ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2024-02-29  7:19 UTC (permalink / raw)
  To: JeeHeng Sia
  Cc: Daniel P . Berrang�, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daud�, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell,
	Jonathan Cameron, qemu-devel@nongnu.org, kvm@vger.kernel.org,
	qemu-riscv@nongnu.org, qemu-arm@nongnu.org, Zhenyu Wang,
	Dapeng Mi, Yongwei Ma, Zhao Liu

Hi JeeHeng,

> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index d7cb0f1e49b4..4b5c551fe7f0 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -7582,6 +7582,27 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
> > 
> >  #ifndef CONFIG_USER_ONLY
> >      MachineState *ms = MACHINE(qdev_get_machine());
> > +
> > +    if (ms->smp_cache.l1d != CPU_TOPO_LEVEL_INVALID) {
> > +        env->cache_info_cpuid4.l1d_cache->share_level = ms->smp_cache.l1d;
> > +        env->cache_info_amd.l1d_cache->share_level = ms->smp_cache.l1d;
> > +    }
> > +
> > +    if (ms->smp_cache.l1i != CPU_TOPO_LEVEL_INVALID) {
> > +        env->cache_info_cpuid4.l1i_cache->share_level = ms->smp_cache.l1i;
> > +        env->cache_info_amd.l1i_cache->share_level = ms->smp_cache.l1i;
> > +    }
> > +
> > +    if (ms->smp_cache.l2 != CPU_TOPO_LEVEL_INVALID) {
> > +        env->cache_info_cpuid4.l2_cache->share_level = ms->smp_cache.l2;
> > +        env->cache_info_amd.l2_cache->share_level = ms->smp_cache.l2;
> > +    }
> > +
> > +    if (ms->smp_cache.l3 != CPU_TOPO_LEVEL_INVALID) {
> > +        env->cache_info_cpuid4.l3_cache->share_level = ms->smp_cache.l3;
> > +        env->cache_info_amd.l3_cache->share_level = ms->smp_cache.l3;
> > +    }
> > +
>
> I think this block of code can be further optimized. Maybe we can create
> a function called updateCacheShareLevel() that takes a cache pointer and
> a share level as arguments. This function encapsulates the common
> pattern of updating cache share levels for different caches. You can define
> it like this:
> void updateCacheShareLevel(XxxCacheInfo *cache, int shareLevel) {
>     if (shareLevel != CPU_TOPO_LEVEL_INVALID) {
>         cache->share_level = shareLevel;
>     }
> }
>

Good idea! Will try this way.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-02-29  7:06 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-20  9:24 [RFC 0/8] Introduce SMP Cache Topology Zhao Liu
2024-02-20  9:24 ` [RFC 1/8] hw/core: Rename CpuTopology to CPUTopology Zhao Liu
2024-02-20  9:24 ` [RFC 2/8] hw/core: Move CPU topology enumeration into arch-agnostic file Zhao Liu
2024-02-28  9:53   ` JeeHeng Sia
2024-02-29  4:46     ` Zhao Liu
2024-02-20  9:24 ` [RFC 3/8] hw/core: Define cache topology for machine Zhao Liu
2024-02-20  9:25 ` [RFC 4/8] hw/core: Add cache topology options in -smp Zhao Liu
2024-02-21 12:46   ` Markus Armbruster
2024-02-21 15:17     ` Zhao Liu
2024-02-26 15:39   ` Jonathan Cameron via
2024-02-27  9:20     ` Zhao Liu
2024-02-27  9:12       ` Daniel P. Berrangé
2024-02-27 10:35         ` Zhao Liu
2024-02-27 10:51       ` Jonathan Cameron via
2024-02-27 15:55         ` Zhao Liu
2024-02-28  5:38   ` JeeHeng Sia
2024-02-29  7:04     ` Zhao Liu
2024-02-20  9:25 ` [RFC 5/8] i386/cpu: Support thread and module level cache topology Zhao Liu
2024-02-20  9:25 ` [RFC 6/8] i386/cpu: Update cache topology with machine's configuration Zhao Liu
2024-02-28  9:45   ` JeeHeng Sia
2024-02-29  7:19     ` Zhao Liu
2024-02-20  9:25 ` [RFC 7/8] i386/pc: Support cache topology in -smp for PC machine Zhao Liu
2024-02-20  9:25 ` [RFC 8/8] qemu-options: Add the cache topology description of -smp Zhao Liu
2024-02-26 15:47   ` Jonathan Cameron via
2024-02-27 16:17     ` Zhao Liu
2024-02-20 20:07 ` [RFC 0/8] Introduce SMP Cache Topology Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).