[PATCH v3 0/7] Introduce SMP Cache Topology

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/7] Introduce SMP Cache Topology
@ 2024-10-12 10:44 Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu

Hi all,

Compared with v2 [1], the changes in v3 are quite minor, and most of
patches (except for patch 1 and 7) have received Jonathan’s R/b (thanks
Jonathan!).

Meanwhile, ARM side has also worked a lot on the smp-cache based on
this series [2], so I think we are very close to the final merge, to
catch up with this cycle. :)

This series is based on the commit 7e3b6d8063f2 ("Merge tag 'crypto-
fixes-pull-request' of https://gitlab.com/berrange/qemu into staging").

Background
==========

The x86 and ARM (RISCV) need to allow user to configure cache properties
(current only topology):
 * For x86, the default cache topology model (of max/host CPU) does not
   always match the Host's real physical cache topology. Performance can
   increase when the configured virtual topology is closer to the
   physical topology than a default topology would be.
 * For ARM, QEMU can't get the cache topology information from the CPU
   registers, then user configuration is necessary. Additionally, the
   cache information is also needed for MPAM emulation (for TCG) to
   build the right PPTT. (Originally from Jonathan)

About smp-cache
===============

The API design has been discussed heavily in [3].

Now, smp-cache is implemented as a array integrated in -machine. Though
-machine currently can't support JSON format, this is the one of the
directions of future.

An example is as follows:

smp_cache=smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=module,smp-cache.3.cache=l3,smp-cache.3.topology=die

"cache" specifies the cache that the properties will be applied on. This
field is the combination of cache level and cache type. Now it supports
"l1d" (L1 data cache), "l1i" (L1 instruction cache), "l2" (L2 unified
cache) and "l3" (L3 unified cache).

"topology" field accepts CPU topology levels including "thread", "core",
"module", "cluster", "die", "socket", "book", "drawer" and a special
value "default".

The "default" is introduced to make it easier for libvirt to set a
default parameter value without having to care about the specific
machine (because currently there is no proper way for machine to
expose supported topology levels and caches).

If "default" is set, then the cache topology will follow the
architecture's default cache topology model. If other CPU topology level
is set, the cache will be shared at corresponding CPU topology level.

Welcome your feedback and review!

[1]: Patch v2: https://lore.kernel.org/qemu-devel/20240908125920.1160236-1-zhao1.liu@intel.com/
[2]: ARM smp-cache: https://lore.kernel.org/qemu-devel/20241010111822.345-1-alireza.sanaee@huawei.com/
[3]: API disscussion: https://lore.kernel.org/qemu-devel/8734ndj33j.fsf@pond.sub.org/

Thanks and Best Regards,
Zhao
---
Changelog:

Main changes since Patch v2:
 * Updated version of new QAPI structures to v9.2. (Jonathan)
 * Merged the QAPI change and smp-cache property support of machine
   into one commit. (Jonathan)
 * Picked Alireza's patch to add a has_caches flag.
 * Polished english and coding style. (Jonathan)

Main changes since Patch v1:
 * Dropped handwriten smp-cache object and integrated cache properties
   list into MachineState and used -machine to configure SMP cache
   properties. (Markus)
 * Dropped prefix of CpuTopologyLevel enumeration. (Markus)
 * Rename CPU_TOPO_LEVEL_* to CPU_TOPOLOGY_LEVEL_* to match the QAPI's
   generated code. (Markus)
 * Renamed SMPCacheProperty/SMPCacheProperties (QAPI structures) to
   SmpCacheProperties/SmpCachePropertiesWrapper. (Markus)
 * Renamed SMPCacheName (QAPI structure) to SmpCacheLevelAndType and
   dropped prefix. (Markus)
 * Renamed 'name' field in SmpCacheProperties to 'cache', since the
   type and level of the cache in SMP system could be able to specify
   all of these kinds of cache explicitly enough.
 * Renamed 'topo' field in SmpCacheProperties to 'topology'. (Markus)
 * Returned error information when user repeats setting cache
   properties. (Markus)
 * Renamed SmpCacheLevelAndType to CacheLevelAndType, since this
   representation is general across SMP or hybrid system.
 * Dropped machine_check_smp_cache_support() and did the check when
   -machine parses smp-cache in machine_parse_smp_cache().

Main changes since RFC v2:
 * Dropped cpu-topology.h and cpu-topology.c since QAPI has the helper
   (CpuTopologyLevel_str) to convert enum to string. (Markus)
 * Fixed text format in machine.json (CpuTopologyLevel naming, 2 spaces
   between sentences). (Markus)
 * Added a new level "default" to de-compatibilize some arch-specific
   topo settings. (Daniel)
 * Moved CpuTopologyLevel to qapi/machine-common.json, at where the
   cache enumeration and smp-cache object would be added.
   - If smp-cache object is defined in qapi/machine.json, storage-daemon
     will complain about the qmp cmds in qapi/machine.json during
     compiling.
 * Referred to Daniel's suggestion to introduce cache JSON list, though
   as a standalone object since -smp/-machine can't support JSON.
 * Linked machine's smp_cache to smp-cache object instead of a builtin
   structure. This is to get around the fact that the keyval format of
   -machine can't support JSON.
 * Wrapped the cache topology level access into a helper.
 * Split as a separate commit to just include compatibility checking and
   topology checking.
 * Allow setting "default" topology level even though the cache
   isn't supported by machine. (Daniel)
 * Rewrote the document of smp-cache object.

Main changes since RFC v1:
 * Split CpuTopology renaimg out of this RFC.
 * Use QAPI to enumerate CPU topology levels.
 * Drop string_to_cpu_topo() since QAPI will help to parse the topo
   levels.
 * Set has_*_cache field in machine_get_smp(). (JeeHeng)
 * Use "*_cache=topo_level" as -smp example as the original "level"
   term for a cache has a totally different meaning. (Jonathan)
---
Alireza Sanaee (1):
  i386/cpu: add has_caches flag to check smp_cache configuration

Zhao Liu (6):
  hw/core: Make CPU topology enumeration arch-agnostic
  qapi/qom: Define cache enumeration and properties for machine
  hw/core: Check smp cache topology support for machine
  i386/cpu: Support thread and module level cache topology
  i386/cpu: Update cache topology with machine's configuration
  i386/pc: Support cache topology in -machine for PC machine

 hw/core/machine-smp.c      | 117 +++++++++++++++++++++++
 hw/core/machine.c          |  44 +++++++++
 hw/i386/pc.c               |   4 +
 hw/i386/x86-common.c       |   4 +-
 include/hw/boards.h        |  16 ++++
 include/hw/i386/topology.h |  22 +----
 qapi/machine-common.json   |  96 ++++++++++++++++++-
 qemu-options.hx            |  26 ++++-
 target/i386/cpu.c          | 190 ++++++++++++++++++++++---------------
 target/i386/cpu.h          |   4 +-
 10 files changed, 423 insertions(+), 100 deletions(-)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
       [not found]   ` <20241017095227.00006d85@Huawei.com>
                     ` (2 more replies)
  2024-10-12 10:44 ` [PATCH v3 2/7] qapi/qom: Define cache enumeration and properties for machine Zhao Liu
                   ` (7 subsequent siblings)
  8 siblings, 3 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

Cache topology needs to be defined based on CPU topology levels. Thus,
define CPU topology enumeration in qapi/machine.json to make it generic
for all architectures.

To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
socket.

Also, enumerate additional topology levels for non-i386 arches, and add
a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
with compatibility requirement of arch-specific cache topology models.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
---
Changes since Patch v2:
 * Updated version of new QAPI structures to v9.2. (Jonathan)

Changes since Patch v1:
 * Dropped prefix of CpuTopologyLevel enumeration. (Markus)
 * Rename CPU_TOPO_LEVEL_* to CPU_TOPOLOGY_LEVEL_* to match the QAPI's
   generated code. (Markus)

Changes since RFC v2:
 * Dropped cpu-topology.h and cpu-topology.c since QAPI has the helper
   (CpuTopologyLevel_str) to convert enum to string. (Markus)
 * Fixed text format in machine.json (CpuTopologyLevel naming, 2 spaces
   between sentences). (Markus)
 * Added a new level "default" to de-compatibilize some arch-specific
   topo settings. (Daniel)
 * Moved CpuTopologyLevel to qapi/machine-common.json, at where the
   cache enumeration and smp-cache object would be added.
   - If smp-cache object is defined in qapi/machine.json, storage-daemon
     will complain about the qmp cmds in qapi/machine.json during
     compiling.

Changes since RFC v1:
 * Used QAPI to enumerate CPU topology levels.
 * Dropped string_to_cpu_topo() since QAPI will help to parse the topo
   levels.
---
 hw/i386/x86-common.c       |   4 +-
 include/hw/i386/topology.h |  22 +-----
 qapi/machine-common.json   |  46 +++++++++++-
 target/i386/cpu.c          | 144 ++++++++++++++++++-------------------
 target/i386/cpu.h          |   4 +-
 5 files changed, 124 insertions(+), 96 deletions(-)

diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 992ea1f25e94..b21d2ab97349 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -273,12 +273,12 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     if (ms->smp.modules > 1) {
         env->nr_modules = ms->smp.modules;
-        set_bit(CPU_TOPO_LEVEL_MODULE, env->avail_cpu_topo);
+        set_bit(CPU_TOPOLOGY_LEVEL_MODULE, env->avail_cpu_topo);
     }
 
     if (ms->smp.dies > 1) {
         env->nr_dies = ms->smp.dies;
-        set_bit(CPU_TOPO_LEVEL_DIE, env->avail_cpu_topo);
+        set_bit(CPU_TOPOLOGY_LEVEL_DIE, env->avail_cpu_topo);
     }
 
     /*
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index dff49fce1154..bf740383038b 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -39,7 +39,7 @@
  *  CPUID Fn8000_0008_ECX[ApicIdCoreIdSize[3:0]] is set to apicid_core_width().
  */
 
-
+#include "qapi/qapi-types-machine-common.h"
 #include "qemu/bitops.h"
 
 /*
@@ -62,22 +62,6 @@ typedef struct X86CPUTopoInfo {
     unsigned threads_per_core;
 } X86CPUTopoInfo;
 
-/*
- * CPUTopoLevel is the general i386 topology hierarchical representation,
- * ordered by increasing hierarchical relationship.
- * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
- * or AMD (CPUID[0x80000026]).
- */
-enum CPUTopoLevel {
-    CPU_TOPO_LEVEL_INVALID,
-    CPU_TOPO_LEVEL_SMT,
-    CPU_TOPO_LEVEL_CORE,
-    CPU_TOPO_LEVEL_MODULE,
-    CPU_TOPO_LEVEL_DIE,
-    CPU_TOPO_LEVEL_PACKAGE,
-    CPU_TOPO_LEVEL_MAX,
-};
-
 /* Return the bit width needed for 'count' IDs */
 static unsigned apicid_bitwidth_for_count(unsigned count)
 {
@@ -212,8 +196,8 @@ static inline apic_id_t x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
  */
 static inline bool x86_has_extended_topo(unsigned long *topo_bitmap)
 {
-    return test_bit(CPU_TOPO_LEVEL_MODULE, topo_bitmap) ||
-           test_bit(CPU_TOPO_LEVEL_DIE, topo_bitmap);
+    return test_bit(CPU_TOPOLOGY_LEVEL_MODULE, topo_bitmap) ||
+           test_bit(CPU_TOPOLOGY_LEVEL_DIE, topo_bitmap);
 }
 
 #endif /* HW_I386_TOPOLOGY_H */
diff --git a/qapi/machine-common.json b/qapi/machine-common.json
index b64e4895cfd7..db3e499fb382 100644
--- a/qapi/machine-common.json
+++ b/qapi/machine-common.json
@@ -5,7 +5,7 @@
 # See the COPYING file in the top-level directory.
 
 ##
-# = Machines S390 data types
+# = Common machine types
 ##
 
 ##
@@ -18,3 +18,47 @@
 ##
 { 'enum': 'S390CpuEntitlement',
   'data': [ 'auto', 'low', 'medium', 'high' ] }
+
+##
+# @CpuTopologyLevel:
+#
+# An enumeration of CPU topology levels.
+#
+# @invalid: Invalid topology level.
+#
+# @thread: thread level, which would also be called SMT level or
+#     logical processor level.  The @threads option in
+#     SMPConfiguration is used to configure the topology of this
+#     level.
+#
+# @core: core level.  The @cores option in SMPConfiguration is used
+#     to configure the topology of this level.
+#
+# @module: module level.  The @modules option in SMPConfiguration is
+#     used to configure the topology of this level.
+#
+# @cluster: cluster level.  The @clusters option in SMPConfiguration
+#     is used to configure the topology of this level.
+#
+# @die: die level.  The @dies option in SMPConfiguration is used to
+#     configure the topology of this level.
+#
+# @socket: socket level, which would also be called package level.
+#     The @sockets option in SMPConfiguration is used to configure
+#     the topology of this level.
+#
+# @book: book level.  The @books option in SMPConfiguration is used
+#     to configure the topology of this level.
+#
+# @drawer: drawer level.  The @drawers option in SMPConfiguration is
+#     used to configure the topology of this level.
+#
+# @default: default level.  Some architectures will have default
+#     topology settings (e.g., cache topology), and this special
+#     level means following the architecture-specific settings.
+#
+# Since: 9.2
+##
+{ 'enum': 'CpuTopologyLevel',
+  'data': [ 'invalid', 'thread', 'core', 'module', 'cluster',
+            'die', 'socket', 'book', 'drawer', 'default' ] }
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c87..c84e32c9c303 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -236,23 +236,23 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache)
                        0 /* Invalid value */)
 
 static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
-                                         enum CPUTopoLevel share_level)
+                                         enum CpuTopologyLevel share_level)
 {
     uint32_t num_ids = 0;
 
     switch (share_level) {
-    case CPU_TOPO_LEVEL_CORE:
+    case CPU_TOPOLOGY_LEVEL_CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
-    case CPU_TOPO_LEVEL_DIE:
+    case CPU_TOPOLOGY_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPOLOGY_LEVEL_SOCKET:
         num_ids = 1 << apicid_pkg_offset(topo_info);
         break;
     default:
         /*
-         * Currently there is no use case for SMT and MODULE, so use
+         * Currently there is no use case for THREAD and MODULE, so use
          * assert directly to facilitate debugging.
          */
         g_assert_not_reached();
@@ -301,19 +301,19 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache,
 }
 
 static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
-                                          enum CPUTopoLevel topo_level)
+                                          enum CpuTopologyLevel topo_level)
 {
     switch (topo_level) {
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPOLOGY_LEVEL_THREAD:
         return 1;
-    case CPU_TOPO_LEVEL_CORE:
+    case CPU_TOPOLOGY_LEVEL_CORE:
         return topo_info->threads_per_core;
-    case CPU_TOPO_LEVEL_MODULE:
+    case CPU_TOPOLOGY_LEVEL_MODULE:
         return topo_info->threads_per_core * topo_info->cores_per_module;
-    case CPU_TOPO_LEVEL_DIE:
+    case CPU_TOPOLOGY_LEVEL_DIE:
         return topo_info->threads_per_core * topo_info->cores_per_module *
                topo_info->modules_per_die;
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPOLOGY_LEVEL_SOCKET:
         return topo_info->threads_per_core * topo_info->cores_per_module *
                topo_info->modules_per_die * topo_info->dies_per_pkg;
     default:
@@ -323,18 +323,18 @@ static uint32_t num_threads_by_topo_level(X86CPUTopoInfo *topo_info,
 }
 
 static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
-                                            enum CPUTopoLevel topo_level)
+                                            enum CpuTopologyLevel topo_level)
 {
     switch (topo_level) {
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPOLOGY_LEVEL_THREAD:
         return 0;
-    case CPU_TOPO_LEVEL_CORE:
+    case CPU_TOPOLOGY_LEVEL_CORE:
         return apicid_core_offset(topo_info);
-    case CPU_TOPO_LEVEL_MODULE:
+    case CPU_TOPOLOGY_LEVEL_MODULE:
         return apicid_module_offset(topo_info);
-    case CPU_TOPO_LEVEL_DIE:
+    case CPU_TOPOLOGY_LEVEL_DIE:
         return apicid_die_offset(topo_info);
-    case CPU_TOPO_LEVEL_PACKAGE:
+    case CPU_TOPOLOGY_LEVEL_SOCKET:
         return apicid_pkg_offset(topo_info);
     default:
         g_assert_not_reached();
@@ -342,18 +342,18 @@ static uint32_t apicid_offset_by_topo_level(X86CPUTopoInfo *topo_info,
     return 0;
 }
 
-static uint32_t cpuid1f_topo_type(enum CPUTopoLevel topo_level)
+static uint32_t cpuid1f_topo_type(enum CpuTopologyLevel topo_level)
 {
     switch (topo_level) {
-    case CPU_TOPO_LEVEL_INVALID:
+    case CPU_TOPOLOGY_LEVEL_INVALID:
         return CPUID_1F_ECX_TOPO_LEVEL_INVALID;
-    case CPU_TOPO_LEVEL_SMT:
+    case CPU_TOPOLOGY_LEVEL_THREAD:
         return CPUID_1F_ECX_TOPO_LEVEL_SMT;
-    case CPU_TOPO_LEVEL_CORE:
+    case CPU_TOPOLOGY_LEVEL_CORE:
         return CPUID_1F_ECX_TOPO_LEVEL_CORE;
-    case CPU_TOPO_LEVEL_MODULE:
+    case CPU_TOPOLOGY_LEVEL_MODULE:
         return CPUID_1F_ECX_TOPO_LEVEL_MODULE;
-    case CPU_TOPO_LEVEL_DIE:
+    case CPU_TOPOLOGY_LEVEL_DIE:
         return CPUID_1F_ECX_TOPO_LEVEL_DIE;
     default:
         /* Other types are not supported in QEMU. */
@@ -371,16 +371,16 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
     unsigned long level, next_level;
     uint32_t num_threads_next_level, offset_next_level;
 
-    assert(count + 1 < CPU_TOPO_LEVEL_MAX);
+    assert(count + 1 < CPU_TOPOLOGY_LEVEL__MAX);
 
     /*
      * Find the No.(count + 1) topology level in avail_cpu_topo bitmap.
-     * The search starts from bit 1 (CPU_TOPO_LEVEL_INVALID + 1).
+     * The search starts from bit 1 (CPU_TOPOLOGY_LEVEL_INVALID + 1).
      */
-    level = CPU_TOPO_LEVEL_INVALID;
+    level = CPU_TOPOLOGY_LEVEL_INVALID;
     for (int i = 0; i <= count; i++) {
         level = find_next_bit(env->avail_cpu_topo,
-                              CPU_TOPO_LEVEL_PACKAGE,
+                              CPU_TOPOLOGY_LEVEL_SOCKET,
                               level + 1);
 
         /*
@@ -388,18 +388,18 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
          * and it just encodes the invalid level (all fields are 0)
          * into the last subleaf of 0x1f.
          */
-        if (level == CPU_TOPO_LEVEL_PACKAGE) {
-            level = CPU_TOPO_LEVEL_INVALID;
+        if (level == CPU_TOPOLOGY_LEVEL_SOCKET) {
+            level = CPU_TOPOLOGY_LEVEL_INVALID;
             break;
         }
     }
 
-    if (level == CPU_TOPO_LEVEL_INVALID) {
+    if (level == CPU_TOPOLOGY_LEVEL_INVALID) {
         num_threads_next_level = 0;
         offset_next_level = 0;
     } else {
         next_level = find_next_bit(env->avail_cpu_topo,
-                                   CPU_TOPO_LEVEL_PACKAGE,
+                                   CPU_TOPOLOGY_LEVEL_SOCKET,
                                    level + 1);
         num_threads_next_level = num_threads_by_topo_level(topo_info,
                                                            next_level);
@@ -575,7 +575,7 @@ static CPUCacheInfo legacy_l1d_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -590,7 +590,7 @@ static CPUCacheInfo legacy_l1d_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /* L1 instruction cache: */
@@ -604,7 +604,7 @@ static CPUCacheInfo legacy_l1i_cache = {
     .sets = 64,
     .partitions = 1,
     .no_invd_sharing = true,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */
@@ -619,7 +619,7 @@ static CPUCacheInfo legacy_l1i_cache_amd = {
     .partitions = 1,
     .lines_per_tag = 1,
     .no_invd_sharing = true,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /* Level 2 unified cache: */
@@ -633,7 +633,7 @@ static CPUCacheInfo legacy_l2_cache = {
     .sets = 4096,
     .partitions = 1,
     .no_invd_sharing = true,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -643,7 +643,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
     .size = 2 * MiB,
     .line_size = 64,
     .associativity = 8,
-    .share_level = CPU_TOPO_LEVEL_INVALID,
+    .share_level = CPU_TOPOLOGY_LEVEL_INVALID,
 };
 
 
@@ -657,7 +657,7 @@ static CPUCacheInfo legacy_l2_cache_amd = {
     .associativity = 16,
     .sets = 512,
     .partitions = 1,
-    .share_level = CPU_TOPO_LEVEL_CORE,
+    .share_level = CPU_TOPOLOGY_LEVEL_CORE,
 };
 
 /* Level 3 unified cache: */
@@ -673,7 +673,7 @@ static CPUCacheInfo legacy_l3_cache = {
     .self_init = true,
     .inclusive = true,
     .complex_indexing = true,
-    .share_level = CPU_TOPO_LEVEL_DIE,
+    .share_level = CPU_TOPOLOGY_LEVEL_DIE,
 };
 
 /* TLB definitions: */
@@ -2017,7 +2017,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2030,7 +2030,7 @@ static const CPUCaches epyc_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2041,7 +2041,7 @@ static const CPUCaches epyc_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2055,7 +2055,7 @@ static const CPUCaches epyc_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2071,7 +2071,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2084,7 +2084,7 @@ static CPUCaches epyc_v4_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2095,7 +2095,7 @@ static CPUCaches epyc_v4_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2109,7 +2109,7 @@ static CPUCaches epyc_v4_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2125,7 +2125,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2138,7 +2138,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2149,7 +2149,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2163,7 +2163,7 @@ static const CPUCaches epyc_rome_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2179,7 +2179,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2192,7 +2192,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2203,7 +2203,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2217,7 +2217,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2233,7 +2233,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2246,7 +2246,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2257,7 +2257,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2271,7 +2271,7 @@ static const CPUCaches epyc_milan_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = true,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2287,7 +2287,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2300,7 +2300,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2311,7 +2311,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .partitions = 1,
         .sets = 1024,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2325,7 +2325,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -2341,7 +2341,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l1i_cache = &(CPUCacheInfo) {
         .type = INSTRUCTION_CACHE,
@@ -2354,7 +2354,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .lines_per_tag = 1,
         .self_init = 1,
         .no_invd_sharing = true,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l2_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2365,7 +2365,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .partitions = 1,
         .sets = 2048,
         .lines_per_tag = 1,
-        .share_level = CPU_TOPO_LEVEL_CORE,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
     },
     .l3_cache = &(CPUCacheInfo) {
         .type = UNIFIED_CACHE,
@@ -2379,7 +2379,7 @@ static const CPUCaches epyc_genoa_cache_info = {
         .self_init = true,
         .inclusive = true,
         .complex_indexing = false,
-        .share_level = CPU_TOPO_LEVEL_DIE,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
     },
 };
 
@@ -6507,7 +6507,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 
                     /* Share the cache at package level. */
                     *eax |= max_thread_ids_for_cache(&topo_info,
-                                CPU_TOPO_LEVEL_PACKAGE) << 14;
+                                CPU_TOPOLOGY_LEVEL_SOCKET) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
@@ -7992,10 +7992,10 @@ static void x86_cpu_init_default_topo(X86CPU *cpu)
     env->nr_modules = 1;
     env->nr_dies = 1;
 
-    /* SMT, core and package levels are set by default. */
-    set_bit(CPU_TOPO_LEVEL_SMT, env->avail_cpu_topo);
-    set_bit(CPU_TOPO_LEVEL_CORE, env->avail_cpu_topo);
-    set_bit(CPU_TOPO_LEVEL_PACKAGE, env->avail_cpu_topo);
+    /* thread, core and socket levels are set by default. */
+    set_bit(CPU_TOPOLOGY_LEVEL_THREAD, env->avail_cpu_topo);
+    set_bit(CPU_TOPOLOGY_LEVEL_CORE, env->avail_cpu_topo);
+    set_bit(CPU_TOPOLOGY_LEVEL_SOCKET, env->avail_cpu_topo);
 }
 
 static void x86_cpu_initfn(Object *obj)
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 9c39384ac0aa..34289994082d 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1660,7 +1660,7 @@ typedef struct CPUCacheInfo {
      * Used to encode CPUID[4].EAX[bits 25:14] or
      * CPUID[0x8000001D].EAX[bits 25:14].
      */
-    enum CPUTopoLevel share_level;
+    CpuTopologyLevel share_level;
 } CPUCacheInfo;
 
 
@@ -1990,7 +1990,7 @@ typedef struct CPUArchState {
     unsigned nr_modules;
 
     /* Bitmap of available CPU topology levels for this CPU. */
-    DECLARE_BITMAP(avail_cpu_topo, CPU_TOPO_LEVEL_MAX);
+    DECLARE_BITMAP(avail_cpu_topo, CPU_TOPOLOGY_LEVEL__MAX);
 } CPUX86State;
 
 struct kvm_msrs;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/7] qapi/qom: Define cache enumeration and properties for machine
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 3/7] hw/core: Check smp cache topology support " Zhao Liu
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

The x86 and ARM need to allow user to configure cache properties
(current only topology):
 * For x86, the default cache topology model (of max/host CPU) does not
   always match the Host's real physical cache topology. Performance can
   increase when the configured virtual topology is closer to the
   physical topology than a default topology would be.
 * For ARM, QEMU can't get the cache topology information from the CPU
   registers, then user configuration is necessary. Additionally, the
   cache information is also needed for MPAM emulation (for TCG) to
   build the right PPTT.

Define smp-cache related enumeration and properties in QAPI, so that
user could configure cache properties for SMP system through -machine in
the subsequent patch.

Cache enumeration (CacheLevelAndType) is implemented as the combination
of cache level (level 1/2/3) and cache type (data/instruction/unified).

Currently, separated L1 cache (L1 data cache and L1 instruction cache)
with unified higher-level cache (e.g., unified L2 and L3 caches), is the
most common cache architectures.

Therefore, enumerate the L1 D-cache, L1 I-cache, L2 cache and L3 cache
with smp-cache object to add the basic cache topology support. Other
kinds of caches (e.g., L1 unified or L2/L3 separated caches) can be
added directly into CacheLevelAndType if necessary.

Cache properties (SmpCacheProperties) currently only contains cache
topology information, and other cache properties can be added in it
if necessary.

Note, define cache topology based on CPU topology level with two
reasons:

 1. In practice, a cache will always be bound to the CPU container
    (either private in the CPU container or shared among multiple
    containers), and CPU container is often expressed in terms of CPU
    topology level.
 2. The x86's cache-related CPUIDs encode cache topology based on APIC
    ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV
    relies on also requires CPU containers to help indicate the private
    shared hierarchy of the cache. Therefore, for SMP systems, it is
    natural to use the CPU topology hierarchy directly in QEMU to define
    the cache topology.

With smp-cache QAPI support, add smp cache topology for machine by
parsing the smp-cache object list.

Also add a helper to access cache topology level of machine.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Suggested by credit:
 * Referred to Daniel's suggestion to introduce cache object list.
---
Changes since Patch v2:
 * Updated version of new QAPI structures to v9.2. (Jonathan)
 * Merged the folloup commit ("hw/core: Add smp cache topology for
   machine") into this patch as the use case of smp-cache QAPI.
   (Jonathan)
 * Optimized the error_message line breaks to be more grep-friendly.
   (Jonathan)
 * Reduced unnecessary "else" in machine_parse_smp_cache.
   (Jonathan)

Changes since Patch v1:
 * Renamed SMPCacheProperty/SMPCacheProperties (QAPI structures) to
   SmpCacheProperties/SmpCachePropertiesWrapper. (Markus)
 * Renamed SMPCacheName (QAPI structure) to SmpCacheLevelAndType and
   dropped prefix. (Markus)
 * Renamed 'name' field in SmpCacheProperties to 'cache', since the
   type and level of the cache in SMP system could be able to specify
   all of these kinds of cache explicitly enough.
 * Renamed 'topo' field in SmpCacheProperties to 'topology'. (Markus)
 * Returned error information when user repeats setting cache
   properties. (Markus)
 * Renamed SmpCacheLevelAndType to CacheLevelAndType, since this
   representation is general across SMP or hybrid system.
 * Dropped handwriten smp-cache object and integrated cache pproperties
   list into MachineState (in next patch). (Markus)
 * Added the reason why x86 and ARM need to configure cache
   information. (Markus and Jonathan)

Changes since RFC v2:
 * New commit to implement cache list with JSON format instead of
   multiple sub-options in -smp.
---
 hw/core/machine-smp.c    | 40 ++++++++++++++++++++++++++++++++
 hw/core/machine.c        | 44 +++++++++++++++++++++++++++++++++++
 include/hw/boards.h      | 10 ++++++++
 qapi/machine-common.json | 50 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 144 insertions(+)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 5d8d7edcbd3f..1ce7be902e6e 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -261,6 +261,40 @@ void machine_parse_smp_config(MachineState *ms,
     }
 }
 
+bool machine_parse_smp_cache(MachineState *ms,
+                             const SmpCachePropertiesList *caches,
+                             Error **errp)
+{
+    const SmpCachePropertiesList *node;
+    DECLARE_BITMAP(caches_bitmap, CACHE_LEVEL_AND_TYPE__MAX);
+
+    for (node = caches; node; node = node->next) {
+        /* Prohibit users from setting the cache topology level to invalid. */
+        if (node->value->topology == CPU_TOPOLOGY_LEVEL_INVALID) {
+            error_setg(errp,
+                       "Invalid cache topology level: %s. "
+                       "The topology should match valid CPU topology level",
+                       CpuTopologyLevel_str(node->value->topology));
+            return false;
+        }
+
+        /* Prohibit users from repeating settings. */
+        if (test_bit(node->value->cache, caches_bitmap)) {
+            error_setg(errp,
+                       "Invalid cache properties: %s. "
+                       "The cache properties are duplicated",
+                       CacheLevelAndType_str(node->value->cache));
+            return false;
+        }
+
+        ms->smp_cache.props[node->value->cache].topology =
+            node->value->topology;
+        set_bit(node->value->cache, caches_bitmap);
+    }
+
+    return true;
+}
+
 unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
 {
     return ms->smp.cores * ms->smp.modules * ms->smp.clusters * ms->smp.dies;
@@ -270,3 +304,9 @@ unsigned int machine_topo_get_threads_per_socket(const MachineState *ms)
 {
     return ms->smp.threads * machine_topo_get_cores_per_socket(ms);
 }
+
+CpuTopologyLevel machine_get_cache_topo_level(const MachineState *ms,
+                                              CacheLevelAndType cache)
+{
+    return ms->smp_cache.props[cache].topology;
+}
diff --git a/hw/core/machine.c b/hw/core/machine.c
index adaba17ebac1..518beb9f883a 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -932,6 +932,40 @@ static void machine_set_smp(Object *obj, Visitor *v, const char *name,
     machine_parse_smp_config(ms, config, errp);
 }
 
+static void machine_get_smp_cache(Object *obj, Visitor *v, const char *name,
+                                  void *opaque, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+    SmpCache *cache = &ms->smp_cache;
+    SmpCachePropertiesList *head = NULL;
+    SmpCachePropertiesList **tail = &head;
+
+    for (int i = 0; i < CACHE_LEVEL_AND_TYPE__MAX; i++) {
+        SmpCacheProperties *node = g_new(SmpCacheProperties, 1);
+
+        node->cache = cache->props[i].cache;
+        node->topology = cache->props[i].topology;
+        QAPI_LIST_APPEND(tail, node);
+    }
+
+    visit_type_SmpCachePropertiesList(v, name, &head, errp);
+    qapi_free_SmpCachePropertiesList(head);
+}
+
+static void machine_set_smp_cache(Object *obj, Visitor *v, const char *name,
+                                  void *opaque, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+    SmpCachePropertiesList *caches;
+
+    if (!visit_type_SmpCachePropertiesList(v, name, &caches, errp)) {
+        return;
+    }
+
+    machine_parse_smp_cache(ms, caches, errp);
+    qapi_free_SmpCachePropertiesList(caches);
+}
+
 static void machine_get_boot(Object *obj, Visitor *v, const char *name,
                             void *opaque, Error **errp)
 {
@@ -1057,6 +1091,11 @@ static void machine_class_init(ObjectClass *oc, void *data)
     object_class_property_set_description(oc, "smp",
         "CPU topology");
 
+    object_class_property_add(oc, "smp-cache", "SmpCachePropertiesWrapper",
+        machine_get_smp_cache, machine_set_smp_cache, NULL, NULL);
+    object_class_property_set_description(oc, "smp-cache",
+        "Cache properties list for SMP machine");
+
     object_class_property_add(oc, "phandle-start", "int",
         machine_get_phandle_start, machine_set_phandle_start,
         NULL, NULL);
@@ -1195,6 +1234,11 @@ static void machine_initfn(Object *obj)
     ms->smp.cores = 1;
     ms->smp.threads = 1;
 
+    for (int i = 0; i < CACHE_LEVEL_AND_TYPE__MAX; i++) {
+        ms->smp_cache.props[i].cache = (CacheLevelAndType)i;
+        ms->smp_cache.props[i].topology = CPU_TOPOLOGY_LEVEL_DEFAULT;
+    }
+
     machine_copy_boot_config(ms, &(BootConfiguration){ 0 });
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 5966069baab3..0729066e353a 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -44,8 +44,13 @@ void machine_set_cpu_numa_node(MachineState *machine,
                                Error **errp);
 void machine_parse_smp_config(MachineState *ms,
                               const SMPConfiguration *config, Error **errp);
+bool machine_parse_smp_cache(MachineState *ms,
+                             const SmpCachePropertiesList *caches,
+                             Error **errp);
 unsigned int machine_topo_get_cores_per_socket(const MachineState *ms);
 unsigned int machine_topo_get_threads_per_socket(const MachineState *ms);
+CpuTopologyLevel machine_get_cache_topo_level(const MachineState *ms,
+                                              CacheLevelAndType cache);
 void machine_memory_devices_init(MachineState *ms, hwaddr base, uint64_t size);
 
 /**
@@ -369,6 +374,10 @@ typedef struct CpuTopology {
     unsigned int max_cpus;
 } CpuTopology;
 
+typedef struct SmpCache {
+    SmpCacheProperties props[CACHE_LEVEL_AND_TYPE__MAX];
+} SmpCache;
+
 /**
  * MachineState:
  */
@@ -419,6 +428,7 @@ struct MachineState {
     AccelState *accelerator;
     CPUArchIdList *possible_cpus;
     CpuTopology smp;
+    SmpCache smp_cache;
     struct NVDIMMState *nvdimms_state;
     struct NumaState *numa_state;
 };
diff --git a/qapi/machine-common.json b/qapi/machine-common.json
index db3e499fb382..b780440c7ec6 100644
--- a/qapi/machine-common.json
+++ b/qapi/machine-common.json
@@ -62,3 +62,53 @@
 { 'enum': 'CpuTopologyLevel',
   'data': [ 'invalid', 'thread', 'core', 'module', 'cluster',
             'die', 'socket', 'book', 'drawer', 'default' ] }
+
+##
+# @CacheLevelAndType:
+#
+# Caches a system may have.  The enumeration value here is the
+# combination of cache level and cache type.
+#
+# @l1d: L1 data cache.
+#
+# @l1i: L1 instruction cache.
+#
+# @l2: L2 (unified) cache.
+#
+# @l3: L3 (unified) cache
+#
+# Since: 9.2
+##
+{ 'enum': 'CacheLevelAndType',
+  'data': [ 'l1d', 'l1i', 'l2', 'l3' ] }
+
+##
+# @SmpCacheProperties:
+#
+# Cache information for SMP system.
+#
+# @cache: Cache name, which is the combination of cache level
+#     and cache type.
+#
+# @topology: Cache topology level.  It accepts the CPU topology
+#     enumeration as the parameter, i.e., CPUs in the same
+#     topology container share the same cache.
+#
+# Since: 9.2
+##
+{ 'struct': 'SmpCacheProperties',
+  'data': {
+  'cache': 'CacheLevelAndType',
+  'topology': 'CpuTopologyLevel' } }
+
+##
+# @SmpCachePropertiesWrapper:
+#
+# List wrapper of SmpCacheProperties.
+#
+# @caches: the list of SmpCacheProperties.
+#
+# Since 9.2
+##
+{ 'struct': 'SmpCachePropertiesWrapper',
+  'data': { 'caches': ['SmpCacheProperties'] } }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/7] hw/core: Check smp cache topology support for machine
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 2/7] qapi/qom: Define cache enumeration and properties for machine Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 4/7] i386/cpu: Support thread and module level cache topology Zhao Liu
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

Add cache_supported flags in SMPCompatProps to allow machines to
configure various caches support.

And check the compatibility of the cache properties with the
machine support in machine_parse_smp_cache().

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Changes since Patch v2:
 * Polished the comment of "default" level check. (Jonathan)
 * Reduced short line wrap. (Jonathan)

Changes since Patch v1:
 * Dropped machine_check_smp_cache_support() and did the check when
   -machine parses smp-cache in machine_parse_smp_cache().

Changes since RFC v2:
 * Split as a separate commit to just include compatibility checking and
   topology checking.
 * Allow setting "default" topology level even though the cache
   isn't supported by machine. (Daniel)
---
 hw/core/machine-smp.c | 75 +++++++++++++++++++++++++++++++++++++++++++
 include/hw/boards.h   |  3 ++
 2 files changed, 78 insertions(+)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 1ce7be902e6e..f3edbded2e7b 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -261,10 +261,47 @@ void machine_parse_smp_config(MachineState *ms,
     }
 }
 
+static bool machine_check_topo_support(MachineState *ms,
+                                       CpuTopologyLevel topo,
+                                       Error **errp)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+
+    if ((topo == CPU_TOPOLOGY_LEVEL_MODULE && !mc->smp_props.modules_supported) ||
+        (topo == CPU_TOPOLOGY_LEVEL_CLUSTER && !mc->smp_props.clusters_supported) ||
+        (topo == CPU_TOPOLOGY_LEVEL_DIE && !mc->smp_props.dies_supported) ||
+        (topo == CPU_TOPOLOGY_LEVEL_BOOK && !mc->smp_props.books_supported) ||
+        (topo == CPU_TOPOLOGY_LEVEL_DRAWER && !mc->smp_props.drawers_supported)) {
+        error_setg(errp,
+                   "Invalid topology level: %s. "
+                   "The topology level is not supported by this machine",
+                   CpuTopologyLevel_str(topo));
+        return false;
+    }
+
+    return true;
+}
+
+/*
+ * When both cache1 and cache2 are configured with specific topology levels
+ * (not default level), is cache1's topology level higher than cache2?
+ */
+static bool smp_cache_topo_cmp(const SmpCache *smp_cache,
+                               CacheLevelAndType cache1,
+                               CacheLevelAndType cache2)
+{
+    if (smp_cache->props[cache1].topology != CPU_TOPOLOGY_LEVEL_DEFAULT &&
+        smp_cache->props[cache1].topology > smp_cache->props[cache2].topology) {
+        return true;
+    }
+    return false;
+}
+
 bool machine_parse_smp_cache(MachineState *ms,
                              const SmpCachePropertiesList *caches,
                              Error **errp)
 {
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
     const SmpCachePropertiesList *node;
     DECLARE_BITMAP(caches_bitmap, CACHE_LEVEL_AND_TYPE__MAX);
 
@@ -292,6 +329,44 @@ bool machine_parse_smp_cache(MachineState *ms,
         set_bit(node->value->cache, caches_bitmap);
     }
 
+    for (int i = 0; i < CACHE_LEVEL_AND_TYPE__MAX; i++) {
+        const SmpCacheProperties *props = &ms->smp_cache.props[i];
+
+        /*
+         * Reject non "default" topology level if the cache isn't
+         * supported by the machine.
+         */
+        if (props->topology != CPU_TOPOLOGY_LEVEL_DEFAULT &&
+            !mc->smp_props.cache_supported[props->cache]) {
+            error_setg(errp,
+                       "%s cache topology not supported by this machine",
+                       CacheLevelAndType_str(node->value->cache));
+            return false;
+        }
+
+        if (!machine_check_topo_support(ms, props->topology, errp)) {
+            return false;
+        }
+    }
+
+    if (smp_cache_topo_cmp(&ms->smp_cache, CACHE_LEVEL_AND_TYPE_L1D,
+                           CACHE_LEVEL_AND_TYPE_L2) ||
+        smp_cache_topo_cmp(&ms->smp_cache, CACHE_LEVEL_AND_TYPE_L1I,
+                           CACHE_LEVEL_AND_TYPE_L2)) {
+        error_setg(errp,
+                   "Invalid smp cache topology. "
+                   "L2 cache topology level shouldn't be lower than L1 cache");
+        return false;
+    }
+
+    if (smp_cache_topo_cmp(&ms->smp_cache, CACHE_LEVEL_AND_TYPE_L2,
+                           CACHE_LEVEL_AND_TYPE_L3)) {
+        error_setg(errp,
+                   "Invalid smp cache topology. "
+                   "L3 cache topology level shouldn't be lower than L2 cache");
+        return false;
+    }
+
     return true;
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0729066e353a..e4a1035e3fa1 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -151,6 +151,8 @@ typedef struct {
  * @books_supported - whether books are supported by the machine
  * @drawers_supported - whether drawers are supported by the machine
  * @modules_supported - whether modules are supported by the machine
+ * @cache_supported - whether cache (l1d, l1i, l2 and l3) configuration are
+ *                    supported by the machine
  */
 typedef struct {
     bool prefer_sockets;
@@ -160,6 +162,7 @@ typedef struct {
     bool books_supported;
     bool drawers_supported;
     bool modules_supported;
+    bool cache_supported[CACHE_LEVEL_AND_TYPE__MAX];
 } SMPCompatProps;
 
 /**
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 4/7] i386/cpu: Support thread and module level cache topology
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
                   ` (2 preceding siblings ...)
  2024-10-12 10:44 ` [PATCH v3 3/7] hw/core: Check smp cache topology support " Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 5/7] i386/cpu: Update cache topology with machine's configuration Zhao Liu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

Allow cache to be defined at the thread and module level. This
increases flexibility for x86 users to customize their cache topology.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 target/i386/cpu.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c84e32c9c303..202bfe5086be 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -241,9 +241,15 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
     uint32_t num_ids = 0;
 
     switch (share_level) {
+    case CPU_TOPOLOGY_LEVEL_THREAD:
+        num_ids = 1;
+        break;
     case CPU_TOPOLOGY_LEVEL_CORE:
         num_ids = 1 << apicid_core_offset(topo_info);
         break;
+    case CPU_TOPOLOGY_LEVEL_MODULE:
+        num_ids = 1 << apicid_module_offset(topo_info);
+        break;
     case CPU_TOPOLOGY_LEVEL_DIE:
         num_ids = 1 << apicid_die_offset(topo_info);
         break;
@@ -251,10 +257,6 @@ static uint32_t max_thread_ids_for_cache(X86CPUTopoInfo *topo_info,
         num_ids = 1 << apicid_pkg_offset(topo_info);
         break;
     default:
-        /*
-         * Currently there is no use case for THREAD and MODULE, so use
-         * assert directly to facilitate debugging.
-         */
         g_assert_not_reached();
     }
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 5/7] i386/cpu: Update cache topology with machine's configuration
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
                   ` (3 preceding siblings ...)
  2024-10-12 10:44 ` [PATCH v3 4/7] i386/cpu: Support thread and module level cache topology Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-12 10:44 ` [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine Zhao Liu
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

User will configure smp cache topology via -machine smp-cache.

For this case, update the x86 CPUs' cache topology with user's
configuration in MachineState.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Changes since RFC v2:
 * Used smp_cache array to override cache topology.
 * Wrapped the updating into a function.
---
 target/i386/cpu.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 202bfe5086be..c8a04faf3764 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7597,6 +7597,38 @@ static void x86_cpu_hyperv_realize(X86CPU *cpu)
     cpu->hyperv_limits[2] = 0;
 }
 
+#ifndef CONFIG_USER_ONLY
+static void x86_cpu_update_smp_cache_topo(MachineState *ms, X86CPU *cpu)
+{
+    CPUX86State *env = &cpu->env;
+    CpuTopologyLevel level;
+
+    level = machine_get_cache_topo_level(ms, CACHE_LEVEL_AND_TYPE_L1D);
+    if (level != CPU_TOPOLOGY_LEVEL_DEFAULT) {
+        env->cache_info_cpuid4.l1d_cache->share_level = level;
+        env->cache_info_amd.l1d_cache->share_level = level;
+    }
+
+    level = machine_get_cache_topo_level(ms, CACHE_LEVEL_AND_TYPE_L1I);
+    if (level != CPU_TOPOLOGY_LEVEL_DEFAULT) {
+        env->cache_info_cpuid4.l1i_cache->share_level = level;
+        env->cache_info_amd.l1i_cache->share_level = level;
+    }
+
+    level = machine_get_cache_topo_level(ms, CACHE_LEVEL_AND_TYPE_L2);
+    if (level != CPU_TOPOLOGY_LEVEL_DEFAULT) {
+        env->cache_info_cpuid4.l2_cache->share_level = level;
+        env->cache_info_amd.l2_cache->share_level = level;
+    }
+
+    level = machine_get_cache_topo_level(ms, CACHE_LEVEL_AND_TYPE_L3);
+    if (level != CPU_TOPOLOGY_LEVEL_DEFAULT) {
+        env->cache_info_cpuid4.l3_cache->share_level = level;
+        env->cache_info_amd.l3_cache->share_level = level;
+    }
+}
+#endif
+
 static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 {
     CPUState *cs = CPU(dev);
@@ -7821,6 +7853,13 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
+
+    /*
+     * TODO: Add a SMPCompatProps.has_caches flag to avoid useless Updates
+     * if user didn't set smp_cache.
+     */
+    x86_cpu_update_smp_cache_topo(ms, cpu);
+
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
 
     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || ms->smp.cpus > 1) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
                   ` (4 preceding siblings ...)
  2024-10-12 10:44 ` [PATCH v3 5/7] i386/cpu: Update cache topology with machine's configuration Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-17 15:27   ` Daniel P. Berrangé
  2024-10-12 10:44 ` [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration Zhao Liu
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu, Yongwei Ma

Allow user to configure l1d, l1i, l2 and l3 cache topologies for PC
machine.

Additionally, add the document of "-machine smp-cache" in
qemu-options.hx.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
Changes since Patch v2:
 * Polished the document. (Jonathan)

Changes since Patch v1:
 * Merged document into this patch. (Markus)

Changes since RFC v2:
 * Used cache_supported array.
---
 hw/i386/pc.c    |  4 ++++
 qemu-options.hx | 26 +++++++++++++++++++++++++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2047633e4cf7..8aea2308dcb9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1791,6 +1791,10 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->nvdimm_supported = true;
     mc->smp_props.dies_supported = true;
     mc->smp_props.modules_supported = true;
+    mc->smp_props.cache_supported[CACHE_LEVEL_AND_TYPE_L1D] = true;
+    mc->smp_props.cache_supported[CACHE_LEVEL_AND_TYPE_L1I] = true;
+    mc->smp_props.cache_supported[CACHE_LEVEL_AND_TYPE_L2] = true;
+    mc->smp_props.cache_supported[CACHE_LEVEL_AND_TYPE_L3] = true;
     mc->default_ram_id = "pc.ram";
     pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_AUTO;
 
diff --git a/qemu-options.hx b/qemu-options.hx
index d5afefe5b63c..8a0cd7393f34 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -39,7 +39,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
     "                memory-encryption=@var{} memory encryption object to use (default=none)\n"
     "                hmat=on|off controls ACPI HMAT support (default=off)\n"
     "                memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n"
-    "                cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n",
+    "                cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n"
+    "                smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel\n",
     QEMU_ARCH_ALL)
 SRST
 ``-machine [type=]name[,prop=value[,...]]``
@@ -159,6 +160,29 @@ SRST
         ::
 
             -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512
+
+    ``smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel``
+        Define cache properties for SMP system.
+
+        ``cache=cachename`` specifies the cache that the properties will be
+        applied on. This field is the combination of cache level and cache
+        type. It supports ``l1d`` (L1 data cache), ``l1i`` (L1 instruction
+        cache), ``l2`` (L2 unified cache) and ``l3`` (L3 unified cache).
+
+        ``topology=topologylevel`` sets the cache topology level. It accepts
+        CPU topology levels including ``thread``, ``core``, ``module``,
+        ``cluster``, ``die``, ``socket``, ``book``, ``drawer`` and a special
+        value ``default``. If ``default`` is set, then the cache topology will
+        follow the architecture's default cache topology model. If another
+        topology level is set, the cache will be shared at corresponding CPU
+        topology level. For example, ``topology=core`` makes the cache shared
+        by all threads within a core.
+
+        Example:
+
+        ::
+
+            -machine smp-cache.0.cache=l1d,smp-cache.0.topology=core,smp-cache.1.cache=l1i,smp-cache.1.topology=core
 ERST
 
 DEF("M", HAS_ARG, QEMU_OPTION_M,
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
                   ` (5 preceding siblings ...)
  2024-10-12 10:44 ` [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine Zhao Liu
@ 2024-10-12 10:44 ` Zhao Liu
  2024-10-17 13:16   ` Jonathan Cameron via
  2024-10-17 13:19 ` [PATCH v3 0/7] Introduce SMP Cache Topology Jonathan Cameron via
       [not found] ` <20241017141402.0000135b@Huawei.com>
  8 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-10-12 10:44 UTC (permalink / raw)
  To: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu

From: Alireza Sanaee <alireza.sanaee@huawei.com>

Add has_caches flag to SMPCompatProps, which helps in avoiding
extra checks for every single layer of caches in x86 (and ARM in
future).

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Note: Picked from Alireza's series with the changes:
 * Moved the flag to SMPCompatProps with a new name "has_caches".
   This way, it remains consistent with the function and style of
   "has_clusters" in SMPCompatProps.
 * Dropped my previous TODO with the new flag.
---
Changes since Patch v2:
 * Picked a new patch frome Alireza's ARM smp-cache series.
---
 hw/core/machine-smp.c | 2 ++
 include/hw/boards.h   | 3 +++
 target/i386/cpu.c     | 9 ++++-----
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index f3edbded2e7b..16e456678cb6 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -367,6 +367,8 @@ bool machine_parse_smp_cache(MachineState *ms,
         return false;
     }
 
+    mc->smp_props.has_caches = true;
+
     return true;
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index e4a1035e3fa1..af62b09c89d1 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -153,6 +153,8 @@ typedef struct {
  * @modules_supported - whether modules are supported by the machine
  * @cache_supported - whether cache (l1d, l1i, l2 and l3) configuration are
  *                    supported by the machine
+ * @has_caches - whether cache properties are explicitly specified in the
+ *               user provided smp-cache configuration
  */
 typedef struct {
     bool prefer_sockets;
@@ -163,6 +165,7 @@ typedef struct {
     bool drawers_supported;
     bool modules_supported;
     bool cache_supported[CACHE_LEVEL_AND_TYPE__MAX];
+    bool has_caches;
 } SMPCompatProps;
 
 /**
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c8a04faf3764..6f711e98b527 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7853,12 +7853,11 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 
 #ifndef CONFIG_USER_ONLY
     MachineState *ms = MACHINE(qdev_get_machine());
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
 
-    /*
-     * TODO: Add a SMPCompatProps.has_caches flag to avoid useless Updates
-     * if user didn't set smp_cache.
-     */
-    x86_cpu_update_smp_cache_topo(ms, cpu);
+    if (mc->smp_props.has_caches) {
+        x86_cpu_update_smp_cache_topo(ms, cpu);
+    }
 
     qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration
  2024-10-12 10:44 ` [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration Zhao Liu
@ 2024-10-17 13:16   ` Jonathan Cameron via
  0 siblings, 0 replies; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-17 13:16 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S.Tsirkin , Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi

RESEND as rejected by server (header issue, hopefully fixed)

On Sat, 12 Oct 2024 18:44:29 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:

> From: Alireza Sanaee <alireza.sanaee@huawei.com>
> 
> Add has_caches flag to SMPCompatProps, which helps in avoiding
> extra checks for every single layer of caches in x86 (and ARM in
> future).
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> Note: Picked from Alireza's series with the changes:
>  * Moved the flag to SMPCompatProps with a new name "has_caches".
>    This way, it remains consistent with the function and style of
>    "has_clusters" in SMPCompatProps.
>  * Dropped my previous TODO with the new flag.
> ---
> Changes since Patch v2:
>  * Picked a new patch frome Alireza's ARM smp-cache series.
> ---
>  hw/core/machine-smp.c | 2 ++
>  include/hw/boards.h   | 3 +++
>  target/i386/cpu.c     | 9 ++++-----
>  3 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
> index f3edbded2e7b..16e456678cb6 100644
> --- a/hw/core/machine-smp.c
> +++ b/hw/core/machine-smp.c
> @@ -367,6 +367,8 @@ bool machine_parse_smp_cache(MachineState *ms,
>          return false;
>      }
>  
> +    mc->smp_props.has_caches = true;
> +
>      return true;
>  }
>  
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index e4a1035e3fa1..af62b09c89d1 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -153,6 +153,8 @@ typedef struct {
>   * @modules_supported - whether modules are supported by the machine
>   * @cache_supported - whether cache (l1d, l1i, l2 and l3) configuration are
>   *                    supported by the machine
> + * @has_caches - whether cache properties are explicitly specified in the
> + *               user provided smp-cache configuration
>   */
>  typedef struct {
>      bool prefer_sockets;
> @@ -163,6 +165,7 @@ typedef struct {
>      bool drawers_supported;
>      bool modules_supported;
>      bool cache_supported[CACHE_LEVEL_AND_TYPE__MAX];
> +    bool has_caches;
>  } SMPCompatProps;
>  
>  /**
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c8a04faf3764..6f711e98b527 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7853,12 +7853,11 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>  
>  #ifndef CONFIG_USER_ONLY
>      MachineState *ms = MACHINE(qdev_get_machine());
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>  
> -    /*
> -     * TODO: Add a SMPCompatProps.has_caches flag to avoid useless Updates
> -     * if user didn't set smp_cache.
> -     */
> -    x86_cpu_update_smp_cache_topo(ms, cpu);
> +    if (mc->smp_props.has_caches) {
> +        x86_cpu_update_smp_cache_topo(ms, cpu);
> +    }
>  
>      qemu_register_reset(x86_cpu_machine_reset_cb, cpu);
>  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/7] Introduce SMP Cache Topology
  2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
                   ` (6 preceding siblings ...)
  2024-10-12 10:44 ` [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration Zhao Liu
@ 2024-10-17 13:19 ` Jonathan Cameron via
       [not found] ` <20241017141402.0000135b@Huawei.com>
  8 siblings, 0 replies; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-17 13:19 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S.Tsirkin , Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi

RESEND (again, sorry) as didn't reach list.
Issue some stray " in various email addresses.

On Sat, 12 Oct 2024 18:44:22 +0800
Zhao Liu <zhao1.liu@intel.com> wrote:

> Hi all,
> 
> Compared with v2 [1], the changes in v3 are quite minor, and most of
> patches (except for patch 1 and 7) have received Jonathan’s R/b (thanks
> Jonathan!).
> 
> Meanwhile, ARM side has also worked a lot on the smp-cache based on
> this series [2], so I think we are very close to the final merge, to
> catch up with this cycle. :)

This would finally solve a long standing missing control for our
virtualization usecases (TCG and MPAM stuff is an added bonus),
so I'm very keen in this making 9.2 (and maybe even the ARM part
of things happen to move fast enough). Ali is out this week,
but should be back sometime next week. Looks like rebase of his
ARM patches on this should be simple!

I think this set mostly needs a QAPI review (perhaps from Markus?)

> 
> This series is based on the commit 7e3b6d8063f2 ("Merge tag 'crypto-
> fixes-pull-request' of https://gitlab.com/berrange/qemu into staging").
> 
> Background
> ==========
> 
> The x86 and ARM (RISCV) need to allow user to configure cache properties
*laughs*. I definitely going to start emailing ARM folk with
ARM (RISCV)  
:)  


> (current only topology):
>  * For x86, the default cache topology model (of max/host CPU) does not
>    always match the Host's real physical cache topology. Performance can
>    increase when the configured virtual topology is closer to the
>    physical topology than a default topology would be.
>  * For ARM, QEMU can't get the cache topology information from the CPU
>    registers, then user configuration is necessary. Additionally, the
>    cache information is also needed for MPAM emulation (for TCG) to
>    build the right PPTT. (Originally from Jonathan)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
       [not found]   ` <20241017095227.00006d85@Huawei.com>
@ 2024-10-17 13:20     ` Jonathan Cameron via
  2024-10-17 14:51       ` Zhao Liu
  0 siblings, 1 reply; 23+ messages in thread
From: Jonathan Cameron via @ 2024-10-17 13:20 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S.Tsirkin , Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma

RESEND (sorry for noise).
Quotes in email address issue meant the server bounced it.

On Thu, 17 Oct 2024 09:52:27 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Sat, 12 Oct 2024 18:44:23 +0800
> Zhao Liu <zhao1.liu@intel.com> wrote:
> 
> > Cache topology needs to be defined based on CPU topology levels. Thus,
> > define CPU topology enumeration in qapi/machine.json to make it generic
> > for all architectures.
> > 
> > To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
> > CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
> > socket.
> > 
> > Also, enumerate additional topology levels for non-i386 arches, and add
> > a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
> > with compatibility requirement of arch-specific cache topology models.
> > 
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Tested-by: Yongwei Ma <yongwei.ma@intel.com>  
> LGTM with one incredibly trivial comment inline.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> >  #endif /* HW_I386_TOPOLOGY_H */
> > diff --git a/qapi/machine-common.json b/qapi/machine-common.json
> > index b64e4895cfd7..db3e499fb382 100644
> > --- a/qapi/machine-common.json
> > +++ b/qapi/machine-common.json
> > @@ -5,7 +5,7 @@
> >  # See the COPYING file in the top-level directory.
> >  
> >  ##
> > -# = Machines S390 data types
> > +# = Common machine types
> >  ##
> >  
> >  ##
> > @@ -18,3 +18,47 @@
> >  ##
> >  { 'enum': 'S390CpuEntitlement',
> >    'data': [ 'auto', 'low', 'medium', 'high' ] }
> > +
> > +##
> > +# @CpuTopologyLevel:
> > +#
> > +# An enumeration of CPU topology levels.
> > +#
> > +# @invalid: Invalid topology level.  
> 
> Really trivial but why a capital I on Invalid here but not the
> t of thread below?
> 
> > +#
> > +# @thread: thread level, which would also be called SMT level or
> > +#     logical processor level.  The @threads option in
> > +#     SMPConfiguration is used to configure the topology of this
> > +#     level.  
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-17 13:20     ` Jonathan Cameron via
@ 2024-10-17 14:51       ` Zhao Liu
  0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-17 14:51 UTC (permalink / raw)
  To: Jonathan Cameron, Markus Armbruster
  Cc: Daniel P . Berrang�, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daud�, Yanan Wang,
	Michael S.Tsirkin , Paolo Bonzini, Richard Henderson, Eric Blake,
	Marcelo Tosatti, Alex Benn�e, Peter Maydell, Sia Jee Heng,
	Alireza Sanaee, qemu-devel, kvm, qemu-riscv, qemu-arm,
	Zhenyu Wang, Dapeng Mi, Yongwei Ma, Zhao Liu

(Ping Markus)

> > > +
> > > +##
> > > +# @CpuTopologyLevel:
> > > +#
> > > +# An enumeration of CPU topology levels.
> > > +#
> > > +# @invalid: Invalid topology level.  
> > 
> > Really trivial but why a capital I on Invalid here but not the
> > t of thread below?

Oops, thank you! It should be "invalid".

If Markus doesn't veto this version :), I'll standardize the case issues
in this file later. Some cases are uppercase, while others are lowercase.

> > > +#
> > > +# @thread: thread level, which would also be called SMT level or
> > > +#     logical processor level.  The @threads option in
> > > +#     SMPConfiguration is used to configure the topology of this
> > > +#     level.  
> > 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 0/7] Introduce SMP Cache Topology
       [not found] ` <20241017141402.0000135b@Huawei.com>
@ 2024-10-17 15:01   ` Zhao Liu
  0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-17 15:01 UTC (permalink / raw)
  To: Jonathan Cameron, Daniel P . Berrangé, Paolo Bonzini,
	Markus Armbruster
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe =?ISO-8859-1?Q?Ma?= =?ISO-8859-1?Q?thieu-Daud=E9?=,
	Yanan Wang, Michael S.Tsirkin, Richard Henderson, Eric Blake,
	Marcelo Tosatti, Alex =?ISO-8859-1?Q?Benn=E9e?=, Peter Maydell,
	Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm, qemu-riscv,
	qemu-arm, Zhenyu Wang, Dapeng Mi, Zhao Liu

(Cc and gentlely ping QOM & QAPI maintainers :) )

> > Meanwhile, ARM side has also worked a lot on the smp-cache based on
> > this series [2], so I think we are very close to the final merge, to
> > catch up with this cycle. :)
> 
> This would finally solve a long standing missing control for our
> virtualization usecases (TCG and MPAM stuff is an added bonus),
> so I'm very keen in this making 9.2 (and maybe even the ARM part
> of things happen to move fast enough). Ali is out this week,
> but should be back sometime next week. Looks like rebase of his
> ARM patches on this should be simple!
> 
> I think this set mostly needs a QAPI review (perhaps from Markus?)

Michael mentioned this series also need QOM maintainer's review. So I
pinged maintainers at the beginning of this reply.

> > 
> > This series is based on the commit 7e3b6d8063f2 ("Merge tag 'crypto-
> > fixes-pull-request' of https://gitlab.com/berrange/qemu into staging").
> > 
> > Background
> > ==========
> > 
> > The x86 and ARM (RISCV) need to allow user to configure cache properties
> *laughs*. I definitely going to start emailing ARM folk with
> ARM (RISCV)  
> :)  

:) I remembered you discussing cache topology with Sia (from RISCV).

Thanks,
Zhao




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine
  2024-10-12 10:44 ` [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine Zhao Liu
@ 2024-10-17 15:27   ` Daniel P. Berrangé
  2024-10-18  3:57     ` Zhao Liu
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2024-10-17 15:27 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma

On Sat, Oct 12, 2024 at 06:44:28PM +0800, Zhao Liu wrote:
> Allow user to configure l1d, l1i, l2 and l3 cache topologies for PC
> machine.
> 
> Additionally, add the document of "-machine smp-cache" in
> qemu-options.hx.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> Changes since Patch v2:
>  * Polished the document. (Jonathan)
> 
> Changes since Patch v1:
>  * Merged document into this patch. (Markus)
> 
> Changes since RFC v2:
>  * Used cache_supported array.
> ---
>  hw/i386/pc.c    |  4 ++++
>  qemu-options.hx | 26 +++++++++++++++++++++++++-
>  2 files changed, 29 insertions(+), 1 deletion(-)
> 
>              -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512
> +
> +    ``smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel``
> +        Define cache properties for SMP system.
> +
> +        ``cache=cachename`` specifies the cache that the properties will be
> +        applied on. This field is the combination of cache level and cache
> +        type. It supports ``l1d`` (L1 data cache), ``l1i`` (L1 instruction
> +        cache), ``l2`` (L2 unified cache) and ``l3`` (L3 unified cache).
> +
> +        ``topology=topologylevel`` sets the cache topology level. It accepts
> +        CPU topology levels including ``thread``, ``core``, ``module``,
> +        ``cluster``, ``die``, ``socket``, ``book``, ``drawer`` and a special
> +        value ``default``. If ``default`` is set, then the cache topology will
> +        follow the architecture's default cache topology model. If another
> +        topology level is set, the cache will be shared at corresponding CPU
> +        topology level. For example, ``topology=core`` makes the cache shared
> +        by all threads within a core.
> +
> +        Example:
> +
> +        ::
> +
> +            -machine smp-cache.0.cache=l1d,smp-cache.0.topology=core,smp-cache.1.cache=l1i,smp-cache.1.topology=core

There are 4 cache types, l1d, l1i, l2, l3.

In this example you've only set properties for l1d, l1i caches.

What does this mean for l2 / l3 caches ?

Are they reported as not existing, or are they to be reported at
some built-in default topology level. If the latter, how does the
user know what that built-in default is, and avoid nonsense like
l1d being at socket level, and l3 being at the core level ? Can
we explicitly disable a l2/l3 cache, or must it always exists ?

The QAPI has an "invalid" topology level. You've not documented
that as permitted here, but the qapi parser will happily allow
it. What semantics will that have ? 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
       [not found]   ` <20241017095227.00006d85@Huawei.com>
@ 2024-10-17 15:30   ` Daniel P. Berrangé
  2024-10-18  2:36     ` Zhao Liu
  2024-10-17 16:19   ` Marcin Juszkiewicz
  2 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2024-10-17 15:30 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Yongwei Ma

On Sat, Oct 12, 2024 at 06:44:23PM +0800, Zhao Liu wrote:
> Cache topology needs to be defined based on CPU topology levels. Thus,
> define CPU topology enumeration in qapi/machine.json to make it generic
> for all architectures.
> 
> To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
> CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
> socket.
> 
> Also, enumerate additional topology levels for non-i386 arches, and add
> a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
> with compatibility requirement of arch-specific cache topology models.
> 
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> Tested-by: Yongwei Ma <yongwei.ma@intel.com>
> ---
> Changes since Patch v2:
>  * Updated version of new QAPI structures to v9.2. (Jonathan)
> 
> Changes since Patch v1:
>  * Dropped prefix of CpuTopologyLevel enumeration. (Markus)
>  * Rename CPU_TOPO_LEVEL_* to CPU_TOPOLOGY_LEVEL_* to match the QAPI's
>    generated code. (Markus)
> 
> Changes since RFC v2:
>  * Dropped cpu-topology.h and cpu-topology.c since QAPI has the helper
>    (CpuTopologyLevel_str) to convert enum to string. (Markus)
>  * Fixed text format in machine.json (CpuTopologyLevel naming, 2 spaces
>    between sentences). (Markus)
>  * Added a new level "default" to de-compatibilize some arch-specific
>    topo settings. (Daniel)
>  * Moved CpuTopologyLevel to qapi/machine-common.json, at where the
>    cache enumeration and smp-cache object would be added.
>    - If smp-cache object is defined in qapi/machine.json, storage-daemon
>      will complain about the qmp cmds in qapi/machine.json during
>      compiling.
> 
> Changes since RFC v1:
>  * Used QAPI to enumerate CPU topology levels.
>  * Dropped string_to_cpu_topo() since QAPI will help to parse the topo
>    levels.
> ---
>  hw/i386/x86-common.c       |   4 +-
>  include/hw/i386/topology.h |  22 +-----
>  qapi/machine-common.json   |  46 +++++++++++-
>  target/i386/cpu.c          | 144 ++++++++++++++++++-------------------
>  target/i386/cpu.h          |   4 +-
>  5 files changed, 124 insertions(+), 96 deletions(-)
> 

> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index dff49fce1154..bf740383038b 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -39,7 +39,7 @@

>      unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> -/*
> - * CPUTopoLevel is the general i386 topology hierarchical representation,
> - * ordered by increasing hierarchical relationship.
> - * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
> - * or AMD (CPUID[0x80000026]).
> - */
> -enum CPUTopoLevel {
> -    CPU_TOPO_LEVEL_INVALID,
> -    CPU_TOPO_LEVEL_SMT,
> -    CPU_TOPO_LEVEL_CORE,
> -    CPU_TOPO_LEVEL_MODULE,
> -    CPU_TOPO_LEVEL_DIE,
> -    CPU_TOPO_LEVEL_PACKAGE,
> -    CPU_TOPO_LEVEL_MAX,
> -};
> -

snip

> @@ -18,3 +18,47 @@
>  ##
>  { 'enum': 'S390CpuEntitlement',
>    'data': [ 'auto', 'low', 'medium', 'high' ] }
> +
> +##
> +# @CpuTopologyLevel:
> +#
> +# An enumeration of CPU topology levels.
> +#
> +# @invalid: Invalid topology level.

Previously all topology levels were internal to QEMU, and IIUC
this CPU_TOPO_LEVEL_INVALID appears to have been a special
value to indicate  the cache was absent ?

Now we're exposing this directly to the user as a settable
option. We need to explain what effect setting 'invalid'
has on the CPU cache config.

> +#
> +# @thread: thread level, which would also be called SMT level or
> +#     logical processor level.  The @threads option in
> +#     SMPConfiguration is used to configure the topology of this
> +#     level.
> +#
> +# @core: core level.  The @cores option in SMPConfiguration is used
> +#     to configure the topology of this level.
> +#
> +# @module: module level.  The @modules option in SMPConfiguration is
> +#     used to configure the topology of this level.
> +#
> +# @cluster: cluster level.  The @clusters option in SMPConfiguration
> +#     is used to configure the topology of this level.
> +#
> +# @die: die level.  The @dies option in SMPConfiguration is used to
> +#     configure the topology of this level.
> +#
> +# @socket: socket level, which would also be called package level.
> +#     The @sockets option in SMPConfiguration is used to configure
> +#     the topology of this level.
> +#
> +# @book: book level.  The @books option in SMPConfiguration is used
> +#     to configure the topology of this level.
> +#
> +# @drawer: drawer level.  The @drawers option in SMPConfiguration is
> +#     used to configure the topology of this level.
> +#
> +# @default: default level.  Some architectures will have default
> +#     topology settings (e.g., cache topology), and this special
> +#     level means following the architecture-specific settings.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
       [not found]   ` <20241017095227.00006d85@Huawei.com>
  2024-10-17 15:30   ` Daniel P. Berrangé
@ 2024-10-17 16:19   ` Marcin Juszkiewicz
  2024-10-18  4:26     ` Zhao Liu
  2 siblings, 1 reply; 23+ messages in thread
From: Marcin Juszkiewicz @ 2024-10-17 16:19 UTC (permalink / raw)
  To: Zhao Liu, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Michael S . Tsirkin, Paolo Bonzini, Richard Henderson,
	Eric Blake, Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee
  Cc: qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Yongwei Ma

W dniu 12.10.2024 o 12:44, Zhao Liu pisze:
> Cache topology needs to be defined based on CPU topology levels. Thus,
> define CPU topology enumeration in qapi/machine.json to make it generic
> for all architectures.

I have a question: how to create other than default cache topology in C 
source?

If I would like to change default cache structure for sbsa-ref then how 
would I do it?

QEMU has powerful set of command line options. But it is hard to convert 
set of cli options into C code.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-17 15:30   ` Daniel P. Berrangé
@ 2024-10-18  2:36     ` Zhao Liu
  2024-10-18  7:55       ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-10-18  2:36 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Zhao Liu

Hi Daniel,

> > -/*
> > - * CPUTopoLevel is the general i386 topology hierarchical representation,
> > - * ordered by increasing hierarchical relationship.
> > - * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
> > - * or AMD (CPUID[0x80000026]).
> > - */
> > -enum CPUTopoLevel {
> > -    CPU_TOPO_LEVEL_INVALID,
> > -    CPU_TOPO_LEVEL_SMT,
> > -    CPU_TOPO_LEVEL_CORE,
> > -    CPU_TOPO_LEVEL_MODULE,
> > -    CPU_TOPO_LEVEL_DIE,
> > -    CPU_TOPO_LEVEL_PACKAGE,
> > -    CPU_TOPO_LEVEL_MAX,
> > -};
> > -
> 
> snip
> 
> > @@ -18,3 +18,47 @@
> >  ##
> >  { 'enum': 'S390CpuEntitlement',
> >    'data': [ 'auto', 'low', 'medium', 'high' ] }
> > +
> > +##
> > +# @CpuTopologyLevel:
> > +#
> > +# An enumeration of CPU topology levels.
> > +#
> > +# @invalid: Invalid topology level.
> 
> Previously all topology levels were internal to QEMU, and IIUC
> this CPU_TOPO_LEVEL_INVALID appears to have been a special
> value to indicate  the cache was absent ?

Now I haven't support this logic.
x86 CPU has a "l3-cache" property, and maybe that property can be
implemented or replaced by the "invalid" level support you mentioned.

> Now we're exposing this directly to the user as a settable
> option. We need to explain what effect setting 'invalid'
> has on the CPU cache config.

If user set "invalid", QEMU will report the error message:

qemu-system-x86_64: Invalid cache topology level: invalid. The topology should match valid CPU topology level

Do you think this error message is sufficient?

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine
  2024-10-17 15:27   ` Daniel P. Berrangé
@ 2024-10-18  3:57     ` Zhao Liu
  2024-10-18  7:58       ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Zhao Liu @ 2024-10-18  3:57 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Zhao Liu

Hi Daniel,

> > +    ``smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel``
> > +        Define cache properties for SMP system.
> > +
> > +        ``cache=cachename`` specifies the cache that the properties will be
> > +        applied on. This field is the combination of cache level and cache
> > +        type. It supports ``l1d`` (L1 data cache), ``l1i`` (L1 instruction
> > +        cache), ``l2`` (L2 unified cache) and ``l3`` (L3 unified cache).
> > +
> > +        ``topology=topologylevel`` sets the cache topology level. It accepts
> > +        CPU topology levels including ``thread``, ``core``, ``module``,
> > +        ``cluster``, ``die``, ``socket``, ``book``, ``drawer`` and a special
> > +        value ``default``. If ``default`` is set, then the cache topology will
> > +        follow the architecture's default cache topology model. If another
> > +        topology level is set, the cache will be shared at corresponding CPU
> > +        topology level. For example, ``topology=core`` makes the cache shared
> > +        by all threads within a core.
> > +
> > +        Example:
> > +
> > +        ::
> > +
> > +            -machine smp-cache.0.cache=l1d,smp-cache.0.topology=core,smp-cache.1.cache=l1i,smp-cache.1.topology=core
> 
> There are 4 cache types, l1d, l1i, l2, l3.
> 
> In this example you've only set properties for l1d, l1i caches.
> 
> What does this mean for l2 / l3 caches ?

Omitting "cache" will default to using the "default" level.

I think I should add the above description to the documentation.

> Are they reported as not existing, or are they to be reported at
> some built-in default topology level.

It's the latter.

If a machine doesn't support l2/l3, then QEMU will also report the error
like:

qemu-system-*: l2 cache topology not supported by this machine

> If the latter, how does the user know what that built-in default is, 

Currently, the default cache model for x86 is L1 per core, L2 per core,
and L3 per die. Similar to the topology levels, there is still no way to
expose this to users. I can descript default cache model in doc.

But I feel like we're back to the situation we discussed earlier:
"default" CPU topology support should be related to the CPU model, but
in practice, QEMU supports it at the machine level. The cache topology
depends on CPU topology support and can only continue to be added on top
of the machine.

So do you think we can add topology and cache information in CpuModelInfo
so that query-cpu-model-expansion can expose default CPU/cache topology
information to users?

This way, users can customize CPU/cache topology in -smp and
-machine smp-cache. Although the QMP command is targeted at the CPU model
while our CLI is at the machine level, at least we can expose the
information to users.

If you agree to expose the default topology/cache info in
query-cpu-model-expansion, can I work on this in a separate series? :)

> and avoid nonsense like l1d being at socket level, and l3 being at the
> core level ?

Thank you! I ensured l1 must be lower than l2 and l2 must be lower than
l3, but missed l1 must be lower than l3.

The level check is incomplete due to the influence of "default." I think
the check should be placed after the arch CPU sets the cache model,
so that "default" is consumed.

> Can we explicitly disable a l2/l3 cache, or must it always exists ?

Now we can't disable it through -machine smp-cache (while x86 CPU support
l3-cache=off), but as you mentioned, I can try using "invalid" to support
this scenario, which would be more general. Similarly, if you agree, I
can also add this support in a separate series.

> The QAPI has an "invalid" topology level. You've not documented
> that as permitted here, but the qapi parser will happily allow
> it. What semantics will that have ? 

Because the current this patch throws an error for "invalid", so I didn't
included it in the doc.

Thanks,
Zhao

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-17 16:19   ` Marcin Juszkiewicz
@ 2024-10-18  4:26     ` Zhao Liu
  0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-18  4:26 UTC (permalink / raw)
  To: Marcin Juszkiewicz
  Cc: Daniel P . Berrangé, Igor Mammedov, Eduardo Habkost,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Yanan Wang,
	Michael S . Tsirkin, Paolo Bonzini, Richard Henderson, Eric Blake,
	Markus Armbruster, Marcelo Tosatti, Alex Bennée,
	Peter Maydell, Jonathan Cameron, Sia Jee Heng, Alireza Sanaee,
	qemu-devel, kvm, qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi,
	Zhao Liu

Hi Marcin,

On Thu, Oct 17, 2024 at 06:19:59PM +0200, Marcin Juszkiewicz wrote:
> Date: Thu, 17 Oct 2024 18:19:59 +0200
> From: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> Subject: Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration
>  arch-agnostic
> 
> W dniu 12.10.2024 o 12:44, Zhao Liu pisze:
> > Cache topology needs to be defined based on CPU topology levels. Thus,
> > define CPU topology enumeration in qapi/machine.json to make it generic
> > for all architectures.
> 
> I have a question: how to create other than default cache topology in C
> source?

What does "C source" mean? Does it refer to the C code for sbsa-ref?

There's the ARM change to support cache topology for virt machine:

https://lore.kernel.org/qemu-devel/20241010111822.345-5-alireza.sanaee@huawei.com/

If you're looking to store cache information for some common purposes,
you could also define a cache model structure similar to how it's done
for x86:

static const CPUCaches epyc_cache_info = {
    .l1d_cache = &(CPUCacheInfo) {
        .type = DATA_CACHE,
        .level = 1,
        .size = 32 * KiB,
        .line_size = 64,
        .associativity = 8,
        .partitions = 1,
        .sets = 64,
        .lines_per_tag = 1,
        .self_init = 1,
        .no_invd_sharing = true,
        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
    },
    .l1i_cache = &(CPUCacheInfo) {
        .type = INSTRUCTION_CACHE,
        .level = 1,
        .size = 64 * KiB,
        .line_size = 64,
        .associativity = 4,
        .partitions = 1,
        .sets = 256,
        .lines_per_tag = 1,
        .self_init = 1,
        .no_invd_sharing = true,
        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
    },
    .l2_cache = &(CPUCacheInfo) {
        .type = UNIFIED_CACHE,
        .level = 2,
        .size = 512 * KiB,
        .line_size = 64,
        .associativity = 8,
        .partitions = 1,
        .sets = 1024,
        .lines_per_tag = 1,
        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
    },
    .l3_cache = &(CPUCacheInfo) {
        .type = UNIFIED_CACHE,
        .level = 3,
        .size = 8 * MiB,
        .line_size = 64,
        .associativity = 16,
        .partitions = 1,
        .sets = 8192,
        .lines_per_tag = 1,
        .self_init = true,
        .inclusive = true,
        .complex_indexing = true,
        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
    },
};

> If I would like to change default cache structure for sbsa-ref then how
> would I do it?

I'm not very familiar with sbsa-ref. How is the cache model defined? Does
it use ACPI PPTT like the virt machine? If so, you can refer to the virt
machine series link I provided above.

> QEMU has powerful set of command line options. But it is hard to convert set
> of cli options into C code.

The CLI is currently quite complex, as different machine configurations
may vary. But don't worry. The general steps for enabling smp-cache here
are:

1. Set cache levels support in sbsa_ref_class_init(). You can refer my
   patch 6, to set ture for which cache level you need.
2. Then, the cli can support "-machine smp-cache" for sbsa-ref machine.
   You can refer the doc in my patch 6 to get the correct format.
3. Next, the MachineState will store the user's cache topology in "smp_cache".
   You can refer my patch 5 to get cache topology level from machine.
4. Finally, it's architecture-specific code, depending on whether you
   want to express cache information in the same pptt table as virt
   machine.

Regards,
Zhao




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-18  2:36     ` Zhao Liu
@ 2024-10-18  7:55       ` Daniel P. Berrangé
  2024-10-18  9:01         ` Zhao Liu
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2024-10-18  7:55 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi

On Fri, Oct 18, 2024 at 10:36:30AM +0800, Zhao Liu wrote:
> Hi Daniel,
> 
> > > -/*
> > > - * CPUTopoLevel is the general i386 topology hierarchical representation,
> > > - * ordered by increasing hierarchical relationship.
> > > - * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
> > > - * or AMD (CPUID[0x80000026]).
> > > - */
> > > -enum CPUTopoLevel {
> > > -    CPU_TOPO_LEVEL_INVALID,
> > > -    CPU_TOPO_LEVEL_SMT,
> > > -    CPU_TOPO_LEVEL_CORE,
> > > -    CPU_TOPO_LEVEL_MODULE,
> > > -    CPU_TOPO_LEVEL_DIE,
> > > -    CPU_TOPO_LEVEL_PACKAGE,
> > > -    CPU_TOPO_LEVEL_MAX,
> > > -};
> > > -
> > 
> > snip
> > 
> > > @@ -18,3 +18,47 @@
> > >  ##
> > >  { 'enum': 'S390CpuEntitlement',
> > >    'data': [ 'auto', 'low', 'medium', 'high' ] }
> > > +
> > > +##
> > > +# @CpuTopologyLevel:
> > > +#
> > > +# An enumeration of CPU topology levels.
> > > +#
> > > +# @invalid: Invalid topology level.
> > 
> > Previously all topology levels were internal to QEMU, and IIUC
> > this CPU_TOPO_LEVEL_INVALID appears to have been a special
> > value to indicate  the cache was absent ?
> 
> Now I haven't support this logic.
> x86 CPU has a "l3-cache" property, and maybe that property can be
> implemented or replaced by the "invalid" level support you mentioned.
> 
> > Now we're exposing this directly to the user as a settable
> > option. We need to explain what effect setting 'invalid'
> > has on the CPU cache config.
> 
> If user set "invalid", QEMU will report the error message:
> 
> qemu-system-x86_64: Invalid cache topology level: invalid. The topology should match valid CPU topology level
> 
> Do you think this error message is sufficient?

If the user cannot set 'invalid' as an input, and no QEMU interface
will emit, then ideally we would not define 'invalid' in the QAPI
schema at all.

This woudl require us to have some internal only way to record
"invalid", separately from the topology level, or with a magic
internal only constant that doesn't clash with the public enum
constants. I guess the latter would be less work e.g. we could
"abuse" the 'MAX' constant value

   #define CPU_TOPOLOGY_LEVEL_INVALID CPU_TOPOLOGY_LEVEL_MAX

or separate it with a negative value

   #define CPU_TOPOLOGY_LEVEL_INVALID -1


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine
  2024-10-18  3:57     ` Zhao Liu
@ 2024-10-18  7:58       ` Daniel P. Berrangé
  2024-10-18  9:03         ` Zhao Liu
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2024-10-18  7:58 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi

On Fri, Oct 18, 2024 at 11:57:36AM +0800, Zhao Liu wrote:
> Hi Daniel,
> 
> > > +    ``smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel``
> > > +        Define cache properties for SMP system.
> > > +
> > > +        ``cache=cachename`` specifies the cache that the properties will be
> > > +        applied on. This field is the combination of cache level and cache
> > > +        type. It supports ``l1d`` (L1 data cache), ``l1i`` (L1 instruction
> > > +        cache), ``l2`` (L2 unified cache) and ``l3`` (L3 unified cache).
> > > +
> > > +        ``topology=topologylevel`` sets the cache topology level. It accepts
> > > +        CPU topology levels including ``thread``, ``core``, ``module``,
> > > +        ``cluster``, ``die``, ``socket``, ``book``, ``drawer`` and a special
> > > +        value ``default``. If ``default`` is set, then the cache topology will
> > > +        follow the architecture's default cache topology model. If another
> > > +        topology level is set, the cache will be shared at corresponding CPU
> > > +        topology level. For example, ``topology=core`` makes the cache shared
> > > +        by all threads within a core.
> > > +
> > > +        Example:
> > > +
> > > +        ::
> > > +
> > > +            -machine smp-cache.0.cache=l1d,smp-cache.0.topology=core,smp-cache.1.cache=l1i,smp-cache.1.topology=core
> > 
> > There are 4 cache types, l1d, l1i, l2, l3.
> > 
> > In this example you've only set properties for l1d, l1i caches.
> > 
> > What does this mean for l2 / l3 caches ?
> 
> Omitting "cache" will default to using the "default" level.
> 
> I think I should add the above description to the documentation.
> 
> > Are they reported as not existing, or are they to be reported at
> > some built-in default topology level.
> 
> It's the latter.
> 
> If a machine doesn't support l2/l3, then QEMU will also report the error
> like:
> 
> qemu-system-*: l2 cache topology not supported by this machine

Ok, that's good.

> > If the latter, how does the user know what that built-in default is, 
> 
> Currently, the default cache model for x86 is L1 per core, L2 per core,
> and L3 per die. Similar to the topology levels, there is still no way to
> expose this to users. I can descript default cache model in doc.
> 
> But I feel like we're back to the situation we discussed earlier:
> "default" CPU topology support should be related to the CPU model, but
> in practice, QEMU supports it at the machine level. The cache topology
> depends on CPU topology support and can only continue to be added on top
> of the machine.
> 
> So do you think we can add topology and cache information in CpuModelInfo
> so that query-cpu-model-expansion can expose default CPU/cache topology
> information to users?
> 
> This way, users can customize CPU/cache topology in -smp and
> -machine smp-cache. Although the QMP command is targeted at the CPU model
> while our CLI is at the machine level, at least we can expose the
> information to users.
> 
> If you agree to expose the default topology/cache info in
> query-cpu-model-expansion, can I work on this in a separate series? :)

Yeah, lets worry about that another day.

It it sufficient to just encourage users to always specify
the full set of caches.

> > Can we explicitly disable a l2/l3 cache, or must it always exists ?
> 
> Now we can't disable it through -machine smp-cache (while x86 CPU support
> l3-cache=off), but as you mentioned, I can try using "invalid" to support
> this scenario, which would be more general. Similarly, if you agree, I
> can also add this support in a separate series.

If we decide to offer a way to disable caches, probably better to have
a name like 'disabled' for such a setting, and yes, we don't need todo
that now.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic
  2024-10-18  7:55       ` Daniel P. Berrangé
@ 2024-10-18  9:01         ` Zhao Liu
  0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-18  9:01 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Zhao Liu

On Fri, Oct 18, 2024 at 08:55:35AM +0100, Daniel P. Berrangé wrote:
> Date: Fri, 18 Oct 2024 08:55:35 +0100
> From: "Daniel P. Berrangé" <berrange@redhat.com>
> Subject: Re: [PATCH v3 1/7] hw/core: Make CPU topology enumeration
>  arch-agnostic
> 
> On Fri, Oct 18, 2024 at 10:36:30AM +0800, Zhao Liu wrote:
> > Hi Daniel,
> > 
> > > > -/*
> > > > - * CPUTopoLevel is the general i386 topology hierarchical representation,
> > > > - * ordered by increasing hierarchical relationship.
> > > > - * Its enumeration value is not bound to the type value of Intel (CPUID[0x1F])
> > > > - * or AMD (CPUID[0x80000026]).
> > > > - */
> > > > -enum CPUTopoLevel {
> > > > -    CPU_TOPO_LEVEL_INVALID,
> > > > -    CPU_TOPO_LEVEL_SMT,
> > > > -    CPU_TOPO_LEVEL_CORE,
> > > > -    CPU_TOPO_LEVEL_MODULE,
> > > > -    CPU_TOPO_LEVEL_DIE,
> > > > -    CPU_TOPO_LEVEL_PACKAGE,
> > > > -    CPU_TOPO_LEVEL_MAX,
> > > > -};
> > > > -
> > > 
> > > snip
> > > 
> > > > @@ -18,3 +18,47 @@
> > > >  ##
> > > >  { 'enum': 'S390CpuEntitlement',
> > > >    'data': [ 'auto', 'low', 'medium', 'high' ] }
> > > > +
> > > > +##
> > > > +# @CpuTopologyLevel:
> > > > +#
> > > > +# An enumeration of CPU topology levels.
> > > > +#
> > > > +# @invalid: Invalid topology level.
> > > 
> > > Previously all topology levels were internal to QEMU, and IIUC
> > > this CPU_TOPO_LEVEL_INVALID appears to have been a special
> > > value to indicate  the cache was absent ?
> > 
> > Now I haven't support this logic.
> > x86 CPU has a "l3-cache" property, and maybe that property can be
> > implemented or replaced by the "invalid" level support you mentioned.
> > 
> > > Now we're exposing this directly to the user as a settable
> > > option. We need to explain what effect setting 'invalid'
> > > has on the CPU cache config.
> > 
> > If user set "invalid", QEMU will report the error message:
> > 
> > qemu-system-x86_64: Invalid cache topology level: invalid. The topology should match valid CPU topology level
> > 
> > Do you think this error message is sufficient?
> 
> If the user cannot set 'invalid' as an input, and no QEMU interface
> will emit, then ideally we would not define 'invalid' in the QAPI
> schema at all.
> 
> This woudl require us to have some internal only way to record
> "invalid", separately from the topology level, or with a magic
> internal only constant that doesn't clash with the public enum
> constants. I guess the latter would be less work e.g. we could
> "abuse" the 'MAX' constant value
> 
>    #define CPU_TOPOLOGY_LEVEL_INVALID CPU_TOPOLOGY_LEVEL_MAX
> 
> or separate it with a negative value
> 
>    #define CPU_TOPOLOGY_LEVEL_INVALID -1
>

This's a clever idea. Thank you!

Rgards,
Zhao



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine
  2024-10-18  7:58       ` Daniel P. Berrangé
@ 2024-10-18  9:03         ` Zhao Liu
  0 siblings, 0 replies; 23+ messages in thread
From: Zhao Liu @ 2024-10-18  9:03 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Igor Mammedov, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Michael S . Tsirkin,
	Paolo Bonzini, Richard Henderson, Eric Blake, Markus Armbruster,
	Marcelo Tosatti, Alex Bennée, Peter Maydell,
	Jonathan Cameron, Sia Jee Heng, Alireza Sanaee, qemu-devel, kvm,
	qemu-riscv, qemu-arm, Zhenyu Wang, Dapeng Mi, Zhao Liu

On Fri, Oct 18, 2024 at 08:58:03AM +0100, Daniel P. Berrangé wrote:
> Date: Fri, 18 Oct 2024 08:58:03 +0100
> From: "Daniel P. Berrangé" <berrange@redhat.com>
> Subject: Re: [PATCH v3 6/7] i386/pc: Support cache topology in -machine for
>  PC machine
> 
> On Fri, Oct 18, 2024 at 11:57:36AM +0800, Zhao Liu wrote:
> > Hi Daniel,
> > 
> > > > +    ``smp-cache.0.cache=cachename,smp-cache.0.topology=topologylevel``
> > > > +        Define cache properties for SMP system.
> > > > +
> > > > +        ``cache=cachename`` specifies the cache that the properties will be
> > > > +        applied on. This field is the combination of cache level and cache
> > > > +        type. It supports ``l1d`` (L1 data cache), ``l1i`` (L1 instruction
> > > > +        cache), ``l2`` (L2 unified cache) and ``l3`` (L3 unified cache).
> > > > +
> > > > +        ``topology=topologylevel`` sets the cache topology level. It accepts
> > > > +        CPU topology levels including ``thread``, ``core``, ``module``,
> > > > +        ``cluster``, ``die``, ``socket``, ``book``, ``drawer`` and a special
> > > > +        value ``default``. If ``default`` is set, then the cache topology will
> > > > +        follow the architecture's default cache topology model. If another
> > > > +        topology level is set, the cache will be shared at corresponding CPU
> > > > +        topology level. For example, ``topology=core`` makes the cache shared
> > > > +        by all threads within a core.
> > > > +
> > > > +        Example:
> > > > +
> > > > +        ::
> > > > +
> > > > +            -machine smp-cache.0.cache=l1d,smp-cache.0.topology=core,smp-cache.1.cache=l1i,smp-cache.1.topology=core
> > > 
> > > There are 4 cache types, l1d, l1i, l2, l3.
> > > 
> > > In this example you've only set properties for l1d, l1i caches.
> > > 
> > > What does this mean for l2 / l3 caches ?
> > 
> > Omitting "cache" will default to using the "default" level.
> > 
> > I think I should add the above description to the documentation.
> > 
> > > Are they reported as not existing, or are they to be reported at
> > > some built-in default topology level.
> > 
> > It's the latter.
> > 
> > If a machine doesn't support l2/l3, then QEMU will also report the error
> > like:
> > 
> > qemu-system-*: l2 cache topology not supported by this machine
> 
> Ok, that's good.
> 
> > > If the latter, how does the user know what that built-in default is, 
> > 
> > Currently, the default cache model for x86 is L1 per core, L2 per core,
> > and L3 per die. Similar to the topology levels, there is still no way to
> > expose this to users. I can descript default cache model in doc.
> > 
> > But I feel like we're back to the situation we discussed earlier:
> > "default" CPU topology support should be related to the CPU model, but
> > in practice, QEMU supports it at the machine level. The cache topology
> > depends on CPU topology support and can only continue to be added on top
> > of the machine.
> > 
> > So do you think we can add topology and cache information in CpuModelInfo
> > so that query-cpu-model-expansion can expose default CPU/cache topology
> > information to users?
> > 
> > This way, users can customize CPU/cache topology in -smp and
> > -machine smp-cache. Although the QMP command is targeted at the CPU model
> > while our CLI is at the machine level, at least we can expose the
> > information to users.
> > 
> > If you agree to expose the default topology/cache info in
> > query-cpu-model-expansion, can I work on this in a separate series? :)
> 
> Yeah, lets worry about that another day.
> 
> It it sufficient to just encourage users to always specify
> the full set of caches.

Thanks!

> > > Can we explicitly disable a l2/l3 cache, or must it always exists ?
> > 
> > Now we can't disable it through -machine smp-cache (while x86 CPU support
> > l3-cache=off), but as you mentioned, I can try using "invalid" to support
> > this scenario, which would be more general. Similarly, if you agree, I
> > can also add this support in a separate series.
> 
> If we decide to offer a way to disable caches, probably better to have
> a name like 'disabled' for such a setting, and yes, we don't need todo
> that now.

Yes, "disabled" is better.

Regards,
Zhao

 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-10-18  8:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-12 10:44 [PATCH v3 0/7] Introduce SMP Cache Topology Zhao Liu
2024-10-12 10:44 ` [PATCH v3 1/7] hw/core: Make CPU topology enumeration arch-agnostic Zhao Liu
     [not found]   ` <20241017095227.00006d85@Huawei.com>
2024-10-17 13:20     ` Jonathan Cameron via
2024-10-17 14:51       ` Zhao Liu
2024-10-17 15:30   ` Daniel P. Berrangé
2024-10-18  2:36     ` Zhao Liu
2024-10-18  7:55       ` Daniel P. Berrangé
2024-10-18  9:01         ` Zhao Liu
2024-10-17 16:19   ` Marcin Juszkiewicz
2024-10-18  4:26     ` Zhao Liu
2024-10-12 10:44 ` [PATCH v3 2/7] qapi/qom: Define cache enumeration and properties for machine Zhao Liu
2024-10-12 10:44 ` [PATCH v3 3/7] hw/core: Check smp cache topology support " Zhao Liu
2024-10-12 10:44 ` [PATCH v3 4/7] i386/cpu: Support thread and module level cache topology Zhao Liu
2024-10-12 10:44 ` [PATCH v3 5/7] i386/cpu: Update cache topology with machine's configuration Zhao Liu
2024-10-12 10:44 ` [PATCH v3 6/7] i386/pc: Support cache topology in -machine for PC machine Zhao Liu
2024-10-17 15:27   ` Daniel P. Berrangé
2024-10-18  3:57     ` Zhao Liu
2024-10-18  7:58       ` Daniel P. Berrangé
2024-10-18  9:03         ` Zhao Liu
2024-10-12 10:44 ` [PATCH v3 7/7] i386/cpu: add has_caches flag to check smp_cache configuration Zhao Liu
2024-10-17 13:16   ` Jonathan Cameron via
2024-10-17 13:19 ` [PATCH v3 0/7] Introduce SMP Cache Topology Jonathan Cameron via
     [not found] ` <20241017141402.0000135b@Huawei.com>
2024-10-17 15:01   ` Zhao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).