[PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement
@ 2025-06-26  8:30 Zhao Liu
  2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:30 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Hi,

Since the last RFC, this is now my official patch, trying to add cache
models for Intel CPUs (and Ewai's YongFeng).

This series is based on another series dedicated to cleaning up the
legacy cache models:

https://lore.kernel.org/qemu-devel/20250620092734.1576677-1-zhao1.liu@intel.com/

And this series focuses only on improvements to the named CPU models:
 * Add cache model for Intel CPUs (and YongFeng).
 * Enable 0x1f CPUID leaf for specific Intel CPUs, which already have
   this leaf on host by default.

Change Log
==========

Changes since RFC (20250423114702.1529340-1-zhao1.liu@intel.com):
 * Split CPUID fixes into another series.
 * Since TDX was merged, rebase and rename 0x1f property to
   "x-force-cpuid-0x1f". (Igor)
 * Include cache model for YongFeng from Ewai.

Intel Cache Model
=================

AMD has supports cache model for a long time. And this feature strats
from the Eduardo's idea [1].

Unfortunately, Intel does not support this, and I have received some
feedback (from Tejus on mail list [2] and kvm forum, and from Jason).

Additionally, after clearly defining the cache topology for QEMU's
cache model, outdated cache models can easily raise more questions. For
example, the default legacy cache model's L3 is per die, but SPR's
real L3 is per socket. Users may question how the L3 topology changes
when multiple dies are created (discussed with Daniel on [3]).

So, in this series, I have added cache models for SRF, GNR, and SPR
(because these are the only machines I can find at the moment :-) ).

Note that the cache models are based on the Scalable Performance (SP)
version, and the Xeon Advanced Performance (AP) version may have
different cache sizes. However, SP is sufficient as the default cache
model baseline. In the future, I will consider adding additional
parameters in "smp-cache" to adjust cache sizes to meet different needs.

[1]: https://lore.kernel.org/qemu-devel/20180320175427.GU3417@localhost.localdomain/
[2]: https://lore.kernel.org/qemu-devel/6766AC1F-96D1-41F0-AAEB-CE4158662A51@nutanix.com/
[3]: https://lore.kernel.org/qemu-devel/ZkTrsDdyGRFzVULG@redhat.com/

0x1f CPUID by default (for some CPUs)
=====================================

Once the cache model can be clearly defined, another issue is the
topology.

Currently, the cache topology is actually tied to the CPU topology.
However, in recent Intel CPUs (from cascadelake-AP - 2nd xeon [4]),
CPU topology information is primarily expressed using the 0x1f leaf.

Due to compatibility issues and historical reasons, the Guest's 0x1f
is not unconditionally exposed.

The discrepancy between having 0x1f on the Host but not on the Guest
does indeed cause problems (Manish mentioned in [5]).

Manish and Xiaoyao (for TDX) both attempted to enable 0x1f by default
for Intel CPUs [6] [7], but following Igor's suggestion, it is more
appropriate to enable it by default only for certain CPU models [8].

So, as I update the CPU model at this time, I think it's time to revisit
the community's idea.

I enable the 0x1f leaf for SRF, GNR and SPR by default for better
emulation of real silicons.

[4]: https://lore.kernel.org/qemu-devel/ZpoWskY4XE%2F98jss@intel.com/
[5]: https://lore.kernel.org/qemu-devel/PH0PR02MB738410511BF51B12DB09BE6CF6AC2@PH0PR02MB7384.namprd02.prod.outlook.com/
[6]: https://lore.kernel.org/qemu-devel/20240722101859.47408-1-manish.mishra@nutanix.com/
[7]: https://lore.kernel.org/qemu-devel/20240813033145.279307-1-xiaoyao.li@intel.com/
[8]: https://lore.kernel.org/qemu-devel/20240723170321.0ef780c5@imammedo.users.ipa.redhat.com/
[9]: https://lore.kernel.org/qemu-devel/20250401130205.2198253-34-xiaoyao.li@intel.com/

Thanks and Best Regards,
Zhao
---
Ewan Hai (1):
  i386/cpu: Introduce cache model for YongFeng

Manish Mishra (1):
  i386/cpu: Add a "x-force-cpuid-0x1f" property

Zhao Liu (6):
  i386/cpu: Introduce cache model for SierraForest
  i386/cpu: Introduce cache model for GraniteRapids
  i386/cpu: Introduce cache model for SapphireRapids
  i386/cpu: Enable 0x1f leaf for SierraForest by default
  i386/cpu: Enable 0x1f leaf for GraniteRapids by default
  i386/cpu: Enable 0x1f leaf for SapphireRapids by default

 target/i386/cpu.c | 402 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 402 insertions(+)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
@ 2025-06-26  8:30 ` Zhao Liu
  2025-07-04  3:33   ` Mi, Dapeng
  2025-07-07  0:57   ` Tao Su
  2025-06-26  8:30 ` [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids Zhao Liu
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:30 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Add the cache model to SierraForest (v3) to better emulate its
environment.

The cache model is based on SierraForest-SP (Scalable Performance):

      --- cache 0 ---
      cache type                         = data cache (1)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x8 (8)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 32768 (32 KB)
      --- cache 1 ---
      cache type                         = instruction cache (2)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x8 (8)
      number of sets                     = 0x80 (128)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 128
      (size synth)                       = 65536 (64 KB)
      --- cache 2 ---
      cache type                         = unified cache (3)
      cache level                        = 0x2 (2)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x7 (7)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x1000 (4096)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 4096
      (size synth)                       = 4194304 (4 MB)
      --- cache 3 ---
      cache type                         = unified cache (3)
      cache level                        = 0x3 (3)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1ff (511)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0xc (12)
      number of sets                     = 0x24000 (147456)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = true
      number of sets (s)                 = 147456
      (size synth)                       = 113246208 (108 MB)
      --- cache 4 ---
      cache type                         = no more caches (0)

Suggested-by: Tejus GK <tejus.gk@nutanix.com>
Suggested-by: Jason Zeng <jason.zeng@intel.com>
Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 28e5b7859fef..fcaa2625b023 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2883,6 +2883,97 @@ static const CPUCaches epyc_turin_cache_info = {
         .no_invd_sharing = true,
         .complex_indexing = false,
         .share_level = CPU_TOPOLOGY_LEVEL_DIE,
+    }
+};
+
+static const CPUCaches xeon_srf_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x0.EAX */
+        .type = DATA_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x0.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x0.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x0.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 32 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x1.EAX */
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x1.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x1.ECX */
+        .sets = 128,
+
+        /* CPUID 0x4.0x1.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 64 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x2.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .self_init = true,
+
+        /* CPUID 0x4.0x2.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x2.ECX */
+        .sets = 4096,
+
+        /* CPUID 0x4.0x2.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 4 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_MODULE,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x3.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .self_init = true,
+
+        /* CPUID 0x4.0x3.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 12,
+
+        /* CPUID 0x4.0x3.ECX */
+        .sets = 147456,
+
+        /* CPUID 0x4.0x3.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = true,
+
+        .size = 108 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
     },
 };
 
@@ -5008,6 +5099,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 3,
+                .note = "with srf-sp cache model",
+                .cache_info = &xeon_srf_cache_info,
+            },
             { /* end of list */ },
         },
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
  2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
@ 2025-06-26  8:30 ` Zhao Liu
  2025-07-04  3:34   ` Mi, Dapeng
  2025-07-07  0:58   ` Tao Su
  2025-06-26  8:31 ` [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids Zhao Liu
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:30 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Add the cache model to GraniteRapids (v3) to better emulate its
environment.

The cache model is based on GraniteRapids-SP (Scalable Performance):

      --- cache 0 ---
      cache type                         = data cache (1)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0xc (12)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 49152 (48 KB)
      --- cache 1 ---
      cache type                         = instruction cache (2)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 65536 (64 KB)
      --- cache 2 ---
      cache type                         = unified cache (3)
      cache level                        = 0x2 (2)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x800 (2048)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 2048
      (size synth)                       = 2097152 (2 MB)
      --- cache 3 ---
      cache type                         = unified cache (3)
      cache level                        = 0x3 (3)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0xff (255)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x48000 (294912)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = true
      number of sets (s)                 = 294912
      (size synth)                       = 301989888 (288 MB)
      --- cache 4 ---
      cache type                         = no more caches (0)

Suggested-by: Tejus GK <tejus.gk@nutanix.com>
Suggested-by: Jason Zeng <jason.zeng@intel.com>
Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fcaa2625b023..b40f1a5b6648 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
     }
 };
 
+static const CPUCaches xeon_gnr_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x0.EAX */
+        .type = DATA_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x0.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 12,
+
+        /* CPUID 0x4.0x0.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x0.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 48 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x1.EAX */
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x1.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x1.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x1.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 64 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x2.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .self_init = true,
+
+        /* CPUID 0x4.0x2.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x2.ECX */
+        .sets = 2048,
+
+        /* CPUID 0x4.0x2.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 2 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x3.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .self_init = true,
+
+        /* CPUID 0x4.0x3.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x3.ECX */
+        .sets = 294912,
+
+        /* CPUID 0x4.0x3.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = true,
+
+        .size = 288 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
+    },
+};
+
 static const CPUCaches xeon_srf_cache_info = {
     .l1d_cache = &(CPUCacheInfo) {
         /* CPUID 0x4.0x0.EAX */
@@ -4954,6 +5045,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 3,
+                .note = "with gnr-sp cache model",
+                .cache_info = &xeon_gnr_cache_info,
+            },
             { /* end of list */ },
         },
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
  2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
  2025-06-26  8:30 ` [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-07-04  3:35   ` Mi, Dapeng
  2025-07-07  0:58   ` Tao Su
  2025-06-26  8:31 ` [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng Zhao Liu
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Add the cache model to SapphireRapids (v4) to better emulate its
environment.

The cache model is based on SapphireRapids-SP (Scalable Performance):

      --- cache 0 ---
      cache type                         = data cache (1)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0xc (12)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 49152 (48 KB)
      --- cache 1 ---
      cache type                         = instruction cache (2)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x8 (8)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 32768 (32 KB)
      --- cache 2 ---
      cache type                         = unified cache (3)
      cache level                        = 0x2 (2)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x1 (1)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x800 (2048)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 2048
      (size synth)                       = 2097152 (2 MB)
      --- cache 3 ---
      cache type                         = unified cache (3)
      cache level                        = 0x3 (3)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x7f (127)
      maximum IDs for cores in pkg       = 0x3f (63)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0xf (15)
      number of sets                     = 0x10000 (65536)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = true
      number of sets (s)                 = 65536
      (size synth)                       = 62914560 (60 MB)
      --- cache 4 ---
      cache type                         = no more caches (0)

Suggested-by: Tejus GK <tejus.gk@nutanix.com>
Suggested-by: Jason Zeng <jason.zeng@intel.com>
Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b40f1a5b6648..a7f2e5dd3fcb 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
     }
 };
 
+static const CPUCaches xeon_spr_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x0.EAX */
+        .type = DATA_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x0.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 12,
+
+        /* CPUID 0x4.0x0.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x0.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 48 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x1.EAX */
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x1.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x1.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x1.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 32 * KiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x2.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .self_init = true,
+
+        /* CPUID 0x4.0x2.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x2.ECX */
+        .sets = 2048,
+
+        /* CPUID 0x4.0x2.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        .size = 2 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x3.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .self_init = true,
+
+        /* CPUID 0x4.0x3.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 15,
+
+        /* CPUID 0x4.0x3.ECX */
+        .sets = 65536,
+
+        /* CPUID 0x4.0x3.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = true,
+
+        .size = 60 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
+    },
+};
+
 static const CPUCaches xeon_gnr_cache_info = {
     .l1d_cache = &(CPUCacheInfo) {
         /* CPUID 0x4.0x0.EAX */
@@ -4892,6 +4983,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 4,
+                .note = "with spr-sp cache model",
+                .cache_info = &xeon_spr_cache_info,
+            },
             { /* end of list */ }
         }
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
                   ` (2 preceding siblings ...)
  2025-06-26  8:31 ` [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-06-29  9:47   ` Ewan Hai
  2025-06-26  8:31 ` [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property Zhao Liu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

From: Ewan Hai <ewanhai-oc@zhaoxin.com>

Add the cache model to YongFeng (v3) to better emulate its
environment.

Note, although YongFeng v2 was added after v10.0, it was also back
ported to v10.0.2. Therefore, the new version (v3) is needed to avoid
conflict.

The cache model is as follows:

      --- cache 0 ---
      cache type                         = data cache (1)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x0 (0)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x8 (8)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 32768 (32 KB)
      --- cache 1 ---
      cache type                         = instruction cache (2)
      cache level                        = 0x1 (1)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x0 (0)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x40 (64)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = false
      complex cache indexing             = false
      number of sets (s)                 = 64
      (size synth)                       = 65536 (64 KB)
      --- cache 2 ---
      cache type                         = unified cache (3)
      cache level                        = 0x2 (2)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x0 (0)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x8 (8)
      number of sets                     = 0x200 (512)
      WBINVD/INVD acts on lower caches   = false
      inclusive to lower caches          = true
      complex cache indexing             = false
      number of sets (s)                 = 512
      (size synth)                       = 262144 (256 KB)
      --- cache 3 ---
      cache type                         = unified cache (3)
      cache level                        = 0x3 (3)
      self-initializing cache level      = true
      fully associative cache            = false
      maximum IDs for CPUs sharing cache = 0x0 (0)
      maximum IDs for cores in pkg       = 0x0 (0)
      system coherency line size         = 0x40 (64)
      physical line partitions           = 0x1 (1)
      ways of associativity              = 0x10 (16)
      number of sets                     = 0x2000 (8192)
      WBINVD/INVD acts on lower caches   = true
      inclusive to lower caches          = true
      complex cache indexing             = false
      number of sets (s)                 = 8192
      (size synth)                       = 8388608 (8 MB)
      --- cache 4 ---
      cache type                         = no more caches (0)

Signed-off-by: Ewan Hai <ewanhai-oc@zhaoxin.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes on the original codes:
 * Rearrange cache model fields to make them easier to check.
 * And add explanation of why v3 is needed.
 * Drop lines_per_tag field for L2 & L3.
---
 target/i386/cpu.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a7f2e5dd3fcb..08c84ba90f52 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3159,6 +3159,105 @@ static const CPUCaches xeon_srf_cache_info = {
     },
 };
 
+static const CPUCaches yongfeng_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x0.EAX */
+        .type = DATA_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x0.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x0.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x0.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000005.ECX */
+        .lines_per_tag = 1,
+        .size = 32 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x1.EAX */
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x1.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x1.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x1.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000005.EDX */
+        .lines_per_tag = 1,
+        .size = 64 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x2.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .self_init = true,
+
+        /* CPUID 0x4.0x2.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x2.ECX */
+        .sets = 512,
+
+        /* CPUID 0x4.0x2.EDX */
+        .no_invd_sharing = false,
+        .inclusive = true,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000006.ECX */
+        .size = 256 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x3.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .self_init = true,
+
+        /* CPUID 0x4.0x3.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x3.ECX */
+        .sets = 8192,
+
+        /* CPUID 0x4.0x3.EDX */
+        .no_invd_sharing = true,
+        .inclusive = true,
+        .complex_indexing = false,
+
+        .size = 8 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
+    },
+};
+
 /* The following VMX features are not supported by KVM and are left out in the
  * CPU definitions:
  *
@@ -6438,6 +6537,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 3,
+                .note = "with the cache info",
+                .cache_info = &yongfeng_cache_info
+            },
             { /* end of list */ }
         }
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
                   ` (3 preceding siblings ...)
  2025-06-26  8:31 ` [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-06-26 12:07   ` Ewan Hai
  2025-07-04  3:38   ` Mi, Dapeng
  2025-06-26  8:31 ` [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default Zhao Liu
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

From: Manish Mishra <manish.mishra@nutanix.com>

Add a "x-force-cpuid-0x1f" property so that CPU models can enable it and
have 0x1f CPUID leaf natually as the Host CPU.

The advantage is that when the CPU model's cache model is already
consistent with the Host CPU, for example, SRF defaults to l2 per
module & l3 per package, 0x1f can better help users identify the
topology in the VM.

Adding 0x1f for specific CPU models should not cause any trouble in
principle. This property is only enabled for CPU models that already
have 0x1f leaf on the Host, so software that originally runs normally on
the Host won't encounter issues in the Guest with corresponding CPU
model. Conversely, some software that relies on checking 0x1f might
have problems in the Guest due to the lack of 0x1f [*]. In
summary, adding 0x1f is also intended to further emulate the Host CPU
environment.

[*]: https://lore.kernel.org/qemu-devel/PH0PR02MB738410511BF51B12DB09BE6CF6AC2@PH0PR02MB7384.namprd02.prod.outlook.com/

Signed-off-by: Manish Mishra <manish.mishra@nutanix.com>
Co-authored-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
[Integrated and rebased 2 previous patches (ordered by post time)]
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Note:
  This patch integrates the idea from 2 previous posted patches (ordered
by post time)[1] [2], following the s-o-b policy of "Re-starting
abandoned work" in docs/devel/code-provenance.rst.

[1]: From Manish: https://lore.kernel.org/qemu-devel/20240722101859.47408-1-manish.mishra@nutanix.com/
[2]: From Xiaoyao: https://lore.kernel.org/qemu-devel/20240813033145.279307-1-xiaoyao.li@intel.com/
---
Changes since RFC:
 * Rebase and rename the property as "x-force-cpuid-0x1f". (Igor)
---
 target/i386/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 08c84ba90f52..ee36f7ee2ccc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -9934,6 +9934,7 @@ static const Property x86_cpu_properties[] = {
     DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
                      true),
     DEFINE_PROP_BOOL("x-l1-cache-per-thread", X86CPU, l1_cache_per_core, true),
+    DEFINE_PROP_BOOL("x-force-cpuid-0x1f", X86CPU, force_cpuid_0x1f, false),
 };
 
 #ifndef CONFIG_USER_ONLY
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
                   ` (4 preceding siblings ...)
  2025-06-26  8:31 ` [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-07-04  3:45   ` Mi, Dapeng
  2025-06-26  8:31 ` [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids " Zhao Liu
  2025-06-26  8:31 ` [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids " Zhao Liu
  7 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Host SierraForest CPU has 0x1f leaf by default, so that enable it for
Guest CPU by default as well.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since RFC:
 * Rename the property to "x-force-cpuid-0x1f". (Igor)
---
 target/i386/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ee36f7ee2ccc..70f8fc37f8e0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5392,8 +5392,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
             },
             {
                 .version = 3,
-                .note = "with srf-sp cache model",
+                .note = "with srf-sp cache model and 0x1f leaf",
                 .cache_info = &xeon_srf_cache_info,
+                .props = (PropValue[]) {
+                    { "x-force-cpuid-0x1f", "on" },
+                }
             },
             { /* end of list */ },
         },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids by default
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
                   ` (5 preceding siblings ...)
  2025-06-26  8:31 ` [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-07-04  3:47   ` Mi, Dapeng
  2025-06-26  8:31 ` [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids " Zhao Liu
  7 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Host GraniteRapids CPU has 0x1f leaf by default, so that enable it for
Guest CPU by default as well.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since RFC:
 * Rename the property to "x-force-cpuid-0x1f". (Igor)
---
 target/i386/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 70f8fc37f8e0..acf7e0de184d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5242,8 +5242,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
             },
             {
                 .version = 3,
-                .note = "with gnr-sp cache model",
+                .note = "with gnr-sp cache model and 0x1f leaf",
                 .cache_info = &xeon_gnr_cache_info,
+                .props = (PropValue[]) {
+                    { "x-force-cpuid-0x1f", "on" },
+                }
             },
             { /* end of list */ },
         },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids by default
  2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
                   ` (6 preceding siblings ...)
  2025-06-26  8:31 ` [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids " Zhao Liu
@ 2025-06-26  8:31 ` Zhao Liu
  2025-07-04  3:48   ` Mi, Dapeng
  7 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-06-26  8:31 UTC (permalink / raw)
  To: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel, Zhao Liu

Host SapphireRapids CPU has 0x1f leaf by default, so that enable it for
Guest CPU by default as well.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since RFC:
 * Rename the property to "x-force-cpuid-0x1f". (Igor)
---
 target/i386/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index acf7e0de184d..c7f157a0f71c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5084,8 +5084,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
             },
             {
                 .version = 4,
-                .note = "with spr-sp cache model",
+                .note = "with spr-sp cache model and 0x1f leaf",
                 .cache_info = &xeon_spr_cache_info,
+                .props = (PropValue[]) {
+                    { "x-force-cpuid-0x1f", "on" },
+                }
             },
             { /* end of list */ }
         }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-26  8:31 ` [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property Zhao Liu
@ 2025-06-26 12:07   ` Ewan Hai
  2025-06-27  3:05     ` Zhao Liu
  2025-07-04  3:38   ` Mi, Dapeng
  1 sibling, 1 reply; 26+ messages in thread
From: Ewan Hai @ 2025-06-26 12:07 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi, Tejus GK,
	Manish Mishra, qemu-devel



On 6/26/25 4:31 PM, Zhao Liu wrote:
> 
> 
> From: Manish Mishra <manish.mishra@nutanix.com>
> 
> Add a "x-force-cpuid-0x1f" property so that CPU models can enable it and
> have 0x1f CPUID leaf natually as the Host CPU.
> 
> The advantage is that when the CPU model's cache model is already
> consistent with the Host CPU, for example, SRF defaults to l2 per
> module & l3 per package, 0x1f can better help users identify the
> topology in the VM.
> 
> Adding 0x1f for specific CPU models should not cause any trouble in
> principle. This property is only enabled for CPU models that already
> have 0x1f leaf on the Host, so software that originally runs normally on
> the Host won't encounter issues in the Guest with corresponding CPU
> model. Conversely, some software that relies on checking 0x1f might
> have problems in the Guest due to the lack of 0x1f [*]. In
> summary, adding 0x1f is also intended to further emulate the Host CPU
> environment.
> 
> [*]: https://lore.kernel.org/qemu-devel/PH0PR02MB738410511BF51B12DB09BE6CF6AC2@PH0PR02MB7384.namprd02.prod.outlook.com/
> 
> Signed-off-by: Manish Mishra <manish.mishra@nutanix.com>
> Co-authored-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> [Integrated and rebased 2 previous patches (ordered by post time)]
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Note:
>    This patch integrates the idea from 2 previous posted patches (ordered
> by post time)[1] [2], following the s-o-b policy of "Re-starting
> abandoned work" in docs/devel/code-provenance.rst.
> 
> [1]: From Manish: https://lore.kernel.org/qemu-devel/20240722101859.47408-1-manish.mishra@nutanix.com/
> [2]: From Xiaoyao: https://lore.kernel.org/qemu-devel/20240813033145.279307-1-xiaoyao.li@intel.com/
> ---
> Changes since RFC:
>   * Rebase and rename the property as "x-force-cpuid-0x1f". (Igor)
> ---
>   target/i386/cpu.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 08c84ba90f52..ee36f7ee2ccc 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -9934,6 +9934,7 @@ static const Property x86_cpu_properties[] = {
>       DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>                        true),
>       DEFINE_PROP_BOOL("x-l1-cache-per-thread", X86CPU, l1_cache_per_core, true),
> +    DEFINE_PROP_BOOL("x-force-cpuid-0x1f", X86CPU, force_cpuid_0x1f, false),
>   };

After applying these patches to QEMU mainline at commit 6e1571533fd9:

$ git am 
patches-from-https://lore.kernel.org/qemu-devel/20250620092734.1576677-1-zhao1.liu@intel.com/
$ git am 
patches-from-https://lore.kernel.org/all/20250626083105.2581859-6-zhao1.liu@intel.com/

and configure && make qemu with:

$ ./configure --target-list=x86_64-softmmu --enable-debug --enable-kvm 
--enable-sdl --enable-gtk --enable-spice --prefix=/usr --enable-libusb 
--enable-usb-redir --enable-trace-backends=simple && make -j32

I ran into this build error:

target/i386/cpu.c:9942:52: error: 'X86CPU' {aka 'struct ArchCPU'} has no member 
named 'force_cpuid_0x1f' ; did you mean 'enable_cpuid_0x1f' ?

I haven't debug it yet, because it seems like a simple mistake, asking you 
directly might be quicker.

> 
>   #ifndef CONFIG_USER_ONLY
> --
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-26 12:07   ` Ewan Hai
@ 2025-06-27  3:05     ` Zhao Liu
  2025-06-27  6:48       ` Ewan Hai
  0 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-06-27  3:05 UTC (permalink / raw)
  To: Ewan Hai
  Cc: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

> After applying these patches to QEMU mainline at commit 6e1571533fd9:

Ah, I forgot I've rebased these patches...Now you can rebase all the
patches at the latest master branch.

Or, you can try this repo - I just created it to make it easier for you:

https://gitlab.com/zhao.liu/qemu/-/tree/cache-model-v2.6-rebase-06-23-2025

Thanks,
Zhao

> $ git am patches-from-https://lore.kernel.org/qemu-devel/20250620092734.1576677-1-zhao1.liu@intel.com/
> $ git am patches-from-https://lore.kernel.org/all/20250626083105.2581859-6-zhao1.liu@intel.com/
> 
> and configure && make qemu with:
> 
> $ ./configure --target-list=x86_64-softmmu --enable-debug --enable-kvm
> --enable-sdl --enable-gtk --enable-spice --prefix=/usr --enable-libusb
> --enable-usb-redir --enable-trace-backends=simple && make -j32
> 
> I ran into this build error:
> 
> target/i386/cpu.c:9942:52: error: 'X86CPU' {aka 'struct ArchCPU'} has no
> member named 'force_cpuid_0x1f' ; did you mean 'enable_cpuid_0x1f' ?
> 
> I haven't debug it yet, because it seems like a simple mistake, asking you
> directly might be quicker.
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-27  3:05     ` Zhao Liu
@ 2025-06-27  6:48       ` Ewan Hai
  2025-06-27 10:00         ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Ewan Hai @ 2025-06-27  6:48 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Daniel P. Berrangé, Igor Mammedov,
	Eduardo Habkost, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel



On 6/27/25 11:05 AM, Zhao Liu wrote:
> 
> 
>> After applying these patches to QEMU mainline at commit 6e1571533fd9:
> 
> Ah, I forgot I've rebased these patches...Now you can rebase all the
> patches at the latest master branch.
> 
> Or, you can try this repo - I just created it to make it easier for you:
> 
> https://gitlab.com/zhao.liu/qemu/-/tree/cache-model-v2.6-rebase-06-23-2025
> 

I cloned the repo and then ran:

$ git am 20250620_zhao1_liu_i386_cpu_unify_the_cache_model_in_x86cpustate.mbx

The *.mbx is got from b4 tool.

That applied several patches successfully, but on patch #11 I got this error:

error: patch failed: target/i386/cpu.c:7482
error: target/i386/cpu.c: patch does not apply
Patch failed at 0011 i386/cpu: Select legacy cache model based on vendor in 
CPUID 0x2
hint: Use 'git am --show-current-patch=diff' to see the failed patch

This error also occured on qemu master when I do 'git am 
20250620_zhao1_liu_i386_cpu_unify_the_cache_model_in_x86cpustate.mbx'.

Have you run into this before, or did I miss any steps?




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-27  6:48       ` Ewan Hai
@ 2025-06-27 10:00         ` Zhao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Zhao Liu @ 2025-06-27 10:00 UTC (permalink / raw)
  To: Ewan Hai
  Cc: Paolo Bonzini, Daniel P. Berrangé, Igor Mammedov,
	Eduardo Habkost, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

On Fri, Jun 27, 2025 at 02:48:05PM +0800, Ewan Hai wrote:
> Date: Fri, 27 Jun 2025 14:48:05 +0800
> From: Ewan Hai <ewanhai-oc@zhaoxin.com>
> Subject: Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
> 
> 
> 
> On 6/27/25 11:05 AM, Zhao Liu wrote:
> > 
> > 
> > > After applying these patches to QEMU mainline at commit 6e1571533fd9:
> > 
> > Ah, I forgot I've rebased these patches...Now you can rebase all the
> > patches at the latest master branch.
> > 
> > Or, you can try this repo - I just created it to make it easier for you:
> > 
> > https://gitlab.com/zhao.liu/qemu/-/tree/cache-model-v2.6-rebase-06-23-2025
> > 
> 
> I cloned the repo and then ran:
> 
> $ git am 20250620_zhao1_liu_i386_cpu_unify_the_cache_model_in_x86cpustate.mbx
>

Hi Ewan,

no need to apply any patches on that branch "cache-model-v2.6-rebase-06-23-2025",
since this branch has already contained all my patches.

You could check the git log.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
  2025-06-26  8:31 ` [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng Zhao Liu
@ 2025-06-29  9:47   ` Ewan Hai
  2025-07-02  6:35     ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Ewan Hai @ 2025-06-29  9:47 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi, Tejus GK,
	Manish Mishra, qemu-devel



On 6/26/25 4:31 PM, Zhao Liu wrote:
> 
> 
> From: Ewan Hai <ewanhai-oc@zhaoxin.com>
> 
> Add the cache model to YongFeng (v3) to better emulate its
> environment.
> 
> Note, although YongFeng v2 was added after v10.0, it was also back
> ported to v10.0.2. Therefore, the new version (v3) is needed to avoid
> conflict.
> 
> The cache model is as follows:
> 
>        --- cache 0 ---
>        cache type                         = data cache (1)
>        cache level                        = 0x1 (1)
>        self-initializing cache level      = true
>        fully associative cache            = false
>        maximum IDs for CPUs sharing cache = 0x0 (0)
>        maximum IDs for cores in pkg       = 0x0 (0)
>        system coherency line size         = 0x40 (64)
>        physical line partitions           = 0x1 (1)
>        ways of associativity              = 0x8 (8)
>        number of sets                     = 0x40 (64)
>        WBINVD/INVD acts on lower caches   = false
>        inclusive to lower caches          = false
>        complex cache indexing             = false
>        number of sets (s)                 = 64
>        (size synth)                       = 32768 (32 KB)
>        --- cache 1 ---
>        cache type                         = instruction cache (2)
>        cache level                        = 0x1 (1)
>        self-initializing cache level      = true
>        fully associative cache            = false
>        maximum IDs for CPUs sharing cache = 0x0 (0)
>        maximum IDs for cores in pkg       = 0x0 (0)
>        system coherency line size         = 0x40 (64)
>        physical line partitions           = 0x1 (1)
>        ways of associativity              = 0x10 (16)
>        number of sets                     = 0x40 (64)
>        WBINVD/INVD acts on lower caches   = false
>        inclusive to lower caches          = false
>        complex cache indexing             = false
>        number of sets (s)                 = 64
>        (size synth)                       = 65536 (64 KB)
>        --- cache 2 ---
>        cache type                         = unified cache (3)
>        cache level                        = 0x2 (2)
>        self-initializing cache level      = true
>        fully associative cache            = false
>        maximum IDs for CPUs sharing cache = 0x0 (0)
>        maximum IDs for cores in pkg       = 0x0 (0)
>        system coherency line size         = 0x40 (64)
>        physical line partitions           = 0x1 (1)
>        ways of associativity              = 0x8 (8)
>        number of sets                     = 0x200 (512)
>        WBINVD/INVD acts on lower caches   = false
>        inclusive to lower caches          = true
>        complex cache indexing             = false
>        number of sets (s)                 = 512
>        (size synth)                       = 262144 (256 KB)
>        --- cache 3 ---
>        cache type                         = unified cache (3)
>        cache level                        = 0x3 (3)
>        self-initializing cache level      = true
>        fully associative cache            = false
>        maximum IDs for CPUs sharing cache = 0x0 (0)
>        maximum IDs for cores in pkg       = 0x0 (0)
>        system coherency line size         = 0x40 (64)
>        physical line partitions           = 0x1 (1)
>        ways of associativity              = 0x10 (16)
>        number of sets                     = 0x2000 (8192)
>        WBINVD/INVD acts on lower caches   = true
>        inclusive to lower caches          = true
>        complex cache indexing             = false
>        number of sets (s)                 = 8192
>        (size synth)                       = 8388608 (8 MB)
>        --- cache 4 ---
>        cache type                         = no more caches (0)
> 
> Signed-off-by: Ewan Hai <ewanhai-oc@zhaoxin.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes on the original codes:
>   * Rearrange cache model fields to make them easier to check.
>   * And add explanation of why v3 is needed.
>   * Drop lines_per_tag field for L2 & L3.
> ---
>   target/i386/cpu.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 104 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index a7f2e5dd3fcb..08c84ba90f52 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3159,6 +3159,105 @@ static const CPUCaches xeon_srf_cache_info = {
>       },
>   };
> 
> +static const CPUCaches yongfeng_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        /* CPUID 0x80000005.ECX */
> +        .lines_per_tag = 1,
> +        .size = 32 * KiB,
> +
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        /* CPUID 0x80000005.EDX */
> +        .lines_per_tag = 1,
> +        .size = 64 * KiB,
> +
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 512,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = true,
> +        .complex_indexing = false,
> +
> +        /* CPUID 0x80000006.ECX */
> +        .size = 256 * KiB,
> +
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 8192,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = true,
> +        .inclusive = true,
> +        .complex_indexing = false,
> +
> +        .size = 8 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
> +    },
> +};
> +
>   /* The following VMX features are not supported by KVM and are left out in the
>    * CPU definitions:
>    *
> @@ -6438,6 +6537,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                       { /* end of list */ }
>                   }
>               },
> +            {
> +                .version = 3,
> +                .note = "with the cache info",

I realize that my previous use of "cache info" was not precise; "cache model" is 
more appropriate. Please help me adjust accordingly, thank you.

> +                .cache_info = &yongfeng_cache_info
> +            },
>               { /* end of list */ }
>           }
>       },
> --
> 2.34.1
> 

Hi Zhao,

I tested the patchsets you provided on different hosts, and here are the results:

1. On an Intel host with KVM enabled
The CPUID leaves 0x2 and 0x4 reported inside the YongFeng-V3 VM match our 
expected cache details exactly. However, CPUID leaf 0x80000005 returns all 
zeros. This is because when KVM is in use, QEMU uses the host's vendor for the 
IS_INTEL_CPU(env), IS_ZHAOXIN_CPU(env), and IS_AMD_CPU(env) checks. Given that 
behavior, a zeroed 0x80000005 leaf in the guest is expected and, to me, 
acceptable. What are your thoughts?

2. On a YongFeng host (with or without KVM)
The CPUID leaves 0x2, 0x4, and 0x80000006 inside the VM all return the values we 
want, and the L1D/L1I cache info in leaf 0x80000005 is also correct.

3. TLB info in leaf 0x80000005
On both Intel and YongFeng hosts, the L1 TLB fields in leaf 0x80000005 remain 
constant, as we discussed. As you mentioned before, "we can wait and see what 
maintainers think" about this.

In summary, both patchsets look good for Zhaoxin support, I don't see any issues 
so far.

Btw, YongFeng host also support 0x1F, does YongFeng need to turn on 
"x-force-cpuid-0x1f" default ? I think maybe yes.


Best regards,
Ewan




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
  2025-06-29  9:47   ` Ewan Hai
@ 2025-07-02  6:35     ` Zhao Liu
  2025-07-02  9:35       ` Ewan Hai
  0 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2025-07-02  6:35 UTC (permalink / raw)
  To: Ewan Hai
  Cc: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

> > +            {
> > +                .version = 3,
> > +                .note = "with the cache info",
> 
> I realize that my previous use of "cache info" was not precise; "cache
> model" is more appropriate. Please help me adjust accordingly, thank you.

Nope, will fix.

> > +                .cache_info = &yongfeng_cache_info
> > +            },
> >               { /* end of list */ }
> >           }
> >       },
> > --
> > 2.34.1
> > 
> 
> Hi Zhao,
> 
> I tested the patchsets you provided on different hosts, and here are the results:
> 
> 1. On an Intel host with KVM enabled
> The CPUID leaves 0x2 and 0x4 reported inside the YongFeng-V3 VM match our
> expected cache details exactly. However, CPUID leaf 0x80000005 returns all
> zeros. This is because when KVM is in use, QEMU uses the host's vendor for
> the IS_INTEL_CPU(env), IS_ZHAOXIN_CPU(env), and IS_AMD_CPU(env) checks.

This is a bug:

https://lore.kernel.org/qemu-devel/d429b6f5-b59c-4884-b18f-8db71cb8dc7b@oracle.com/

And we expect we can change the vendor with KVM.

> Given that behavior, a zeroed 0x80000005 leaf in the guest is expected and,
> to me, acceptable. What are your thoughts?

Well, (with this bug) since VM is "Intel" vendor, so this is correct.

> 2. On a YongFeng host (with or without KVM)
> The CPUID leaves 0x2, 0x4, and 0x80000006 inside the VM all return the
> values we want, and the L1D/L1I cache info in leaf 0x80000005 is also
> correct.

Nice!

> 3. TLB info in leaf 0x80000005
> On both Intel and YongFeng hosts, the L1 TLB fields in leaf 0x80000005
> remain constant, as we discussed. As you mentioned before, "we can wait and
> see what maintainers think" about this.

Yes. I suppose Zhaoxin also uses 0x2 to present TLB info like Intel does.
To support TLB, I feel like there is still some work to be done, and it
depends on if it's worth it...

> In summary, both patchsets look good for Zhaoxin support, I don't see any
> issues so far.

Thanks!

> Btw, YongFeng host also support 0x1F, does YongFeng need to turn on
> "x-force-cpuid-0x1f" default ? I think maybe yes.

OK, will add it.

BTW...my colleague reports a bug that Intel/Zhaoxin CPUs with cache
model will meet assertion failure on the v10.0 or old machine.

So I think it's necessary to drop all the assert() checks on
lines_per_tag directly:

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 18bb0e9cf9f6..f73943a46945 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -491,7 +491,6 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
 static uint32_t encode_cache_cpuid80000005(CPUCacheInfo *cache)
 {
     assert(cache->size % 1024 == 0);
-    assert(cache->lines_per_tag > 0);
     assert(cache->associativity > 0);
     assert(cache->line_size > 0);
     return ((cache->size / 1024) << 24) | (cache->associativity << 16) |
@@ -520,13 +519,10 @@ static uint32_t encode_cache_cpuid80000005(CPUCacheInfo *cache)
  */
 static void encode_cache_cpuid80000006(CPUCacheInfo *l2,
                                        CPUCacheInfo *l3,
-                                       uint32_t *ecx, uint32_t *edx,
-                                       bool lines_per_tag_supported)
+                                       uint32_t *ecx, uint32_t *edx)
 {
     assert(l2->size % 1024 == 0);
     assert(l2->associativity > 0);
-    assert(lines_per_tag_supported ?
-           l2->lines_per_tag > 0 : l2->lines_per_tag == 0);
     *ecx = ((l2->size / 1024) << 16) |
            (X86_ENC_ASSOC(l2->associativity) << 12) |
            (l2->lines_per_tag << 8) | (l2->line_size);
@@ -535,8 +531,6 @@ static void encode_cache_cpuid80000006(CPUCacheInfo *l2,
     if (l3) {
         assert(l3->size % (512 * 1024) == 0);
         assert(l3->associativity > 0);
-        assert(lines_per_tag_supported ?
-               l3->lines_per_tag > 0 : l3->lines_per_tag == 0);
         assert(l3->line_size > 0);
         *edx = ((l3->size / (512 * 1024)) << 18) |
                (X86_ENC_ASSOC(l3->associativity) << 12) |
@@ -8353,7 +8347,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) {
             *eax = *ebx = 0;
             encode_cache_cpuid80000006(caches->l2_cache,
-                                       NULL, ecx, edx, false);
+                                       NULL, ecx, edx);
             break;
         }

@@ -8369,7 +8363,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         encode_cache_cpuid80000006(caches->l2_cache,
                                    cpu->enable_l3_cache ?
                                    caches->l3_cache : NULL,
-                                   ecx, edx, true);
+                                   ecx, edx);
         break;
     }
     case 0x80000007:





^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
  2025-07-02  6:35     ` Zhao Liu
@ 2025-07-02  9:35       ` Ewan Hai
  0 siblings, 0 replies; 26+ messages in thread
From: Ewan Hai @ 2025-07-02  9:35 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Daniel P. Berrangé, Igor Mammedov,
	Eduardo Habkost, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel



On 7/2/25 2:35 PM, Zhao Liu wrote:
> 
> 
>>> +            {
>>> +                .version = 3,
>>> +                .note = "with the cache info",
>>
>> I realize that my previous use of "cache info" was not precise; "cache
>> model" is more appropriate. Please help me adjust accordingly, thank you.
> 
> Nope, will fix.

Thank you!

> 
>>> +                .cache_info = &yongfeng_cache_info
>>> +            },
>>>                { /* end of list */ }
>>>            }
>>>        },
>>> --
>>> 2.34.1
>>>
>>
>> Hi Zhao,
>>
>> I tested the patchsets you provided on different hosts, and here are the results:
>>
>> 1. On an Intel host with KVM enabled
>> The CPUID leaves 0x2 and 0x4 reported inside the YongFeng-V3 VM match our
>> expected cache details exactly. However, CPUID leaf 0x80000005 returns all
>> zeros. This is because when KVM is in use, QEMU uses the host's vendor for
>> the IS_INTEL_CPU(env), IS_ZHAOXIN_CPU(env), and IS_AMD_CPU(env) checks.
> 
> This is a bug:
> 
> https://lore.kernel.org/qemu-devel/d429b6f5-b59c-4884-b18f-8db71cb8dc7b@oracle.com/
> 
> And we expect we can change the vendor with KVM.

Oh, I've reviewed your discussion on this bug, and it looks like it will be 
resolved soon.

> 
>> Given that behavior, a zeroed 0x80000005 leaf in the guest is expected and,
>> to me, acceptable. What are your thoughts?
> 
> Well, (with this bug) since VM is "Intel" vendor, so this is correct.
> 
>> 2. On a YongFeng host (with or without KVM)
>> The CPUID leaves 0x2, 0x4, and 0x80000006 inside the VM all return the
>> values we want, and the L1D/L1I cache info in leaf 0x80000005 is also
>> correct.
> 
> Nice!
> 
>> 3. TLB info in leaf 0x80000005
>> On both Intel and YongFeng hosts, the L1 TLB fields in leaf 0x80000005
>> remain constant, as we discussed. As you mentioned before, "we can wait and
>> see what maintainers think" about this.
> 
> Yes. I suppose Zhaoxin also uses 0x2 to present TLB info like Intel does.

Yeah. Same behaviour.

> To support TLB, I feel like there is still some work to be done, and it
> depends on if it's worth it...
> 

The community will provide the final decision, let's wait and see.

>> In summary, both patchsets look good for Zhaoxin support, I don't see any
>> issues so far.
> 
> Thanks!
> 
>> Btw, YongFeng host also support 0x1F, does YongFeng need to turn on
>> "x-force-cpuid-0x1f" default ? I think maybe yes.
> 
> OK, will add it.

Thanks a lot!

> 
> BTW...my colleague reports a bug that Intel/Zhaoxin CPUs with cache
> model will meet assertion failure on the v10.0 or old machine.
> 
> So I think it's necessary to drop all the assert() checks on
> lines_per_tag directly:

I'm not sure in which scenarios this assertion failure occurs, so I can't offer 
any ideas on a solution…

> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 18bb0e9cf9f6..f73943a46945 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -491,7 +491,6 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count,
>   static uint32_t encode_cache_cpuid80000005(CPUCacheInfo *cache)
>   {
>       assert(cache->size % 1024 == 0);
> -    assert(cache->lines_per_tag > 0);
>       assert(cache->associativity > 0);
>       assert(cache->line_size > 0);
>       return ((cache->size / 1024) << 24) | (cache->associativity << 16) |
> @@ -520,13 +519,10 @@ static uint32_t encode_cache_cpuid80000005(CPUCacheInfo *cache)
>    */
>   static void encode_cache_cpuid80000006(CPUCacheInfo *l2,
>                                          CPUCacheInfo *l3,
> -                                       uint32_t *ecx, uint32_t *edx,
> -                                       bool lines_per_tag_supported)
> +                                       uint32_t *ecx, uint32_t *edx)
>   {
>       assert(l2->size % 1024 == 0);
>       assert(l2->associativity > 0);
> -    assert(lines_per_tag_supported ?
> -           l2->lines_per_tag > 0 : l2->lines_per_tag == 0);
>       *ecx = ((l2->size / 1024) << 16) |
>              (X86_ENC_ASSOC(l2->associativity) << 12) |
>              (l2->lines_per_tag << 8) | (l2->line_size);
> @@ -535,8 +531,6 @@ static void encode_cache_cpuid80000006(CPUCacheInfo *l2,
>       if (l3) {
>           assert(l3->size % (512 * 1024) == 0);
>           assert(l3->associativity > 0);
> -        assert(lines_per_tag_supported ?
> -               l3->lines_per_tag > 0 : l3->lines_per_tag == 0);
>           assert(l3->line_size > 0);
>           *edx = ((l3->size / (512 * 1024)) << 18) |
>                  (X86_ENC_ASSOC(l3->associativity) << 12) |
> @@ -8353,7 +8347,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>               (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) {
>               *eax = *ebx = 0;
>               encode_cache_cpuid80000006(caches->l2_cache,
> -                                       NULL, ecx, edx, false);
> +                                       NULL, ecx, edx);
>               break;
>           }
> 
> @@ -8369,7 +8363,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           encode_cache_cpuid80000006(caches->l2_cache,
>                                      cpu->enable_l3_cache ?
>                                      caches->l3_cache : NULL,
> -                                   ecx, edx, true);
> +                                   ecx, edx);
>           break;
>       }
>       case 0x80000007:
> 
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest
  2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
@ 2025-07-04  3:33   ` Mi, Dapeng
  2025-07-07  0:57   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:33 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:30 PM, Zhao Liu wrote:
> Add the cache model to SierraForest (v3) to better emulate its
> environment.
>
> The cache model is based on SierraForest-SP (Scalable Performance):
>
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x0 (0)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 32768 (32 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x0 (0)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x80 (128)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 128
>       (size synth)                       = 65536 (64 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x7 (7)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x1000 (4096)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 4096
>       (size synth)                       = 4194304 (4 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1ff (511)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x24000 (147456)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 147456
>       (size synth)                       = 113246208 (108 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
>
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 28e5b7859fef..fcaa2625b023 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2883,6 +2883,97 @@ static const CPUCaches epyc_turin_cache_info = {
>          .no_invd_sharing = true,
>          .complex_indexing = false,
>          .share_level = CPU_TOPOLOGY_LEVEL_DIE,
> +    }
> +};
> +
> +static const CPUCaches xeon_srf_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 32 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 128,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 64 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 4096,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 4 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_MODULE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 147456,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 108 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
>      },
>  };
>  
> @@ -5008,6 +5099,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 3,
> +                .note = "with srf-sp cache model",
> +                .cache_info = &xeon_srf_cache_info,
> +            },
>              { /* end of list */ },
>          },
>      },

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids
  2025-06-26  8:30 ` [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids Zhao Liu
@ 2025-07-04  3:34   ` Mi, Dapeng
  2025-07-07  0:58   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:34 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:30 PM, Zhao Liu wrote:
> Add the cache model to GraniteRapids (v3) to better emulate its
> environment.
>
> The cache model is based on GraniteRapids-SP (Scalable Performance):
>
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 49152 (48 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 65536 (64 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x800 (2048)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 2048
>       (size synth)                       = 2097152 (2 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0xff (255)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x48000 (294912)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 294912
>       (size synth)                       = 301989888 (288 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
>
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index fcaa2625b023..b40f1a5b6648 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
>      }
>  };
>  
> +static const CPUCaches xeon_gnr_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 48 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 64 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 2048,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 2 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 294912,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 288 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
> +    },
> +};
> +
>  static const CPUCaches xeon_srf_cache_info = {
>      .l1d_cache = &(CPUCacheInfo) {
>          /* CPUID 0x4.0x0.EAX */
> @@ -4954,6 +5045,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 3,
> +                .note = "with gnr-sp cache model",
> +                .cache_info = &xeon_gnr_cache_info,
> +            },
>              { /* end of list */ },
>          },
>      },

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids
  2025-06-26  8:31 ` [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids Zhao Liu
@ 2025-07-04  3:35   ` Mi, Dapeng
  2025-07-07  0:58   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:35 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:31 PM, Zhao Liu wrote:
> Add the cache model to SapphireRapids (v4) to better emulate its
> environment.
>
> The cache model is based on SapphireRapids-SP (Scalable Performance):
>
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 49152 (48 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 32768 (32 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x800 (2048)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 2048
>       (size synth)                       = 2097152 (2 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x7f (127)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xf (15)
>       number of sets                     = 0x10000 (65536)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 65536
>       (size synth)                       = 62914560 (60 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
>
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b40f1a5b6648..a7f2e5dd3fcb 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
>      }
>  };
>  
> +static const CPUCaches xeon_spr_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 48 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 32 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 2048,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 2 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 15,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 65536,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 60 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
> +    },
> +};
> +
>  static const CPUCaches xeon_gnr_cache_info = {
>      .l1d_cache = &(CPUCacheInfo) {
>          /* CPUID 0x4.0x0.EAX */
> @@ -4892,6 +4983,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 4,
> +                .note = "with spr-sp cache model",
> +                .cache_info = &xeon_spr_cache_info,
> +            },
>              { /* end of list */ }
>          }
>      },

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
  2025-06-26  8:31 ` [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property Zhao Liu
  2025-06-26 12:07   ` Ewan Hai
@ 2025-07-04  3:38   ` Mi, Dapeng
  1 sibling, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:38 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:31 PM, Zhao Liu wrote:
> From: Manish Mishra <manish.mishra@nutanix.com>
>
> Add a "x-force-cpuid-0x1f" property so that CPU models can enable it and
> have 0x1f CPUID leaf natually as the Host CPU.
>
> The advantage is that when the CPU model's cache model is already
> consistent with the Host CPU, for example, SRF defaults to l2 per
> module & l3 per package, 0x1f can better help users identify the
> topology in the VM.
>
> Adding 0x1f for specific CPU models should not cause any trouble in
> principle. This property is only enabled for CPU models that already
> have 0x1f leaf on the Host, so software that originally runs normally on
> the Host won't encounter issues in the Guest with corresponding CPU
> model. Conversely, some software that relies on checking 0x1f might
> have problems in the Guest due to the lack of 0x1f [*]. In
> summary, adding 0x1f is also intended to further emulate the Host CPU
> environment.
>
> [*]: https://lore.kernel.org/qemu-devel/PH0PR02MB738410511BF51B12DB09BE6CF6AC2@PH0PR02MB7384.namprd02.prod.outlook.com/
>
> Signed-off-by: Manish Mishra <manish.mishra@nutanix.com>
> Co-authored-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> [Integrated and rebased 2 previous patches (ordered by post time)]
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Note:
>   This patch integrates the idea from 2 previous posted patches (ordered
> by post time)[1] [2], following the s-o-b policy of "Re-starting
> abandoned work" in docs/devel/code-provenance.rst.
>
> [1]: From Manish: https://lore.kernel.org/qemu-devel/20240722101859.47408-1-manish.mishra@nutanix.com/
> [2]: From Xiaoyao: https://lore.kernel.org/qemu-devel/20240813033145.279307-1-xiaoyao.li@intel.com/
> ---
> Changes since RFC:
>  * Rebase and rename the property as "x-force-cpuid-0x1f". (Igor)
> ---
>  target/i386/cpu.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 08c84ba90f52..ee36f7ee2ccc 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -9934,6 +9934,7 @@ static const Property x86_cpu_properties[] = {
>      DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level,
>                       true),
>      DEFINE_PROP_BOOL("x-l1-cache-per-thread", X86CPU, l1_cache_per_core, true),
> +    DEFINE_PROP_BOOL("x-force-cpuid-0x1f", X86CPU, force_cpuid_0x1f, false),
>  };
>  
>  #ifndef CONFIG_USER_ONLY

LGTM.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default
  2025-06-26  8:31 ` [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default Zhao Liu
@ 2025-07-04  3:45   ` Mi, Dapeng
  0 siblings, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:45 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:31 PM, Zhao Liu wrote:
> Host SierraForest CPU has 0x1f leaf by default, so that enable it for
> Guest CPU by default as well.
>
> Suggested-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since RFC:
>  * Rename the property to "x-force-cpuid-0x1f". (Igor)
> ---
>  target/i386/cpu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ee36f7ee2ccc..70f8fc37f8e0 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5392,8 +5392,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>              },
>              {
>                  .version = 3,
> -                .note = "with srf-sp cache model",
> +                .note = "with srf-sp cache model and 0x1f leaf",

Since 0x1f contains multiple sub-leaves, so better change the words to
"0x1f leaves".

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>


>                  .cache_info = &xeon_srf_cache_info,
> +                .props = (PropValue[]) {
> +                    { "x-force-cpuid-0x1f", "on" },
> +                }
>              },
>              { /* end of list */ },
>          },


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids by default
  2025-06-26  8:31 ` [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids " Zhao Liu
@ 2025-07-04  3:47   ` Mi, Dapeng
  0 siblings, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:47 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:31 PM, Zhao Liu wrote:
> Host GraniteRapids CPU has 0x1f leaf by default, so that enable it for
> Guest CPU by default as well.
>
> Suggested-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since RFC:
>  * Rename the property to "x-force-cpuid-0x1f". (Igor)
> ---
>  target/i386/cpu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 70f8fc37f8e0..acf7e0de184d 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5242,8 +5242,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>              },
>              {
>                  .version = 3,
> -                .note = "with gnr-sp cache model",
> +                .note = "with gnr-sp cache model and 0x1f leaf",

ditto.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>

>                  .cache_info = &xeon_gnr_cache_info,
> +                .props = (PropValue[]) {
> +                    { "x-force-cpuid-0x1f", "on" },
> +                }
>              },
>              { /* end of list */ },
>          },


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids by default
  2025-06-26  8:31 ` [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids " Zhao Liu
@ 2025-07-04  3:48   ` Mi, Dapeng
  0 siblings, 0 replies; 26+ messages in thread
From: Mi, Dapeng @ 2025-07-04  3:48 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost
  Cc: Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai, Dapeng Mi,
	Tejus GK, Manish Mishra, qemu-devel


On 6/26/2025 4:31 PM, Zhao Liu wrote:
> Host SapphireRapids CPU has 0x1f leaf by default, so that enable it for
> Guest CPU by default as well.
>
> Suggested-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
> Changes since RFC:
>  * Rename the property to "x-force-cpuid-0x1f". (Igor)
> ---
>  target/i386/cpu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index acf7e0de184d..c7f157a0f71c 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5084,8 +5084,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>              },
>              {
>                  .version = 4,
> -                .note = "with spr-sp cache model",
> +                .note = "with spr-sp cache model and 0x1f leaf",

ditto.

Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>



>                  .cache_info = &xeon_spr_cache_info,
> +                .props = (PropValue[]) {
> +                    { "x-force-cpuid-0x1f", "on" },
> +                }
>              },
>              { /* end of list */ }
>          }


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest
  2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
  2025-07-04  3:33   ` Mi, Dapeng
@ 2025-07-07  0:57   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Tao Su @ 2025-07-07  0:57 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

On Thu, Jun 26, 2025 at 04:30:58PM +0800, Zhao Liu wrote:
> Add the cache model to SierraForest (v3) to better emulate its
> environment.
> 
> The cache model is based on SierraForest-SP (Scalable Performance):
> 
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x0 (0)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 32768 (32 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x0 (0)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x80 (128)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 128
>       (size synth)                       = 65536 (64 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x7 (7)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x1000 (4096)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 4096
>       (size synth)                       = 4194304 (4 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1ff (511)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x24000 (147456)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 147456
>       (size synth)                       = 113246208 (108 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
> 
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 28e5b7859fef..fcaa2625b023 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2883,6 +2883,97 @@ static const CPUCaches epyc_turin_cache_info = {
>          .no_invd_sharing = true,
>          .complex_indexing = false,
>          .share_level = CPU_TOPOLOGY_LEVEL_DIE,
> +    }
> +};
> +
> +static const CPUCaches xeon_srf_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 32 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 128,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 64 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 4096,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 4 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_MODULE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 147456,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 108 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
>      },
>  };
>  
> @@ -5008,6 +5099,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 3,
> +                .note = "with srf-sp cache model",
> +                .cache_info = &xeon_srf_cache_info,
> +            },
>              { /* end of list */ },
>          },
>      },

Reviewed-by: Tao Su <tao1.su@linux.intel.com>

> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids
  2025-06-26  8:30 ` [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids Zhao Liu
  2025-07-04  3:34   ` Mi, Dapeng
@ 2025-07-07  0:58   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Tao Su @ 2025-07-07  0:58 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

On Thu, Jun 26, 2025 at 04:30:59PM +0800, Zhao Liu wrote:
> Add the cache model to GraniteRapids (v3) to better emulate its
> environment.
> 
> The cache model is based on GraniteRapids-SP (Scalable Performance):
> 
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 49152 (48 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 65536 (64 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x800 (2048)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 2048
>       (size synth)                       = 2097152 (2 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0xff (255)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x48000 (294912)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 294912
>       (size synth)                       = 301989888 (288 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
> 
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index fcaa2625b023..b40f1a5b6648 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
>      }
>  };
>  
> +static const CPUCaches xeon_gnr_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 48 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 64 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 2048,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 2 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 294912,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 288 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
> +    },
> +};
> +
>  static const CPUCaches xeon_srf_cache_info = {
>      .l1d_cache = &(CPUCacheInfo) {
>          /* CPUID 0x4.0x0.EAX */
> @@ -4954,6 +5045,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 3,
> +                .note = "with gnr-sp cache model",
> +                .cache_info = &xeon_gnr_cache_info,
> +            },
>              { /* end of list */ },
>          },
>      },

Reviewed-by: Tao Su <tao1.su@linux.intel.com>

> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids
  2025-06-26  8:31 ` [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids Zhao Liu
  2025-07-04  3:35   ` Mi, Dapeng
@ 2025-07-07  0:58   ` Tao Su
  1 sibling, 0 replies; 26+ messages in thread
From: Tao Su @ 2025-07-07  0:58 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Daniel P . Berrangé, Igor Mammedov,
	Eduardo Habkost, Ewan Hai, Jason Zeng, Xiaoyao Li, Tao Su, Yi Lai,
	Dapeng Mi, Tejus GK, Manish Mishra, qemu-devel

On Thu, Jun 26, 2025 at 04:31:00PM +0800, Zhao Liu wrote:
> Add the cache model to SapphireRapids (v4) to better emulate its
> environment.
> 
> The cache model is based on SapphireRapids-SP (Scalable Performance):
> 
>       --- cache 0 ---
>       cache type                         = data cache (1)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xc (12)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 49152 (48 KB)
>       --- cache 1 ---
>       cache type                         = instruction cache (2)
>       cache level                        = 0x1 (1)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x8 (8)
>       number of sets                     = 0x40 (64)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 64
>       (size synth)                       = 32768 (32 KB)
>       --- cache 2 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x2 (2)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x1 (1)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0x10 (16)
>       number of sets                     = 0x800 (2048)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = false
>       number of sets (s)                 = 2048
>       (size synth)                       = 2097152 (2 MB)
>       --- cache 3 ---
>       cache type                         = unified cache (3)
>       cache level                        = 0x3 (3)
>       self-initializing cache level      = true
>       fully associative cache            = false
>       maximum IDs for CPUs sharing cache = 0x7f (127)
>       maximum IDs for cores in pkg       = 0x3f (63)
>       system coherency line size         = 0x40 (64)
>       physical line partitions           = 0x1 (1)
>       ways of associativity              = 0xf (15)
>       number of sets                     = 0x10000 (65536)
>       WBINVD/INVD acts on lower caches   = false
>       inclusive to lower caches          = false
>       complex cache indexing             = true
>       number of sets (s)                 = 65536
>       (size synth)                       = 62914560 (60 MB)
>       --- cache 4 ---
>       cache type                         = no more caches (0)
> 
> Suggested-by: Tejus GK <tejus.gk@nutanix.com>
> Suggested-by: Jason Zeng <jason.zeng@intel.com>
> Suggested-by: "Daniel P . Berrangé" <berrange@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  target/i386/cpu.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b40f1a5b6648..a7f2e5dd3fcb 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -2886,6 +2886,97 @@ static const CPUCaches epyc_turin_cache_info = {
>      }
>  };
>  
> +static const CPUCaches xeon_spr_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x0.EAX */
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x0.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 12,
> +
> +        /* CPUID 0x4.0x0.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x0.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 48 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x1.EAX */
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x1.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 8,
> +
> +        /* CPUID 0x4.0x1.ECX */
> +        .sets = 64,
> +
> +        /* CPUID 0x4.0x1.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 32 * KiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x2.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x2.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 16,
> +
> +        /* CPUID 0x4.0x2.ECX */
> +        .sets = 2048,
> +
> +        /* CPUID 0x4.0x2.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = false,
> +
> +        .size = 2 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        /* CPUID 0x4.0x3.EAX */
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .self_init = true,
> +
> +        /* CPUID 0x4.0x3.EBX */
> +        .line_size = 64,
> +        .partitions = 1,
> +        .associativity = 15,
> +
> +        /* CPUID 0x4.0x3.ECX */
> +        .sets = 65536,
> +
> +        /* CPUID 0x4.0x3.EDX */
> +        .no_invd_sharing = false,
> +        .inclusive = false,
> +        .complex_indexing = true,
> +
> +        .size = 60 * MiB,
> +        .share_level = CPU_TOPOLOGY_LEVEL_SOCKET,
> +    },
> +};
> +
>  static const CPUCaches xeon_gnr_cache_info = {
>      .l1d_cache = &(CPUCacheInfo) {
>          /* CPUID 0x4.0x0.EAX */
> @@ -4892,6 +4983,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>                      { /* end of list */ }
>                  }
>              },
> +            {
> +                .version = 4,
> +                .note = "with spr-sp cache model",
> +                .cache_info = &xeon_spr_cache_info,
> +            },
>              { /* end of list */ }
>          }
>      },

Reviewed-by: Tao Su <tao1.su@linux.intel.com>

> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-07-07  1:07 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26  8:30 [PATCH 0/8] i386/cpu: Intel cache model & topo CPUID enhencement Zhao Liu
2025-06-26  8:30 ` [PATCH 1/8] i386/cpu: Introduce cache model for SierraForest Zhao Liu
2025-07-04  3:33   ` Mi, Dapeng
2025-07-07  0:57   ` Tao Su
2025-06-26  8:30 ` [PATCH 2/8] i386/cpu: Introduce cache model for GraniteRapids Zhao Liu
2025-07-04  3:34   ` Mi, Dapeng
2025-07-07  0:58   ` Tao Su
2025-06-26  8:31 ` [PATCH 3/8] i386/cpu: Introduce cache model for SapphireRapids Zhao Liu
2025-07-04  3:35   ` Mi, Dapeng
2025-07-07  0:58   ` Tao Su
2025-06-26  8:31 ` [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng Zhao Liu
2025-06-29  9:47   ` Ewan Hai
2025-07-02  6:35     ` Zhao Liu
2025-07-02  9:35       ` Ewan Hai
2025-06-26  8:31 ` [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property Zhao Liu
2025-06-26 12:07   ` Ewan Hai
2025-06-27  3:05     ` Zhao Liu
2025-06-27  6:48       ` Ewan Hai
2025-06-27 10:00         ` Zhao Liu
2025-07-04  3:38   ` Mi, Dapeng
2025-06-26  8:31 ` [PATCH 6/8] i386/cpu: Enable 0x1f leaf for SierraForest by default Zhao Liu
2025-07-04  3:45   ` Mi, Dapeng
2025-06-26  8:31 ` [PATCH 7/8] i386/cpu: Enable 0x1f leaf for GraniteRapids " Zhao Liu
2025-07-04  3:47   ` Mi, Dapeng
2025-06-26  8:31 ` [PATCH 8/8] i386/cpu: Enable 0x1f leaf for SapphireRapids " Zhao Liu
2025-07-04  3:48   ` Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).