[PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models
@ 2023-05-04 20:53 Babu Moger
  2023-05-04 20:53 ` [PATCH v4 1/7] target/i386: allow versioned CPUs to specify new cache_info Babu Moger
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

This series updates the AMD EPYC models and adds new EPYC-Genoa model.

Here are the features.
a. Allow versioned CPUs to specify new cache_info pointers.
b. Add EPYC-v4, EPYC-Rome-v3 and EPYC-Milan-v2 fixing the
   cache_info.complex_indexing.
c. Introduce EPYC-Milan-v2 by adding few missing feature bits.
d. Add CPU model for AMD EPYC Genoa processor series

This series depends on the following recent kernel commits:
8c19b6f257fa ("KVM: x86: Propagate the AMD Automatic IBRS feature to the guest")
e7862eda309e ("x86/cpu: Support AMD Automatic IBRS")
5b909d4ae59a ("x86/cpu, kvm: Add the Null Selector Clears Base feature")
a9dc9ec5a1fa ("x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature")
0977cfac6e76 ("KVM: nSVM: Implement support for nested VNMI")
fa4c027a7956 ("KVM: x86: Add support for SVM's Virtual NMI")
---
v4:
  Minor text changes and function name change in patch1 (Robert Hoo).

v3:
  Refreshed the patches on top of latest master.
  Add CPU model for AMD EPYC Genoa processor series (zen4)
  
v2:
  Refreshed the patches on top of latest master.
  Changed the feature NULL_SELECT_CLEARS_BASE to NULL_SEL_CLR_BASE to
  match the kernel name.
  https://lore.kernel.org/kvm/20221205233235.622491-3-kim.phillips@amd.com/

v1: https://lore.kernel.org/kvm/167001034454.62456.7111414518087569436.stgit@bmoger-ubuntu/
v2: https://lore.kernel.org/kvm/20230106185700.28744-1-babu.moger@amd.com/
v3: https://lore.kernel.org/kvm/20230424163401.23018-1-babu.moger@amd.com/

Babu Moger (5):
  target/i386: Add a couple of feature bits in  8000_0008_EBX
  target/i386: Add feature bits for CPUID_Fn80000021_EAX
  target/i386: Add missing feature bits in EPYC-Milan model
  target/i386: Add VNMI and automatic IBRS feature bits
  target/i386: Add EPYC-Genoa model to support Zen 4 processor series

Michael Roth (2):
  target/i386: allow versioned CPUs to specify new cache_info
  target/i386: Add new EPYC CPU versions with updated  cache_info

 target/i386/cpu.c | 375 +++++++++++++++++++++++++++++++++++++++++++++-
 target/i386/cpu.h |  15 ++
 2 files changed, 384 insertions(+), 6 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v4 1/7] target/i386: allow versioned CPUs to specify new cache_info
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-04 20:53 ` [PATCH v4 2/7] target/i386: Add new EPYC CPU versions with updated cache_info Babu Moger
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

From: Michael Roth <michael.roth@amd.com>

New EPYC CPUs versions require small changes to their cache_info's.
Because current QEMU x86 CPU definition does not support versioned
cach_info, we would have to declare a new CPU type for each such case.
To avoid the dup work, add "cache_info" in X86CPUVersionDefinition",
to allow new cache_info pointers to be specified for a new CPU version.

Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 target/i386/cpu.c | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6576287e5b..6e5d2779c9 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1598,6 +1598,7 @@ typedef struct X86CPUVersionDefinition {
     const char *alias;
     const char *note;
     PropValue *props;
+    const CPUCaches *const cache_info;
 } X86CPUVersionDefinition;
 
 /* Base definition for a CPU model */
@@ -5192,6 +5193,31 @@ static void x86_cpu_apply_version_props(X86CPU *cpu, X86CPUModel *model)
     assert(vdef->version == version);
 }
 
+static const CPUCaches *x86_cpu_get_versioned_cache_info(X86CPU *cpu,
+                                                         X86CPUModel *model)
+{
+    const X86CPUVersionDefinition *vdef;
+    X86CPUVersion version = x86_cpu_model_resolve_version(model);
+    const CPUCaches *cache_info = model->cpudef->cache_info;
+
+    if (version == CPU_VERSION_LEGACY) {
+        return cache_info;
+    }
+
+    for (vdef = x86_cpu_def_get_versions(model->cpudef); vdef->version; vdef++) {
+        if (vdef->cache_info) {
+            cache_info = vdef->cache_info;
+        }
+
+        if (vdef->version == version) {
+            break;
+        }
+    }
+
+    assert(vdef->version == version);
+    return cache_info;
+}
+
 /*
  * Load data from X86CPUDefinition into a X86CPU object.
  * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
@@ -5224,7 +5250,7 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model)
     }
 
     /* legacy-cache defaults to 'off' if CPU model provides cache info */
-    cpu->legacy_cache = !def->cache_info;
+    cpu->legacy_cache = !x86_cpu_get_versioned_cache_info(cpu, model);
 
     env->features[FEAT_1_ECX] |= CPUID_EXT_HYPERVISOR;
 
@@ -6703,14 +6729,17 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
 
     /* Cache information initialization */
     if (!cpu->legacy_cache) {
-        if (!xcc->model || !xcc->model->cpudef->cache_info) {
+        const CPUCaches *cache_info =
+            x86_cpu_get_versioned_cache_info(cpu, xcc->model);
+
+        if (!xcc->model || !cache_info) {
             g_autofree char *name = x86_cpu_class_get_model_name(xcc);
             error_setg(errp,
                        "CPU model '%s' doesn't support legacy-cache=off", name);
             return;
         }
         env->cache_info_cpuid2 = env->cache_info_cpuid4 = env->cache_info_amd =
-            *xcc->model->cpudef->cache_info;
+            *cache_info;
     } else {
         /* Build legacy cache information */
         env->cache_info_cpuid2.l1d_cache = &legacy_l1d_cache;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 2/7] target/i386: Add new EPYC CPU versions with updated cache_info
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
  2023-05-04 20:53 ` [PATCH v4 1/7] target/i386: allow versioned CPUs to specify new cache_info Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-04 20:53 ` [PATCH v4 3/7] target/i386: Add a couple of feature bits in 8000_0008_EBX Babu Moger
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

From: Michael Roth <michael.roth@amd.com>

Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
The only difference vs. older models is an updated cache_info with
the 'complex_indexing' bit unset, since this bit is not currently
defined for AMD and may cause problems should it be used for
something else in the future. Setting this bit will also cause
CPUID validation failures when running SEV-SNP guests.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 target/i386/cpu.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 118 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6e5d2779c9..6c20ce86d1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1707,6 +1707,56 @@ static const CPUCaches epyc_cache_info = {
     },
 };
 
+static CPUCaches epyc_v4_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        .type = DATA_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .size = 64 * KiB,
+        .line_size = 64,
+        .associativity = 4,
+        .partitions = 1,
+        .sets = 256,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .size = 512 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 1024,
+        .lines_per_tag = 1,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .size = 8 * MiB,
+        .line_size = 64,
+        .associativity = 16,
+        .partitions = 1,
+        .sets = 8192,
+        .lines_per_tag = 1,
+        .self_init = true,
+        .inclusive = true,
+        .complex_indexing = false,
+    },
+};
+
 static const CPUCaches epyc_rome_cache_info = {
     .l1d_cache = &(CPUCacheInfo) {
         .type = DATA_CACHE,
@@ -1757,6 +1807,56 @@ static const CPUCaches epyc_rome_cache_info = {
     },
 };
 
+static const CPUCaches epyc_rome_v3_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        .type = DATA_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .size = 512 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 1024,
+        .lines_per_tag = 1,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .size = 16 * MiB,
+        .line_size = 64,
+        .associativity = 16,
+        .partitions = 1,
+        .sets = 16384,
+        .lines_per_tag = 1,
+        .self_init = true,
+        .inclusive = true,
+        .complex_indexing = false,
+    },
+};
+
 static const CPUCaches epyc_milan_cache_info = {
     .l1d_cache = &(CPUCacheInfo) {
         .type = DATA_CACHE,
@@ -4091,6 +4191,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 4,
+                .props = (PropValue[]) {
+                    { "model-id",
+                      "AMD EPYC-v4 Processor" },
+                    { /* end of list */ }
+                },
+                .cache_info = &epyc_v4_cache_info
+            },
             { /* end of list */ }
         }
     },
@@ -4210,6 +4319,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                     { /* end of list */ }
                 }
             },
+            {
+                .version = 3,
+                .props = (PropValue[]) {
+                    { "model-id",
+                      "AMD EPYC-Rome-v3 Processor" },
+                    { /* end of list */ }
+                },
+                .cache_info = &epyc_rome_v3_cache_info
+            },
             { /* end of list */ }
         }
     },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 3/7] target/i386: Add a couple of feature bits in 8000_0008_EBX
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
  2023-05-04 20:53 ` [PATCH v4 1/7] target/i386: allow versioned CPUs to specify new cache_info Babu Moger
  2023-05-04 20:53 ` [PATCH v4 2/7] target/i386: Add new EPYC CPU versions with updated cache_info Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-04 20:53 ` [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX Babu Moger
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

Add the following feature bits.

amd-psfd : Predictive Store Forwarding Disable:
           PSF is a hardware-based micro-architectural optimization
           designed to improve the performance of code execution by
           predicting address dependencies between loads and stores.
           While SSBD (Speculative Store Bypass Disable) disables both
           PSF and speculative store bypass, PSFD only disables PSF.
           PSFD may be desirable for the software which is concerned
           with the speculative behavior of PSF but desires a smaller
           performance impact than setting SSBD.
	   Depends on the following kernel commit:
           b73a54321ad8 ("KVM: x86: Expose Predictive Store Forwarding Disable")

stibp-always-on :
           Single Thread Indirect Branch Prediction mode has enhanced
           performance and may be left always on.

The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
   Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING

Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
---
 target/i386/cpu.c | 4 ++--
 target/i386/cpu.h | 4 ++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6c20ce86d1..1a79d224da 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -911,10 +911,10 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             NULL, NULL, NULL, NULL,
             NULL, "wbnoinvd", NULL, NULL,
             "ibpb", NULL, "ibrs", "amd-stibp",
-            NULL, NULL, NULL, NULL,
+            NULL, "stibp-always-on", NULL, NULL,
             NULL, NULL, NULL, NULL,
             "amd-ssbd", "virt-ssbd", "amd-no-ssb", NULL,
-            NULL, NULL, NULL, NULL,
+            "amd-psfd", NULL, NULL, NULL,
         },
         .cpuid = { .eax = 0x80000008, .reg = R_EBX, },
         .tcg_features = 0,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d243e290d3..14645e3cb8 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -932,8 +932,12 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_8000_0008_EBX_IBRS        (1U << 14)
 /* Single Thread Indirect Branch Predictors */
 #define CPUID_8000_0008_EBX_STIBP       (1U << 15)
+/* STIBP mode has enhanced performance and may be left always on */
+#define CPUID_8000_0008_EBX_STIBP_ALWAYS_ON    (1U << 17)
 /* Speculative Store Bypass Disable */
 #define CPUID_8000_0008_EBX_AMD_SSBD    (1U << 24)
+/* Predictive Store Forwarding Disable */
+#define CPUID_8000_0008_EBX_AMD_PSFD    (1U << 28)
 
 #define CPUID_XSAVE_XSAVEOPT   (1U << 0)
 #define CPUID_XSAVE_XSAVEC     (1U << 1)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
                   ` (2 preceding siblings ...)
  2023-05-04 20:53 ` [PATCH v4 3/7] target/i386: Add a couple of feature bits in 8000_0008_EBX Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-05  8:29   ` Paolo Bonzini
  2023-05-04 20:53 ` [PATCH v4 5/7] target/i386: Add missing feature bits in EPYC-Milan model Babu Moger
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

Add the following feature bits.
no-nested-data-bp	  : Processor ignores nested data breakpoints.
lfence-always-serializing : LFENCE instruction is always serializing.
null-sel-cls-base	  : Null Selector Clears Base. When this bit is
			    set, a null segment load clears the segment base.

The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
   Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
    40332 4.05 Date October 2022

Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
---
 target/i386/cpu.c | 24 ++++++++++++++++++++++++
 target/i386/cpu.h |  8 ++++++++
 2 files changed, 32 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1a79d224da..5c93c230e6 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -920,6 +920,22 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         .tcg_features = 0,
         .unmigratable_flags = 0,
     },
+    [FEAT_8000_0021_EAX] = {
+        .type = CPUID_FEATURE_WORD,
+        .feat_names = {
+            "no-nested-data-bp", NULL, "lfence-always-serializing", NULL,
+            NULL, NULL, "null-sel-clr-base", NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, NULL,
+        },
+        .cpuid = { .eax = 0x80000021, .reg = R_EAX, },
+        .tcg_features = 0,
+        .unmigratable_flags = 0,
+    },
     [FEAT_XSAVE] = {
         .type = CPUID_FEATURE_WORD,
         .feat_names = {
@@ -6135,6 +6151,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             *ebx |= sev_get_reduced_phys_bits() << 6;
         }
         break;
+    case 0x80000021:
+        *eax = env->features[FEAT_8000_0021_EAX];
+        *ebx = *ecx = *edx = 0;
+        break;
     default:
         /* reserved values: zero */
         *eax = 0;
@@ -6564,6 +6584,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
             x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel, 0x8000001F);
         }
 
+        if (env->features[FEAT_8000_0021_EAX]) {
+            x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel, 0x80000021);
+        }
+
         /* SGX requires CPUID[0x12] for EPC enumeration */
         if (env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_SGX) {
             x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x12);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 14645e3cb8..7cf811d8fe 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -600,6 +600,7 @@ typedef enum FeatureWord {
     FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */
     FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */
     FEAT_8000_0008_EBX, /* CPUID[8000_0008].EBX */
+    FEAT_8000_0021_EAX, /* CPUID[8000_0021].EAX */
     FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */
     FEAT_KVM,           /* CPUID[4000_0001].EAX (KVM_CPUID_FEATURES) */
     FEAT_KVM_HINTS,     /* CPUID[4000_0001].EDX */
@@ -939,6 +940,13 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 /* Predictive Store Forwarding Disable */
 #define CPUID_8000_0008_EBX_AMD_PSFD    (1U << 28)
 
+/* Processor ignores nested data breakpoints */
+#define CPUID_8000_0021_EAX_No_NESTED_DATA_BP    (1U << 0)
+/* LFENCE is always serializing */
+#define CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING    (1U << 2)
+/* Null Selector Clears Base */
+#define CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE    (1U << 6)
+
 #define CPUID_XSAVE_XSAVEOPT   (1U << 0)
 #define CPUID_XSAVE_XSAVEC     (1U << 1)
 #define CPUID_XSAVE_XGETBV1    (1U << 2)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 5/7] target/i386: Add missing feature bits in EPYC-Milan model
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
                   ` (3 preceding siblings ...)
  2023-05-04 20:53 ` [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-04 20:53 ` [PATCH v4 6/7] target/i386: Add VNMI and automatic IBRS feature bits Babu Moger
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

Add the following feature bits for EPYC-Milan model and bump the version.
vaes            : Vector VAES(ENC|DEC), VAES(ENC|DEC)LAST instruction support
vpclmulqdq	: Vector VPCLMULQDQ instruction support
stibp-always-on : Single Thread Indirect Branch Prediction Mode has enhanced
                  performance and may be left Always on
amd-psfd	: Predictive Store Forward Disable
no-nested-data-bp         : Processor ignores nested data breakpoints
lfence-always-serializing : LFENCE instruction is always serializing
null-sel-clr-base         : Null Selector Clears Base. When this bit is
                            set, a null segment load clears the segment base

These new features will be added in EPYC-Milan-v2. The "-cpu help" output
after the change will be.

    x86 EPYC-Milan             (alias configured by machine type)
    x86 EPYC-Milan-v1          AMD EPYC-Milan Processor
    x86 EPYC-Milan-v2          AMD EPYC-Milan Processor

The documentation for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
   Revision B1 Processors
b. SECURITY ANALYSIS OF AMD PREDICTIVE STORE FORWARDING
c. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
    40332 4.05 Date October 2022

Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
---
 target/i386/cpu.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5c93c230e6..0a6fb2fc82 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1923,6 +1923,56 @@ static const CPUCaches epyc_milan_cache_info = {
     },
 };
 
+static const CPUCaches epyc_milan_v2_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        .type = DATA_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .size = 512 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 1024,
+        .lines_per_tag = 1,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .size = 32 * MiB,
+        .line_size = 64,
+        .associativity = 16,
+        .partitions = 1,
+        .sets = 32768,
+        .lines_per_tag = 1,
+        .self_init = true,
+        .inclusive = true,
+        .complex_indexing = false,
+    },
+};
+
 /* The following VMX features are not supported by KVM and are left out in the
  * CPU definitions:
  *
@@ -4401,6 +4451,26 @@ static const X86CPUDefinition builtin_x86_defs[] = {
         .xlevel = 0x8000001E,
         .model_id = "AMD EPYC-Milan Processor",
         .cache_info = &epyc_milan_cache_info,
+        .versions = (X86CPUVersionDefinition[]) {
+            { .version = 1 },
+            {
+                .version = 2,
+                .props = (PropValue[]) {
+                    { "model-id",
+                      "AMD EPYC-Milan-v2 Processor" },
+                    { "vaes", "on" },
+                    { "vpclmulqdq", "on" },
+                    { "stibp-always-on", "on" },
+                    { "amd-psfd", "on" },
+                    { "no-nested-data-bp", "on" },
+                    { "lfence-always-serializing", "on" },
+                    { "null-sel-clr-base", "on" },
+                    { /* end of list */ }
+                },
+                .cache_info = &epyc_milan_v2_cache_info
+            },
+            { /* end of list */ }
+        }
     },
 };
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 6/7] target/i386: Add VNMI and automatic IBRS feature bits
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
                   ` (4 preceding siblings ...)
  2023-05-04 20:53 ` [PATCH v4 5/7] target/i386: Add missing feature bits in EPYC-Milan model Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2023-05-04 20:53 ` [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series Babu Moger
  2023-05-05  8:31 ` [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Paolo Bonzini
  7 siblings, 0 replies; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

Add the following featute bits.

vnmi: Virtual NMI (VNMI) allows the hypervisor to inject the NMI into the
      guest without using Event Injection mechanism meaning not required to
      track the guest NMI and intercepting the IRET.
      The presence of this feature is indicated via the CPUID function
      0x8000000A_EDX[25].


automatic-ibrs :
      The AMD Zen4 core supports a new feature called Automatic IBRS.
      It is a "set-and-forget" feature that means that, unlike e.g.,
      s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
      resources automatically across CPL transitions.
      The presence of this feature is indicated via the CPUID function
      0x80000021_EAX[8].

The documention for the features are available in the links below.
a. Processor Programming Reference (PPR) for AMD Family 19h Model 01h,
   Revision B1 Processors
b. AMD64 Architecture Programmer’s Manual Volumes 1–5 Publication No. Revision
   40332 4.05 Date October 2022

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
Link: https://www.amd.com/system/files/TechDocs/40332_4.05.pdf
---
 target/i386/cpu.c | 4 ++--
 target/i386/cpu.h | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0a6fb2fc82..d50ace84bf 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -806,7 +806,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "pfthreshold", "avic", NULL, "v-vmsave-vmload",
             "vgif", NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
-            NULL, NULL, NULL, NULL,
+            NULL, "vnmi", NULL, NULL,
             "svme-addr-chk", NULL, NULL, NULL,
         },
         .cpuid = { .eax = 0x8000000A, .reg = R_EDX, },
@@ -925,7 +925,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         .feat_names = {
             "no-nested-data-bp", NULL, "lfence-always-serializing", NULL,
             NULL, NULL, "null-sel-clr-base", NULL,
-            NULL, NULL, NULL, NULL,
+            "auto-ibrs", NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7cf811d8fe..f6575f1f01 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -773,6 +773,7 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_SVM_AVIC            (1U << 13)
 #define CPUID_SVM_V_VMSAVE_VMLOAD (1U << 15)
 #define CPUID_SVM_VGIF            (1U << 16)
+#define CPUID_SVM_VNMI            (1U << 25)
 #define CPUID_SVM_SVME_ADDR_CHK   (1U << 28)
 
 /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */
@@ -946,6 +947,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING    (1U << 2)
 /* Null Selector Clears Base */
 #define CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE    (1U << 6)
+/* Automatic IBRS */
+#define CPUID_8000_0021_EAX_AUTO_IBRS   (1U << 8)
 
 #define CPUID_XSAVE_XSAVEOPT   (1U << 0)
 #define CPUID_XSAVE_XSAVEC     (1U << 1)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
                   ` (5 preceding siblings ...)
  2023-05-04 20:53 ` [PATCH v4 6/7] target/i386: Add VNMI and automatic IBRS feature bits Babu Moger
@ 2023-05-04 20:53 ` Babu Moger
  2024-11-08 18:15   ` Maksim Davydov
  2023-05-05  8:31 ` [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Paolo Bonzini
  7 siblings, 1 reply; 18+ messages in thread
From: Babu Moger @ 2023-05-04 20:53 UTC (permalink / raw)
  To: pbonzini, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, babu.moger, bdas

Adds the support for AMD EPYC Genoa generation processors. The model
display for the new processor will be EPYC-Genoa.

Adds the following new feature bits on top of the feature bits from
the previous generation EPYC models.

avx512f         : AVX-512 Foundation instruction
avx512dq        : AVX-512 Doubleword & Quadword Instruction
avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
avx512cd        : AVX-512 Conflict Detection instruction
avx512bw        : AVX-512 Byte and Word Instructions
avx512vl        : AVX-512 Vector Length Extension Instructions
avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
gfni            : AVX-512 Galois Field New Instructions
avx512_vnni     : AVX-512 Vector Neural Network Instructions
avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
                  Quadword Instructions
avx512_bf16	: AVX-512 BFLOAT16 instructions
la57            : 57-bit virtual address support (5-level Page Tables)
vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject the NMI
                  into the guest without using Event Injection mechanism
                  meaning not required to track the guest NMI and intercepting
                  the IRET.
auto-ibrs       : The AMD Zen4 core supports a new feature called Automatic IBRS.
                  It is a "set-and-forget" feature that means that, unlike e.g.,
                  s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
                  resources automatically across CPL transitions.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d50ace84bf..71fe1e02ee 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info = {
     },
 };
 
+static const CPUCaches epyc_genoa_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        .type = DATA_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .size = 32 * KiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 64,
+        .lines_per_tag = 1,
+        .self_init = 1,
+        .no_invd_sharing = true,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .size = 1 * MiB,
+        .line_size = 64,
+        .associativity = 8,
+        .partitions = 1,
+        .sets = 2048,
+        .lines_per_tag = 1,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .size = 32 * MiB,
+        .line_size = 64,
+        .associativity = 16,
+        .partitions = 1,
+        .sets = 32768,
+        .lines_per_tag = 1,
+        .self_init = true,
+        .inclusive = true,
+        .complex_indexing = false,
+    },
+};
+
 /* The following VMX features are not supported by KVM and are left out in the
  * CPU definitions:
  *
@@ -4472,6 +4522,78 @@ static const X86CPUDefinition builtin_x86_defs[] = {
             { /* end of list */ }
         }
     },
+    {
+        .name = "EPYC-Genoa",
+        .level = 0xd,
+        .vendor = CPUID_VENDOR_AMD,
+        .family = 25,
+        .model = 17,
+        .stepping = 0,
+        .features[FEAT_1_EDX] =
+            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | CPUID_CLFLUSH |
+            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | CPUID_PGE |
+            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | CPUID_MCE |
+            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
+            CPUID_VME | CPUID_FP87,
+        .features[FEAT_1_ECX] =
+            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
+            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
+            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
+            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
+            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
+            CPUID_EXT_SSE3,
+        .features[FEAT_8000_0001_EDX] =
+            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
+            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
+            CPUID_EXT2_SYSCALL,
+        .features[FEAT_8000_0001_ECX] =
+            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
+            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
+            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
+            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
+        .features[FEAT_8000_0008_EBX] =
+            CPUID_8000_0008_EBX_CLZERO | CPUID_8000_0008_EBX_XSAVEERPTR |
+            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
+            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
+            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
+            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
+        .features[FEAT_8000_0021_EAX] =
+            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
+            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
+            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
+            CPUID_8000_0021_EAX_AUTO_IBRS,
+        .features[FEAT_7_0_EBX] =
+            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
+            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS |
+            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
+            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
+            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
+            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
+            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
+            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
+        .features[FEAT_7_0_ECX] =
+            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
+            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
+            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
+            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
+            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
+            CPUID_7_0_ECX_RDPID,
+        .features[FEAT_7_0_EDX] =
+            CPUID_7_0_EDX_FSRM,
+        .features[FEAT_7_1_EAX] =
+            CPUID_7_1_EAX_AVX512_BF16,
+        .features[FEAT_XSAVE] =
+            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
+            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
+        .features[FEAT_6_EAX] =
+            CPUID_6_EAX_ARAT,
+        .features[FEAT_SVM] =
+            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
+            CPUID_SVM_SVME_ADDR_CHK,
+        .xlevel = 0x80000022,
+        .model_id = "AMD EPYC-Genoa Processor",
+        .cache_info = &epyc_genoa_cache_info,
+    },
 };
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX
  2023-05-04 20:53 ` [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX Babu Moger
@ 2023-05-05  8:29   ` Paolo Bonzini
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2023-05-05  8:29 UTC (permalink / raw)
  To: Babu Moger, richard.henderson
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas

On 5/4/23 22:53, Babu Moger wrote:
> Add the following feature bits.
> no-nested-data-bp	  : Processor ignores nested data breakpoints.

This bit is useless, unfortunately.  Another similar bit include the one 
about availability of FCS/FDS in the x87 save state.

They say that something is _not_ available, so a strict interpretation 
would prevent migrating from any old processor to Genoa, because in 
theory you never know if guests are using nested data breakpoints.

In practice, this does not really matter because no one used 
them---that's why AMD could get away with removing them---but please 
tell the architects that while they're free to deprecate and remove old 
features, adding CPUID is basically pointless.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models
  2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
                   ` (6 preceding siblings ...)
  2023-05-04 20:53 ` [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series Babu Moger
@ 2023-05-05  8:31 ` Paolo Bonzini
  2023-05-05 17:15   ` Moger, Babu
  7 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2023-05-05  8:31 UTC (permalink / raw)
  To: Babu Moger
  Cc: pbonzini, richard.henderson, weijiang.yang, philmd, dwmw, paul,
	joao.m.martins, qemu-devel, mtosatti, kvm, mst, marcel.apfelbaum,
	yang.zhong, jing2.liu, vkuznets, michael.roth, wei.huang2,
	berrange, bdas

Queued, thanks.

Paolo



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models
  2023-05-05  8:31 ` [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Paolo Bonzini
@ 2023-05-05 17:15   ` Moger, Babu
  0 siblings, 0 replies; 18+ messages in thread
From: Moger, Babu @ 2023-05-05 17:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: richard.henderson@linaro.org, weijiang.yang@intel.com,
	philmd@linaro.org, dwmw@amazon.co.uk, paul@xen.org,
	joao.m.martins@oracle.com, qemu-devel@nongnu.org,
	mtosatti@redhat.com, kvm@vger.kernel.org, mst@redhat.com,
	marcel.apfelbaum@gmail.com, yang.zhong@intel.com,
	jing2.liu@intel.com, vkuznets@redhat.com, Roth, Michael,
	Huang2, Wei, berrange@redhat.com, bdas@redhat.com

[AMD Official Use Only - General]


> -----Original Message-----
> From: Paolo Bonzini <pbonzini@redhat.com>
> Sent: Friday, May 5, 2023 3:31 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: pbonzini@redhat.com; richard.henderson@linaro.org;
> weijiang.yang@intel.com; philmd@linaro.org; dwmw@amazon.co.uk;
> paul@xen.org; joao.m.martins@oracle.com; qemu-devel@nongnu.org;
> mtosatti@redhat.com; kvm@vger.kernel.org; mst@redhat.com;
> marcel.apfelbaum@gmail.com; yang.zhong@intel.com; jing2.liu@intel.com;
> vkuznets@redhat.com; Roth, Michael <Michael.Roth@amd.com>; Huang2, Wei
> <Wei.Huang2@amd.com>; berrange@redhat.com; bdas@redhat.com
> Subject: Re: [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC
> Models
> 
> Queued, thanks.

Thank You.
Babu


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2023-05-04 20:53 ` [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series Babu Moger
@ 2024-11-08 18:15   ` Maksim Davydov
  2024-11-08 20:56     ` Moger, Babu
  0 siblings, 1 reply; 18+ messages in thread
From: Maksim Davydov @ 2024-11-08 18:15 UTC (permalink / raw)
  To: Babu Moger
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson

Hi!
I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa 
host. I found some mismatches that confused me. Could you help me to 
understand them?

On 5/4/23 23:53, Babu Moger wrote:
> Adds the support for AMD EPYC Genoa generation processors. The model
> display for the new processor will be EPYC-Genoa.
> 
> Adds the following new feature bits on top of the feature bits from
> the previous generation EPYC models.
> 
> avx512f         : AVX-512 Foundation instruction
> avx512dq        : AVX-512 Doubleword & Quadword Instruction
> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
> avx512cd        : AVX-512 Conflict Detection instruction
> avx512bw        : AVX-512 Byte and Word Instructions
> avx512vl        : AVX-512 Vector Length Extension Instructions
> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
> gfni            : AVX-512 Galois Field New Instructions
> avx512_vnni     : AVX-512 Vector Neural Network Instructions
> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>                    Quadword Instructions
> avx512_bf16	: AVX-512 BFLOAT16 instructions
> la57            : 57-bit virtual address support (5-level Page Tables)
> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject the NMI
>                    into the guest without using Event Injection mechanism
>                    meaning not required to track the guest NMI and intercepting
>                    the IRET.
> auto-ibrs       : The AMD Zen4 core supports a new feature called Automatic IBRS.
>                    It is a "set-and-forget" feature that means that, unlike e.g.,
>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS mitigation
>                    resources automatically across CPL transitions.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 122 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d50ace84bf..71fe1e02ee 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info = {
>       },
>   };
>   
> +static const CPUCaches epyc_genoa_cache_info = {
> +    .l1d_cache = &(CPUCacheInfo) {
> +        .type = DATA_CACHE,
> +        .level = 1,
> +        .size = 32 * KiB,
> +        .line_size = 64,
> +        .associativity = 8,
> +        .partitions = 1,
> +        .sets = 64,
> +        .lines_per_tag = 1,
> +        .self_init = 1,
> +        .no_invd_sharing = true,
> +    },
> +    .l1i_cache = &(CPUCacheInfo) {
> +        .type = INSTRUCTION_CACHE,
> +        .level = 1,
> +        .size = 32 * KiB,
> +        .line_size = 64,
> +        .associativity = 8,
> +        .partitions = 1,
> +        .sets = 64,
> +        .lines_per_tag = 1,
> +        .self_init = 1,
> +        .no_invd_sharing = true,
> +    },
> +    .l2_cache = &(CPUCacheInfo) {
> +        .type = UNIFIED_CACHE,
> +        .level = 2,
> +        .size = 1 * MiB,
> +        .line_size = 64,
> +        .associativity = 8,
> +        .partitions = 1,
> +        .sets = 2048,
> +        .lines_per_tag = 1,

1. Why L2 cache is not shown as inclusive and self-initializing?

PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
* cache inclusive. Read-only. Reset: Fixed,1.
* cache is self-initializing. Read-only. Reset: Fixed,1.

> +    },
> +    .l3_cache = &(CPUCacheInfo) {
> +        .type = UNIFIED_CACHE,
> +        .level = 3,
> +        .size = 32 * MiB,
> +        .line_size = 64,
> +        .associativity = 16,
> +        .partitions = 1,
> +        .sets = 32768,
> +        .lines_per_tag = 1,
> +        .self_init = true,
> +        .inclusive = true,
> +        .complex_indexing = false,

2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that 
the WBINVD/INVD instruction is not guaranteed to invalidate all lower 
level caches (0 bit)?

PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
* cache inclusive. Read-only. Reset: Fixed,0.
* Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.



3. Why the default stub is used for TLB, but not real values as for 
other caches?

> +    },
> +};
> +
>   /* The following VMX features are not supported by KVM and are left out in the
>    * CPU definitions:
>    *
> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>               { /* end of list */ }
>           }
>       },
> +    {
> +        .name = "EPYC-Genoa",
> +        .level = 0xd,
> +        .vendor = CPUID_VENDOR_AMD,
> +        .family = 25,
> +        .model = 17,
> +        .stepping = 0,
> +        .features[FEAT_1_EDX] =
> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | CPUID_CLFLUSH |
> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | CPUID_PGE |
> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | CPUID_MCE |
> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
> +            CPUID_VME | CPUID_FP87,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
> +            CPUID_EXT_SSE3,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
> +            CPUID_EXT2_SYSCALL,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
> +        .features[FEAT_8000_0008_EBX] =
> +            CPUID_8000_0008_EBX_CLZERO | CPUID_8000_0008_EBX_XSAVEERPTR |
> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,

4. Why 0x80000008_EBX features related to speculation vulnerabilities 
(BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?

> +        .features[FEAT_8000_0021_EAX] =
> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
> +            CPUID_8000_0021_EAX_AUTO_IBRS,

5. Why some 0x80000021_EAX features are not set? 
(FsGsKernelGsBaseNonSerializing, FSRC and FSRS)

> +        .features[FEAT_7_0_EBX] =
> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS |
> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
> +        .features[FEAT_7_0_ECX] =
> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
> +            CPUID_7_0_ECX_RDPID,
> +        .features[FEAT_7_0_EDX] =
> +            CPUID_7_0_EDX_FSRM,

6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data 
processors have to use it, am I right?

> +        .features[FEAT_7_1_EAX] =
> +            CPUID_7_1_EAX_AVX512_BF16,
> +        .features[FEAT_XSAVE] =
> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
> +        .features[FEAT_6_EAX] =
> +            CPUID_6_EAX_ARAT,
> +        .features[FEAT_SVM] =
> +            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
> +            CPUID_SVM_SVME_ADDR_CHK,
> +        .xlevel = 0x80000022,
> +        .model_id = "AMD EPYC-Genoa Processor",
> +        .cache_info = &epyc_genoa_cache_info,
> +    },
>   };
>   
>   /*

-- 
Best regards,
Maksim Davydov


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-08 18:15   ` Maksim Davydov
@ 2024-11-08 20:56     ` Moger, Babu
  2024-11-12 10:09       ` Maksim Davydov
  0 siblings, 1 reply; 18+ messages in thread
From: Moger, Babu @ 2024-11-08 20:56 UTC (permalink / raw)
  To: Maksim Davydov, Babu Moger
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson

Hi Maxim,

Thanks for looking into this. I will fix the bits I mentioned below in 
upcoming Genoa/Turin model update.

I have few comments below.

On 11/8/2024 12:15 PM, Maksim Davydov wrote:
> Hi!
> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa 
> host. I found some mismatches that confused me. Could you help me to 
> understand them?
> 
> On 5/4/23 23:53, Babu Moger wrote:
>> Adds the support for AMD EPYC Genoa generation processors. The model
>> display for the new processor will be EPYC-Genoa.
>>
>> Adds the following new feature bits on top of the feature bits from
>> the previous generation EPYC models.
>>
>> avx512f         : AVX-512 Foundation instruction
>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>> avx512cd        : AVX-512 Conflict Detection instruction
>> avx512bw        : AVX-512 Byte and Word Instructions
>> avx512vl        : AVX-512 Vector Length Extension Instructions
>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
>> gfni            : AVX-512 Galois Field New Instructions
>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>                    Quadword Instructions
>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>> la57            : 57-bit virtual address support (5-level Page Tables)
>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject 
>> the NMI
>>                    into the guest without using Event Injection mechanism
>>                    meaning not required to track the guest NMI and 
>> intercepting
>>                    the IRET.
>> auto-ibrs       : The AMD Zen4 core supports a new feature called 
>> Automatic IBRS.
>>                    It is a "set-and-forget" feature that means that, 
>> unlike e.g.,
>>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS 
>> mitigation
>>                    resources automatically across CPL transitions.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 122 insertions(+)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index d50ace84bf..71fe1e02ee 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info 
>> = {
>>       },
>>   };
>> +static const CPUCaches epyc_genoa_cache_info = {
>> +    .l1d_cache = &(CPUCacheInfo) {
>> +        .type = DATA_CACHE,
>> +        .level = 1,
>> +        .size = 32 * KiB,
>> +        .line_size = 64,
>> +        .associativity = 8,
>> +        .partitions = 1,
>> +        .sets = 64,
>> +        .lines_per_tag = 1,
>> +        .self_init = 1,
>> +        .no_invd_sharing = true,
>> +    },
>> +    .l1i_cache = &(CPUCacheInfo) {
>> +        .type = INSTRUCTION_CACHE,
>> +        .level = 1,
>> +        .size = 32 * KiB,
>> +        .line_size = 64,
>> +        .associativity = 8,
>> +        .partitions = 1,
>> +        .sets = 64,
>> +        .lines_per_tag = 1,
>> +        .self_init = 1,
>> +        .no_invd_sharing = true,
>> +    },
>> +    .l2_cache = &(CPUCacheInfo) {
>> +        .type = UNIFIED_CACHE,
>> +        .level = 2,
>> +        .size = 1 * MiB,
>> +        .line_size = 64,
>> +        .associativity = 8,
>> +        .partitions = 1,
>> +        .sets = 2048,
>> +        .lines_per_tag = 1,
> 
> 1. Why L2 cache is not shown as inclusive and self-initializing?
> 
> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
> * cache inclusive. Read-only. Reset: Fixed,1.
> * cache is self-initializing. Read-only. Reset: Fixed,1.

Yes. That is correct. This needs to be fixed. I Will fix it.
> 
>> +    },
>> +    .l3_cache = &(CPUCacheInfo) {
>> +        .type = UNIFIED_CACHE,
>> +        .level = 3,
>> +        .size = 32 * MiB,
>> +        .line_size = 64,
>> +        .associativity = 16,
>> +        .partitions = 1,
>> +        .sets = 32768,
>> +        .lines_per_tag = 1,
>> +        .self_init = true,
>> +        .inclusive = true,
>> +        .complex_indexing = false,
> 
> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that 
> the WBINVD/INVD instruction is not guaranteed to invalidate all lower 
> level caches (0 bit)?
> 
> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
> * cache inclusive. Read-only. Reset: Fixed,0.
> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
> 

Yes. Both of this needs to be fixed. I Will fix it.

> 
> 
> 3. Why the default stub is used for TLB, but not real values as for 
> other caches?

Can you please eloberate on this?

> 
>> +    },
>> +};
>> +
>>   /* The following VMX features are not supported by KVM and are left 
>> out in the
>>    * CPU definitions:
>>    *
>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition 
>> builtin_x86_defs[] = {
>>               { /* end of list */ }
>>           }
>>       },
>> +    {
>> +        .name = "EPYC-Genoa",
>> +        .level = 0xd,
>> +        .vendor = CPUID_VENDOR_AMD,
>> +        .family = 25,
>> +        .model = 17,
>> +        .stepping = 0,
>> +        .features[FEAT_1_EDX] =
>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | 
>> CPUID_CLFLUSH |
>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | 
>> CPUID_PGE |
>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | 
>> CPUID_MCE |
>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>> +            CPUID_VME | CPUID_FP87,
>> +        .features[FEAT_1_ECX] =
>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>> +            CPUID_EXT_SSE3,
>> +        .features[FEAT_8000_0001_EDX] =
>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>> +            CPUID_EXT2_SYSCALL,
>> +        .features[FEAT_8000_0001_ECX] =
>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>> +        .features[FEAT_8000_0008_EBX] =
>> +            CPUID_8000_0008_EBX_CLZERO | 
>> CPUID_8000_0008_EBX_XSAVEERPTR |
>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
> 
> 4. Why 0x80000008_EBX features related to speculation vulnerabilities 
> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?

KVM does not expose these bits to the guests yet.

I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.


> 
>> +        .features[FEAT_8000_0021_EAX] =
>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
> 
> 5. Why some 0x80000021_EAX features are not set? 
> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)

KVM does not expose FSRC and FSRS bits to the guests yet.

The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if 
we can add this bit to the Genoa and Turin.

> 
>> +        .features[FEAT_7_0_EBX] =
>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | 
>> CPUID_7_0_EBX_AVX2 |
>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | 
>> CPUID_7_0_EBX_ERMS |
>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED | 
>> CPUID_7_0_EBX_ADX |
>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>> +        .features[FEAT_7_0_ECX] =
>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | 
>> CPUID_7_0_ECX_PKU |
>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>> +            CPUID_7_0_ECX_RDPID,
>> +        .features[FEAT_7_0_EDX] =
>> +            CPUID_7_0_EDX_FSRM,
> 
> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data 
> processors have to use it, am I right?

KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to 
investigate.


> 
>> +        .features[FEAT_7_1_EAX] =
>> +            CPUID_7_1_EAX_AVX512_BF16,
>> +        .features[FEAT_XSAVE] =
>> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>> +        .features[FEAT_6_EAX] =
>> +            CPUID_6_EAX_ARAT,
>> +        .features[FEAT_SVM] =
>> +            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
>> +            CPUID_SVM_SVME_ADDR_CHK,
>> +        .xlevel = 0x80000022,
>> +        .model_id = "AMD EPYC-Genoa Processor",
>> +        .cache_info = &epyc_genoa_cache_info,
>> +    },
>>   };
>>   /*
> 

-- 
- Babu Moger


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-08 20:56     ` Moger, Babu
@ 2024-11-12 10:09       ` Maksim Davydov
  2024-11-12 16:23         ` Moger, Babu
  2024-11-13 16:23         ` Moger, Babu
  0 siblings, 2 replies; 18+ messages in thread
From: Maksim Davydov @ 2024-11-12 10:09 UTC (permalink / raw)
  To: babu.moger
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson



On 11/8/24 23:56, Moger, Babu wrote:
> Hi Maxim,
> 
> Thanks for looking into this. I will fix the bits I mentioned below in 
> upcoming Genoa/Turin model update.
> 
> I have few comments below.
> 
> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>> Hi!
>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa 
>> host. I found some mismatches that confused me. Could you help me to 
>> understand them?
>>
>> On 5/4/23 23:53, Babu Moger wrote:
>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>> display for the new processor will be EPYC-Genoa.
>>>
>>> Adds the following new feature bits on top of the feature bits from
>>> the previous generation EPYC models.
>>>
>>> avx512f         : AVX-512 Foundation instruction
>>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>>> avx512cd        : AVX-512 Conflict Detection instruction
>>> avx512bw        : AVX-512 Byte and Word Instructions
>>> avx512vl        : AVX-512 Vector Length Extension Instructions
>>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation 
>>> Instruction
>>> gfni            : AVX-512 Galois Field New Instructions
>>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms 
>>> Instructions
>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>                    Quadword Instructions
>>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>>> la57            : 57-bit virtual address support (5-level Page Tables)
>>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject 
>>> the NMI
>>>                    into the guest without using Event Injection 
>>> mechanism
>>>                    meaning not required to track the guest NMI and 
>>> intercepting
>>>                    the IRET.
>>> auto-ibrs       : The AMD Zen4 core supports a new feature called 
>>> Automatic IBRS.
>>>                    It is a "set-and-forget" feature that means that, 
>>> unlike e.g.,
>>>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS 
>>> mitigation
>>>                    resources automatically across CPL transitions.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 122 insertions(+)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index d50ace84bf..71fe1e02ee 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -1973,6 +1973,56 @@ static const CPUCaches 
>>> epyc_milan_v2_cache_info = {
>>>       },
>>>   };
>>> +static const CPUCaches epyc_genoa_cache_info = {
>>> +    .l1d_cache = &(CPUCacheInfo) {
>>> +        .type = DATA_CACHE,
>>> +        .level = 1,
>>> +        .size = 32 * KiB,
>>> +        .line_size = 64,
>>> +        .associativity = 8,
>>> +        .partitions = 1,
>>> +        .sets = 64,
>>> +        .lines_per_tag = 1,
>>> +        .self_init = 1,
>>> +        .no_invd_sharing = true,
>>> +    },
>>> +    .l1i_cache = &(CPUCacheInfo) {
>>> +        .type = INSTRUCTION_CACHE,
>>> +        .level = 1,
>>> +        .size = 32 * KiB,
>>> +        .line_size = 64,
>>> +        .associativity = 8,
>>> +        .partitions = 1,
>>> +        .sets = 64,
>>> +        .lines_per_tag = 1,
>>> +        .self_init = 1,
>>> +        .no_invd_sharing = true,
>>> +    },
>>> +    .l2_cache = &(CPUCacheInfo) {
>>> +        .type = UNIFIED_CACHE,
>>> +        .level = 2,
>>> +        .size = 1 * MiB,
>>> +        .line_size = 64,
>>> +        .associativity = 8,
>>> +        .partitions = 1,
>>> +        .sets = 2048,
>>> +        .lines_per_tag = 1,
>>
>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>
>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>> * cache inclusive. Read-only. Reset: Fixed,1.
>> * cache is self-initializing. Read-only. Reset: Fixed,1.
> 
> Yes. That is correct. This needs to be fixed. I Will fix it.
>>
>>> +    },
>>> +    .l3_cache = &(CPUCacheInfo) {
>>> +        .type = UNIFIED_CACHE,
>>> +        .level = 3,
>>> +        .size = 32 * MiB,
>>> +        .line_size = 64,
>>> +        .associativity = 16,
>>> +        .partitions = 1,
>>> +        .sets = 32768,
>>> +        .lines_per_tag = 1,
>>> +        .self_init = true,
>>> +        .inclusive = true,
>>> +        .complex_indexing = false,
>>
>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that 
>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower 
>> level caches (0 bit)?
>>
>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>> * cache inclusive. Read-only. Reset: Fixed,0.
>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>
> 
> Yes. Both of this needs to be fixed. I Will fix it.
> 
>>
>>
>> 3. Why the default stub is used for TLB, but not real values as for 
>> other caches?
> 
> Can you please eloberate on this?
> 

For L1i, L1d, L2 and L3 cache we provide the correct information about 
characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB 
(0x80000005 and 0x80000006) we use the same value for all CPU models. 
Sometimes it seems strange. For instance, the current default value in 
QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved 
value for Genoa (as PPR for Family 19h Model 11h says)

>>
>>> +    },
>>> +};
>>> +
>>>   /* The following VMX features are not supported by KVM and are left 
>>> out in the
>>>    * CPU definitions:
>>>    *
>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition 
>>> builtin_x86_defs[] = {
>>>               { /* end of list */ }
>>>           }
>>>       },
>>> +    {
>>> +        .name = "EPYC-Genoa",
>>> +        .level = 0xd,
>>> +        .vendor = CPUID_VENDOR_AMD,
>>> +        .family = 25,
>>> +        .model = 17,
>>> +        .stepping = 0,
>>> +        .features[FEAT_1_EDX] =
>>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | 
>>> CPUID_CLFLUSH |
>>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | 
>>> CPUID_PGE |
>>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | 
>>> CPUID_MCE |
>>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>> +            CPUID_VME | CPUID_FP87,
>>> +        .features[FEAT_1_ECX] =
>>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>> +            CPUID_EXT_SSE3,
>>> +        .features[FEAT_8000_0001_EDX] =
>>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>> +            CPUID_EXT2_SYSCALL,
>>> +        .features[FEAT_8000_0001_ECX] =
>>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | 
>>> CPUID_EXT3_ABM |
>>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>> +        .features[FEAT_8000_0008_EBX] =
>>> +            CPUID_8000_0008_EBX_CLZERO | 
>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>> +            CPUID_8000_0008_EBX_AMD_SSBD | 
>>> CPUID_8000_0008_EBX_AMD_PSFD,
>>
>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities 
>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
> 
> KVM does not expose these bits to the guests yet.
> 
> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
> 

I'm not sure, but at least the first two of these features seem to be 
helpful to choose the appropriate mitigation. Do you think that we 
should add them to KVM?

> 
>>
>>> +        .features[FEAT_8000_0021_EAX] =
>>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
>>
>> 5. Why some 0x80000021_EAX features are not set? 
>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
> 
> KVM does not expose FSRC and FSRS bits to the guests yet.

But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel 
CPU models. Do we have to add these bits for AMD to KVM?

> 
> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if 
> we can add this bit to the Genoa and Turin.
> 
>>
>>> +        .features[FEAT_7_0_EBX] =
>>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | 
>>> CPUID_7_0_EBX_AVX2 |
>>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | 
>>> CPUID_7_0_EBX_ERMS |
>>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED | 
>>> CPUID_7_0_EBX_ADX |
>>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>> +        .features[FEAT_7_0_ECX] =
>>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | 
>>> CPUID_7_0_ECX_PKU |
>>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>> +            CPUID_7_0_ECX_RDPID,
>>> +        .features[FEAT_7_0_EDX] =
>>> +            CPUID_7_0_EDX_FSRM,
>>
>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data 
>> processors have to use it, am I right?
> 
> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to 
> investigate.
> 

It seems that KVM has exposed L1D_FLUSH since da3db168fb67

> 
>>
>>> +        .features[FEAT_7_1_EAX] =
>>> +            CPUID_7_1_EAX_AVX512_BF16,
>>> +        .features[FEAT_XSAVE] =
>>> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>>> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>>> +        .features[FEAT_6_EAX] =
>>> +            CPUID_6_EAX_ARAT,
>>> +        .features[FEAT_SVM] =
>>> +            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
>>> +            CPUID_SVM_SVME_ADDR_CHK,
>>> +        .xlevel = 0x80000022,
>>> +        .model_id = "AMD EPYC-Genoa Processor",
>>> +        .cache_info = &epyc_genoa_cache_info,
>>> +    },
>>>   };
>>>   /*
>>
> 

So, If you don't mind, I will send a patch to KVM within a few hours. I 
will add bits for FSRC, FSRS and some bits from 0x80000008_EBX

-- 
Best regards,
Maksim Davydov


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-12 10:09       ` Maksim Davydov
@ 2024-11-12 16:23         ` Moger, Babu
  2024-11-13 14:15           ` Maksim Davydov
  2024-11-13 16:23         ` Moger, Babu
  1 sibling, 1 reply; 18+ messages in thread
From: Moger, Babu @ 2024-11-12 16:23 UTC (permalink / raw)
  To: Maksim Davydov
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson

Hi Maksim,

On 11/12/24 04:09, Maksim Davydov wrote:
> 
> 
> On 11/8/24 23:56, Moger, Babu wrote:
>> Hi Maxim,
>>
>> Thanks for looking into this. I will fix the bits I mentioned below in
>> upcoming Genoa/Turin model update.
>>
>> I have few comments below.
>>
>> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>>> Hi!
>>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa
>>> host. I found some mismatches that confused me. Could you help me to
>>> understand them?
>>>
>>> On 5/4/23 23:53, Babu Moger wrote:
>>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>>> display for the new processor will be EPYC-Genoa.
>>>>
>>>> Adds the following new feature bits on top of the feature bits from
>>>> the previous generation EPYC models.
>>>>
>>>> avx512f         : AVX-512 Foundation instruction
>>>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>>>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>>>> avx512cd        : AVX-512 Conflict Detection instruction
>>>> avx512bw        : AVX-512 Byte and Word Instructions
>>>> avx512vl        : AVX-512 Vector Length Extension Instructions
>>>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>>>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
>>>> gfni            : AVX-512 Galois Field New Instructions
>>>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>>>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
>>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>>                    Quadword Instructions
>>>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>>>> la57            : 57-bit virtual address support (5-level Page Tables)
>>>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject
>>>> the NMI
>>>>                    into the guest without using Event Injection mechanism
>>>>                    meaning not required to track the guest NMI and
>>>> intercepting
>>>>                    the IRET.
>>>> auto-ibrs       : The AMD Zen4 core supports a new feature called
>>>> Automatic IBRS.
>>>>                    It is a "set-and-forget" feature that means that,
>>>> unlike e.g.,
>>>>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS
>>>> mitigation
>>>>                    resources automatically across CPL transitions.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 122 insertions(+)
>>>>
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index d50ace84bf..71fe1e02ee 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info
>>>> = {
>>>>       },
>>>>   };
>>>> +static const CPUCaches epyc_genoa_cache_info = {
>>>> +    .l1d_cache = &(CPUCacheInfo) {
>>>> +        .type = DATA_CACHE,
>>>> +        .level = 1,
>>>> +        .size = 32 * KiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 64,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = 1,
>>>> +        .no_invd_sharing = true,
>>>> +    },
>>>> +    .l1i_cache = &(CPUCacheInfo) {
>>>> +        .type = INSTRUCTION_CACHE,
>>>> +        .level = 1,
>>>> +        .size = 32 * KiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 64,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = 1,
>>>> +        .no_invd_sharing = true,
>>>> +    },
>>>> +    .l2_cache = &(CPUCacheInfo) {
>>>> +        .type = UNIFIED_CACHE,
>>>> +        .level = 2,
>>>> +        .size = 1 * MiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 2048,
>>>> +        .lines_per_tag = 1,
>>>
>>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,1.
>>> * cache is self-initializing. Read-only. Reset: Fixed,1.
>>
>> Yes. That is correct. This needs to be fixed. I Will fix it.
>>>
>>>> +    },
>>>> +    .l3_cache = &(CPUCacheInfo) {
>>>> +        .type = UNIFIED_CACHE,
>>>> +        .level = 3,
>>>> +        .size = 32 * MiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 16,
>>>> +        .partitions = 1,
>>>> +        .sets = 32768,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = true,
>>>> +        .inclusive = true,
>>>> +        .complex_indexing = false,
>>>
>>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that
>>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower
>>> level caches (0 bit)?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,0.
>>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>>
>>
>> Yes. Both of this needs to be fixed. I Will fix it.
>>
>>>
>>>
>>> 3. Why the default stub is used for TLB, but not real values as for
>>> other caches?
>>
>> Can you please eloberate on this?
>>
> 
> For L1i, L1d, L2 and L3 cache we provide the correct information about
> characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB
> (0x80000005 and 0x80000006) we use the same value for all CPU models.
> Sometimes it seems strange. For instance, the current default value in
> QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved
> value for Genoa (as PPR for Family 19h Model 11h says)

Yes. I see that. We may need to address this sometime in the future.

> 
>>>
>>>> +    },
>>>> +};
>>>> +
>>>>   /* The following VMX features are not supported by KVM and are left
>>>> out in the
>>>>    * CPU definitions:
>>>>    *
>>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition
>>>> builtin_x86_defs[] = {
>>>>               { /* end of list */ }
>>>>           }
>>>>       },
>>>> +    {
>>>> +        .name = "EPYC-Genoa",
>>>> +        .level = 0xd,
>>>> +        .vendor = CPUID_VENDOR_AMD,
>>>> +        .family = 25,
>>>> +        .model = 17,
>>>> +        .stepping = 0,
>>>> +        .features[FEAT_1_EDX] =
>>>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
>>>> CPUID_CLFLUSH |
>>>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
>>>> CPUID_PGE |
>>>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
>>>> CPUID_MCE |
>>>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>>> +            CPUID_VME | CPUID_FP87,
>>>> +        .features[FEAT_1_ECX] =
>>>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>>>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>>> +            CPUID_EXT_SSE3,
>>>> +        .features[FEAT_8000_0001_EDX] =
>>>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>>> +            CPUID_EXT2_SYSCALL,
>>>> +        .features[FEAT_8000_0001_ECX] =
>>>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>>>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>>> +        .features[FEAT_8000_0008_EBX] =
>>>> +            CPUID_8000_0008_EBX_CLZERO |
>>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>>> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
>>>
>>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities
>>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
>>
>> KVM does not expose these bits to the guests yet.
>>
>> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
>>
> 
> I'm not sure, but at least the first two of these features seem to be
> helpful to choose the appropriate mitigation. Do you think that we should
> add them to KVM?

Yes. Sure.

> 
>>
>>>
>>>> +        .features[FEAT_8000_0021_EAX] =
>>>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
>>>
>>> 5. Why some 0x80000021_EAX features are not set?
>>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
>>
>> KVM does not expose FSRC and FSRS bits to the guests yet.
> 
> But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel
> CPU models. Do we have to add these bits for AMD to KVM?

Yes. Sure.
> 
>>
>> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if
>> we can add this bit to the Genoa and Turin.

Will add this in my qemu series.

>>
>>>
>>>> +        .features[FEAT_7_0_EBX] =
>>>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
>>>> CPUID_7_0_EBX_AVX2 |
>>>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>>>> CPUID_7_0_EBX_ERMS |
>>>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED |
>>>> CPUID_7_0_EBX_ADX |
>>>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>>> +        .features[FEAT_7_0_ECX] =
>>>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP |
>>>> CPUID_7_0_ECX_PKU |
>>>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>>> +            CPUID_7_0_ECX_RDPID,
>>>> +        .features[FEAT_7_0_EDX] =
>>>> +            CPUID_7_0_EDX_FSRM,
>>>
>>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data
>>> processors have to use it, am I right?
>>
>> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to
>> investigate.
>>
> 
> It seems that KVM has exposed L1D_FLUSH since da3db168fb67

Sure. Will update my patch series.

> 
>>
>>>
>>>> +        .features[FEAT_7_1_EAX] =
>>>> +            CPUID_7_1_EAX_AVX512_BF16,
>>>> +        .features[FEAT_XSAVE] =
>>>> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>>>> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>>>> +        .features[FEAT_6_EAX] =
>>>> +            CPUID_6_EAX_ARAT,
>>>> +        .features[FEAT_SVM] =
>>>> +            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
>>>> +            CPUID_SVM_SVME_ADDR_CHK,
>>>> +        .xlevel = 0x80000022,
>>>> +        .model_id = "AMD EPYC-Genoa Processor",
>>>> +        .cache_info = &epyc_genoa_cache_info,
>>>> +    },
>>>>   };
>>>>   /*
>>>
>>
> 
> So, If you don't mind, I will send a patch to KVM within a few hours. I
> will add bits for FSRC, FSRS and some bits from 0x80000008_EBX
> 

FSRC and FSRS are not used anywhere in the kernel. It is mostly FYI kind
of information. It does not hurt to add.  Please go ahead.

-- 
Thanks
Babu Moger


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-12 16:23         ` Moger, Babu
@ 2024-11-13 14:15           ` Maksim Davydov
  0 siblings, 0 replies; 18+ messages in thread
From: Maksim Davydov @ 2024-11-13 14:15 UTC (permalink / raw)
  To: babu.moger
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson

Hi!
Thank you very much!
I'm looking forward to your Genoa/Turin series.

And I've sent patch series to KVM:
https://lore.kernel.org/lkml/20241113133042.702340-1-davydov-max@yandex-team.ru/

On 11/12/24 19:23, Moger, Babu wrote:
> Hi Maksim,
> 
> On 11/12/24 04:09, Maksim Davydov wrote:
>>
>>
>> On 11/8/24 23:56, Moger, Babu wrote:
>>> Hi Maxim,
>>>
>>> Thanks for looking into this. I will fix the bits I mentioned below in
>>> upcoming Genoa/Turin model update.
>>>
>>> I have few comments below.
>>>
>>> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>>>> Hi!
>>>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa
>>>> host. I found some mismatches that confused me. Could you help me to
>>>> understand them?
>>>>
>>>> On 5/4/23 23:53, Babu Moger wrote:
>>>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>>>> display for the new processor will be EPYC-Genoa.
>>>>>
>>>>> Adds the following new feature bits on top of the feature bits from
>>>>> the previous generation EPYC models.
>>>>>
>>>>> avx512f         : AVX-512 Foundation instruction
>>>>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>>>>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>>>>> avx512cd        : AVX-512 Conflict Detection instruction
>>>>> avx512bw        : AVX-512 Byte and Word Instructions
>>>>> avx512vl        : AVX-512 Vector Length Extension Instructions
>>>>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>>>>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
>>>>> gfni            : AVX-512 Galois Field New Instructions
>>>>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>>>>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
>>>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>>>                     Quadword Instructions
>>>>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>>>>> la57            : 57-bit virtual address support (5-level Page Tables)
>>>>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject
>>>>> the NMI
>>>>>                     into the guest without using Event Injection mechanism
>>>>>                     meaning not required to track the guest NMI and
>>>>> intercepting
>>>>>                     the IRET.
>>>>> auto-ibrs       : The AMD Zen4 core supports a new feature called
>>>>> Automatic IBRS.
>>>>>                     It is a "set-and-forget" feature that means that,
>>>>> unlike e.g.,
>>>>>                     s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS
>>>>> mitigation
>>>>>                     resources automatically across CPL transitions.
>>>>>
>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>>> ---
>>>>>    target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 122 insertions(+)
>>>>>
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index d50ace84bf..71fe1e02ee 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info
>>>>> = {
>>>>>        },
>>>>>    };
>>>>> +static const CPUCaches epyc_genoa_cache_info = {
>>>>> +    .l1d_cache = &(CPUCacheInfo) {
>>>>> +        .type = DATA_CACHE,
>>>>> +        .level = 1,
>>>>> +        .size = 32 * KiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 64,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = 1,
>>>>> +        .no_invd_sharing = true,
>>>>> +    },
>>>>> +    .l1i_cache = &(CPUCacheInfo) {
>>>>> +        .type = INSTRUCTION_CACHE,
>>>>> +        .level = 1,
>>>>> +        .size = 32 * KiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 64,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = 1,
>>>>> +        .no_invd_sharing = true,
>>>>> +    },
>>>>> +    .l2_cache = &(CPUCacheInfo) {
>>>>> +        .type = UNIFIED_CACHE,
>>>>> +        .level = 2,
>>>>> +        .size = 1 * MiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 2048,
>>>>> +        .lines_per_tag = 1,
>>>>
>>>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>>>
>>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>>> * cache inclusive. Read-only. Reset: Fixed,1.
>>>> * cache is self-initializing. Read-only. Reset: Fixed,1.
>>>
>>> Yes. That is correct. This needs to be fixed. I Will fix it.
>>>>
>>>>> +    },
>>>>> +    .l3_cache = &(CPUCacheInfo) {
>>>>> +        .type = UNIFIED_CACHE,
>>>>> +        .level = 3,
>>>>> +        .size = 32 * MiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 16,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 32768,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = true,
>>>>> +        .inclusive = true,
>>>>> +        .complex_indexing = false,
>>>>
>>>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that
>>>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower
>>>> level caches (0 bit)?
>>>>
>>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>>> * cache inclusive. Read-only. Reset: Fixed,0.
>>>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>>>
>>>
>>> Yes. Both of this needs to be fixed. I Will fix it.
>>>
>>>>
>>>>
>>>> 3. Why the default stub is used for TLB, but not real values as for
>>>> other caches?
>>>
>>> Can you please eloberate on this?
>>>
>>
>> For L1i, L1d, L2 and L3 cache we provide the correct information about
>> characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB
>> (0x80000005 and 0x80000006) we use the same value for all CPU models.
>> Sometimes it seems strange. For instance, the current default value in
>> QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved
>> value for Genoa (as PPR for Family 19h Model 11h says)
> 
> Yes. I see that. We may need to address this sometime in the future.
> 
>>
>>>>
>>>>> +    },
>>>>> +};
>>>>> +
>>>>>    /* The following VMX features are not supported by KVM and are left
>>>>> out in the
>>>>>     * CPU definitions:
>>>>>     *
>>>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition
>>>>> builtin_x86_defs[] = {
>>>>>                { /* end of list */ }
>>>>>            }
>>>>>        },
>>>>> +    {
>>>>> +        .name = "EPYC-Genoa",
>>>>> +        .level = 0xd,
>>>>> +        .vendor = CPUID_VENDOR_AMD,
>>>>> +        .family = 25,
>>>>> +        .model = 17,
>>>>> +        .stepping = 0,
>>>>> +        .features[FEAT_1_EDX] =
>>>>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
>>>>> CPUID_CLFLUSH |
>>>>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
>>>>> CPUID_PGE |
>>>>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
>>>>> CPUID_MCE |
>>>>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>>>> +            CPUID_VME | CPUID_FP87,
>>>>> +        .features[FEAT_1_ECX] =
>>>>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>>>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>>>>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>>>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>>>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>>>> +            CPUID_EXT_SSE3,
>>>>> +        .features[FEAT_8000_0001_EDX] =
>>>>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>>>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>>>> +            CPUID_EXT2_SYSCALL,
>>>>> +        .features[FEAT_8000_0001_ECX] =
>>>>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>>>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>>>>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>>>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>>>> +        .features[FEAT_8000_0008_EBX] =
>>>>> +            CPUID_8000_0008_EBX_CLZERO |
>>>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>>>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>>>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>>>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>>>> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
>>>>
>>>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities
>>>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
>>>
>>> KVM does not expose these bits to the guests yet.
>>>
>>> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
>>>
>>
>> I'm not sure, but at least the first two of these features seem to be
>> helpful to choose the appropriate mitigation. Do you think that we should
>> add them to KVM?
> 
> Yes. Sure.
> 
>>
>>>
>>>>
>>>>> +        .features[FEAT_8000_0021_EAX] =
>>>>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>>>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>>>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>>>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
>>>>
>>>> 5. Why some 0x80000021_EAX features are not set?
>>>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
>>>
>>> KVM does not expose FSRC and FSRS bits to the guests yet.
>>
>> But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel
>> CPU models. Do we have to add these bits for AMD to KVM?
> 
> Yes. Sure.
>>
>>>
>>> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if
>>> we can add this bit to the Genoa and Turin.
> 
> Will add this in my qemu series.
> 
>>>
>>>>
>>>>> +        .features[FEAT_7_0_EBX] =
>>>>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
>>>>> CPUID_7_0_EBX_AVX2 |
>>>>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>>>>> CPUID_7_0_EBX_ERMS |
>>>>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>>>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED |
>>>>> CPUID_7_0_EBX_ADX |
>>>>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>>>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>>>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>>>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>>>> +        .features[FEAT_7_0_ECX] =
>>>>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP |
>>>>> CPUID_7_0_ECX_PKU |
>>>>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>>>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>>>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>>>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>>>> +            CPUID_7_0_ECX_RDPID,
>>>>> +        .features[FEAT_7_0_EDX] =
>>>>> +            CPUID_7_0_EDX_FSRM,
>>>>
>>>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data
>>>> processors have to use it, am I right?
>>>
>>> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to
>>> investigate.
>>>
>>
>> It seems that KVM has exposed L1D_FLUSH since da3db168fb67
> 
> Sure. Will update my patch series.
> 
>>
>>>
>>>>
>>>>> +        .features[FEAT_7_1_EAX] =
>>>>> +            CPUID_7_1_EAX_AVX512_BF16,
>>>>> +        .features[FEAT_XSAVE] =
>>>>> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
>>>>> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
>>>>> +        .features[FEAT_6_EAX] =
>>>>> +            CPUID_6_EAX_ARAT,
>>>>> +        .features[FEAT_SVM] =
>>>>> +            CPUID_SVM_NPT | CPUID_SVM_NRIPSAVE | CPUID_SVM_VNMI |
>>>>> +            CPUID_SVM_SVME_ADDR_CHK,
>>>>> +        .xlevel = 0x80000022,
>>>>> +        .model_id = "AMD EPYC-Genoa Processor",
>>>>> +        .cache_info = &epyc_genoa_cache_info,
>>>>> +    },
>>>>>    };
>>>>>    /*
>>>>
>>>
>>
>> So, If you don't mind, I will send a patch to KVM within a few hours. I
>> will add bits for FSRC, FSRS and some bits from 0x80000008_EBX
>>
> 
> FSRC and FSRS are not used anywhere in the kernel. It is mostly FYI kind
> of information. It does not hurt to add.  Please go ahead.
> 

-- 
Best regards,
Maksim Davydov


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-12 10:09       ` Maksim Davydov
  2024-11-12 16:23         ` Moger, Babu
@ 2024-11-13 16:23         ` Moger, Babu
  2024-11-14 16:59           ` Moger, Babu
  1 sibling, 1 reply; 18+ messages in thread
From: Moger, Babu @ 2024-11-13 16:23 UTC (permalink / raw)
  To: Maksim Davydov, pbonzini@redhat.com, Roth, Michael
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, michael.roth, wei.huang2, berrange, bdas, pbonzini,
	richard.henderson

Adding Paolo.

On 11/12/24 04:09, Maksim Davydov wrote:
> 
> 
> On 11/8/24 23:56, Moger, Babu wrote:
>> Hi Maxim,
>>
>> Thanks for looking into this. I will fix the bits I mentioned below in
>> upcoming Genoa/Turin model update.
>>
>> I have few comments below.
>>
>> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>>> Hi!
>>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa
>>> host. I found some mismatches that confused me. Could you help me to
>>> understand them?
>>>
>>> On 5/4/23 23:53, Babu Moger wrote:
>>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>>> display for the new processor will be EPYC-Genoa.
>>>>
>>>> Adds the following new feature bits on top of the feature bits from
>>>> the previous generation EPYC models.
>>>>
>>>> avx512f         : AVX-512 Foundation instruction
>>>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>>>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>>>> avx512cd        : AVX-512 Conflict Detection instruction
>>>> avx512bw        : AVX-512 Byte and Word Instructions
>>>> avx512vl        : AVX-512 Vector Length Extension Instructions
>>>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>>>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
>>>> gfni            : AVX-512 Galois Field New Instructions
>>>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>>>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
>>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>>                    Quadword Instructions
>>>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>>>> la57            : 57-bit virtual address support (5-level Page Tables)
>>>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject
>>>> the NMI
>>>>                    into the guest without using Event Injection mechanism
>>>>                    meaning not required to track the guest NMI and
>>>> intercepting
>>>>                    the IRET.
>>>> auto-ibrs       : The AMD Zen4 core supports a new feature called
>>>> Automatic IBRS.
>>>>                    It is a "set-and-forget" feature that means that,
>>>> unlike e.g.,
>>>>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS
>>>> mitigation
>>>>                    resources automatically across CPL transitions.
>>>>
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>>   1 file changed, 122 insertions(+)
>>>>
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index d50ace84bf..71fe1e02ee 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info
>>>> = {
>>>>       },
>>>>   };
>>>> +static const CPUCaches epyc_genoa_cache_info = {
>>>> +    .l1d_cache = &(CPUCacheInfo) {
>>>> +        .type = DATA_CACHE,
>>>> +        .level = 1,
>>>> +        .size = 32 * KiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 64,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = 1,
>>>> +        .no_invd_sharing = true,
>>>> +    },
>>>> +    .l1i_cache = &(CPUCacheInfo) {
>>>> +        .type = INSTRUCTION_CACHE,
>>>> +        .level = 1,
>>>> +        .size = 32 * KiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 64,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = 1,
>>>> +        .no_invd_sharing = true,
>>>> +    },
>>>> +    .l2_cache = &(CPUCacheInfo) {
>>>> +        .type = UNIFIED_CACHE,
>>>> +        .level = 2,
>>>> +        .size = 1 * MiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 8,
>>>> +        .partitions = 1,
>>>> +        .sets = 2048,
>>>> +        .lines_per_tag = 1,
>>>
>>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,1.
>>> * cache is self-initializing. Read-only. Reset: Fixed,1.
>>
>> Yes. That is correct. This needs to be fixed. I Will fix it.
>>>
>>>> +    },
>>>> +    .l3_cache = &(CPUCacheInfo) {
>>>> +        .type = UNIFIED_CACHE,
>>>> +        .level = 3,
>>>> +        .size = 32 * MiB,
>>>> +        .line_size = 64,
>>>> +        .associativity = 16,
>>>> +        .partitions = 1,
>>>> +        .sets = 32768,
>>>> +        .lines_per_tag = 1,
>>>> +        .self_init = true,
>>>> +        .inclusive = true,
>>>> +        .complex_indexing = false,
>>>
>>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that
>>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower
>>> level caches (0 bit)?
>>>
>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>> * cache inclusive. Read-only. Reset: Fixed,0.
>>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>>
>>
>> Yes. Both of this needs to be fixed. I Will fix it.
>>
>>>
>>>
>>> 3. Why the default stub is used for TLB, but not real values as for
>>> other caches?
>>
>> Can you please eloberate on this?
>>
> 
> For L1i, L1d, L2 and L3 cache we provide the correct information about
> characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB
> (0x80000005 and 0x80000006) we use the same value for all CPU models.
> Sometimes it seems strange. For instance, the current default value in
> QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved
> value for Genoa (as PPR for Family 19h Model 11h says)
> 
>>>
>>>> +    },
>>>> +};
>>>> +
>>>>   /* The following VMX features are not supported by KVM and are left
>>>> out in the
>>>>    * CPU definitions:
>>>>    *
>>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition
>>>> builtin_x86_defs[] = {
>>>>               { /* end of list */ }
>>>>           }
>>>>       },
>>>> +    {
>>>> +        .name = "EPYC-Genoa",
>>>> +        .level = 0xd,
>>>> +        .vendor = CPUID_VENDOR_AMD,
>>>> +        .family = 25,
>>>> +        .model = 17,
>>>> +        .stepping = 0,
>>>> +        .features[FEAT_1_EDX] =
>>>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
>>>> CPUID_CLFLUSH |
>>>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
>>>> CPUID_PGE |
>>>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
>>>> CPUID_MCE |
>>>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>>> +            CPUID_VME | CPUID_FP87,
>>>> +        .features[FEAT_1_ECX] =
>>>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>>>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>>> +            CPUID_EXT_SSE3,
>>>> +        .features[FEAT_8000_0001_EDX] =
>>>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>>> +            CPUID_EXT2_SYSCALL,
>>>> +        .features[FEAT_8000_0001_ECX] =
>>>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>>>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>>> +        .features[FEAT_8000_0008_EBX] =
>>>> +            CPUID_8000_0008_EBX_CLZERO |
>>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>>> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
>>>
>>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities
>>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
>>
>> KVM does not expose these bits to the guests yet.
>>
>> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
>>
> 
> I'm not sure, but at least the first two of these features seem to be
> helpful to choose the appropriate mitigation. Do you think that we should
> add them to KVM?
> 
>>
>>>
>>>> +        .features[FEAT_8000_0021_EAX] =
>>>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
>>>
>>> 5. Why some 0x80000021_EAX features are not set?
>>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
>>
>> KVM does not expose FSRC and FSRS bits to the guests yet.
> 
> But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel
> CPU models. Do we have to add these bits for AMD to KVM?
> 
>>
>> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if
>> we can add this bit to the Genoa and Turin.
>>
>>>
>>>> +        .features[FEAT_7_0_EBX] =
>>>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
>>>> CPUID_7_0_EBX_AVX2 |
>>>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>>>> CPUID_7_0_EBX_ERMS |
>>>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED |
>>>> CPUID_7_0_EBX_ADX |
>>>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>>> +        .features[FEAT_7_0_ECX] =
>>>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP |
>>>> CPUID_7_0_ECX_PKU |
>>>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>>> +            CPUID_7_0_ECX_RDPID,
>>>> +        .features[FEAT_7_0_EDX] =
>>>> +            CPUID_7_0_EDX_FSRM,
>>>
>>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data
>>> processors have to use it, am I right?
>>
>> KVM does not expose L1D_FLUSH to the guests. Not sure why. Need to
>> investigate.
>>
> 
> It seems that KVM has exposed L1D_FLUSH since da3db168fb67

Paolo,

I see L1D_FLUSH feature bit is masked for SEV-SNP guest.

https://lore.kernel.org/qemu-devel/20240704095806.1780273-17-pbonzini@redhat.com/

Any reason for this?

Genoa and Turin hardware supports L1D_FLUSH feature.

I tested commenting the masking line and SEV-SNP guest boots fine.

diff --git a/target/i386/sev.c b/target/i386/sev.c
index a0d271f898..f5cc37bcc7 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -962,7 +962,7 @@ sev_snp_mask_cpuid_features(X86ConfidentialGuest *cg,
uint32_t feature, uint32_t
         if (index == 0 && reg == R_EDX) {
             return value & ~(CPUID_7_0_EDX_SPEC_CTRL |
                              CPUID_7_0_EDX_STIBP |
-                             CPUID_7_0_EDX_FLUSH_L1D |
+                             //CPUID_7_0_EDX_FLUSH_L1D |
                              CPUID_7_0_EDX_ARCH_CAPABILITIES |
                              CPUID_7_0_EDX_CORE_CAPABILITY |
                              CPUID_7_0_EDX_SPEC_CTRL_SSBD);



If there are no objections, I can add this patch in the Turin series.

Thanks

-- 
Thanks
Babu Moger


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series
  2024-11-13 16:23         ` Moger, Babu
@ 2024-11-14 16:59           ` Moger, Babu
  0 siblings, 0 replies; 18+ messages in thread
From: Moger, Babu @ 2024-11-14 16:59 UTC (permalink / raw)
  To: Maksim Davydov, pbonzini@redhat.com, Roth, Michael
  Cc: weijiang.yang, philmd, dwmw, paul, joao.m.martins, qemu-devel,
	mtosatti, kvm, mst, marcel.apfelbaum, yang.zhong, jing2.liu,
	vkuznets, wei.huang2, berrange, bdas, richard.henderson



On 11/13/24 10:23, Moger, Babu wrote:
> Adding Paolo.
> 
> On 11/12/24 04:09, Maksim Davydov wrote:
>>
>>
>> On 11/8/24 23:56, Moger, Babu wrote:
>>> Hi Maxim,
>>>
>>> Thanks for looking into this. I will fix the bits I mentioned below in
>>> upcoming Genoa/Turin model update.
>>>
>>> I have few comments below.
>>>
>>> On 11/8/2024 12:15 PM, Maksim Davydov wrote:
>>>> Hi!
>>>> I compared EPYC-Genoa CPU model with CPUID output from real EPYC Genoa
>>>> host. I found some mismatches that confused me. Could you help me to
>>>> understand them?
>>>>
>>>> On 5/4/23 23:53, Babu Moger wrote:
>>>>> Adds the support for AMD EPYC Genoa generation processors. The model
>>>>> display for the new processor will be EPYC-Genoa.
>>>>>
>>>>> Adds the following new feature bits on top of the feature bits from
>>>>> the previous generation EPYC models.
>>>>>
>>>>> avx512f         : AVX-512 Foundation instruction
>>>>> avx512dq        : AVX-512 Doubleword & Quadword Instruction
>>>>> avx512ifma      : AVX-512 Integer Fused Multiply Add instruction
>>>>> avx512cd        : AVX-512 Conflict Detection instruction
>>>>> avx512bw        : AVX-512 Byte and Word Instructions
>>>>> avx512vl        : AVX-512 Vector Length Extension Instructions
>>>>> avx512vbmi      : AVX-512 Vector Byte Manipulation Instruction
>>>>> avx512_vbmi2    : AVX-512 Additional Vector Byte Manipulation Instruction
>>>>> gfni            : AVX-512 Galois Field New Instructions
>>>>> avx512_vnni     : AVX-512 Vector Neural Network Instructions
>>>>> avx512_bitalg   : AVX-512 Bit Algorithms, add bit algorithms Instructions
>>>>> avx512_vpopcntdq: AVX-512 AVX-512 Vector Population Count Doubleword and
>>>>>                    Quadword Instructions
>>>>> avx512_bf16    : AVX-512 BFLOAT16 instructions
>>>>> la57            : 57-bit virtual address support (5-level Page Tables)
>>>>> vnmi            : Virtual NMI (VNMI) allows the hypervisor to inject
>>>>> the NMI
>>>>>                    into the guest without using Event Injection mechanism
>>>>>                    meaning not required to track the guest NMI and
>>>>> intercepting
>>>>>                    the IRET.
>>>>> auto-ibrs       : The AMD Zen4 core supports a new feature called
>>>>> Automatic IBRS.
>>>>>                    It is a "set-and-forget" feature that means that,
>>>>> unlike e.g.,
>>>>>                    s/w-toggled SPEC_CTRL.IBRS, h/w manages its IBRS
>>>>> mitigation
>>>>>                    resources automatically across CPL transitions.
>>>>>
>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>>> ---
>>>>>   target/i386/cpu.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++
>>>>>   1 file changed, 122 insertions(+)
>>>>>
>>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>>> index d50ace84bf..71fe1e02ee 100644
>>>>> --- a/target/i386/cpu.c
>>>>> +++ b/target/i386/cpu.c
>>>>> @@ -1973,6 +1973,56 @@ static const CPUCaches epyc_milan_v2_cache_info
>>>>> = {
>>>>>       },
>>>>>   };
>>>>> +static const CPUCaches epyc_genoa_cache_info = {
>>>>> +    .l1d_cache = &(CPUCacheInfo) {
>>>>> +        .type = DATA_CACHE,
>>>>> +        .level = 1,
>>>>> +        .size = 32 * KiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 64,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = 1,
>>>>> +        .no_invd_sharing = true,
>>>>> +    },
>>>>> +    .l1i_cache = &(CPUCacheInfo) {
>>>>> +        .type = INSTRUCTION_CACHE,
>>>>> +        .level = 1,
>>>>> +        .size = 32 * KiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 64,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = 1,
>>>>> +        .no_invd_sharing = true,
>>>>> +    },
>>>>> +    .l2_cache = &(CPUCacheInfo) {
>>>>> +        .type = UNIFIED_CACHE,
>>>>> +        .level = 2,
>>>>> +        .size = 1 * MiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 8,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 2048,
>>>>> +        .lines_per_tag = 1,
>>>>
>>>> 1. Why L2 cache is not shown as inclusive and self-initializing?
>>>>
>>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>>> * cache inclusive. Read-only. Reset: Fixed,1.
>>>> * cache is self-initializing. Read-only. Reset: Fixed,1.
>>>
>>> Yes. That is correct. This needs to be fixed. I Will fix it.
>>>>
>>>>> +    },
>>>>> +    .l3_cache = &(CPUCacheInfo) {
>>>>> +        .type = UNIFIED_CACHE,
>>>>> +        .level = 3,
>>>>> +        .size = 32 * MiB,
>>>>> +        .line_size = 64,
>>>>> +        .associativity = 16,
>>>>> +        .partitions = 1,
>>>>> +        .sets = 32768,
>>>>> +        .lines_per_tag = 1,
>>>>> +        .self_init = true,
>>>>> +        .inclusive = true,
>>>>> +        .complex_indexing = false,
>>>>
>>>> 2. Why L3 cache is shown as inclusive? Why is it not shown in L3 that
>>>> the WBINVD/INVD instruction is not guaranteed to invalidate all lower
>>>> level caches (0 bit)?
>>>>
>>>> PPR for AMD Family 19h Model 11 says for L2 (0x8000001d):
>>>> * cache inclusive. Read-only. Reset: Fixed,0.
>>>> * Write-Back Invalidate/Invalidate. Read-only. Reset: Fixed,1.
>>>>
>>>
>>> Yes. Both of this needs to be fixed. I Will fix it.
>>>
>>>>
>>>>
>>>> 3. Why the default stub is used for TLB, but not real values as for
>>>> other caches?
>>>
>>> Can you please eloberate on this?
>>>
>>
>> For L1i, L1d, L2 and L3 cache we provide the correct information about
>> characteristics. In contrast, for L1i TLB, L1d TLB, L2i TLB and L2d TLB
>> (0x80000005 and 0x80000006) we use the same value for all CPU models.
>> Sometimes it seems strange. For instance, the current default value in
>> QEMU for L2 TLB associativity for 4 KB pages is 4. But 4 is a reserved
>> value for Genoa (as PPR for Family 19h Model 11h says)
>>
>>>>
>>>>> +    },
>>>>> +};
>>>>> +
>>>>>   /* The following VMX features are not supported by KVM and are left
>>>>> out in the
>>>>>    * CPU definitions:
>>>>>    *
>>>>> @@ -4472,6 +4522,78 @@ static const X86CPUDefinition
>>>>> builtin_x86_defs[] = {
>>>>>               { /* end of list */ }
>>>>>           }
>>>>>       },
>>>>> +    {
>>>>> +        .name = "EPYC-Genoa",
>>>>> +        .level = 0xd,
>>>>> +        .vendor = CPUID_VENDOR_AMD,
>>>>> +        .family = 25,
>>>>> +        .model = 17,
>>>>> +        .stepping = 0,
>>>>> +        .features[FEAT_1_EDX] =
>>>>> +            CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
>>>>> CPUID_CLFLUSH |
>>>>> +            CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
>>>>> CPUID_PGE |
>>>>> +            CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
>>>>> CPUID_MCE |
>>>>> +            CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
>>>>> +            CPUID_VME | CPUID_FP87,
>>>>> +        .features[FEAT_1_ECX] =
>>>>> +            CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
>>>>> +            CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
>>>>> +            CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
>>>>> +            CPUID_EXT_PCID | CPUID_EXT_CX16 | CPUID_EXT_FMA |
>>>>> +            CPUID_EXT_SSSE3 | CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ |
>>>>> +            CPUID_EXT_SSE3,
>>>>> +        .features[FEAT_8000_0001_EDX] =
>>>>> +            CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
>>>>> +            CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
>>>>> +            CPUID_EXT2_SYSCALL,
>>>>> +        .features[FEAT_8000_0001_ECX] =
>>>>> +            CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
>>>>> +            CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
>>>>> +            CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
>>>>> +            CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
>>>>> +        .features[FEAT_8000_0008_EBX] =
>>>>> +            CPUID_8000_0008_EBX_CLZERO |
>>>>> CPUID_8000_0008_EBX_XSAVEERPTR |
>>>>> +            CPUID_8000_0008_EBX_WBNOINVD | CPUID_8000_0008_EBX_IBPB |
>>>>> +            CPUID_8000_0008_EBX_IBRS | CPUID_8000_0008_EBX_STIBP |
>>>>> +            CPUID_8000_0008_EBX_STIBP_ALWAYS_ON |
>>>>> +            CPUID_8000_0008_EBX_AMD_SSBD | CPUID_8000_0008_EBX_AMD_PSFD,
>>>>
>>>> 4. Why 0x80000008_EBX features related to speculation vulnerabilities
>>>> (BTC_NO, IBPB_RET, IbrsPreferred, INT_WBINVD) are not set?
>>>
>>> KVM does not expose these bits to the guests yet.
>>>
>>> I normally check using the ioctl KVM_GET_SUPPORTED_CPUID.
>>>
>>
>> I'm not sure, but at least the first two of these features seem to be
>> helpful to choose the appropriate mitigation. Do you think that we should
>> add them to KVM?
>>
>>>
>>>>
>>>>> +        .features[FEAT_8000_0021_EAX] =
>>>>> +            CPUID_8000_0021_EAX_No_NESTED_DATA_BP |
>>>>> +            CPUID_8000_0021_EAX_LFENCE_ALWAYS_SERIALIZING |
>>>>> +            CPUID_8000_0021_EAX_NULL_SEL_CLR_BASE |
>>>>> +            CPUID_8000_0021_EAX_AUTO_IBRS,
>>>>
>>>> 5. Why some 0x80000021_EAX features are not set?
>>>> (FsGsKernelGsBaseNonSerializing, FSRC and FSRS)
>>>
>>> KVM does not expose FSRC and FSRS bits to the guests yet.
>>
>> But KVM exposes the same features (0x7 ecx=1, bits 10 and 11) for Intel
>> CPU models. Do we have to add these bits for AMD to KVM?
>>
>>>
>>> The KVM reports the bit FsGsKernelGsBaseNonSerializing. I will check if
>>> we can add this bit to the Genoa and Turin.
>>>
>>>>
>>>>> +        .features[FEAT_7_0_EBX] =
>>>>> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
>>>>> CPUID_7_0_EBX_AVX2 |
>>>>> +            CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
>>>>> CPUID_7_0_EBX_ERMS |
>>>>> +            CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_AVX512F |
>>>>> +            CPUID_7_0_EBX_AVX512DQ | CPUID_7_0_EBX_RDSEED |
>>>>> CPUID_7_0_EBX_ADX |
>>>>> +            CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_AVX512IFMA |
>>>>> +            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
>>>>> +            CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
>>>>> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
>>>>> +        .features[FEAT_7_0_ECX] =
>>>>> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP |
>>>>> CPUID_7_0_ECX_PKU |
>>>>> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
>>>>> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
>>>>> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
>>>>> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
>>>>> +            CPUID_7_0_ECX_RDPID,
>>>>> +        .features[FEAT_7_0_EDX] =
>>>>> +            CPUID_7_0_EDX_FSRM,
>>>>
>>>> 6. Why L1D_FLUSH is not set? Because only vulnerable MMIO stale data
>>>> processors have to use it, am I right?

Yes. That is correct.  We dont need to expose this bit on AMD guests.

I will not add L1D_FLUSH.

Thanks
Babu


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-11-14 17:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-04 20:53 [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Babu Moger
2023-05-04 20:53 ` [PATCH v4 1/7] target/i386: allow versioned CPUs to specify new cache_info Babu Moger
2023-05-04 20:53 ` [PATCH v4 2/7] target/i386: Add new EPYC CPU versions with updated cache_info Babu Moger
2023-05-04 20:53 ` [PATCH v4 3/7] target/i386: Add a couple of feature bits in 8000_0008_EBX Babu Moger
2023-05-04 20:53 ` [PATCH v4 4/7] target/i386: Add feature bits for CPUID_Fn80000021_EAX Babu Moger
2023-05-05  8:29   ` Paolo Bonzini
2023-05-04 20:53 ` [PATCH v4 5/7] target/i386: Add missing feature bits in EPYC-Milan model Babu Moger
2023-05-04 20:53 ` [PATCH v4 6/7] target/i386: Add VNMI and automatic IBRS feature bits Babu Moger
2023-05-04 20:53 ` [PATCH v4 7/7] target/i386: Add EPYC-Genoa model to support Zen 4 processor series Babu Moger
2024-11-08 18:15   ` Maksim Davydov
2024-11-08 20:56     ` Moger, Babu
2024-11-12 10:09       ` Maksim Davydov
2024-11-12 16:23         ` Moger, Babu
2024-11-13 14:15           ` Maksim Davydov
2024-11-13 16:23         ` Moger, Babu
2024-11-14 16:59           ` Moger, Babu
2023-05-05  8:31 ` [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models Paolo Bonzini
2023-05-05 17:15   ` Moger, Babu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).