qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model
@ 2024-10-29 15:18 Paolo Bonzini
  2024-10-29 15:18 ` [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

Most of the patches here are from Tao Su's v1.  The main issue in his
version were two:

- overlooking kvm_cpu_xsave_init(), which currently looks at ExtSaveArea.
  This would get a bit ugly for extended save states that are enabled
  by both AVX512 and AVX10.  Patches 1-2 change kvm_cpu_xsave_init()
  to look at ExtSaveArea's size field instead of testing features.

- downgrading silently to KVM reported value if the avx10_version property
  is >= kvm reported value.  Xiaoyao Li suggested basing this on
  cpu->check_cpuid and cpu->enforce_cpuid.  Also, the check must accept
  any accelerator and not just KVM.  I moved the check to
  x86_cpu_filter_features in patch 4.

I don't have a Granite Rapids machine, so please test! :)

Paolo

Paolo Bonzini (3):
  target/i386: cpu: set correct supported XCR0 features for TCG
  target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
  target/i386: return bool from x86_cpu_filter_features

Tao Su (5):
  target/i386: add AVX10 feature and AVX10 version property
  target/i386: add CPUID.24 features for AVX10
  target/i386: Add feature dependencies for AVX10
  target/i386: Add AVX512 state when AVX10 is supported
  target/i386: Introduce GraniteRapids-v2 model

 target/i386/cpu.h         |  16 ++++
 target/i386/cpu.c         | 175 ++++++++++++++++++++++++++++++++++----
 target/i386/kvm/kvm-cpu.c |   8 --
 target/i386/kvm/kvm.c     |   3 +-
 4 files changed, 175 insertions(+), 27 deletions(-)

-- 
2.47.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  2:56   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5886b44fcf7..f08e9b8f1bc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1318,7 +1318,9 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             .needs_ecx = true, .ecx = 0,
             .reg = R_EAX,
         },
-        .tcg_features = ~0U,
+        .tcg_features = XSTATE_FP_MASK | XSTATE_SSE_MASK |
+            XSTATE_YMM_MASK | XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK |
+            XSTATE_PKRU_MASK,
         .migratable_flags = XSTATE_FP_MASK | XSTATE_SSE_MASK |
             XSTATE_YMM_MASK | XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK |
             XSTATE_OPMASK_MASK | XSTATE_ZMM_Hi256_MASK | XSTATE_Hi16_ZMM_MASK |
@@ -1331,7 +1333,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             .needs_ecx = true, .ecx = 0,
             .reg = R_EDX,
         },
-        .tcg_features = ~0U,
+        .tcg_features = 0U,
     },
     /*Below are MSR exposed features*/
     [FEAT_ARCH_CAPABILITIES] = {
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
  2024-10-29 15:18 ` [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  3:50   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea
to query the accelerator for the support status of extended save areas.
This is a problem for AVX10, which attaches two feature bits (AVX512F
and AVX10) to the same extended save states.

To keep the AVX10 hacks to the minimum, limit usage of esa->features
and esa->bits.  Instead, just query the accelerator for the 0xD leaf.
Do it in common code and clear esa->size if an extended save state is
unsupported.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c         | 33 +++++++++++++++++++++++++++++++--
 target/i386/kvm/kvm-cpu.c |  4 ----
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f08e9b8f1bc..1ee4d988caf 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7102,6 +7102,15 @@ static void x86_cpu_set_sgxlepubkeyhash(CPUX86State *env)
 #endif
 }
 
+static bool cpuid_has_xsave_feature(CPUX86State *env, const ExtSaveArea *esa)
+{
+    if (!esa->size) {
+        return false;
+    }
+
+    return (env->features[esa->feature] & esa->bits);
+}
+
 static void x86_cpu_reset_hold(Object *obj, ResetType type)
 {
     CPUState *cs = CPU(obj);
@@ -7210,7 +7219,7 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
         if (!((1 << i) & CPUID_XSTATE_XCR0_MASK)) {
             continue;
         }
-        if (env->features[esa->feature] & esa->bits) {
+        if (cpuid_has_xsave_feature(env, esa)) {
             xcr0 |= 1ull << i;
         }
     }
@@ -7348,7 +7357,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
     mask = 0;
     for (i = 0; i < ARRAY_SIZE(x86_ext_save_areas); i++) {
         const ExtSaveArea *esa = &x86_ext_save_areas[i];
-        if (env->features[esa->feature] & esa->bits) {
+        if (cpuid_has_xsave_feature(env, esa)) {
             mask |= (1ULL << i);
         }
     }
@@ -8020,6 +8029,26 @@ static void x86_cpu_register_feature_bit_props(X86CPUClass *xcc,
 
 static void x86_cpu_post_initfn(Object *obj)
 {
+    static bool first = true;
+    uint64_t supported_xcr0;
+    int i;
+
+    if (first) {
+        first = false;
+
+        supported_xcr0 =
+            ((uint64_t) x86_cpu_get_supported_feature_word(NULL, FEAT_XSAVE_XCR0_HI) << 32) |
+            x86_cpu_get_supported_feature_word(NULL, FEAT_XSAVE_XCR0_LO);
+
+        for (i = XSTATE_SSE_BIT + 1; i < XSAVE_STATE_AREA_COUNT; i++) {
+            ExtSaveArea *esa = &x86_ext_save_areas[i];
+
+            if (!(supported_xcr0 & (1 << i))) {
+                esa->size = 0;
+            }
+        }
+    }
+
     accel_cpu_instance_init(CPU(obj));
 }
 
diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index 6bf8dcfc607..d9306418490 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -133,10 +133,6 @@ static void kvm_cpu_xsave_init(void)
         if (!esa->size) {
             continue;
         }
-        if ((x86_cpu_get_supported_feature_word(NULL, esa->feature) & esa->bits)
-            != esa->bits) {
-            continue;
-        }
         host_cpuid(0xd, i, &eax, &ebx, &ecx, &edx);
         if (eax != 0) {
             assert(esa->size == eax);
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
  2024-10-29 15:18 ` [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG Paolo Bonzini
  2024-10-29 15:18 ` [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  5:19   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

Prepare for filtering non-boolean features such as AVX10 version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1ee4d988caf..c2f6045ec1c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5840,7 +5840,7 @@ static void x86_cpu_parse_featurestr(const char *typename, char *features,
     }
 }
 
-static void x86_cpu_filter_features(X86CPU *cpu, bool verbose);
+static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose);
 
 /* Build a list with the name of all features on a feature word array */
 static void x86_cpu_list_feature_names(FeatureWordArray features,
@@ -7558,7 +7558,7 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
  *
  * Returns: 0 if all flags are supported by the host, non-zero otherwise.
  */
-static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
+static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
 {
     CPUX86State *env = &cpu->env;
     FeatureWord w;
@@ -7610,6 +7610,8 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
             mark_unavailable_features(cpu, FEAT_7_0_EBX, CPUID_7_0_EBX_INTEL_PT, prefix);
         }
     }
+
+    return x86_cpu_have_filtered_features(cpu);
 }
 
 static void x86_cpu_hyperv_realize(X86CPU *cpu)
@@ -7707,14 +7709,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         }
     }
 
-    x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
-
-    if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {
-        error_setg(&local_err,
-                   accel_uses_host_cpuid() ?
+    if (x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid)) {
+        if (cpu->enforce_cpuid) {
+            error_setg(&local_err,
+                       accel_uses_host_cpuid() ?
                        "Host doesn't support requested features" :
                        "TCG doesn't support requested features");
-        goto out;
+            goto out;
+        }
     }
 
     /* On AMD CPUs, some CPUID[8000_0001].EDX bits must match the bits on
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (2 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  3:05   ` Tao Su
  2024-10-30  8:44   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 5/8] target/i386: add CPUID.24 features for AVX10 Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

From: Tao Su <tao1.su@linux.intel.com>

When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.

Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user.  The default can come from the device model or, for the max model,
from KVM's reported value.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h     |  4 +++
 target/i386/cpu.c     | 64 ++++++++++++++++++++++++++++++++++++++-----
 target/i386/kvm/kvm.c |  3 +-
 3 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a0a122cb5bf..72e98b25114 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -975,6 +975,8 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
 #define CPUID_7_1_EDX_AMX_COMPLEX       (1U << 8)
 /* PREFETCHIT0/1 Instructions */
 #define CPUID_7_1_EDX_PREFETCHITI       (1U << 14)
+/* Support for Advanced Vector Extensions 10 */
+#define CPUID_7_1_EDX_AVX10             (1U << 19)
 /* Flexible return and event delivery (FRED) */
 #define CPUID_7_1_EAX_FRED              (1U << 17)
 /* Load into IA32_KERNEL_GS_BASE (LKGS) */
@@ -1954,6 +1956,8 @@ typedef struct CPUArchState {
     uint32_t cpuid_vendor3;
     uint32_t cpuid_version;
     FeatureWordArray features;
+    /* AVX10 version */
+    uint8_t avx10_version;
     /* Features that were explicitly enabled/disabled */
     FeatureWordArray user_features;
     uint32_t cpuid_model[12];
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index c2f6045ec1c..fdbfcc59da4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -46,6 +46,9 @@
 #include "cpu-internal.h"
 
 static void x86_cpu_realizefn(DeviceState *dev, Error **errp);
+static void x86_cpu_get_supported_cpuid(uint32_t func, uint32_t index,
+                                        uint32_t *eax, uint32_t *ebx,
+                                        uint32_t *ecx, uint32_t *edx);
 
 /* Helpers for building CPUID[2] descriptors: */
 
@@ -1132,7 +1135,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
             "avx-vnni-int8", "avx-ne-convert", NULL, NULL,
             "amx-complex", NULL, "avx-vnni-int16", NULL,
             NULL, NULL, "prefetchiti", NULL,
-            NULL, NULL, NULL, NULL,
+            NULL, NULL, NULL, "avx10",
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
             NULL, NULL, NULL, NULL,
@@ -1989,6 +1992,7 @@ typedef struct X86CPUDefinition {
     int family;
     int model;
     int stepping;
+    int avx10_version;
     FeatureWordArray features;
     const char *model_id;
     const CPUCaches *const cache_info;
@@ -5338,6 +5342,8 @@ static Property max_x86_cpu_properties[] = {
 static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
 {
     Object *obj = OBJECT(dev);
+    X86CPU *cpu = X86_CPU(dev);
+    CPUX86State *env = &cpu->env;
 
     if (!object_property_get_int(obj, "family", &error_abort)) {
         if (X86_CPU(obj)->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
@@ -5351,6 +5357,14 @@ static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
         }
     }
 
+    if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && !env->avx10_version) {
+        uint32_t eax, ebx, ecx, edx;
+        x86_cpu_get_supported_cpuid(0x24, 0,
+                                    &eax, &ebx, &ecx, &edx);
+
+        env->avx10_version = ebx & 0xff;
+    }
+
     x86_cpu_realizefn(dev, errp);
 }
 
@@ -6331,6 +6345,9 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model)
      */
     object_property_set_str(OBJECT(cpu), "vendor", def->vendor, &error_abort);
 
+    object_property_set_int(OBJECT(cpu), "avx10-version", def->avx10_version,
+                            &error_abort);
+
     x86_cpu_apply_version_props(cpu, model);
 
     /*
@@ -6859,6 +6876,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         break;
     }
+    case 0x24: {
+        *eax = 0;
+        *ebx = 0;
+        *ecx = 0;
+        *edx = 0;
+        if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+            *ebx = env->avx10_version;
+        }
+        break;
+    }
     case 0x40000000:
         /*
          * CPUID code in kvm_arch_init_vcpu() ignores stuff
@@ -7513,6 +7540,11 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
             x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x1F);
         }
 
+        /* Advanced Vector Extensions 10 (AVX10) requires CPUID[0x24] */
+        if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+            x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x24);
+        }
+
         /* SVM requires CPUID[0x8000000A] */
         if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
             x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel, 0x8000000A);
@@ -7563,6 +7595,10 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
     CPUX86State *env = &cpu->env;
     FeatureWord w;
     const char *prefix = NULL;
+    bool have_filtered_features;
+
+    uint32_t eax_0, ebx_0, ecx_0, edx_0;
+    uint32_t eax_1, ebx_1, ecx_1, edx_1;
 
     if (verbose) {
         prefix = accel_uses_host_cpuid()
@@ -7584,13 +7620,10 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
      */
     if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
         kvm_enabled()) {
-        uint32_t eax_0, ebx_0, ecx_0, edx_0_unused;
-        uint32_t eax_1, ebx_1, ecx_1_unused, edx_1_unused;
-
         x86_cpu_get_supported_cpuid(0x14, 0,
-                                    &eax_0, &ebx_0, &ecx_0, &edx_0_unused);
+                                    &eax_0, &ebx_0, &ecx_0, &edx_0);
         x86_cpu_get_supported_cpuid(0x14, 1,
-                                    &eax_1, &ebx_1, &ecx_1_unused, &edx_1_unused);
+                                    &eax_1, &ebx_1, &ecx_1, &edx_1);
 
         if (!eax_0 ||
            ((ebx_0 & INTEL_PT_MINIMAL_EBX) != INTEL_PT_MINIMAL_EBX) ||
@@ -7611,7 +7644,23 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
         }
     }
 
-    return x86_cpu_have_filtered_features(cpu);
+    have_filtered_features = x86_cpu_have_filtered_features(cpu);
+
+    if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+        x86_cpu_get_supported_cpuid(0x24, 0,
+                                    &eax_0, &ebx_0, &ecx_0, &edx_0);
+        uint8_t version = ebx_0 & 0xff;
+
+        if (version < env->avx10_version) {
+            if (prefix) {
+                warn_report("%s: avx10.%d", prefix, env->avx10_version);
+            }
+            env->avx10_version = version;
+            have_filtered_features = true;
+        }
+    }
+
+    return have_filtered_features;
 }
 
 static void x86_cpu_hyperv_realize(X86CPU *cpu)
@@ -8395,6 +8444,7 @@ static Property x86_cpu_properties[] = {
     DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
     DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
     DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
+    DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),
     DEFINE_PROP_UINT64("ucode-rev", X86CPU, ucode_rev, 0),
     DEFINE_PROP_BOOL("full-cpuid-auto-level", X86CPU, full_cpuid_auto_level, true),
     DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor),
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index fd9f1988920..8e17942c3ba 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1923,7 +1923,8 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
         case 0x7:
         case 0x14:
         case 0x1d:
-        case 0x1e: {
+        case 0x1e:
+        case 0x24: {
             uint32_t times;
 
             c->function = i;
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/8] target/i386: add CPUID.24 features for AVX10
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (3 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  8:50   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 6/8] target/i386: Add feature dependencies " Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

From: Tao Su <tao1.su@linux.intel.com>

Introduce features for the supported vector bit lengths.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
---
 target/i386/cpu.h |  8 ++++++++
 target/i386/cpu.c | 15 +++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 72e98b25114..f8f97fe9330 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -665,6 +665,7 @@ typedef enum FeatureWord {
     FEAT_XSAVE_XSS_HI,     /* CPUID[EAX=0xd,ECX=1].EDX */
     FEAT_7_1_EDX,       /* CPUID[EAX=7,ECX=1].EDX */
     FEAT_7_2_EDX,       /* CPUID[EAX=7,ECX=2].EDX */
+    FEAT_24_0_EBX,      /* CPUID[EAX=0x24,ECX=0].EBX */
     FEATURE_WORDS,
 } FeatureWord;
 
@@ -993,6 +994,13 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
 /* Packets which contain IP payload have LIP values */
 #define CPUID_14_0_ECX_LIP              (1U << 31)
 
+/* AVX10 128-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_128        (1U << 16)
+/* AVX10 256-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_256        (1U << 17)
+/* AVX10 512-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_512        (1U << 18)
+
 /* RAS Features */
 #define CPUID_8000_0007_EBX_OVERFLOW_RECOV    (1U << 0)
 #define CPUID_8000_0007_EBX_SUCCOR      (1U << 1)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index fdbfcc59da4..3f7fed8e485 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -901,6 +901,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
 #define TCG_SGX_12_0_EAX_FEATURES 0
 #define TCG_SGX_12_0_EBX_FEATURES 0
 #define TCG_SGX_12_1_EAX_FEATURES 0
+#define TCG_24_0_EBX_FEATURES 0
 
 #if defined CONFIG_USER_ONLY
 #define CPUID_8000_0008_EBX_KERNEL_FEATURES (CPUID_8000_0008_EBX_IBPB | \
@@ -1166,6 +1167,20 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
         },
         .tcg_features = TCG_7_2_EDX_FEATURES,
     },
+    [FEAT_24_0_EBX] = {
+        .type = CPUID_FEATURE_WORD,
+        .feat_names = {
+            [16] = "avx10-128",
+            [17] = "avx10-256",
+            [18] = "avx10-512",
+        },
+        .cpuid = {
+            .eax = 0x24,
+            .needs_ecx = true, .ecx = 0,
+            .reg = R_EBX,
+        },
+        .tcg_features = TCG_24_0_EBX_FEATURES,
+    },
     [FEAT_8000_0007_EDX] = {
         .type = CPUID_FEATURE_WORD,
         .feat_names = {
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/8] target/i386: Add feature dependencies for AVX10
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (4 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 5/8] target/i386: add CPUID.24 features for AVX10 Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-29 15:18 ` [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li, Xuelian Guo

From: Tao Su <tao1.su@linux.intel.com>

Since the highest supported vector length for a processor implies that
all lesser vector lengths are also supported, add the dependencies of
the supported vector lengths. If all vector lengths aren't supported,
clear AVX10 enable bit as well.

Note that the order of AVX10 related dependencies should be kept as:
        CPUID_24_0_EBX_AVX10_128     -> CPUID_24_0_EBX_AVX10_256,
        CPUID_24_0_EBX_AVX10_256     -> CPUID_24_0_EBX_AVX10_512,
        CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
        CPUID_7_1_EDX_AVX10          -> CPUID_24_0_EBX,
so that prevent user from setting weird CPUID combinations, e.g. 256-bits
and 512-bits are supported but 128-bits is not, no vector lengths are
supported but AVX10 enable bit is still set.

Since AVX10_128 will be reserved as 1, adding these dependencies has the
bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
CPUID_24_0_EBX will be disabled automatically.

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-5-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.h |  4 ++++
 target/i386/cpu.c | 16 ++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f8f97fe9330..59959b8b7a4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1000,6 +1000,10 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
 #define CPUID_24_0_EBX_AVX10_256        (1U << 17)
 /* AVX10 512-bit vector support is present */
 #define CPUID_24_0_EBX_AVX10_512        (1U << 18)
+/* AVX10 vector length support mask */
+#define CPUID_24_0_EBX_AVX10_VL_MASK    (CPUID_24_0_EBX_AVX10_128 | \
+                                         CPUID_24_0_EBX_AVX10_256 | \
+                                         CPUID_24_0_EBX_AVX10_512)
 
 /* RAS Features */
 #define CPUID_8000_0007_EBX_OVERFLOW_RECOV    (1U << 0)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3f7fed8e485..4c86a49ad05 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1787,6 +1787,22 @@ static FeatureDep feature_dependencies[] = {
         .from = { FEAT_7_0_EBX,             CPUID_7_0_EBX_SGX },
         .to = { FEAT_SGX_12_1_EAX,          ~0ull },
     },
+    {
+        .from = { FEAT_24_0_EBX,            CPUID_24_0_EBX_AVX10_128 },
+        .to = { FEAT_24_0_EBX,              CPUID_24_0_EBX_AVX10_256 },
+    },
+    {
+        .from = { FEAT_24_0_EBX,            CPUID_24_0_EBX_AVX10_256 },
+        .to = { FEAT_24_0_EBX,              CPUID_24_0_EBX_AVX10_512 },
+    },
+    {
+        .from = { FEAT_24_0_EBX,            CPUID_24_0_EBX_AVX10_VL_MASK },
+        .to = { FEAT_7_1_EDX,               CPUID_7_1_EDX_AVX10 },
+    },
+    {
+        .from = { FEAT_7_1_EDX,             CPUID_7_1_EDX_AVX10 },
+        .to = { FEAT_24_0_EBX,              ~0ull },
+    },
 };
 
 typedef struct X86RegisterInfo32 {
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (5 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 6/8] target/i386: Add feature dependencies " Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-29 20:11   ` Paolo Bonzini
  2024-10-30  8:54   ` Zhao Liu
  2024-10-29 15:18 ` [PATCH 8/8] target/i386: Introduce GraniteRapids-v2 model Paolo Bonzini
  2024-10-30  3:18 ` [PATCH v2 0/8] Add AVX10.1 CPUID support and " Tao Su
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

From: Tao Su <tao1.su@linux.intel.com>

AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
are identical to AVX512 state regardless of the supported vector lengths.

Given that some E-cores will support AVX10 but not support AVX512, add
AVX512 state components to guest when AVX10 is enabled.

Based on a patch by Tao Su <tao1.su@linux.intel.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 4c86a49ad05..b6799ddafa9 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7166,7 +7166,15 @@ static bool cpuid_has_xsave_feature(CPUX86State *env, const ExtSaveArea *esa)
         return false;
     }
 
-    return (env->features[esa->feature] & esa->bits);
+    if (env->features[esa->feature] & esa->bits) {
+        return true;
+    }
+    if (esa->feature == FEAT_7_0_EBX && esa->bits = CPUID_7_0_EBX_AVX512F
+        && (features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
+        return true;
+    }
+
+    return false;
 }
 
 static void x86_cpu_reset_hold(Object *obj, ResetType type)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/8] target/i386: Introduce GraniteRapids-v2 model
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (6 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported Paolo Bonzini
@ 2024-10-29 15:18 ` Paolo Bonzini
  2024-10-30  3:18 ` [PATCH v2 0/8] Add AVX10.1 CPUID support and " Tao Su
  8 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 15:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li, Xuelian Guo

From: Tao Su <tao1.su@linux.intel.com>

Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
tsc-adjust, cldemote, movdiri, movdir64b).

Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-7-tao1.su@linux.intel.com
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/cpu.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b6799ddafa9..c39e0eb924c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4403,6 +4403,23 @@ static const X86CPUDefinition builtin_x86_defs[] = {
         .model_id = "Intel Xeon Processor (GraniteRapids)",
         .versions = (X86CPUVersionDefinition[]) {
             { .version = 1 },
+            {
+                .version = 2,
+                .props = (PropValue[]) {
+                    { "ss", "on" },
+                    { "tsc-adjust", "on" },
+                    { "cldemote", "on" },
+                    { "movdiri", "on" },
+                    { "movdir64b", "on" },
+                    { "avx10", "on" },
+                    { "avx10-128", "on" },
+                    { "avx10-256", "on" },
+                    { "avx10-512", "on" },
+                    { "avx10-version", "1" },
+                    { "stepping", "1" },
+                    { /* end of list */ }
+                }
+            },
             { /* end of list */ },
         },
     },
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported
  2024-10-29 15:18 ` [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported Paolo Bonzini
@ 2024-10-29 20:11   ` Paolo Bonzini
  2024-10-30  8:54   ` Zhao Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-29 20:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: tao1.su, zhao1.liu, xiaoyao.li

On Tue, Oct 29, 2024 at 4:19 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> From: Tao Su <tao1.su@linux.intel.com>
>
> AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> are identical to AVX512 state regardless of the supported vector lengths.
>
> Given that some E-cores will support AVX10 but not support AVX512, add
> AVX512 state components to guest when AVX10 is enabled.
>
> Based on a patch by Tao Su <tao1.su@linux.intel.com>
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 4c86a49ad05..b6799ddafa9 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7166,7 +7166,15 @@ static bool cpuid_has_xsave_feature(CPUX86State *env, const ExtSaveArea *esa)
>          return false;
>      }
>
> -    return (env->features[esa->feature] & esa->bits);
> +    if (env->features[esa->feature] & esa->bits) {
> +        return true;
> +    }
> +    if (esa->feature == FEAT_7_0_EBX && esa->bits = CPUID_7_0_EBX_AVX512F
> +        && (features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
> +        return true;
> +    }

Oops, a few syntax issues in this hunk. :(

The right version is at branch avx10-for-intel of
https://gitlab.com/bonzini/qemu/.

> +
> +    return false;
>  }
>
>  static void x86_cpu_reset_hold(Object *obj, ResetType type)
> --
> 2.47.0
>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG
  2024-10-29 15:18 ` [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG Paolo Bonzini
@ 2024-10-30  2:56   ` Zhao Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  2:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:51PM +0100, Paolo Bonzini wrote:
> Date: Tue, 29 Oct 2024 16:18:51 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features
>  for TCG
> X-Mailer: git-send-email 2.47.0
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-29 15:18 ` [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property Paolo Bonzini
@ 2024-10-30  3:05   ` Tao Su
  2024-10-30  8:09     ` Zhao Liu
  2024-10-30  8:44   ` Zhao Liu
  1 sibling, 1 reply; 31+ messages in thread
From: Tao Su @ 2024-10-30  3:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, zhao1.liu, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:54PM +0100, Paolo Bonzini wrote:
> From: Tao Su <tao1.su@linux.intel.com>

[ ... ]

>  static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
>  {
>      Object *obj = OBJECT(dev);
> +    X86CPU *cpu = X86_CPU(dev);
> +    CPUX86State *env = &cpu->env;
>  
>      if (!object_property_get_int(obj, "family", &error_abort)) {
>          if (X86_CPU(obj)->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> @@ -5351,6 +5357,14 @@ static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
>          }
>      }
>  
> +    if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && !env->avx10_version) {

CPUID_7_1_EDX_AVX10 is not set now and will be set in x86_cpu_realizefn().
How about just checking !env->avx10_version? Because cpu_x86_cpuid will
also check (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10).

> +        uint32_t eax, ebx, ecx, edx;
> +        x86_cpu_get_supported_cpuid(0x24, 0,
> +                                    &eax, &ebx, &ecx, &edx);
> +
> +        env->avx10_version = ebx & 0xff;
> +    }
> +
>      x86_cpu_realizefn(dev, errp);
>  }
>  
> @@ -6331,6 +6345,9 @@ static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model)
>       */
>      object_property_set_str(OBJECT(cpu), "vendor", def->vendor, &error_abort);
>  
> +    object_property_set_int(OBJECT(cpu), "avx10-version", def->avx10_version,
> +                            &error_abort);

NIT: avx10-version is defined as UINT8, is it better to use object_property_set_uint?

> +
>      x86_cpu_apply_version_props(cpu, model);
>  
>      /*
> @@ -6859,6 +6876,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          }
>          break;
>      }
> +    case 0x24: {
> +        *eax = 0;
> +        *ebx = 0;
> +        *ecx = 0;
> +        *edx = 0;
> +        if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> +            *ebx = env->avx10_version;
> +        }

The CPUIDs of vector lengths are missing here, according to your previous
comment, it should be:

+        if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && count == 0) {
+            *ebx = env->features[FEAT_24_0_EBX] | env->avx10_version;
+        }

[ ... ]


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model
  2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
                   ` (7 preceding siblings ...)
  2024-10-29 15:18 ` [PATCH 8/8] target/i386: Introduce GraniteRapids-v2 model Paolo Bonzini
@ 2024-10-30  3:18 ` Tao Su
  2024-10-30  8:35   ` Paolo Bonzini
  8 siblings, 1 reply; 31+ messages in thread
From: Tao Su @ 2024-10-30  3:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, zhao1.liu, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:50PM +0100, Paolo Bonzini wrote:
> Most of the patches here are from Tao Su's v1.  The main issue in his
> version were two:
> 
> - overlooking kvm_cpu_xsave_init(), which currently looks at ExtSaveArea.
>   This would get a bit ugly for extended save states that are enabled
>   by both AVX512 and AVX10.  Patches 1-2 change kvm_cpu_xsave_init()
>   to look at ExtSaveArea's size field instead of testing features.
> 
> - downgrading silently to KVM reported value if the avx10_version property
>   is >= kvm reported value.  Xiaoyao Li suggested basing this on
>   cpu->check_cpuid and cpu->enforce_cpuid.  Also, the check must accept
>   any accelerator and not just KVM.  I moved the check to
>   x86_cpu_filter_features in patch 4.
> 
> I don't have a Granite Rapids machine, so please test! :)

I test it on Granite Rapids and all meet expection with my minor changes on
patch4 :)

> 
> Paolo
> 
> Paolo Bonzini (3):
>   target/i386: cpu: set correct supported XCR0 features for TCG
>   target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
>   target/i386: return bool from x86_cpu_filter_features
> 
> Tao Su (5):
>   target/i386: add AVX10 feature and AVX10 version property
>   target/i386: add CPUID.24 features for AVX10
>   target/i386: Add feature dependencies for AVX10
>   target/i386: Add AVX512 state when AVX10 is supported
>   target/i386: Introduce GraniteRapids-v2 model
> 
>  target/i386/cpu.h         |  16 ++++
>  target/i386/cpu.c         | 175 ++++++++++++++++++++++++++++++++++----
>  target/i386/kvm/kvm-cpu.c |   8 --
>  target/i386/kvm/kvm.c     |   3 +-
>  4 files changed, 175 insertions(+), 27 deletions(-)
> 
> -- 
> 2.47.0
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits
  2024-10-29 15:18 ` [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits Paolo Bonzini
@ 2024-10-30  3:50   ` Zhao Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  3:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:52PM +0100, Paolo Bonzini wrote:
> Date: Tue, 29 Oct 2024 16:18:52 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH 2/8] target/i386: do not rely on ExtSaveArea for
>  accelerator-supported XCR0 bits
> X-Mailer: git-send-email 2.47.0
> 
> Right now, QEMU is using the "feature" and "bits" fields of ExtSaveArea
> to query the accelerator for the support status of extended save areas.
> This is a problem for AVX10, which attaches two feature bits (AVX512F
> and AVX10) to the same extended save states.
> 
> To keep the AVX10 hacks to the minimum, limit usage of esa->features
> and esa->bits.  Instead, just query the accelerator for the 0xD leaf.
> Do it in common code and clear esa->size if an extended save state is
> unsupported.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.c         | 33 +++++++++++++++++++++++++++++++--
>  target/i386/kvm/kvm-cpu.c |  4 ----
>  2 files changed, 31 insertions(+), 6 deletions(-)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features
  2024-10-29 15:18 ` [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features Paolo Bonzini
@ 2024-10-30  5:19   ` Zhao Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  5:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

[snip]

> @@ -7558,7 +7558,7 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>   *
>   * Returns: 0 if all flags are supported by the host, non-zero otherwise.

Comment can be updated as well. :-)

Returns: true if any flag is not supported by the host, false otherwise.

>   */
> -static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> +static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>  {
>      CPUX86State *env = &cpu->env;
>      FeatureWord w;
> @@ -7610,6 +7610,8 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>              mark_unavailable_features(cpu, FEAT_7_0_EBX, CPUID_7_0_EBX_INTEL_PT, prefix);
>          }
>      }
> +
> +    return x86_cpu_have_filtered_features(cpu);
>  }
>  
>  static void x86_cpu_hyperv_realize(X86CPU *cpu)
> @@ -7707,14 +7709,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>          }
>      }
>  
> -    x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
> -
> -    if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {
> -        error_setg(&local_err,
> -                   accel_uses_host_cpuid() ?
> +    if (x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid)) {
> +        if (cpu->enforce_cpuid) {
> +            error_setg(&local_err,

It seems that we don't need local_err here, as this function already has
an errp parameter. I will clean up the error handling of this function
later.

> +                       accel_uses_host_cpuid() ?
>                         "Host doesn't support requested features" :
>                         "TCG doesn't support requested features");
> -        goto out;
> +            goto out;
> +        }
>      }

LGTM,

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30  3:05   ` Tao Su
@ 2024-10-30  8:09     ` Zhao Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  8:09 UTC (permalink / raw)
  To: Tao Su; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Wed, Oct 30, 2024 at 11:05:23AM +0800, Tao Su wrote:
> Date: Wed, 30 Oct 2024 11:05:23 +0800
> From: Tao Su <tao1.su@linux.intel.com>
> Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
>  property
> 
> On Tue, Oct 29, 2024 at 04:18:54PM +0100, Paolo Bonzini wrote:
> > From: Tao Su <tao1.su@linux.intel.com>
> 
> [ ... ]
> 
> >  static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
> >  {
> >      Object *obj = OBJECT(dev);
> > +    X86CPU *cpu = X86_CPU(dev);
> > +    CPUX86State *env = &cpu->env;
> >  
> >      if (!object_property_get_int(obj, "family", &error_abort)) {
> >          if (X86_CPU(obj)->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
> > @@ -5351,6 +5357,14 @@ static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
> >          }
> >      }
> >  
> > +    if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && !env->avx10_version) {
> 
> CPUID_7_1_EDX_AVX10 is not set now and will be set in x86_cpu_realizefn().
> How about just checking !env->avx10_version? Because cpu_x86_cpuid will
> also check (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10).

If you explicitly turn on avx10 via -cpu max,+avx10, then CPUID_7_1_EDX_AVX10
will be there. But I agree, this is not a good place to check avx10 and
avx10_version.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model
  2024-10-30  3:18 ` [PATCH v2 0/8] Add AVX10.1 CPUID support and " Tao Su
@ 2024-10-30  8:35   ` Paolo Bonzini
  2024-10-30  8:52     ` Tao Su
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2024-10-30  8:35 UTC (permalink / raw)
  To: Tao Su; +Cc: qemu-devel, Zhao Liu, Xiaoyao Li

[-- Attachment #1: Type: text/plain, Size: 1103 bytes --]

Il mer 30 ott 2024, 04:23 Tao Su <tao1.su@linux.intel.com> ha scritto:

> > I don't have a Granite Rapids machine, so please test! :)
>
> I test it on Granite Rapids and all meet expection with my minor changes on
> patch4 :)
>

Thanks, can you send v3?

Paolo

>
> > Paolo
> >
> > Paolo Bonzini (3):
> >   target/i386: cpu: set correct supported XCR0 features for TCG
> >   target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0
> bits
> >   target/i386: return bool from x86_cpu_filter_features
> >
> > Tao Su (5):
> >   target/i386: add AVX10 feature and AVX10 version property
> >   target/i386: add CPUID.24 features for AVX10
> >   target/i386: Add feature dependencies for AVX10
> >   target/i386: Add AVX512 state when AVX10 is supported
> >   target/i386: Introduce GraniteRapids-v2 model
> >
> >  target/i386/cpu.h         |  16 ++++
> >  target/i386/cpu.c         | 175 ++++++++++++++++++++++++++++++++++----
> >  target/i386/kvm/kvm-cpu.c |   8 --
> >  target/i386/kvm/kvm.c     |   3 +-
> >  4 files changed, 175 insertions(+), 27 deletions(-)
> >
> > --
> > 2.47.0
> >
>
>

[-- Attachment #2: Type: text/html, Size: 1916 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-29 15:18 ` [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property Paolo Bonzini
  2024-10-30  3:05   ` Tao Su
@ 2024-10-30  8:44   ` Zhao Liu
  2024-10-30  9:37     ` Tao Su
  1 sibling, 1 reply; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  8:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:54PM +0100, Paolo Bonzini wrote:
> Date: Tue, 29 Oct 2024 16:18:54 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
>  property
> X-Mailer: git-send-email 2.47.0
> 
> From: Tao Su <tao1.su@linux.intel.com>
> 
> When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
> Converged Vector ISA leaf" containing fields for the version number and
> the supported vector bit lengths.
> 
> Introduce avx10-version property so that avx10 version can be controlled
> by user and cpu model. Per spec, avx10 version can never be 0, the default
> value of avx10-version is set to 0 to determine whether it is specified by
> user.

The default value of 0 does not reflect whether the user has set it to 0.
According to the description here, the spec clearly prohibits 0, so
should we report an error when the user sets it to 0?

If so, it might be better to change the default value to -1 and adjust
based on the host's support.

> The default can come from the device model or, for the max model,
> from KVM's reported value.
> 
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
> Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.h     |  4 +++
>  target/i386/cpu.c     | 64 ++++++++++++++++++++++++++++++++++++++-----
>  target/i386/kvm/kvm.c |  3 +-
>  3 files changed, 63 insertions(+), 8 deletions(-)

[snip]

> @@ -7611,7 +7644,23 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>          }
>      }
>  
> -    return x86_cpu_have_filtered_features(cpu);
> +    have_filtered_features = x86_cpu_have_filtered_features(cpu);
> +
> +    if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> +        x86_cpu_get_supported_cpuid(0x24, 0,
> +                                    &eax_0, &ebx_0, &ecx_0, &edx_0);
> +        uint8_t version = ebx_0 & 0xff;
> +
> +        if (version < env->avx10_version) {
> +            if (prefix) {
> +                warn_report("%s: avx10.%d", prefix, env->avx10_version);

Perhaps also tell user about revised version?

warn_report("%s: avx10.%d. Adjust to avx10.%d",
            prefix, env->avx10_version, version);

> +            }
> +            env->avx10_version = version;
> +            have_filtered_features = true;
> +        }
> +    }


Per Tao's comment, perhaps we can check AVX10 and version here (default
version is 0):

@@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
                                     &eax_0, &ebx_0, &ecx_0, &edx_0);
         uint8_t version = ebx_0 & 0xff;

-        if (version < env->avx10_version) {
+        if (!env->avx10_version) {
+            env->avx10_version = version;
+        } else (version < env->avx10_version) {
             if (prefix) {
-                warn_report("%s: avx10.%d", prefix, env->avx10_version);
+                warn_report("%s: avx10.%d. Adjust to avx10.%d",
+                            prefix, env->avx10_version, version);
             }
             env->avx10_version = version;
             have_filtered_features = true;
         }
+    } else if (env->avx10_version && prefix) {
+        if (prefix) {
+            warn_report("%s: avx10.%d.", prefix, env->avx10_version);
+        }
+        have_filtered_features = true;
     }

     return have_filtered_features;

> +    return have_filtered_features;
>  }
>  
>  static void x86_cpu_hyperv_realize(X86CPU *cpu)
> @@ -8395,6 +8444,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
>      DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
>      DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
> +    DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),

As my first comment, we could consider changing the default value to -1.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/8] target/i386: add CPUID.24 features for AVX10
  2024-10-29 15:18 ` [PATCH 5/8] target/i386: add CPUID.24 features for AVX10 Paolo Bonzini
@ 2024-10-30  8:50   ` Zhao Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  8:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:55PM +0100, Paolo Bonzini wrote:
> Date: Tue, 29 Oct 2024 16:18:55 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH 5/8] target/i386: add CPUID.24 features for AVX10
> X-Mailer: git-send-email 2.47.0
> 
> From: Tao Su <tao1.su@linux.intel.com>
> 
> Introduce features for the supported vector bit lengths.
> 
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
> Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com

You missed your Signed-of.

> ---
>  target/i386/cpu.h |  8 ++++++++
>  target/i386/cpu.c | 15 +++++++++++++++
>  2 files changed, 23 insertions(+)
> 

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model
  2024-10-30  8:35   ` Paolo Bonzini
@ 2024-10-30  8:52     ` Tao Su
  0 siblings, 0 replies; 31+ messages in thread
From: Tao Su @ 2024-10-30  8:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Zhao Liu, Xiaoyao Li

On Wed, Oct 30, 2024 at 09:35:43AM +0100, Paolo Bonzini wrote:
> Il mer 30 ott 2024, 04:23 Tao Su <tao1.su@linux.intel.com> ha scritto:
> 
> > > I don't have a Granite Rapids machine, so please test! :)
> >
> > I test it on Granite Rapids and all meet expection with my minor changes on
> > patch4 :)
> >
> 
> Thanks, can you send v3?

Sure, I will.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported
  2024-10-29 15:18 ` [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported Paolo Bonzini
  2024-10-29 20:11   ` Paolo Bonzini
@ 2024-10-30  8:54   ` Zhao Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-30  8:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, tao1.su, xiaoyao.li

On Tue, Oct 29, 2024 at 04:18:57PM +0100, Paolo Bonzini wrote:
> Date: Tue, 29 Oct 2024 16:18:57 +0100
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported
> X-Mailer: git-send-email 2.47.0
> 
> From: Tao Su <tao1.su@linux.intel.com>
> 
> AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> are identical to AVX512 state regardless of the supported vector lengths.
> 
> Given that some E-cores will support AVX10 but not support AVX512, add
> AVX512 state components to guest when AVX10 is enabled.
> 
> Based on a patch by Tao Su <tao1.su@linux.intel.com>
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/cpu.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30  8:44   ` Zhao Liu
@ 2024-10-30  9:37     ` Tao Su
  2024-10-30 13:21       ` Zhao Liu
  0 siblings, 1 reply; 31+ messages in thread
From: Tao Su @ 2024-10-30  9:37 UTC (permalink / raw)
  To: Zhao Liu; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Wed, Oct 30, 2024 at 04:44:54PM +0800, Zhao Liu wrote:
> On Tue, Oct 29, 2024 at 04:18:54PM +0100, Paolo Bonzini wrote:
> > Date: Tue, 29 Oct 2024 16:18:54 +0100
> > From: Paolo Bonzini <pbonzini@redhat.com>
> > Subject: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
> >  property
> > X-Mailer: git-send-email 2.47.0
> > 
> > From: Tao Su <tao1.su@linux.intel.com>
> > 
> > When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
> > Converged Vector ISA leaf" containing fields for the version number and
> > the supported vector bit lengths.
> > 
> > Introduce avx10-version property so that avx10 version can be controlled
> > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > value of avx10-version is set to 0 to determine whether it is specified by
> > user.
> 
> The default value of 0 does not reflect whether the user has set it to 0.
> According to the description here, the spec clearly prohibits 0, so
> should we report an error when the user sets it to 0?
> 
> If so, it might be better to change the default value to -1 and adjust
> based on the host's support.
> 

If user sets version to 0, it will directly use reported version, this
should be a more neat and intuitive way?

> > The default can come from the device model or, for the max model,
> > from KVM's reported value.
> > 
> > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
> > Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  target/i386/cpu.h     |  4 +++
> >  target/i386/cpu.c     | 64 ++++++++++++++++++++++++++++++++++++++-----
> >  target/i386/kvm/kvm.c |  3 +-
> >  3 files changed, 63 insertions(+), 8 deletions(-)
> 
> [snip]
> 
> > @@ -7611,7 +7644,23 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> >          }
> >      }
> >  
> > -    return x86_cpu_have_filtered_features(cpu);
> > +    have_filtered_features = x86_cpu_have_filtered_features(cpu);
> > +
> > +    if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> > +        x86_cpu_get_supported_cpuid(0x24, 0,
> > +                                    &eax_0, &ebx_0, &ecx_0, &edx_0);
> > +        uint8_t version = ebx_0 & 0xff;
> > +
> > +        if (version < env->avx10_version) {
> > +            if (prefix) {
> > +                warn_report("%s: avx10.%d", prefix, env->avx10_version);
> 
> Perhaps also tell user about revised version?
> 
> warn_report("%s: avx10.%d. Adjust to avx10.%d",
>             prefix, env->avx10_version, version);
> 

I see, thanks!

> > +            }
> > +            env->avx10_version = version;
> > +            have_filtered_features = true;
> > +        }
> > +    }
> 
> 
> Per Tao's comment, perhaps we can check AVX10 and version here (default
> version is 0):
> 
> @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
>          uint8_t version = ebx_0 & 0xff;
> 
> -        if (version < env->avx10_version) {
> +        if (!env->avx10_version) {
> +            env->avx10_version = version;

x86_cpu_filter_features() is not a good place to assign avx10_version, I
still tend to set it in max_x86_cpu_realize().

> +        } else (version < env->avx10_version) {
>              if (prefix) {
> -                warn_report("%s: avx10.%d", prefix, env->avx10_version);
> +                warn_report("%s: avx10.%d. Adjust to avx10.%d",
> +                            prefix, env->avx10_version, version);
>              }
>              env->avx10_version = version;
>              have_filtered_features = true;
>          }
> +    } else if (env->avx10_version && prefix) {
> +        if (prefix) {

I think it is reasonable, especially when we don't check AVX10 enable bit
in max_x86_cpu_realize(). But checking prefix here again seems not
necessary.

> +            warn_report("%s: avx10.%d.", prefix, env->avx10_version);
> +        }
> +        have_filtered_features = true;
>      }
> 
>      return have_filtered_features;
> 
> > +    return have_filtered_features;
> >  }
> >  
> >  static void x86_cpu_hyperv_realize(X86CPU *cpu)
> > @@ -8395,6 +8444,7 @@ static Property x86_cpu_properties[] = {
> >      DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
> >      DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
> >      DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
> > +    DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),
> 
> As my first comment, we could consider changing the default value to -1.
> 

I still think 0 is better…



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30  9:37     ` Tao Su
@ 2024-10-30 13:21       ` Zhao Liu
  2024-10-30 14:05         ` Tao Su
  0 siblings, 1 reply; 31+ messages in thread
From: Zhao Liu @ 2024-10-30 13:21 UTC (permalink / raw)
  To: Tao Su; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

> > > Introduce avx10-version property so that avx10 version can be controlled
> > > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > > value of avx10-version is set to 0 to determine whether it is specified by
> > > user.
> > 
> > The default value of 0 does not reflect whether the user has set it to 0.
> > According to the description here, the spec clearly prohibits 0, so
> > should we report an error when the user sets it to 0?
> > 
> > If so, it might be better to change the default value to -1 and adjust
> > based on the host's support.
> > 
> 
> If user sets version to 0, it will directly use reported version, this
> should be a more neat and intuitive way?

The code implementation is actually similar for different initial
values. And about this:

> If user sets version to 0, it will directly use reported version", 

It's defining a special behavior for the API, which is based on the
special 0 value, and there needs to be documentation to let the user
know that 0 will be considered legal as well as that it will be quietly
overridden... But AFAIK there doesn't seem to be any place to add
documentation for the property ...

There may be similar problems with -1, e.g. if the user writes -1, there
is no way to report an error for the user's behavior. But it's better
than 0. After all, no one would think that a version of -1 is correct.
Topology IDs have been initialized to -1 to include the user's 0 value
in the check.

> > > The default can come from the device model or, for the max model,
> > > from KVM's reported value.
> > > 
> > > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > > Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
> > > Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
> > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > > ---
> > >  target/i386/cpu.h     |  4 +++
> > >  target/i386/cpu.c     | 64 ++++++++++++++++++++++++++++++++++++++-----
> > >  target/i386/kvm/kvm.c |  3 +-
> > >  3 files changed, 63 insertions(+), 8 deletions(-)
> > 
> > [snip]
> > 
> > > @@ -7611,7 +7644,23 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > >          }
> > >      }
> > >  
> > > -    return x86_cpu_have_filtered_features(cpu);
> > > +    have_filtered_features = x86_cpu_have_filtered_features(cpu);
> > > +
> > > +    if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> > > +        x86_cpu_get_supported_cpuid(0x24, 0,
> > > +                                    &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > +        uint8_t version = ebx_0 & 0xff;
> > > +
> > > +        if (version < env->avx10_version) {
> > > +            if (prefix) {
> > > +                warn_report("%s: avx10.%d", prefix, env->avx10_version);
> > 
> > Perhaps also tell user about revised version?
> > 
> > warn_report("%s: avx10.%d. Adjust to avx10.%d",
> >             prefix, env->avx10_version, version);
> > 
> 
> I see, thanks!

Welcome!

> > > +            }
> > > +            env->avx10_version = version;
> > > +            have_filtered_features = true;
> > > +        }
> > > +    }
> > 
> > 
> > Per Tao's comment, perhaps we can check AVX10 and version here (default
> > version is 0):
> > 
> > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> >          uint8_t version = ebx_0 & 0xff;
> > 
> > -        if (version < env->avx10_version) {
> > +        if (!env->avx10_version) {
> > +            env->avx10_version = version;
> 
> x86_cpu_filter_features() is not a good place to assign avx10_version, I
> still tend to set it in max_x86_cpu_realize().

It's not proper to get the host's version when AVX10 cannot be enabled,
even maybe host doesn't support AVX10.

As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
be enabled or not.

> > +        } else (version < env->avx10_version) {
> >              if (prefix) {
> > -                warn_report("%s: avx10.%d", prefix, env->avx10_version);
> > +                warn_report("%s: avx10.%d. Adjust to avx10.%d",
> > +                            prefix, env->avx10_version, version);
> >              }
> >              env->avx10_version = version;
> >              have_filtered_features = true;
> >          }
> > +    } else if (env->avx10_version && prefix) {

Oops, here we only need to check env->avx10_version...

> > +        if (prefix) {
> 
> I think it is reasonable, especially when we don't check AVX10 enable bit
> in max_x86_cpu_realize(). But checking prefix here again seems not
> necessary.

Thanks! We only need the second check since have_filtered_features
should change whether prefix exists or not.

> > +            warn_report("%s: avx10.%d.", prefix, env->avx10_version);
> > +        }
> > +        have_filtered_features = true;
> >      }
> > 
> >      return have_filtered_features;
> > 
> > > +    return have_filtered_features;
> > >  }
> > >  
> > >  static void x86_cpu_hyperv_realize(X86CPU *cpu)
> > > @@ -8395,6 +8444,7 @@ static Property x86_cpu_properties[] = {
> > >      DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
> > >      DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
> > >      DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
> > > +    DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),
> > 
> > As my first comment, we could consider changing the default value to -1.
> > 
> 
> I still think 0 is better…
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30 13:21       ` Zhao Liu
@ 2024-10-30 14:05         ` Tao Su
  2024-10-30 15:55           ` Zhao Liu
  0 siblings, 1 reply; 31+ messages in thread
From: Tao Su @ 2024-10-30 14:05 UTC (permalink / raw)
  To: Zhao Liu; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Wed, Oct 30, 2024 at 09:21:36PM +0800, Zhao Liu wrote:
> > > > Introduce avx10-version property so that avx10 version can be controlled
> > > > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > > > value of avx10-version is set to 0 to determine whether it is specified by
> > > > user.
> > > 
> > > The default value of 0 does not reflect whether the user has set it to 0.
> > > According to the description here, the spec clearly prohibits 0, so
> > > should we report an error when the user sets it to 0?
> > > 
> > > If so, it might be better to change the default value to -1 and adjust
> > > based on the host's support.
> > > 
> > 
> > If user sets version to 0, it will directly use reported version, this
> > should be a more neat and intuitive way?
> 
> The code implementation is actually similar for different initial
> values. And about this:
> 
> > If user sets version to 0, it will directly use reported version", 
> 
> It's defining a special behavior for the API, which is based on the
> special 0 value, and there needs to be documentation to let the user
> know that 0 will be considered legal as well as that it will be quietly
> overridden... But AFAIK there doesn't seem to be any place to add
> documentation for the property ...
> 
> There may be similar problems with -1, e.g. if the user writes -1, there
> is no way to report an error for the user's behavior. But it's better
> than 0. After all, no one would think that a version of -1 is correct.
> Topology IDs have been initialized to -1 to include the user's 0 value
> in the check.

Thanks for your explanation, but I really think the users who set
avx10-version should also know avx10.0 doesn’t exist, so using 0 is same
as -1…

To solve the initial value issue fundamentally, maybe we can add get/set
callbacks when adding avx10-version property? It should be simpler to
limit what users set.

[ ... ]

> > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> > >          uint8_t version = ebx_0 & 0xff;
> > > 
> > > -        if (version < env->avx10_version) {
> > > +        if (!env->avx10_version) {
> > > +            env->avx10_version = version;
> > 
> > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > still tend to set it in max_x86_cpu_realize().
> 
> It's not proper to get the host's version when AVX10 cannot be enabled,
> even maybe host doesn't support AVX10.
> 
> As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> be enabled or not.
> 

How about moving to x86_cpu_expand_features()? We can set when checking
cpu->max_features.

[ ... ]


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30 14:05         ` Tao Su
@ 2024-10-30 15:55           ` Zhao Liu
  2024-10-31  4:39             ` Tao Su
  0 siblings, 1 reply; 31+ messages in thread
From: Zhao Liu @ 2024-10-30 15:55 UTC (permalink / raw)
  To: Tao Su; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Wed, Oct 30, 2024 at 10:05:51PM +0800, Tao Su wrote:
> Date: Wed, 30 Oct 2024 22:05:51 +0800
> From: Tao Su <tao1.su@linux.intel.com>
> Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
>  property
> 
> On Wed, Oct 30, 2024 at 09:21:36PM +0800, Zhao Liu wrote:
> > > > > Introduce avx10-version property so that avx10 version can be controlled
> > > > > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > > > > value of avx10-version is set to 0 to determine whether it is specified by
> > > > > user.
> > > > 
> > > > The default value of 0 does not reflect whether the user has set it to 0.
> > > > According to the description here, the spec clearly prohibits 0, so
> > > > should we report an error when the user sets it to 0?
> > > > 
> > > > If so, it might be better to change the default value to -1 and adjust
> > > > based on the host's support.
> > > > 
> > > 
> > > If user sets version to 0, it will directly use reported version, this
> > > should be a more neat and intuitive way?
> > 
> > The code implementation is actually similar for different initial
> > values. And about this:
> > 
> > > If user sets version to 0, it will directly use reported version", 
> > 
> > It's defining a special behavior for the API, which is based on the
> > special 0 value, and there needs to be documentation to let the user
> > know that 0 will be considered legal as well as that it will be quietly
> > overridden... But AFAIK there doesn't seem to be any place to add
> > documentation for the property ...
> > 
> > There may be similar problems with -1, e.g. if the user writes -1, there
> > is no way to report an error for the user's behavior. But it's better
> > than 0. After all, no one would think that a version of -1 is correct.
> > Topology IDs have been initialized to -1 to include the user's 0 value
> > in the check.
> 
> Thanks for your explanation, but I really think the users who set
> avx10-version should also know avx10.0 doesn’t exist, so using 0 is same
> as -1…

I see. "Per spec, avx10 version can never be 0", so showing the warning
for avx10-version=0 is as it should be.

> To solve the initial value issue fundamentally, maybe we can add get/set
> callbacks when adding avx10-version property? It should be simpler to
> limit what users set.

It's unnecessary. Similar cases using -1 are already common, such as for
APIC ID, NUMA node ID, topology IDs, etc. The initial value is -1 simply
because we need to handle the case where users explicitly set it to 0.
If you don’t want to see -1, you can define a macro like APIC ID did
(#define UNSET_AVX10_VERSION -1).

> > > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > > >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > >          uint8_t version = ebx_0 & 0xff;
> > > > 
> > > > -        if (version < env->avx10_version) {
> > > > +        if (!env->avx10_version) {
> > > > +            env->avx10_version = version;
> > > 
> > > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > > still tend to set it in max_x86_cpu_realize().
> > 
> > It's not proper to get the host's version when AVX10 cannot be enabled,
> > even maybe host doesn't support AVX10.
> > 
> > As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> > be enabled or not.
> > 
> 
> How about moving to x86_cpu_expand_features()? We can set when checking
> cpu->max_features.

The feature bit set in x86_cpu_expand_features() is unstable since it
may be masked later in x86_cpu_filter_features(). :)

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-30 15:55           ` Zhao Liu
@ 2024-10-31  4:39             ` Tao Su
  2024-10-31  5:52               ` Xiaoyao Li
  2024-10-31  7:18               ` Zhao Liu
  0 siblings, 2 replies; 31+ messages in thread
From: Tao Su @ 2024-10-31  4:39 UTC (permalink / raw)
  To: Zhao Liu; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Wed, Oct 30, 2024 at 11:55:34PM +0800, Zhao Liu wrote:
> On Wed, Oct 30, 2024 at 10:05:51PM +0800, Tao Su wrote:
> > Date: Wed, 30 Oct 2024 22:05:51 +0800
> > From: Tao Su <tao1.su@linux.intel.com>
> > Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
> >  property
> > 
> > On Wed, Oct 30, 2024 at 09:21:36PM +0800, Zhao Liu wrote:
> > > > > > Introduce avx10-version property so that avx10 version can be controlled
> > > > > > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > > > > > value of avx10-version is set to 0 to determine whether it is specified by
> > > > > > user.
> > > > > 
> > > > > The default value of 0 does not reflect whether the user has set it to 0.
> > > > > According to the description here, the spec clearly prohibits 0, so
> > > > > should we report an error when the user sets it to 0?
> > > > > 
> > > > > If so, it might be better to change the default value to -1 and adjust
> > > > > based on the host's support.
> > > > > 
> > > > 
> > > > If user sets version to 0, it will directly use reported version, this
> > > > should be a more neat and intuitive way?
> > > 
> > > The code implementation is actually similar for different initial
> > > values. And about this:
> > > 
> > > > If user sets version to 0, it will directly use reported version", 
> > > 
> > > It's defining a special behavior for the API, which is based on the
> > > special 0 value, and there needs to be documentation to let the user
> > > know that 0 will be considered legal as well as that it will be quietly
> > > overridden... But AFAIK there doesn't seem to be any place to add
> > > documentation for the property ...
> > > 
> > > There may be similar problems with -1, e.g. if the user writes -1, there
> > > is no way to report an error for the user's behavior. But it's better
> > > than 0. After all, no one would think that a version of -1 is correct.
> > > Topology IDs have been initialized to -1 to include the user's 0 value
> > > in the check.
> > 
> > Thanks for your explanation, but I really think the users who set
> > avx10-version should also know avx10.0 doesn’t exist, so using 0 is same
> > as -1…
> 
> I see. "Per spec, avx10 version can never be 0", so showing the warning
> for avx10-version=0 is as it should be.
> 
> > To solve the initial value issue fundamentally, maybe we can add get/set
> > callbacks when adding avx10-version property? It should be simpler to
> > limit what users set.
> 
> It's unnecessary. Similar cases using -1 are already common, such as for
> APIC ID, NUMA node ID, topology IDs, etc. The initial value is -1 simply
> because we need to handle the case where users explicitly set it to 0.
> If you don’t want to see -1, you can define a macro like APIC ID did
> (#define UNSET_AVX10_VERSION -1).
> 

OK, I will change the default value to -1.

> > > > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > > > >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > > >          uint8_t version = ebx_0 & 0xff;
> > > > > 
> > > > > -        if (version < env->avx10_version) {
> > > > > +        if (!env->avx10_version) {
> > > > > +            env->avx10_version = version;
> > > > 
> > > > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > > > still tend to set it in max_x86_cpu_realize().
> > > 
> > > It's not proper to get the host's version when AVX10 cannot be enabled,
> > > even maybe host doesn't support AVX10.
> > > 
> > > As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> > > be enabled or not.
> > > 
> > 
> > How about moving to x86_cpu_expand_features()? We can set when checking
> > cpu->max_features.
> 
> The feature bit set in x86_cpu_expand_features() is unstable since it
> may be masked later in x86_cpu_filter_features(). :)
> 

A lot of feature bits are set in x86_cpu_expand_features() with reported
value, so I think avx10_version can also be set to reported value there.

I mainly want to let avx10_version be assigned only when -cpu host or max,
so that it can be distinguished from the cpu model. This should also be
Paolo's original intention in v2.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-31  4:39             ` Tao Su
@ 2024-10-31  5:52               ` Xiaoyao Li
  2024-10-31  6:07                 ` Tao Su
  2024-10-31  7:12                 ` Zhao Liu
  2024-10-31  7:18               ` Zhao Liu
  1 sibling, 2 replies; 31+ messages in thread
From: Xiaoyao Li @ 2024-10-31  5:52 UTC (permalink / raw)
  To: Tao Su, Zhao Liu; +Cc: Paolo Bonzini, qemu-devel

On 10/31/2024 12:39 PM, Tao Su wrote:
> On Wed, Oct 30, 2024 at 11:55:34PM +0800, Zhao Liu wrote:
>> On Wed, Oct 30, 2024 at 10:05:51PM +0800, Tao Su wrote:
>>> Date: Wed, 30 Oct 2024 22:05:51 +0800
>>> From: Tao Su <tao1.su@linux.intel.com>
>>> Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
>>>   property
>>>
>>> On Wed, Oct 30, 2024 at 09:21:36PM +0800, Zhao Liu wrote:
>>>>>>> Introduce avx10-version property so that avx10 version can be controlled
>>>>>>> by user and cpu model. Per spec, avx10 version can never be 0, the default
>>>>>>> value of avx10-version is set to 0 to determine whether it is specified by
>>>>>>> user.
>>>>>>
>>>>>> The default value of 0 does not reflect whether the user has set it to 0.
>>>>>> According to the description here, the spec clearly prohibits 0, so
>>>>>> should we report an error when the user sets it to 0?
>>>>>>
>>>>>> If so, it might be better to change the default value to -1 and adjust
>>>>>> based on the host's support.
>>>>>>
>>>>>
>>>>> If user sets version to 0, it will directly use reported version, this
>>>>> should be a more neat and intuitive way?
>>>>
>>>> The code implementation is actually similar for different initial
>>>> values. And about this:
>>>>
>>>>> If user sets version to 0, it will directly use reported version",
>>>>
>>>> It's defining a special behavior for the API, which is based on the
>>>> special 0 value, and there needs to be documentation to let the user
>>>> know that 0 will be considered legal as well as that it will be quietly
>>>> overridden... But AFAIK there doesn't seem to be any place to add
>>>> documentation for the property ...
>>>>
>>>> There may be similar problems with -1, e.g. if the user writes -1, there
>>>> is no way to report an error for the user's behavior. But it's better
>>>> than 0. After all, no one would think that a version of -1 is correct.
>>>> Topology IDs have been initialized to -1 to include the user's 0 value
>>>> in the check.
>>>
>>> Thanks for your explanation, but I really think the users who set
>>> avx10-version should also know avx10.0 doesn’t exist, so using 0 is same
>>> as -1…
>>
>> I see. "Per spec, avx10 version can never be 0", so showing the warning
>> for avx10-version=0 is as it should be.
>>
>>> To solve the initial value issue fundamentally, maybe we can add get/set
>>> callbacks when adding avx10-version property? It should be simpler to
>>> limit what users set.
>>
>> It's unnecessary. Similar cases using -1 are already common, such as for
>> APIC ID, NUMA node ID, topology IDs, etc. The initial value is -1 simply
>> because we need to handle the case where users explicitly set it to 0.
>> If you don’t want to see -1, you can define a macro like APIC ID did
>> (#define UNSET_AVX10_VERSION -1).
>>
> 
> OK, I will change the default value to -1.

Then please remember to handle the issue like ...

>>>>>> @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
>>>>>>                                       &eax_0, &ebx_0, &ecx_0, &edx_0);
>>>>>>           uint8_t version = ebx_0 & 0xff;
>>>>>>
>>>>>> -        if (version < env->avx10_version) {
>>>>>> +        if (!env->avx10_version) {
>>>>>> +            env->avx10_version = version;
>>>>>
>>>>> x86_cpu_filter_features() is not a good place to assign avx10_version, I
>>>>> still tend to set it in max_x86_cpu_realize().
>>>>
>>>> It's not proper to get the host's version when AVX10 cannot be enabled,
>>>> even maybe host doesn't support AVX10.
>>>>
>>>> As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
>>>> be enabled or not.
>>>>
>>>
>>> How about moving to x86_cpu_expand_features()? We can set when checking
>>> cpu->max_features.
>>
>> The feature bit set in x86_cpu_expand_features() is unstable since it
>> may be masked later in x86_cpu_filter_features(). :)
>>
> 
> A lot of feature bits are set in x86_cpu_expand_features() with reported
> value, so I think avx10_version can also be set to reported value there.

I agree.

> I mainly want to let avx10_version be assigned only when -cpu host or max,
> so that it can be distinguished from the cpu model. This should also be
> Paolo's original intention in v2.

avx10_version needs to be assigned with a default valid value, when user 
enables avx10 explicitly without specifying avx10_version. It also 
applies to (existing) named cpu models other than GraniteRapids-v2 
(which is added by this series). E.g.,

	-cpu GraniteRapids-v1,+avx10

So if you are going to make default value as -1, then you need to add 
something in x86_cpu_load_model()

     if (!def->avx10_version) {
         def->avx10_version = -1;
     }





^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-31  5:52               ` Xiaoyao Li
@ 2024-10-31  6:07                 ` Tao Su
  2024-10-31  7:12                 ` Zhao Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Tao Su @ 2024-10-31  6:07 UTC (permalink / raw)
  To: Xiaoyao Li; +Cc: Zhao Liu, Paolo Bonzini, qemu-devel

On Thu, Oct 31, 2024 at 01:52:24PM +0800, Xiaoyao Li wrote:

[ ... ]

> > I mainly want to let avx10_version be assigned only when -cpu host or max,
> > so that it can be distinguished from the cpu model. This should also be
> > Paolo's original intention in v2.
> 
> avx10_version needs to be assigned with a default valid value, when user
> enables avx10 explicitly without specifying avx10_version. It also applies
> to (existing) named cpu models other than GraniteRapids-v2 (which is added
> by this series). E.g.,
> 
> 	-cpu GraniteRapids-v1,+avx10
> 
> So if you are going to make default value as -1, then you need to add
> something in x86_cpu_load_model()
> 
>     if (!def->avx10_version) {
>         def->avx10_version = -1;
>     }

Good suggestion, thanks for the reminder!



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-31  5:52               ` Xiaoyao Li
  2024-10-31  6:07                 ` Tao Su
@ 2024-10-31  7:12                 ` Zhao Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Zhao Liu @ 2024-10-31  7:12 UTC (permalink / raw)
  To: Xiaoyao Li; +Cc: Tao Su, Paolo Bonzini, qemu-devel

On Thu, Oct 31, 2024 at 01:52:24PM +0800, Xiaoyao Li wrote:
> Date: Thu, 31 Oct 2024 13:52:24 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
>  property
> 
> On 10/31/2024 12:39 PM, Tao Su wrote:
> > On Wed, Oct 30, 2024 at 11:55:34PM +0800, Zhao Liu wrote:
> > > On Wed, Oct 30, 2024 at 10:05:51PM +0800, Tao Su wrote:
> > > > Date: Wed, 30 Oct 2024 22:05:51 +0800
> > > > From: Tao Su <tao1.su@linux.intel.com>
> > > > Subject: Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version
> > > >   property
> > > > 
> > > > On Wed, Oct 30, 2024 at 09:21:36PM +0800, Zhao Liu wrote:
> > > > > > > > Introduce avx10-version property so that avx10 version can be controlled
> > > > > > > > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > > > > > > > value of avx10-version is set to 0 to determine whether it is specified by
> > > > > > > > user.
> > > > > > > 
> > > > > > > The default value of 0 does not reflect whether the user has set it to 0.
> > > > > > > According to the description here, the spec clearly prohibits 0, so
> > > > > > > should we report an error when the user sets it to 0?
> > > > > > > 
> > > > > > > If so, it might be better to change the default value to -1 and adjust
> > > > > > > based on the host's support.
> > > > > > > 
> > > > > > 
> > > > > > If user sets version to 0, it will directly use reported version, this
> > > > > > should be a more neat and intuitive way?
> > > > > 
> > > > > The code implementation is actually similar for different initial
> > > > > values. And about this:
> > > > > 
> > > > > > If user sets version to 0, it will directly use reported version",
> > > > > 
> > > > > It's defining a special behavior for the API, which is based on the
> > > > > special 0 value, and there needs to be documentation to let the user
> > > > > know that 0 will be considered legal as well as that it will be quietly
> > > > > overridden... But AFAIK there doesn't seem to be any place to add
> > > > > documentation for the property ...
> > > > > 
> > > > > There may be similar problems with -1, e.g. if the user writes -1, there
> > > > > is no way to report an error for the user's behavior. But it's better
> > > > > than 0. After all, no one would think that a version of -1 is correct.
> > > > > Topology IDs have been initialized to -1 to include the user's 0 value
> > > > > in the check.
> > > > 
> > > > Thanks for your explanation, but I really think the users who set
> > > > avx10-version should also know avx10.0 doesn’t exist, so using 0 is same
> > > > as -1…
> > > 
> > > I see. "Per spec, avx10 version can never be 0", so showing the warning
> > > for avx10-version=0 is as it should be.
> > > 
> > > > To solve the initial value issue fundamentally, maybe we can add get/set
> > > > callbacks when adding avx10-version property? It should be simpler to
> > > > limit what users set.
> > > 
> > > It's unnecessary. Similar cases using -1 are already common, such as for
> > > APIC ID, NUMA node ID, topology IDs, etc. The initial value is -1 simply
> > > because we need to handle the case where users explicitly set it to 0.
> > > If you don’t want to see -1, you can define a macro like APIC ID did
> > > (#define UNSET_AVX10_VERSION -1).
> > > 
> > 
> > OK, I will change the default value to -1.
> 
> Then please remember to handle the issue like ...
> 
> > > > > > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > > > > > >                                       &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > > > > >           uint8_t version = ebx_0 & 0xff;
> > > > > > > 
> > > > > > > -        if (version < env->avx10_version) {
> > > > > > > +        if (!env->avx10_version) {
> > > > > > > +            env->avx10_version = version;
> > > > > > 
> > > > > > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > > > > > still tend to set it in max_x86_cpu_realize().
> > > > > 
> > > > > It's not proper to get the host's version when AVX10 cannot be enabled,
> > > > > even maybe host doesn't support AVX10.
> > > > > 
> > > > > As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> > > > > be enabled or not.
> > > > > 
> > > > 
> > > > How about moving to x86_cpu_expand_features()? We can set when checking
> > > > cpu->max_features.
> > > 
> > > The feature bit set in x86_cpu_expand_features() is unstable since it
> > > may be masked later in x86_cpu_filter_features(). :)
> > > 
> > 
> > A lot of feature bits are set in x86_cpu_expand_features() with reported
> > value, so I think avx10_version can also be set to reported value there.
> 
> I agree.
> 
> > I mainly want to let avx10_version be assigned only when -cpu host or max,
> > so that it can be distinguished from the cpu model. This should also be
> > Paolo's original intention in v2.
> 
> avx10_version needs to be assigned with a default valid value, when user
> enables avx10 explicitly without specifying avx10_version. It also applies
> to (existing) named cpu models other than GraniteRapids-v2 (which is added
> by this series). E.g.,
> 
> 	-cpu GraniteRapids-v1,+avx10
> 
> So if you are going to make default value as -1, then you need to add
> something in x86_cpu_load_model()
> 
>     if (!def->avx10_version) {
>         def->avx10_version = -1;
>     }

Yes, this is because the model's field defaults to 0, and avx10-version
is set once when the model is loaded.

Such a check seems necessary, but it does make the code more redundant,
so I'm starting to agree with default 0. :)

Thanks,
Zhao





^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-31  4:39             ` Tao Su
  2024-10-31  5:52               ` Xiaoyao Li
@ 2024-10-31  7:18               ` Zhao Liu
  2024-10-31  7:19                 ` Tao Su
  1 sibling, 1 reply; 31+ messages in thread
From: Zhao Liu @ 2024-10-31  7:18 UTC (permalink / raw)
  To: Tao Su; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

> > > > > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > > > > >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > > > >          uint8_t version = ebx_0 & 0xff;
> > > > > > 
> > > > > > -        if (version < env->avx10_version) {
> > > > > > +        if (!env->avx10_version) {
> > > > > > +            env->avx10_version = version;
> > > > > 
> > > > > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > > > > still tend to set it in max_x86_cpu_realize().
> > > > 
> > > > It's not proper to get the host's version when AVX10 cannot be enabled,
> > > > even maybe host doesn't support AVX10.
> > > > 
> > > > As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> > > > be enabled or not.
> > > > 
> > > 
> > > How about moving to x86_cpu_expand_features()? We can set when checking
> > > cpu->max_features.
> > 
> > The feature bit set in x86_cpu_expand_features() is unstable since it
> > may be masked later in x86_cpu_filter_features(). :)
> > 
> 
> A lot of feature bits are set in x86_cpu_expand_features() with reported
> value, so I think avx10_version can also be set to reported value there.
> 
> I mainly want to let avx10_version be assigned only when -cpu host or max,
> so that it can be distinguished from the cpu model. This should also be
> Paolo's original intention in v2.

OK. In this case, extend avx10-version is also consistent with the
semantics of this function. Even if host doesn't support avx10, then in
principle it's ok to read unimplemented avx10-version as 0.

Pls go ahead. :)

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property
  2024-10-31  7:18               ` Zhao Liu
@ 2024-10-31  7:19                 ` Tao Su
  0 siblings, 0 replies; 31+ messages in thread
From: Tao Su @ 2024-10-31  7:19 UTC (permalink / raw)
  To: Zhao Liu; +Cc: Paolo Bonzini, qemu-devel, xiaoyao.li

On Thu, Oct 31, 2024 at 03:18:37PM +0800, Zhao Liu wrote:
> > > > > > > @@ -7674,13 +7682,21 @@ static bool x86_cpu_filter_features(X86CPU *cpu, bool verbose)
> > > > > > >                                      &eax_0, &ebx_0, &ecx_0, &edx_0);
> > > > > > >          uint8_t version = ebx_0 & 0xff;
> > > > > > > 
> > > > > > > -        if (version < env->avx10_version) {
> > > > > > > +        if (!env->avx10_version) {
> > > > > > > +            env->avx10_version = version;
> > > > > > 
> > > > > > x86_cpu_filter_features() is not a good place to assign avx10_version, I
> > > > > > still tend to set it in max_x86_cpu_realize().
> > > > > 
> > > > > It's not proper to get the host's version when AVX10 cannot be enabled,
> > > > > even maybe host doesn't support AVX10.
> > > > > 
> > > > > As you found out earlier, max_x86_cpu_realize doesn't know if AVX10 can
> > > > > be enabled or not.
> > > > > 
> > > > 
> > > > How about moving to x86_cpu_expand_features()? We can set when checking
> > > > cpu->max_features.
> > > 
> > > The feature bit set in x86_cpu_expand_features() is unstable since it
> > > may be masked later in x86_cpu_filter_features(). :)
> > > 
> > 
> > A lot of feature bits are set in x86_cpu_expand_features() with reported
> > value, so I think avx10_version can also be set to reported value there.
> > 
> > I mainly want to let avx10_version be assigned only when -cpu host or max,
> > so that it can be distinguished from the cpu model. This should also be
> > Paolo's original intention in v2.
> 
> OK. In this case, extend avx10-version is also consistent with the
> semantics of this function. Even if host doesn't support avx10, then in
> principle it's ok to read unimplemented avx10-version as 0.
> 
> Pls go ahead. :)

I will submit v3 based on all your comments, thanks for review :)



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-10-31  7:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-29 15:18 [PATCH v2 0/8] Add AVX10.1 CPUID support and GraniteRapids-v2 model Paolo Bonzini
2024-10-29 15:18 ` [PATCH 1/8] target/i386: cpu: set correct supported XCR0 features for TCG Paolo Bonzini
2024-10-30  2:56   ` Zhao Liu
2024-10-29 15:18 ` [PATCH 2/8] target/i386: do not rely on ExtSaveArea for accelerator-supported XCR0 bits Paolo Bonzini
2024-10-30  3:50   ` Zhao Liu
2024-10-29 15:18 ` [PATCH 3/8] target/i386: return bool from x86_cpu_filter_features Paolo Bonzini
2024-10-30  5:19   ` Zhao Liu
2024-10-29 15:18 ` [PATCH 4/8] target/i386: add AVX10 feature and AVX10 version property Paolo Bonzini
2024-10-30  3:05   ` Tao Su
2024-10-30  8:09     ` Zhao Liu
2024-10-30  8:44   ` Zhao Liu
2024-10-30  9:37     ` Tao Su
2024-10-30 13:21       ` Zhao Liu
2024-10-30 14:05         ` Tao Su
2024-10-30 15:55           ` Zhao Liu
2024-10-31  4:39             ` Tao Su
2024-10-31  5:52               ` Xiaoyao Li
2024-10-31  6:07                 ` Tao Su
2024-10-31  7:12                 ` Zhao Liu
2024-10-31  7:18               ` Zhao Liu
2024-10-31  7:19                 ` Tao Su
2024-10-29 15:18 ` [PATCH 5/8] target/i386: add CPUID.24 features for AVX10 Paolo Bonzini
2024-10-30  8:50   ` Zhao Liu
2024-10-29 15:18 ` [PATCH 6/8] target/i386: Add feature dependencies " Paolo Bonzini
2024-10-29 15:18 ` [PATCH 7/8] target/i386: Add AVX512 state when AVX10 is supported Paolo Bonzini
2024-10-29 20:11   ` Paolo Bonzini
2024-10-30  8:54   ` Zhao Liu
2024-10-29 15:18 ` [PATCH 8/8] target/i386: Introduce GraniteRapids-v2 model Paolo Bonzini
2024-10-30  3:18 ` [PATCH v2 0/8] Add AVX10.1 CPUID support and " Tao Su
2024-10-30  8:35   ` Paolo Bonzini
2024-10-30  8:52     ` Tao Su

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).