* [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model
@ 2024-10-28 2:45 Tao Su
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
` (5 more replies)
0 siblings, 6 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
Add AVX10.1 CPUID support, i.e. add AVX10 support bit via
CPUID.(EAX=07H, ECX=01H):EDX[bit 19] and new CPUID leaf 0x24H so that
guest OS and applications can query the AVX10 CPUIDs directly. The AVX10.1
spec can be found in [*], it is worth mentioning that
VL128 (CPUID.(EAX=24H, ECX=00H):EBX[bit 16]) was dropped in rev3.0, but it
will be added back and reserved as 1.
Since GraniteRapids (stepping 1) is the first platform to support AVX10,
introduce GraniteRapids-v2 CPU model to add AVX10 in this patch set, and
add some missing features as well.
[*] https://cdrdv2.intel.com/v1/dl/getContent/784267
---
Tao Su (6):
target/i386: Add AVX512 state when AVX10 is supported
target/i386: add avx10-version property
target/i386: Add CPUID.24 leaf for AVX10
target/i386: Add feature dependencies for AVX10
target/i386: Add support for AVX10 in CPUID enumeration
target/i386: Introduce GraniteRapids-v2 model
target/i386/cpu.c | 90 ++++++++++++++++++++++++++++++++++++++++++-
target/i386/cpu.h | 16 ++++++++
target/i386/kvm/kvm.c | 3 +-
3 files changed, 107 insertions(+), 2 deletions(-)
base-commit: cea8ac78545a83e1f01c94d89d6f5a3f6b5c05d2
--
2.34.1
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-28 8:41 ` Paolo Bonzini
2024-10-28 15:12 ` Xiaoyao Li
2024-10-28 2:45 ` [PATCH 2/6] target/i386: add avx10-version property Tao Su
` (4 subsequent siblings)
5 siblings, 2 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
are identical to AVX512 state regardless of the supported vector lengths.
Given that some E-cores will support AVX10 but not support AVX512, add
AVX512 state components to guest when AVX10 is enabled.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 14 ++++++++++++++
target/i386/cpu.h | 2 ++
2 files changed, 16 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1ff1af032e..d845ff5e4e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
}
if (env->features[esa->feature] & esa->bits) {
xcr0 |= 1ull << i;
+ continue;
+ }
+ if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
+ i == XSTATE_Hi16_ZMM_BIT) {
+ if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+ xcr0 |= 1ull << i;
+ }
}
}
@@ -7315,6 +7322,13 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
const ExtSaveArea *esa = &x86_ext_save_areas[i];
if (env->features[esa->feature] & esa->bits) {
mask |= (1ULL << i);
+ continue;
+ }
+ if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
+ i == XSTATE_Hi16_ZMM_BIT) {
+ if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+ mask |= (1ULL << i);
+ }
}
}
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 74886d1580..280bec701c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -972,6 +972,8 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
#define CPUID_7_1_EDX_AMX_COMPLEX (1U << 8)
/* PREFETCHIT0/1 Instructions */
#define CPUID_7_1_EDX_PREFETCHITI (1U << 14)
+/* Support for Advanced Vector Extensions 10 */
+#define CPUID_7_1_EDX_AVX10 (1U << 19)
/* Flexible return and event delivery (FRED) */
#define CPUID_7_1_EAX_FRED (1U << 17)
/* Load into IA32_KERNEL_GS_BASE (LKGS) */
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/6] target/i386: add avx10-version property
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-28 15:10 ` Xiaoyao Li
2024-10-28 2:45 ` [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10 Tao Su
` (3 subsequent siblings)
5 siblings, 1 reply; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 1 +
target/i386/cpu.h | 2 ++
2 files changed, 3 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d845ff5e4e..5b434a107a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -8343,6 +8343,7 @@ static Property x86_cpu_properties[] = {
DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
+ DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),
DEFINE_PROP_UINT64("ucode-rev", X86CPU, ucode_rev, 0),
DEFINE_PROP_BOOL("full-cpuid-auto-level", X86CPU, full_cpuid_auto_level, true),
DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 280bec701c..d845384dcd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1920,6 +1920,8 @@ typedef struct CPUArchState {
uint32_t cpuid_vendor3;
uint32_t cpuid_version;
FeatureWordArray features;
+ /* AVX10 version */
+ uint8_t avx10_version;
/* Features that were explicitly enabled/disabled */
FeatureWordArray user_features;
uint32_t cpuid_model[12];
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
2024-10-28 2:45 ` [PATCH 2/6] target/i386: add avx10-version property Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-28 15:04 ` Xiaoyao Li
2024-10-29 8:25 ` Paolo Bonzini
2024-10-28 2:45 ` [PATCH 4/6] target/i386: Add feature dependencies " Tao Su
` (2 subsequent siblings)
5 siblings, 2 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 40 ++++++++++++++++++++++++++++++++++++++++
target/i386/cpu.h | 8 ++++++++
target/i386/kvm/kvm.c | 3 ++-
3 files changed, 50 insertions(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5b434a107a..91fae0dcb7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -898,6 +898,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
#define TCG_SGX_12_0_EAX_FEATURES 0
#define TCG_SGX_12_0_EBX_FEATURES 0
#define TCG_SGX_12_1_EAX_FEATURES 0
+#define TCG_24_0_EBX_FEATURES 0
#if defined CONFIG_USER_ONLY
#define CPUID_8000_0008_EBX_KERNEL_FEATURES (CPUID_8000_0008_EBX_IBPB | \
@@ -1163,6 +1164,20 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
},
.tcg_features = TCG_7_2_EDX_FEATURES,
},
+ [FEAT_24_0_EBX] = {
+ .type = CPUID_FEATURE_WORD,
+ .feat_names = {
+ [16] = "avx10-128",
+ [17] = "avx10-256",
+ [18] = "avx10-512",
+ },
+ .cpuid = {
+ .eax = 0x24,
+ .needs_ecx = true, .ecx = 0,
+ .reg = R_EBX,
+ },
+ .tcg_features = TCG_24_0_EBX_FEATURES,
+ },
[FEAT_8000_0007_EDX] = {
.type = CPUID_FEATURE_WORD,
.feat_names = {
@@ -6835,6 +6850,26 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
}
break;
}
+ case 0x24: {
+ *eax = 0;
+ *ebx = 0;
+ *ecx = 0;
+ *edx = 0;
+ if (!(env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
+ break;
+ }
+
+ if (count == 0) {
+ uint8_t v = kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24,
+ 0, R_EBX);
+ if (env->avx10_version && env->avx10_version < v) {
+ v = env->avx10_version;
+ }
+
+ *ebx = env->features[FEAT_24_0_EBX] | v;
+ }
+ break;
+ }
case 0x40000000:
/*
* CPUID code in kvm_arch_init_vcpu() ignores stuff
@@ -7483,6 +7518,11 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x1F);
}
+ /* Advanced Vector Extensions 10 (AVX10) requires CPUID[0x24] */
+ if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+ x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x24);
+ }
+
/* SVM requires CPUID[0x8000000A] */
if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel, 0x8000000A);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d845384dcd..5566a13f4f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -662,6 +662,7 @@ typedef enum FeatureWord {
FEAT_XSAVE_XSS_HI, /* CPUID[EAX=0xd,ECX=1].EDX */
FEAT_7_1_EDX, /* CPUID[EAX=7,ECX=1].EDX */
FEAT_7_2_EDX, /* CPUID[EAX=7,ECX=2].EDX */
+ FEAT_24_0_EBX, /* CPUID[EAX=0x24,ECX=0].EBX */
FEATURE_WORDS,
} FeatureWord;
@@ -990,6 +991,13 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
/* Packets which contain IP payload have LIP values */
#define CPUID_14_0_ECX_LIP (1U << 31)
+/* AVX10 128-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_128 (1U << 16)
+/* AVX10 256-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_256 (1U << 17)
+/* AVX10 512-bit vector support is present */
+#define CPUID_24_0_EBX_AVX10_512 (1U << 18)
+
/* RAS Features */
#define CPUID_8000_0007_EBX_OVERFLOW_RECOV (1U << 0)
#define CPUID_8000_0007_EBX_SUCCOR (1U << 1)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index fd9f198892..8e17942c3b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1923,7 +1923,8 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
case 0x7:
case 0x14:
case 0x1d:
- case 0x1e: {
+ case 0x1e:
+ case 0x24: {
uint32_t times;
c->function = i;
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
` (2 preceding siblings ...)
2024-10-28 2:45 ` [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10 Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-28 8:45 ` Paolo Bonzini
2024-10-29 14:47 ` Zhao Liu
2024-10-28 2:45 ` [PATCH 5/6] target/i386: Add support for AVX10 in CPUID enumeration Tao Su
2024-10-28 2:45 ` [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model Tao Su
5 siblings, 2 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
Since the highest supported vector length for a processor implies that
all lesser vector lengths are also supported, add the dependencies of
the supported vector lengths. If all vector lengths aren't supported,
clear AVX10 enable bit as well.
Note that the order of AVX10 related dependencies should be kept as:
CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
CPUID_7_1_EDX_AVX10 -> CPUID_24_0_EBX,
so that prevent user from setting weird CPUID combinations, e.g. 256-bits
and 512-bits are supported but 128-bits is not, no vector lengths are
supported but AVX10 enable bit is still set.
Since AVX10_128 will be reserved as 1, adding these dependencies has the
bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
CPUID_24_0_EBX will be disabled automatically.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 16 ++++++++++++++++
target/i386/cpu.h | 4 ++++
2 files changed, 20 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 91fae0dcb7..9666dbf29c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1760,6 +1760,22 @@ static FeatureDep feature_dependencies[] = {
.from = { FEAT_7_0_EBX, CPUID_7_0_EBX_SGX },
.to = { FEAT_SGX_12_1_EAX, ~0ull },
},
+ {
+ .from = { FEAT_24_0_EBX, CPUID_24_0_EBX_AVX10_128 },
+ .to = { FEAT_24_0_EBX, CPUID_24_0_EBX_AVX10_256 },
+ },
+ {
+ .from = { FEAT_24_0_EBX, CPUID_24_0_EBX_AVX10_256 },
+ .to = { FEAT_24_0_EBX, CPUID_24_0_EBX_AVX10_512 },
+ },
+ {
+ .from = { FEAT_24_0_EBX, CPUID_24_0_EBX_AVX10_VL_MASK },
+ .to = { FEAT_7_1_EDX, CPUID_7_1_EDX_AVX10 },
+ },
+ {
+ .from = { FEAT_7_1_EDX, CPUID_7_1_EDX_AVX10 },
+ .to = { FEAT_24_0_EBX, ~0ull },
+ },
};
typedef struct X86RegisterInfo32 {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5566a13f4f..e4c947b478 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -997,6 +997,10 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
#define CPUID_24_0_EBX_AVX10_256 (1U << 17)
/* AVX10 512-bit vector support is present */
#define CPUID_24_0_EBX_AVX10_512 (1U << 18)
+/* AVX10 vector length support mask */
+#define CPUID_24_0_EBX_AVX10_VL_MASK (CPUID_24_0_EBX_AVX10_128 | \
+ CPUID_24_0_EBX_AVX10_256 | \
+ CPUID_24_0_EBX_AVX10_512)
/* RAS Features */
#define CPUID_8000_0007_EBX_OVERFLOW_RECOV (1U << 0)
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 5/6] target/i386: Add support for AVX10 in CPUID enumeration
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
` (3 preceding siblings ...)
2024-10-28 2:45 ` [PATCH 4/6] target/i386: Add feature dependencies " Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-28 2:45 ` [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model Tao Su
5 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
Intel AVX10 represents the first major new vector ISA since the
introduction of Intel AVX512, which will establish a common, converged
vector instruction set across all Intel architectures.
AVX10 enable bit is enumerated via CPUID.(EAX=7,ECX=1):EDX[bit 19]. Add
the CPUID definition for AVX10 enable bit, AVX10 will be enabled
automatically when using '-cpu host' and KVM advertises AVX10 to
userspace.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9666dbf29c..adde98fd26 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1133,7 +1133,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
"avx-vnni-int8", "avx-ne-convert", NULL, NULL,
"amx-complex", NULL, "avx-vnni-int16", NULL,
NULL, NULL, "prefetchiti", NULL,
- NULL, NULL, NULL, NULL,
+ NULL, NULL, NULL, "avx10",
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
` (4 preceding siblings ...)
2024-10-28 2:45 ` [PATCH 5/6] target/i386: Add support for AVX10 in CPUID enumeration Tao Su
@ 2024-10-28 2:45 ` Tao Su
2024-10-29 14:58 ` Zhao Liu
5 siblings, 1 reply; 29+ messages in thread
From: Tao Su @ 2024-10-28 2:45 UTC (permalink / raw)
To: qemu-devel; +Cc: pbonzini, mtosatti, xiaoyao.li, xuelian.guo, tao1.su
Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
tsc-adjust, cldemote, movdiri, movdir64b).
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
---
target/i386/cpu.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index adde98fd26..8d72c08b66 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4375,6 +4375,23 @@ static const X86CPUDefinition builtin_x86_defs[] = {
.model_id = "Intel Xeon Processor (GraniteRapids)",
.versions = (X86CPUVersionDefinition[]) {
{ .version = 1 },
+ {
+ .version = 2,
+ .props = (PropValue[]) {
+ { "ss", "on" },
+ { "tsc-adjust", "on" },
+ { "cldemote", "on" },
+ { "movdiri", "on" },
+ { "movdir64b", "on" },
+ { "avx10", "on" },
+ { "avx10-128", "on" },
+ { "avx10-256", "on" },
+ { "avx10-512", "on" },
+ { "avx10-version", "1" },
+ { "stepping", "1" },
+ { /* end of list */ }
+ }
+ },
{ /* end of list */ },
},
},
--
2.34.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
@ 2024-10-28 8:41 ` Paolo Bonzini
2024-10-28 9:25 ` Tao Su
2024-10-28 15:12 ` Xiaoyao Li
1 sibling, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-28 8:41 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: mtosatti, xiaoyao.li, xuelian.guo
On 10/28/24 03:45, Tao Su wrote:
> AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> are identical to AVX512 state regardless of the supported vector lengths.
>
> Given that some E-cores will support AVX10 but not support AVX512, add
> AVX512 state components to guest when AVX10 is enabled.
>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> target/i386/cpu.c | 14 ++++++++++++++
> target/i386/cpu.h | 2 ++
> 2 files changed, 16 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 1ff1af032e..d845ff5e4e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
> }
> if (env->features[esa->feature] & esa->bits) {
> xcr0 |= 1ull << i;
> + continue;
> + }
> + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> + i == XSTATE_Hi16_ZMM_BIT) {
Can you confirm that XSTATE_ZMM_Hi256_BIT depends on AVX10 and not
AVX10-512?
Thanks,
Paolo
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + xcr0 |= 1ull << i;
> + }
> }
> }
>
> @@ -7315,6 +7322,13 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
> const ExtSaveArea *esa = &x86_ext_save_areas[i];
> if (env->features[esa->feature] & esa->bits) {
> mask |= (1ULL << i);
> + continue;
> + }
> + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> + i == XSTATE_Hi16_ZMM_BIT) {
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + mask |= (1ULL << i);
> + }
> }
> }
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 74886d1580..280bec701c 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -972,6 +972,8 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
> #define CPUID_7_1_EDX_AMX_COMPLEX (1U << 8)
> /* PREFETCHIT0/1 Instructions */
> #define CPUID_7_1_EDX_PREFETCHITI (1U << 14)
> +/* Support for Advanced Vector Extensions 10 */
> +#define CPUID_7_1_EDX_AVX10 (1U << 19)
> /* Flexible return and event delivery (FRED) */
> #define CPUID_7_1_EAX_FRED (1U << 17)
> /* Load into IA32_KERNEL_GS_BASE (LKGS) */
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 2:45 ` [PATCH 4/6] target/i386: Add feature dependencies " Tao Su
@ 2024-10-28 8:45 ` Paolo Bonzini
2024-10-28 10:02 ` Tao Su
2024-10-29 14:47 ` Zhao Liu
1 sibling, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-28 8:45 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: mtosatti, xiaoyao.li, xuelian.guo
On 10/28/24 03:45, Tao Su wrote:
> Since the highest supported vector length for a processor implies that
> all lesser vector lengths are also supported, add the dependencies of
> the supported vector lengths. If all vector lengths aren't supported,
> clear AVX10 enable bit as well.
>
> Note that the order of AVX10 related dependencies should be kept as:
> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
I think you need to add a set of dependencies so that avx10 cannot be
set, unless all the older AVX features that it's composed of are
available. From the manual these are
AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
Otherwise, the changes look good.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-28 8:41 ` Paolo Bonzini
@ 2024-10-28 9:25 ` Tao Su
2024-10-29 8:49 ` Paolo Bonzini
0 siblings, 1 reply; 29+ messages in thread
From: Tao Su @ 2024-10-28 9:25 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On Mon, Oct 28, 2024 at 09:41:14AM +0100, Paolo Bonzini wrote:
> On 10/28/24 03:45, Tao Su wrote:
> > AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> > are identical to AVX512 state regardless of the supported vector lengths.
> >
> > Given that some E-cores will support AVX10 but not support AVX512, add
> > AVX512 state components to guest when AVX10 is enabled.
> >
> > Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > ---
> > target/i386/cpu.c | 14 ++++++++++++++
> > target/i386/cpu.h | 2 ++
> > 2 files changed, 16 insertions(+)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 1ff1af032e..d845ff5e4e 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
> > }
> > if (env->features[esa->feature] & esa->bits) {
> > xcr0 |= 1ull << i;
> > + continue;
> > + }
> > + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> > + i == XSTATE_Hi16_ZMM_BIT) {
>
> Can you confirm that XSTATE_ZMM_Hi256_BIT depends on AVX10 and not
> AVX10-512?
>
Sorry, I should attach AVX10.2 spec [*].
In 3.1.3, spec said Intel AVX10 state enumeration in CPUID leaf 0xD and
enabling in XCR0 register are identical to Intel AVX-512 regardless of the
maximum vector length supported.
So XSTATE_ZMM_Hi256_BIT doesn't depend on AVX10-512.
[*] https://cdrdv2.intel.com/v1/dl/getContent/828965
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 8:45 ` Paolo Bonzini
@ 2024-10-28 10:02 ` Tao Su
2024-10-28 10:45 ` Paolo Bonzini
0 siblings, 1 reply; 29+ messages in thread
From: Tao Su @ 2024-10-28 10:02 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
> On 10/28/24 03:45, Tao Su wrote:
> > Since the highest supported vector length for a processor implies that
> > all lesser vector lengths are also supported, add the dependencies of
> > the supported vector lengths. If all vector lengths aren't supported,
> > clear AVX10 enable bit as well.
> >
> > Note that the order of AVX10 related dependencies should be kept as:
> > CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> > CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> > CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
>
> I think you need to add a set of dependencies so that avx10 cannot be set,
> unless all the older AVX features that it's composed of are available. From
> the manual these are
>
> AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
> AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
> VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
Thanks for such a quick review!!
AVX10.1 spec said:
Intel AVX-512 will continue to be supported on P-core-only processors for
the foreseeable future to support legacy applications. However, new vector
ISA features will only be added to the Intel AVX10 ISA moving forward.
While Intel AVX10/512 includes all Intel AVX-512 instructions, it
important to note that applications compiled to Intel AVX-512 with vector
length limited to 256 bits are not guaranteed to be compatible on an Intel
AVX10/256 processor.
I.e. AVX10/256 processors will support old AVX-512 instructions
(limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
CPUIDs. So, I think we can't add these dependencies…
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 10:02 ` Tao Su
@ 2024-10-28 10:45 ` Paolo Bonzini
2024-10-28 12:23 ` Tao Su
2024-10-28 14:48 ` Xiaoyao Li
0 siblings, 2 replies; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-28 10:45 UTC (permalink / raw)
To: Tao Su; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On 10/28/24 11:02, Tao Su wrote:
> On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
>> On 10/28/24 03:45, Tao Su wrote:
>>> Since the highest supported vector length for a processor implies that
>>> all lesser vector lengths are also supported, add the dependencies of
>>> the supported vector lengths. If all vector lengths aren't supported,
>>> clear AVX10 enable bit as well.
>>>
>>> Note that the order of AVX10 related dependencies should be kept as:
>>> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
>>> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
>>> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
>>
>> I think you need to add a set of dependencies so that avx10 cannot be set,
>> unless all the older AVX features that it's composed of are available. From
>> the manual these are
>>
>> AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
>> AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
>> VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
>
> Thanks for such a quick review!!
>
> AVX10.1 spec said:
> Intel AVX-512 will continue to be supported on P-core-only processors for
> the foreseeable future to support legacy applications. However, new vector
> ISA features will only be added to the Intel AVX10 ISA moving forward.
> While Intel AVX10/512 includes all Intel AVX-512 instructions, it
> important to note that applications compiled to Intel AVX-512 with vector
> length limited to 256 bits are not guaranteed to be compatible on an Intel
> AVX10/256 processor.
>
> I.e. AVX10/256 processors will support old AVX-512 instructions
> (limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
> CPUIDs. So, I think we can't add these dependencies…
Of course you're right about AVX10 in general, you still need to add the
dependency but only for CPUID_24_0_EBX_AVX10_512.
Paolo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 10:45 ` Paolo Bonzini
@ 2024-10-28 12:23 ` Tao Su
2024-10-28 14:48 ` Xiaoyao Li
1 sibling, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-28 12:23 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On Mon, Oct 28, 2024 at 11:45:25AM +0100, Paolo Bonzini wrote:
> On 10/28/24 11:02, Tao Su wrote:
> > On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
> > > On 10/28/24 03:45, Tao Su wrote:
> > > > Since the highest supported vector length for a processor implies that
> > > > all lesser vector lengths are also supported, add the dependencies of
> > > > the supported vector lengths. If all vector lengths aren't supported,
> > > > clear AVX10 enable bit as well.
> > > >
> > > > Note that the order of AVX10 related dependencies should be kept as:
> > > > CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> > > > CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> > > > CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
> > >
> > > I think you need to add a set of dependencies so that avx10 cannot be set,
> > > unless all the older AVX features that it's composed of are available. From
> > > the manual these are
> > >
> > > AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
> > > AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
> > > VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
> >
> > Thanks for such a quick review!!
> >
> > AVX10.1 spec said:
> > Intel AVX-512 will continue to be supported on P-core-only processors for
> > the foreseeable future to support legacy applications. However, new vector
> > ISA features will only be added to the Intel AVX10 ISA moving forward.
> > While Intel AVX10/512 includes all Intel AVX-512 instructions, it
> > important to note that applications compiled to Intel AVX-512 with vector
> > length limited to 256 bits are not guaranteed to be compatible on an Intel
> > AVX10/256 processor.
> >
> > I.e. AVX10/256 processors will support old AVX-512 instructions
> > (limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
> > CPUIDs. So, I think we can't add these dependencies…
>
> Of course you're right about AVX10 in general, you still need to add the
> dependency but only for CPUID_24_0_EBX_AVX10_512.
>
I agree, will add in v2, thanks!
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 10:45 ` Paolo Bonzini
2024-10-28 12:23 ` Tao Su
@ 2024-10-28 14:48 ` Xiaoyao Li
2024-10-28 14:50 ` Paolo Bonzini
1 sibling, 1 reply; 29+ messages in thread
From: Xiaoyao Li @ 2024-10-28 14:48 UTC (permalink / raw)
To: Paolo Bonzini, Tao Su; +Cc: qemu-devel, mtosatti, xuelian.guo
On 10/28/2024 6:45 PM, Paolo Bonzini wrote:
> On 10/28/24 11:02, Tao Su wrote:
>> On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
>>> On 10/28/24 03:45, Tao Su wrote:
>>>> Since the highest supported vector length for a processor implies that
>>>> all lesser vector lengths are also supported, add the dependencies of
>>>> the supported vector lengths. If all vector lengths aren't supported,
>>>> clear AVX10 enable bit as well.
>>>>
>>>> Note that the order of AVX10 related dependencies should be kept as:
>>>> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
>>>> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
>>>> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
>>>
>>> I think you need to add a set of dependencies so that avx10 cannot be
>>> set,
>>> unless all the older AVX features that it's composed of are
>>> available. From
>>> the manual these are
>>>
>>> AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
>>> AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
>>> VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
>>
>> Thanks for such a quick review!!
>>
>> AVX10.1 spec said:
>> Intel AVX-512 will continue to be supported on P-core-only processors for
>> the foreseeable future to support legacy applications. However, new
>> vector
>> ISA features will only be added to the Intel AVX10 ISA moving forward.
>> While Intel AVX10/512 includes all Intel AVX-512 instructions, it
>> important to note that applications compiled to Intel AVX-512 with vector
>> length limited to 256 bits are not guaranteed to be compatible on an
>> Intel
>> AVX10/256 processor.
>>
>> I.e. AVX10/256 processors will support old AVX-512 instructions
>> (limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
>> CPUIDs. So, I think we can't add these dependencies…
>
> Of course you're right about AVX10 in general, you still need to add the
> dependency but only for CPUID_24_0_EBX_AVX10_512.
What if future E-core processor starts to support AVX10/512 but not
enumerating any individual AVX512 bit? (AVX10.1 spec only states the
compatibility behavior for P-core-only processors)
> Paolo
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 14:48 ` Xiaoyao Li
@ 2024-10-28 14:50 ` Paolo Bonzini
2024-10-28 15:08 ` Xiaoyao Li
0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-28 14:50 UTC (permalink / raw)
To: Xiaoyao Li; +Cc: Tao Su, qemu-devel, mtosatti, xuelian.guo
On Mon, Oct 28, 2024 at 3:48 PM Xiaoyao Li <xiaoyao.li@intel.com> wrote:
>
> On 10/28/2024 6:45 PM, Paolo Bonzini wrote:
> > On 10/28/24 11:02, Tao Su wrote:
> >> On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
> >>> On 10/28/24 03:45, Tao Su wrote:
> >>>> Since the highest supported vector length for a processor implies that
> >>>> all lesser vector lengths are also supported, add the dependencies of
> >>>> the supported vector lengths. If all vector lengths aren't supported,
> >>>> clear AVX10 enable bit as well.
> >>>>
> >>>> Note that the order of AVX10 related dependencies should be kept as:
> >>>> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> >>>> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> >>>> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
> >>>
> >>> I think you need to add a set of dependencies so that avx10 cannot be
> >>> set,
> >>> unless all the older AVX features that it's composed of are
> >>> available. From
> >>> the manual these are
> >>>
> >>> AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
> >>> AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
> >>> VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
> >>
> >> Thanks for such a quick review!!
> >>
> >> AVX10.1 spec said:
> >> Intel AVX-512 will continue to be supported on P-core-only processors for
> >> the foreseeable future to support legacy applications. However, new
> >> vector
> >> ISA features will only be added to the Intel AVX10 ISA moving forward.
> >> While Intel AVX10/512 includes all Intel AVX-512 instructions, it
> >> important to note that applications compiled to Intel AVX-512 with vector
> >> length limited to 256 bits are not guaranteed to be compatible on an
> >> Intel
> >> AVX10/256 processor.
> >>
> >> I.e. AVX10/256 processors will support old AVX-512 instructions
> >> (limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
> >> CPUIDs. So, I think we can't add these dependencies…
> >
> > Of course you're right about AVX10 in general, you still need to add the
> > dependency but only for CPUID_24_0_EBX_AVX10_512.
>
> What if future E-core processor starts to support AVX10/512 but not
> enumerating any individual AVX512 bit? (AVX10.1 spec only states the
> compatibility behavior for P-core-only processors)
KVM and QEMU could always specify the bits. If you want to ask around
if this is possible then go ahead.
In the meanwhile I actually can apply Tao Su's patches, since the
dependencies are merely a safety feature.
Paolo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10
2024-10-28 2:45 ` [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10 Tao Su
@ 2024-10-28 15:04 ` Xiaoyao Li
2024-10-29 6:13 ` Tao Su
2024-10-29 8:25 ` Paolo Bonzini
1 sibling, 1 reply; 29+ messages in thread
From: Xiaoyao Li @ 2024-10-28 15:04 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: pbonzini, mtosatti, xuelian.guo
On 10/28/2024 10:45 AM, Tao Su wrote:
> When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
> Converged Vector ISA leaf" containing fields for the version number and
> the supported vector bit lengths.
>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> target/i386/cpu.c | 40 ++++++++++++++++++++++++++++++++++++++++
> target/i386/cpu.h | 8 ++++++++
> target/i386/kvm/kvm.c | 3 ++-
> 3 files changed, 50 insertions(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 5b434a107a..91fae0dcb7 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -898,6 +898,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
> #define TCG_SGX_12_0_EAX_FEATURES 0
> #define TCG_SGX_12_0_EBX_FEATURES 0
> #define TCG_SGX_12_1_EAX_FEATURES 0
> +#define TCG_24_0_EBX_FEATURES 0
>
> #if defined CONFIG_USER_ONLY
> #define CPUID_8000_0008_EBX_KERNEL_FEATURES (CPUID_8000_0008_EBX_IBPB | \
> @@ -1163,6 +1164,20 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
> },
> .tcg_features = TCG_7_2_EDX_FEATURES,
> },
> + [FEAT_24_0_EBX] = {
> + .type = CPUID_FEATURE_WORD,
> + .feat_names = {
> + [16] = "avx10-128",
> + [17] = "avx10-256",
> + [18] = "avx10-512",
> + },
> + .cpuid = {
> + .eax = 0x24,
> + .needs_ecx = true, .ecx = 0,
> + .reg = R_EBX,
> + },
> + .tcg_features = TCG_24_0_EBX_FEATURES,
> + },
> [FEAT_8000_0007_EDX] = {
> .type = CPUID_FEATURE_WORD,
> .feat_names = {
> @@ -6835,6 +6850,26 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> }
> break;
> }
> + case 0x24: {
> + *eax = 0;
> + *ebx = 0;
> + *ecx = 0;
> + *edx = 0;
> + if (!(env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
> + break;
> + }
> +
> + if (count == 0) {
> + uint8_t v = kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24,
> + 0, R_EBX);
> + if (env->avx10_version && env->avx10_version < v) {
> + v = env->avx10_version;
> + }
Here, if user specified avx10_version is >= kvm reported value, it uses
KVM's reported value silently.
I think it's not good. It'd better to validate if user specified value
can be satisfied or not, and emit a warning when not. e.g., in
x86_cpu_filter_features() or in kvm_cpu_realizefn(). Also we can put the
behavior along with it that "use KVM reported maximum value when
avx10_version is 0 "
then, here we can simply do
*ebx = env->features[FEAT_24_0_EBX] | env->avx10_version;
> + *ebx = env->features[FEAT_24_0_EBX] | v;
> + }
> + break;
> + }
> case 0x40000000:
> /*
> * CPUID code in kvm_arch_init_vcpu() ignores stuff
> @@ -7483,6 +7518,11 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
> x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x1F);
> }
>
> + /* Advanced Vector Extensions 10 (AVX10) requires CPUID[0x24] */
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + x86_cpu_adjust_level(cpu, &env->cpuid_min_level, 0x24);
> + }
> +
> /* SVM requires CPUID[0x8000000A] */
> if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> x86_cpu_adjust_level(cpu, &env->cpuid_min_xlevel, 0x8000000A);
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index d845384dcd..5566a13f4f 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -662,6 +662,7 @@ typedef enum FeatureWord {
> FEAT_XSAVE_XSS_HI, /* CPUID[EAX=0xd,ECX=1].EDX */
> FEAT_7_1_EDX, /* CPUID[EAX=7,ECX=1].EDX */
> FEAT_7_2_EDX, /* CPUID[EAX=7,ECX=2].EDX */
> + FEAT_24_0_EBX, /* CPUID[EAX=0x24,ECX=0].EBX */
> FEATURE_WORDS,
> } FeatureWord;
>
> @@ -990,6 +991,13 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
> /* Packets which contain IP payload have LIP values */
> #define CPUID_14_0_ECX_LIP (1U << 31)
>
> +/* AVX10 128-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_128 (1U << 16)
> +/* AVX10 256-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_256 (1U << 17)
> +/* AVX10 512-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_512 (1U << 18)
> +
> /* RAS Features */
> #define CPUID_8000_0007_EBX_OVERFLOW_RECOV (1U << 0)
> #define CPUID_8000_0007_EBX_SUCCOR (1U << 1)
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index fd9f198892..8e17942c3b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1923,7 +1923,8 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
> case 0x7:
> case 0x14:
> case 0x1d:
> - case 0x1e: {
> + case 0x1e:
> + case 0x24: {
> uint32_t times;
>
> c->function = i;
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 14:50 ` Paolo Bonzini
@ 2024-10-28 15:08 ` Xiaoyao Li
0 siblings, 0 replies; 29+ messages in thread
From: Xiaoyao Li @ 2024-10-28 15:08 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Tao Su, qemu-devel, mtosatti, xuelian.guo
On 10/28/2024 10:50 PM, Paolo Bonzini wrote:
> On Mon, Oct 28, 2024 at 3:48 PM Xiaoyao Li <xiaoyao.li@intel.com> wrote:
>>
>> On 10/28/2024 6:45 PM, Paolo Bonzini wrote:
>>> On 10/28/24 11:02, Tao Su wrote:
>>>> On Mon, Oct 28, 2024 at 09:45:39AM +0100, Paolo Bonzini wrote:
>>>>> On 10/28/24 03:45, Tao Su wrote:
>>>>>> Since the highest supported vector length for a processor implies that
>>>>>> all lesser vector lengths are also supported, add the dependencies of
>>>>>> the supported vector lengths. If all vector lengths aren't supported,
>>>>>> clear AVX10 enable bit as well.
>>>>>>
>>>>>> Note that the order of AVX10 related dependencies should be kept as:
>>>>>> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
>>>>>> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
>>>>>> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
>>>>>
>>>>> I think you need to add a set of dependencies so that avx10 cannot be
>>>>> set,
>>>>> unless all the older AVX features that it's composed of are
>>>>> available. From
>>>>> the manual these are
>>>>>
>>>>> AVX512F, AVX512CD, AVX512VW, AVX512DQ, AVX512_VBMI, AVX512_IFMA,
>>>>> AVX512_VNNI, AVX512_BF16, AVX512_VPOPCNTDQ, AVX512_VBMI2, VAES, GFNI,
>>>>> VPCLMULQDQ, AVX512_BITALG, AVX512_FP16.
>>>>
>>>> Thanks for such a quick review!!
>>>>
>>>> AVX10.1 spec said:
>>>> Intel AVX-512 will continue to be supported on P-core-only processors for
>>>> the foreseeable future to support legacy applications. However, new
>>>> vector
>>>> ISA features will only be added to the Intel AVX10 ISA moving forward.
>>>> While Intel AVX10/512 includes all Intel AVX-512 instructions, it
>>>> important to note that applications compiled to Intel AVX-512 with vector
>>>> length limited to 256 bits are not guaranteed to be compatible on an
>>>> Intel
>>>> AVX10/256 processor.
>>>>
>>>> I.e. AVX10/256 processors will support old AVX-512 instructions
>>>> (limited to 256 bits and enumerated by AVX10) but not set AVX-512 related
>>>> CPUIDs. So, I think we can't add these dependencies…
>>>
>>> Of course you're right about AVX10 in general, you still need to add the
>>> dependency but only for CPUID_24_0_EBX_AVX10_512.
>>
>> What if future E-core processor starts to support AVX10/512 but not
>> enumerating any individual AVX512 bit? (AVX10.1 spec only states the
>> compatibility behavior for P-core-only processors)
>
> KVM and QEMU could always specify the bits.
That will need KVM to report individual AVX512 bits in
KVM_GET_SUPPORTED_CPUID to not break QEMU.
> If you want to ask around
> if this is possible then go ahead.
We will ask internal architects and get back with the answer.
> In the meanwhile I actually can apply Tao Su's patches, since the
> dependencies are merely a safety feature.
Sound reasonable.
> Paolo
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/6] target/i386: add avx10-version property
2024-10-28 2:45 ` [PATCH 2/6] target/i386: add avx10-version property Tao Su
@ 2024-10-28 15:10 ` Xiaoyao Li
2024-10-29 6:14 ` Tao Su
0 siblings, 1 reply; 29+ messages in thread
From: Xiaoyao Li @ 2024-10-28 15:10 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: pbonzini, mtosatti, xuelian.guo
On 10/28/2024 10:45 AM, Tao Su wrote:
> Introduce avx10-version property so that avx10 version can be controlled
> by user and cpu model. Per spec, avx10 version can never be 0, the default
> value of avx10-version is set to 0 to determine whether it is specified by
> user.
I think it's better to merge this patch into next one. It's intact to
show how avx10_version is supposed to work.
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> target/i386/cpu.c | 1 +
> target/i386/cpu.h | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d845ff5e4e..5b434a107a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -8343,6 +8343,7 @@ static Property x86_cpu_properties[] = {
> DEFINE_PROP_UINT32("min-level", X86CPU, env.cpuid_min_level, 0),
> DEFINE_PROP_UINT32("min-xlevel", X86CPU, env.cpuid_min_xlevel, 0),
> DEFINE_PROP_UINT32("min-xlevel2", X86CPU, env.cpuid_min_xlevel2, 0),
> + DEFINE_PROP_UINT8("avx10-version", X86CPU, env.avx10_version, 0),
> DEFINE_PROP_UINT64("ucode-rev", X86CPU, ucode_rev, 0),
> DEFINE_PROP_BOOL("full-cpuid-auto-level", X86CPU, full_cpuid_auto_level, true),
> DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor),
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 280bec701c..d845384dcd 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -1920,6 +1920,8 @@ typedef struct CPUArchState {
> uint32_t cpuid_vendor3;
> uint32_t cpuid_version;
> FeatureWordArray features;
> + /* AVX10 version */
> + uint8_t avx10_version;
> /* Features that were explicitly enabled/disabled */
> FeatureWordArray user_features;
> uint32_t cpuid_model[12];
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
2024-10-28 8:41 ` Paolo Bonzini
@ 2024-10-28 15:12 ` Xiaoyao Li
1 sibling, 0 replies; 29+ messages in thread
From: Xiaoyao Li @ 2024-10-28 15:12 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: pbonzini, mtosatti, xuelian.guo
On 10/28/2024 10:45 AM, Tao Su wrote:
> AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> are identical to AVX512 state regardless of the supported vector lengths.
>
> Given that some E-cores will support AVX10 but not support AVX512, add
> AVX512 state components to guest when AVX10 is enabled.
>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
> target/i386/cpu.c | 14 ++++++++++++++
> target/i386/cpu.h | 2 ++
> 2 files changed, 16 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 1ff1af032e..d845ff5e4e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
> }
> if (env->features[esa->feature] & esa->bits) {
> xcr0 |= 1ull << i;
> + continue;
> + }
> + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> + i == XSTATE_Hi16_ZMM_BIT) {
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + xcr0 |= 1ull << i;
> + }
> }
> }
>
> @@ -7315,6 +7322,13 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
> const ExtSaveArea *esa = &x86_ext_save_areas[i];
> if (env->features[esa->feature] & esa->bits) {
> mask |= (1ULL << i);
> + continue;
> + }
> + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> + i == XSTATE_Hi16_ZMM_BIT) {
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + mask |= (1ULL << i);
> + }
> }
> }
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 74886d1580..280bec701c 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -972,6 +972,8 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
> #define CPUID_7_1_EDX_AMX_COMPLEX (1U << 8)
> /* PREFETCHIT0/1 Instructions */
> #define CPUID_7_1_EDX_PREFETCHITI (1U << 14)
> +/* Support for Advanced Vector Extensions 10 */
> +#define CPUID_7_1_EDX_AVX10 (1U << 19)
> /* Flexible return and event delivery (FRED) */
> #define CPUID_7_1_EAX_FRED (1U << 17)
> /* Load into IA32_KERNEL_GS_BASE (LKGS) */
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10
2024-10-28 15:04 ` Xiaoyao Li
@ 2024-10-29 6:13 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-29 6:13 UTC (permalink / raw)
To: Xiaoyao Li; +Cc: qemu-devel, pbonzini, mtosatti, xuelian.guo
On Mon, Oct 28, 2024 at 11:04:07PM +0800, Xiaoyao Li wrote:
> On 10/28/2024 10:45 AM, Tao Su wrote:
> > + case 0x24: {
> > + *eax = 0;
> > + *ebx = 0;
> > + *ecx = 0;
> > + *edx = 0;
> > + if (!(env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
> > + break;
> > + }
> > +
> > + if (count == 0) {
> > + uint8_t v = kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24,
> > + 0, R_EBX);
> > + if (env->avx10_version && env->avx10_version < v) {
> > + v = env->avx10_version;
> > + }
>
> Here, if user specified avx10_version is >= kvm reported value, it uses
> KVM's reported value silently.
>
> I think it's not good. It'd better to validate if user specified value can
> be satisfied or not, and emit a warning when not. e.g., in
> x86_cpu_filter_features() or in kvm_cpu_realizefn(). Also we can put the
> behavior along with it that "use KVM reported maximum value when
> avx10_version is 0 "
>
> then, here we can simply do
>
> *ebx = env->features[FEAT_24_0_EBX] | env->avx10_version;
Is it necessary to add enforce_cpuid for avx10_version too?
How about checking this in x86_cpu_realizefn, because I see this may be a
more suitable place to implement check_cpuid and enforce_cpuid.
@@ -7816,6 +7810,29 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
+ if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+ uint8_t version =
+ kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24, 0, R_EBX);
+
+ if (!env->avx10_version) {
+ env->avx10_version = version;
+ }
+
+ if (version < env->avx10_version) {
+ const char *msg = accel_uses_host_cpuid()
+ ? "Host doesn't support requested feature"
+ : "TCG doesn't support requested feature";
+ if (cpu->enforce_cpuid) {
+ error_setg(&local_err, "%s: avx10.%d", msg,
+ env->avx10_version);
+ goto out;
+ } else if (cpu->check_cpuid) {
+ warn_report("%s: avx10.%d", msg, env->avx10_version);
+ }
+ env->avx10_version = version;
+ }
+ }
+
if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {
error_setg(&local_err,
accel_uses_host_cpuid() ?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/6] target/i386: add avx10-version property
2024-10-28 15:10 ` Xiaoyao Li
@ 2024-10-29 6:14 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-29 6:14 UTC (permalink / raw)
To: Xiaoyao Li; +Cc: qemu-devel, pbonzini, mtosatti, xuelian.guo
On Mon, Oct 28, 2024 at 11:10:45PM +0800, Xiaoyao Li wrote:
> On 10/28/2024 10:45 AM, Tao Su wrote:
> > Introduce avx10-version property so that avx10 version can be controlled
> > by user and cpu model. Per spec, avx10 version can never be 0, the default
> > value of avx10-version is set to 0 to determine whether it is specified by
> > user.
>
> I think it's better to merge this patch into next one. It's intact to show
> how avx10_version is supposed to work.
>
Will do in v2, thanks for this suggestion!
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10
2024-10-28 2:45 ` [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10 Tao Su
2024-10-28 15:04 ` Xiaoyao Li
@ 2024-10-29 8:25 ` Paolo Bonzini
2024-10-29 14:29 ` Tao Su
1 sibling, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-29 8:25 UTC (permalink / raw)
To: Tao Su, qemu-devel; +Cc: mtosatti, xiaoyao.li, xuelian.guo
On 10/28/24 03:45, Tao Su wrote:
> @@ -6835,6 +6850,26 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> }
> break;
> }
> + case 0x24: {
> + *eax = 0;
> + *ebx = 0;
> + *ecx = 0;
> + *edx = 0;
> + if (!(env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
> + break;
> + }
> +
> + if (count == 0) {
> + uint8_t v = kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24,
> + 0, R_EBX);
> + if (env->avx10_version && env->avx10_version < v) {
> + v = env->avx10_version;
> + }
> +
> + *ebx = env->features[FEAT_24_0_EBX] | v;
> + }
> + break;
> + }
> case 0x40000000:
> /*
> * CPUID code in kvm_arch_init_vcpu() ignores stuff
This check should be done elsewhere (called by x86_cpu_realizefn());
cpu_x86_cpuid() should only report the value:
if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && count == 0) {
*ebx = env->features[FEAT_24_0_EBX] | env->avx10_version;
}
Also, the check should use x86_cpu_get_supported_cpuid() because KVM is not the
only accelerator.
>
> + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> + uint8_t version =
> + kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24, 0, R_EBX);
> +
> + if (!env->avx10_version) {
> + env->avx10_version = version;
> + }
> +
This should not be done here, but in max_x86_cpu_realize(). It should also
use x86_cpu_get_supported_cpuid().
For Granite Rapids you're only setting the AVX10 version in v2 and therefore
you don't need it, but there should also be (for the future) an avx10_version
field in X86CPUDefinition, which is set into the avx10-version property at
x86_cpu_load_model().
> index d845384dcd..5566a13f4f 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -662,6 +662,7 @@ typedef enum FeatureWord {
> FEAT_XSAVE_XSS_HI, /* CPUID[EAX=0xd,ECX=1].EDX */
> FEAT_7_1_EDX, /* CPUID[EAX=7,ECX=1].EDX */
> FEAT_7_2_EDX, /* CPUID[EAX=7,ECX=2].EDX */
> + FEAT_24_0_EBX, /* CPUID[EAX=0x24,ECX=0].EBX */
Adding FEAT_24_0_EBX should be a separate patch.
Paolo
> FEATURE_WORDS,
> } FeatureWord;
>
> @@ -990,6 +991,13 @@ uint64_t x86_cpu_get_supported_feature_word(X86CPU *cpu, FeatureWord w);
> /* Packets which contain IP payload have LIP values */
> #define CPUID_14_0_ECX_LIP (1U << 31)
>
> +/* AVX10 128-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_128 (1U << 16)
> +/* AVX10 256-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_256 (1U << 17)
> +/* AVX10 512-bit vector support is present */
> +#define CPUID_24_0_EBX_AVX10_512 (1U << 18)
> +
> /* RAS Features */
> #define CPUID_8000_0007_EBX_OVERFLOW_RECOV (1U << 0)
> #define CPUID_8000_0007_EBX_SUCCOR (1U << 1)
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index fd9f198892..8e17942c3b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1923,7 +1923,8 @@ static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
> case 0x7:
> case 0x14:
> case 0x1d:
> - case 0x1e: {
> + case 0x1e:
> + case 0x24: {
> uint32_t times;
>
> c->function = i;
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-28 9:25 ` Tao Su
@ 2024-10-29 8:49 ` Paolo Bonzini
2024-10-29 9:29 ` Tao Su
0 siblings, 1 reply; 29+ messages in thread
From: Paolo Bonzini @ 2024-10-29 8:49 UTC (permalink / raw)
To: Tao Su; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On 10/28/24 10:25, Tao Su wrote:
> On Mon, Oct 28, 2024 at 09:41:14AM +0100, Paolo Bonzini wrote:
>> On 10/28/24 03:45, Tao Su wrote:
>>> AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
>>> are identical to AVX512 state regardless of the supported vector lengths.
>>>
>>> Given that some E-cores will support AVX10 but not support AVX512, add
>>> AVX512 state components to guest when AVX10 is enabled.
>>>
>>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>>> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
>>> ---
>>> target/i386/cpu.c | 14 ++++++++++++++
>>> target/i386/cpu.h | 2 ++
>>> 2 files changed, 16 insertions(+)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 1ff1af032e..d845ff5e4e 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
>>> }
>>> if (env->features[esa->feature] & esa->bits) {
>>> xcr0 |= 1ull << i;
>>> + continue;
>>> + }
>>> + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
>>> + i == XSTATE_Hi16_ZMM_BIT) {
>>
>> Can you confirm that XSTATE_ZMM_Hi256_BIT depends on AVX10 and not
>> AVX10-512?
>>
>
> Sorry, I should attach AVX10.2 spec [*].
>
> In 3.1.3, spec said Intel AVX10 state enumeration in CPUID leaf 0xD and
> enabling in XCR0 register are identical to Intel AVX-512 regardless of the
> maximum vector length supported.
>
> So XSTATE_ZMM_Hi256_BIT doesn't depend on AVX10-512.
>
> [*] https://cdrdv2.intel.com/v1/dl/getContent/828965
Ok, thanks.
Another related issue is that kvm_cpu_xsave_init() is using esa->feature
and esa->bits, which misses these three features.
I think we need to change the code to not look at esa->feature at all.
I'll send a v2 of your series.
Paolo
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported
2024-10-29 8:49 ` Paolo Bonzini
@ 2024-10-29 9:29 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-29 9:29 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On Tue, Oct 29, 2024 at 09:49:39AM +0100, Paolo Bonzini wrote:
> On 10/28/24 10:25, Tao Su wrote:
> > On Mon, Oct 28, 2024 at 09:41:14AM +0100, Paolo Bonzini wrote:
> > > On 10/28/24 03:45, Tao Su wrote:
> > > > AVX10 state enumeration in CPUID leaf D and enabling in XCR0 register
> > > > are identical to AVX512 state regardless of the supported vector lengths.
> > > >
> > > > Given that some E-cores will support AVX10 but not support AVX512, add
> > > > AVX512 state components to guest when AVX10 is enabled.
> > > >
> > > > Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> > > > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > > > ---
> > > > target/i386/cpu.c | 14 ++++++++++++++
> > > > target/i386/cpu.h | 2 ++
> > > > 2 files changed, 16 insertions(+)
> > > >
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 1ff1af032e..d845ff5e4e 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -7177,6 +7177,13 @@ static void x86_cpu_reset_hold(Object *obj, ResetType type)
> > > > }
> > > > if (env->features[esa->feature] & esa->bits) {
> > > > xcr0 |= 1ull << i;
> > > > + continue;
> > > > + }
> > > > + if (i == XSTATE_OPMASK_BIT || i == XSTATE_ZMM_Hi256_BIT ||
> > > > + i == XSTATE_Hi16_ZMM_BIT) {
> > >
> > > Can you confirm that XSTATE_ZMM_Hi256_BIT depends on AVX10 and not
> > > AVX10-512?
> > >
> >
> > Sorry, I should attach AVX10.2 spec [*].
> >
> > In 3.1.3, spec said Intel AVX10 state enumeration in CPUID leaf 0xD and
> > enabling in XCR0 register are identical to Intel AVX-512 regardless of the
> > maximum vector length supported.
> >
> > So XSTATE_ZMM_Hi256_BIT doesn't depend on AVX10-512.
> >
> > [*] https://cdrdv2.intel.com/v1/dl/getContent/828965
>
> Ok, thanks.
>
> Another related issue is that kvm_cpu_xsave_init() is using esa->feature and
> esa->bits, which misses these three features.
Yes, it has issue if AVX512F is not reported but AVX10 is reported, thanks for
pointing out!
>
> I think we need to change the code to not look at esa->feature at all. I'll
> send a v2 of your series.
>
Yes, ExtSaveArea can't set more feature bits, which makes the code a bit ugly.
Looking forward to the better implementation :-)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10
2024-10-29 8:25 ` Paolo Bonzini
@ 2024-10-29 14:29 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-29 14:29 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, mtosatti, xiaoyao.li, xuelian.guo
On Tue, Oct 29, 2024 at 09:25:53AM +0100, Paolo Bonzini wrote:
> On 10/28/24 03:45, Tao Su wrote:
> > @@ -6835,6 +6850,26 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
> > }
> > break;
> > }
> > + case 0x24: {
> > + *eax = 0;
> > + *ebx = 0;
> > + *ecx = 0;
> > + *edx = 0;
> > + if (!(env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10)) {
> > + break;
> > + }
> > +
> > + if (count == 0) {
> > + uint8_t v = kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24,
> > + 0, R_EBX);
> > + if (env->avx10_version && env->avx10_version < v) {
> > + v = env->avx10_version;
> > + }
> > +
> > + *ebx = env->features[FEAT_24_0_EBX] | v;
> > + }
> > + break;
> > + }
> > case 0x40000000:
> > /*
> > * CPUID code in kvm_arch_init_vcpu() ignores stuff
>
> This check should be done elsewhere (called by x86_cpu_realizefn());
> cpu_x86_cpuid() should only report the value:
>
> if ((env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) && count == 0) {
> *ebx = env->features[FEAT_24_0_EBX] | env->avx10_version;
> }
>
> Also, the check should use x86_cpu_get_supported_cpuid() because KVM is not the
> only accelerator.
>
Yes, I see, I add check here:
@@ -7679,6 +7719,27 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
+ if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
+ uint32_t eax, ebx, ecx, edx;
+ x86_cpu_get_supported_cpuid(0x24, 0, &eax, &ebx, &ecx, &edx);
+
+ ebx &= 0xff;
+
+ if (ebx < env->avx10_version) {
+ const char *msg = accel_uses_host_cpuid()
+ ? "Host doesn't support requested feature"
+ : "TCG doesn't support requested feature";
+ if (cpu->enforce_cpuid) {
+ error_setg(&local_err, "%s: avx10.%d", msg,
+ env->avx10_version);
+ goto out;
+ } else if (cpu->check_cpuid) {
+ warn_report("%s: avx10.%d", msg, env->avx10_version);
+ }
+ env->avx10_version = ebx;
+ }
+ }
+
if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {
error_setg(&local_err,
accel_uses_host_cpuid() ?
>
> >
> > + if (env->features[FEAT_7_1_EDX] & CPUID_7_1_EDX_AVX10) {
> > + uint8_t version =
> > + kvm_arch_get_supported_cpuid(cs->kvm_state, 0x24, 0, R_EBX);
> > +
> > + if (!env->avx10_version) {
> > + env->avx10_version = version;
> > + }
> > +
>
> This should not be done here, but in max_x86_cpu_realize(). It should also
> use x86_cpu_get_supported_cpuid().
>
Yes, I try to add here:
@@ -5327,6 +5365,12 @@ static void max_x86_cpu_realize(DeviceState *dev, Error **errp)
}
}
+ if (!object_property_get_uint(obj, "avx10-version", &error_abort)) {
+ uint32_t eax, ebx, ecx, edx;
+ x86_cpu_get_supported_cpuid(0x24, 0, &eax, &ebx, &ecx, &edx);
+ object_property_set_uint(obj, "avx10-version", ebx & 0xff, &error_abort);
+ }
+
x86_cpu_realizefn(dev, errp);
}
> For Granite Rapids you're only setting the AVX10 version in v2 and therefore
> you don't need it, but there should also be (for the future) an avx10_version
> field in X86CPUDefinition, which is set into the avx10-version property at
> x86_cpu_load_model().
>
Yes, for new CPU model, we should do that.
> > index d845384dcd..5566a13f4f 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -662,6 +662,7 @@ typedef enum FeatureWord {
> > FEAT_XSAVE_XSS_HI, /* CPUID[EAX=0xd,ECX=1].EDX */
> > FEAT_7_1_EDX, /* CPUID[EAX=7,ECX=1].EDX */
> > FEAT_7_2_EDX, /* CPUID[EAX=7,ECX=2].EDX */
> > + FEAT_24_0_EBX, /* CPUID[EAX=0x24,ECX=0].EBX */
>
> Adding FEAT_24_0_EBX should be a separate patch.
>
Yes, all you said above are excellent suggestions and I have tested on
my platform. Should I submit a v2 with these changes or wait for you to
send v2 directly? :-)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-29 14:47 ` Zhao Liu
@ 2024-10-29 14:36 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-29 14:36 UTC (permalink / raw)
To: Zhao Liu; +Cc: qemu-devel, pbonzini, mtosatti, xiaoyao.li, xuelian.guo
On Tue, Oct 29, 2024 at 10:47:04PM +0800, Zhao Liu wrote:
> Hi Tao,
>
> On Mon, Oct 28, 2024 at 10:45:10AM +0800, Tao Su wrote:
> > Date: Mon, 28 Oct 2024 10:45:10 +0800
> > From: Tao Su <tao1.su@linux.intel.com>
> > Subject: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
> > X-Mailer: git-send-email 2.34.1
> >
> > Since the highest supported vector length for a processor implies that
> > all lesser vector lengths are also supported, add the dependencies of
> > the supported vector lengths. If all vector lengths aren't supported,
> > clear AVX10 enable bit as well.
> >
> > Note that the order of AVX10 related dependencies should be kept as:
> > CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> > CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> > CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
> > CPUID_7_1_EDX_AVX10 -> CPUID_24_0_EBX,
> > so that prevent user from setting weird CPUID combinations, e.g. 256-bits
> > and 512-bits are supported but 128-bits is not, no vector lengths are
> > supported but AVX10 enable bit is still set.
> >
> > Since AVX10_128 will be reserved as 1,
>
> Does this means AVX10_128 bit is reserved and it is always 1?
>
> From the spec you linked in cover letter (Table 1-1. CPUID Enumeration
> of Intel® AVX10), it seems AVX10_128 bit is marked as reserved.
>
> It's worth describing its behavior.
Yes, AVX10_128 will be reserved as 1. SDM(Volume 1, Ch16) and AVX10.2
have already described this. I believe AVX10.1 spec (linked in cover
letter) will change the description.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
2024-10-28 2:45 ` [PATCH 4/6] target/i386: Add feature dependencies " Tao Su
2024-10-28 8:45 ` Paolo Bonzini
@ 2024-10-29 14:47 ` Zhao Liu
2024-10-29 14:36 ` Tao Su
1 sibling, 1 reply; 29+ messages in thread
From: Zhao Liu @ 2024-10-29 14:47 UTC (permalink / raw)
To: Tao Su; +Cc: qemu-devel, pbonzini, mtosatti, xiaoyao.li, xuelian.guo
Hi Tao,
On Mon, Oct 28, 2024 at 10:45:10AM +0800, Tao Su wrote:
> Date: Mon, 28 Oct 2024 10:45:10 +0800
> From: Tao Su <tao1.su@linux.intel.com>
> Subject: [PATCH 4/6] target/i386: Add feature dependencies for AVX10
> X-Mailer: git-send-email 2.34.1
>
> Since the highest supported vector length for a processor implies that
> all lesser vector lengths are also supported, add the dependencies of
> the supported vector lengths. If all vector lengths aren't supported,
> clear AVX10 enable bit as well.
>
> Note that the order of AVX10 related dependencies should be kept as:
> CPUID_24_0_EBX_AVX10_128 -> CPUID_24_0_EBX_AVX10_256,
> CPUID_24_0_EBX_AVX10_256 -> CPUID_24_0_EBX_AVX10_512,
> CPUID_24_0_EBX_AVX10_VL_MASK -> CPUID_7_1_EDX_AVX10,
> CPUID_7_1_EDX_AVX10 -> CPUID_24_0_EBX,
> so that prevent user from setting weird CPUID combinations, e.g. 256-bits
> and 512-bits are supported but 128-bits is not, no vector lengths are
> supported but AVX10 enable bit is still set.
>
> Since AVX10_128 will be reserved as 1,
Does this means AVX10_128 bit is reserved and it is always 1?
From the spec you linked in cover letter (Table 1-1. CPUID Enumeration
of Intel® AVX10), it seems AVX10_128 bit is marked as reserved.
It's worth describing its behavior.
> adding these dependencies has the
> bonus that when user sets -cpu host,-avx10-128, CPUID_7_1_EDX_AVX10 and
> CPUID_24_0_EBX will be disabled automatically.
>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> target/i386/cpu.c | 16 ++++++++++++++++
> target/i386/cpu.h | 4 ++++
> 2 files changed, 20 insertions(+)
Otherwise,
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model
2024-10-28 2:45 ` [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model Tao Su
@ 2024-10-29 14:58 ` Zhao Liu
2024-10-30 1:28 ` Tao Su
0 siblings, 1 reply; 29+ messages in thread
From: Zhao Liu @ 2024-10-29 14:58 UTC (permalink / raw)
To: Tao Su; +Cc: qemu-devel, pbonzini, mtosatti, xiaoyao.li, xuelian.guo
On Mon, Oct 28, 2024 at 10:45:12AM +0800, Tao Su wrote:
> Date: Mon, 28 Oct 2024 10:45:12 +0800
> From: Tao Su <tao1.su@linux.intel.com>
> Subject: [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model
> X-Mailer: git-send-email 2.34.1
>
> Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
> tsc-adjust, cldemote, movdiri, movdir64b).
Do you have datasheet link? It's better to add the link in the commit
message for easy comparison checking.
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> ---
> target/i386/cpu.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index adde98fd26..8d72c08b66 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -4375,6 +4375,23 @@ static const X86CPUDefinition builtin_x86_defs[] = {
> .model_id = "Intel Xeon Processor (GraniteRapids)",
> .versions = (X86CPUVersionDefinition[]) {
> { .version = 1 },
> + {
> + .version = 2,
> + .props = (PropValue[]) {
> + { "ss", "on" },
> + { "tsc-adjust", "on" },
> + { "cldemote", "on" },
> + { "movdiri", "on" },
> + { "movdir64b", "on" },
> + { "avx10", "on" },
> + { "avx10-128", "on" },
> + { "avx10-256", "on" },
> + { "avx10-512", "on" },
> + { "avx10-version", "1" },
> + { "stepping", "1" },
> + { /* end of list */ }
> + }
> + },
> { /* end of list */ },
> },
> },
> --
> 2.34.1
>
LGTM,
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
BTW, Could you please update the CPU model you added in
docs/system/cpu-models-x86.rst.inc (section "Preferred CPU models for
Intel x86 hosts") as well? Although this document has been inactive for
some time, it hasn't been deprecated, and we can pick it up again to
continue updating it, helping QEMU users understand QEMU's support for
x86 CPU/features.
Thanks,
Zhao
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model
2024-10-29 14:58 ` Zhao Liu
@ 2024-10-30 1:28 ` Tao Su
0 siblings, 0 replies; 29+ messages in thread
From: Tao Su @ 2024-10-30 1:28 UTC (permalink / raw)
To: Zhao Liu; +Cc: qemu-devel, pbonzini, mtosatti, xiaoyao.li, xuelian.guo
On Tue, Oct 29, 2024 at 10:58:39PM +0800, Zhao Liu wrote:
> On Mon, Oct 28, 2024 at 10:45:12AM +0800, Tao Su wrote:
> > Date: Mon, 28 Oct 2024 10:45:12 +0800
> > From: Tao Su <tao1.su@linux.intel.com>
> > Subject: [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model
> > X-Mailer: git-send-email 2.34.1
> >
> > Update GraniteRapids CPU model to add AVX10 and the missing features(ss,
> > tsc-adjust, cldemote, movdiri, movdir64b).
>
> Do you have datasheet link? It's better to add the link in the commit
> message for easy comparison checking.
>
Sorry, I think we can get new CPUIDs in ISE[*], but as far as I know,
there is no datasheet which lists all CPUIDs.
[*] https://cdrdv2.intel.com/v1/dl/getContent/671368
> > Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> > Signed-off-by: Tao Su <tao1.su@linux.intel.com>
> > ---
> > target/i386/cpu.c | 17 +++++++++++++++++
> > 1 file changed, 17 insertions(+)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index adde98fd26..8d72c08b66 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -4375,6 +4375,23 @@ static const X86CPUDefinition builtin_x86_defs[] = {
> > .model_id = "Intel Xeon Processor (GraniteRapids)",
> > .versions = (X86CPUVersionDefinition[]) {
> > { .version = 1 },
> > + {
> > + .version = 2,
> > + .props = (PropValue[]) {
> > + { "ss", "on" },
> > + { "tsc-adjust", "on" },
> > + { "cldemote", "on" },
> > + { "movdiri", "on" },
> > + { "movdir64b", "on" },
> > + { "avx10", "on" },
> > + { "avx10-128", "on" },
> > + { "avx10-256", "on" },
> > + { "avx10-512", "on" },
> > + { "avx10-version", "1" },
> > + { "stepping", "1" },
> > + { /* end of list */ }
> > + }
> > + },
> > { /* end of list */ },
> > },
> > },
> > --
> > 2.34.1
> >
>
> LGTM,
>
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
>
> BTW, Could you please update the CPU model you added in
> docs/system/cpu-models-x86.rst.inc (section "Preferred CPU models for
> Intel x86 hosts") as well? Although this document has been inactive for
> some time, it hasn't been deprecated, and we can pick it up again to
> continue updating it, helping QEMU users understand QEMU's support for
> x86 CPU/features.
Yes, thanks for this suggestion! I think I can update the doc when I
introduce new CPU models (e.g. upcoming Clearwater Forest).
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-10-30 1:33 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-28 2:45 [PATCH 0/6] Add AVX10.1 CPUID support and GraniteRapids-v2 model Tao Su
2024-10-28 2:45 ` [PATCH 1/6] target/i386: Add AVX512 state when AVX10 is supported Tao Su
2024-10-28 8:41 ` Paolo Bonzini
2024-10-28 9:25 ` Tao Su
2024-10-29 8:49 ` Paolo Bonzini
2024-10-29 9:29 ` Tao Su
2024-10-28 15:12 ` Xiaoyao Li
2024-10-28 2:45 ` [PATCH 2/6] target/i386: add avx10-version property Tao Su
2024-10-28 15:10 ` Xiaoyao Li
2024-10-29 6:14 ` Tao Su
2024-10-28 2:45 ` [PATCH 3/6] target/i386: Add CPUID.24 leaf for AVX10 Tao Su
2024-10-28 15:04 ` Xiaoyao Li
2024-10-29 6:13 ` Tao Su
2024-10-29 8:25 ` Paolo Bonzini
2024-10-29 14:29 ` Tao Su
2024-10-28 2:45 ` [PATCH 4/6] target/i386: Add feature dependencies " Tao Su
2024-10-28 8:45 ` Paolo Bonzini
2024-10-28 10:02 ` Tao Su
2024-10-28 10:45 ` Paolo Bonzini
2024-10-28 12:23 ` Tao Su
2024-10-28 14:48 ` Xiaoyao Li
2024-10-28 14:50 ` Paolo Bonzini
2024-10-28 15:08 ` Xiaoyao Li
2024-10-29 14:47 ` Zhao Liu
2024-10-29 14:36 ` Tao Su
2024-10-28 2:45 ` [PATCH 5/6] target/i386: Add support for AVX10 in CPUID enumeration Tao Su
2024-10-28 2:45 ` [PATCH 6/6] target/i386: Introduce GraniteRapids-v2 model Tao Su
2024-10-29 14:58 ` Zhao Liu
2024-10-30 1:28 ` Tao Su
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).