qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Lei Wang <lei4.wang@intel.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com,
	berrange@redhat.com, xiaoyao.li@intel.com,
	yang.zhong@linux.intel.com
Subject: Re: [PATCH v3 6/6] i386: Add new CPU model SapphireRapids
Date: Thu, 2 Feb 2023 11:40:55 +0100	[thread overview]
Message-ID: <20230202114055.2b8dd9c3@imammedo.users.ipa.redhat.com> (raw)
In-Reply-To: <20230106083826.5384-7-lei4.wang@intel.com>

On Fri,  6 Jan 2023 00:38:26 -0800
Lei Wang <lei4.wang@intel.com> wrote:

> The new CPU model mostly inherits features from Icelake-Server, while
> adding new features:
>  - AMX (Advance Matrix eXtensions)
>  - Bus Lock Debug Exception
> and new instructions:
>  - AVX VNNI (Vector Neural Network Instruction):
>     - VPDPBUS: Multiply and Add Unsigned and Signed Bytes
>     - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
>     - VPDPWSSD: Multiply and Add Signed Word Integers
>     - VPDPWSSDS: Multiply and Add Signed Integers with Saturation
>  - FP16: Replicates existing AVX512 computational SP (FP32) instructions
>    using FP16 instead of FP32 for ~2X performance gain
>  - SERIALIZE: Provide software with a simple way to force the processor to
>    complete all modifications, faster, allowed in all privilege levels and
>    not causing an unconditional VM exit
>  - TSX Suspend Load Address Tracking: Allows programmers to choose which
>    memory accesses do not need to be tracked in the TSX read set
>  - AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
>    inputs and conversion instructions from IEEE single precision

you should mention all new features here, preferably in format:
 feature-flag:  expected CPUID feature bits , so reviewer would be able to find it in spec
 
also you mention that it inherits most of the features from Icelake cpu,
it would be nice to provide cpuid diff between real Icelake and new cpu
somewhere in cover letter to simplify comparison.

> 
> Features may be added in future versions:
>  - CET (virtualization support hasn't been merged)
> Instructions may be added in future versions:
>  - fast zero-length MOVSB (KVM doesn't support yet)
>  - fast short STOSB (KVM doesn't support yet)
>  - fast short CMPSB, SCASB (KVM doesn't support yet)
> 
> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
> ---
>  target/i386/cpu.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++
>  target/i386/cpu.h |   4 ++
>  2 files changed, 139 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 946df29a3d..5e1ecd50b3 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3576,6 +3576,141 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>              { /* end of list */ }
>          }
>      },
> +    {
> +        .name = "SapphireRapids",
> +        .level = 0x20,
> +        .vendor = CPUID_VENDOR_INTEL,
> +        .family = 6,
> +        .model = 143,
> +        .stepping = 4,
> +        /*
> +         * please keep the ascending order so that we can have a clear view of
> +         * bit position of each feature.
> +         */

unless you have a way to enforce this comment. I wouldn't expect that would
work in practice.

Also If you wish to bring more order here, you should start with a prerequisite
patch that would do the same to all existing CPU models.
That way one can easily see a difference between Icelake and new cpu model.

As it is in this patch (with all unnecessary reordering) it's hard for reviewer
to see differences.
I'd suggest to just copy Icelake definition and modify it to suit new cpu model
(without reordering all feature bits)  

> +        .features[FEAT_1_EDX] =
> +            CPUID_FP87 | CPUID_VME | CPUID_DE | CPUID_PSE | CPUID_TSC |
> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC |
> +            CPUID_SEP | CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV |
> +            CPUID_PAT | CPUID_PSE36 | CPUID_CLFLUSH | CPUID_MMX | CPUID_FXSR |
> +            CPUID_SSE | CPUID_SSE2,
> +        .features[FEAT_1_ECX] =
> +            CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSSE3 |
> +            CPUID_EXT_FMA | CPUID_EXT_CX16 | CPUID_EXT_PCID | CPUID_EXT_SSE41 |
> +            CPUID_EXT_SSE42 | CPUID_EXT_X2APIC | CPUID_EXT_MOVBE |
> +            CPUID_EXT_POPCNT | CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_AES |
> +            CPUID_EXT_XSAVE | CPUID_EXT_AVX | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
> +        .features[FEAT_8000_0001_EDX] =
> +            CPUID_EXT2_SYSCALL | CPUID_EXT2_NX | CPUID_EXT2_PDPE1GB |
> +            CPUID_EXT2_RDTSCP | CPUID_EXT2_LM,
> +        .features[FEAT_8000_0001_ECX] =
> +            CPUID_EXT3_LAHF_LM | CPUID_EXT3_ABM | CPUID_EXT3_3DNOWPREFETCH,
> +        .features[FEAT_8000_0008_EBX] =
> +            CPUID_8000_0008_EBX_WBNOINVD,
> +        .features[FEAT_7_0_EBX] =
> +            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_HLE |
> +            CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> +            CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID | CPUID_7_0_EBX_RTM |
> +            CPUID_7_0_EBX_AVX512F | CPUID_7_0_EBX_AVX512DQ |
> +            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
> +            CPUID_7_0_EBX_AVX512IFMA | CPUID_7_0_EBX_CLFLUSHOPT |
> +            CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_SHA_NI |
> +            CPUID_7_0_EBX_AVX512BW | CPUID_7_0_EBX_AVX512VL,
> +        .features[FEAT_7_0_ECX] =
> +            CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU |
> +            CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
> +            CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
> +            CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
> +            CPUID_7_0_ECX_AVX512_VPOPCNTDQ | CPUID_7_0_ECX_LA57 |
> +            CPUID_7_0_ECX_RDPID | CPUID_7_0_ECX_BUS_LOCK_DETECT,
> +        .features[FEAT_7_0_EDX] =
> +            CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_SERIALIZE |
> +            CPUID_7_0_EDX_TSX_LDTRK | CPUID_7_0_EDX_AMX_BF16 |
> +            CPUID_7_0_EDX_AVX512_FP16 | CPUID_7_0_EDX_AMX_TILE |
> +            CPUID_7_0_EDX_AMX_INT8 | CPUID_7_0_EDX_SPEC_CTRL |
> +            CPUID_7_0_EDX_ARCH_CAPABILITIES | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
> +        .features[FEAT_ARCH_CAPABILITIES] =
> +            MSR_ARCH_CAP_RDCL_NO | MSR_ARCH_CAP_IBRS_ALL |
> +            MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
> +            MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
> +        .features[FEAT_XSAVE] =
> +            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> +            CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES | CPUID_D_1_EAX_XFD,
> +        .features[FEAT_6_EAX] =
> +            CPUID_6_EAX_ARAT,
> +        .features[FEAT_7_1_EAX] =
> +            CPUID_7_1_EAX_AVX_VNNI | CPUID_7_1_EAX_AVX512_BF16,
> +        .features[FEAT_1D_1_EAX] = INTEL_SPR_AMX_TOTAL_TILE_BYTES |
> +            (INTEL_SPR_AMX_BYTES_PER_TILE << 16),
> +        .features[FEAT_1D_1_EBX] = INTEL_SPR_AMX_BYTES_PER_ROW |
> +            (INTEL_SPR_AMX_TILE_MAX_NAMES << 16),
> +        .features[FEAT_1D_1_ECX] = INTEL_SPR_AMX_TILE_MAX_ROWS,
> +        .features[FEAT_1E_0_EBX] = INTEL_SPR_AMX_TMUL_MAX_K |
> +            (INTEL_SPR_AMX_TMUL_MAX_N << 8),
> +        .features[FEAT_VMX_BASIC] =
> +            MSR_VMX_BASIC_INS_OUTS | MSR_VMX_BASIC_TRUE_CTLS,
> +        .features[FEAT_VMX_ENTRY_CTLS] =
> +            VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_IA32E_MODE |
> +            VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
> +            VMX_VM_ENTRY_LOAD_IA32_PAT | VMX_VM_ENTRY_LOAD_IA32_EFER,
> +        .features[FEAT_VMX_EPT_VPID_CAPS] =
> +            MSR_VMX_EPT_EXECONLY |
> +            MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_PAGE_WALK_LENGTH_5 |
> +            MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB |
> +            MSR_VMX_EPT_INVEPT | MSR_VMX_EPT_AD_BITS |
> +            MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
> +            MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT |
> +            MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
> +            MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS,
> +        .features[FEAT_VMX_EXIT_CTLS] =
> +            VMX_VM_EXIT_SAVE_DEBUG_CONTROLS |
> +            VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> +            VMX_VM_EXIT_ACK_INTR_ON_EXIT | VMX_VM_EXIT_SAVE_IA32_PAT |
> +            VMX_VM_EXIT_LOAD_IA32_PAT | VMX_VM_EXIT_SAVE_IA32_EFER |
> +            VMX_VM_EXIT_LOAD_IA32_EFER | VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER,
> +        .features[FEAT_VMX_MISC] =
> +            MSR_VMX_MISC_STORE_LMA | MSR_VMX_MISC_ACTIVITY_HLT |
> +            MSR_VMX_MISC_VMWRITE_VMEXIT,
> +        .features[FEAT_VMX_PINBASED_CTLS] =
> +            VMX_PIN_BASED_EXT_INTR_MASK | VMX_PIN_BASED_NMI_EXITING |
> +            VMX_PIN_BASED_VIRTUAL_NMIS | VMX_PIN_BASED_VMX_PREEMPTION_TIMER |
> +            VMX_PIN_BASED_POSTED_INTR,
> +        .features[FEAT_VMX_PROCBASED_CTLS] =
> +            VMX_CPU_BASED_VIRTUAL_INTR_PENDING |
> +            VMX_CPU_BASED_USE_TSC_OFFSETING | VMX_CPU_BASED_HLT_EXITING |
> +            VMX_CPU_BASED_INVLPG_EXITING | VMX_CPU_BASED_MWAIT_EXITING |
> +            VMX_CPU_BASED_RDPMC_EXITING | VMX_CPU_BASED_RDTSC_EXITING |
> +            VMX_CPU_BASED_CR3_LOAD_EXITING | VMX_CPU_BASED_CR3_STORE_EXITING |
> +            VMX_CPU_BASED_CR8_LOAD_EXITING | VMX_CPU_BASED_CR8_STORE_EXITING |
> +            VMX_CPU_BASED_TPR_SHADOW | VMX_CPU_BASED_VIRTUAL_NMI_PENDING |
> +            VMX_CPU_BASED_MOV_DR_EXITING | VMX_CPU_BASED_UNCOND_IO_EXITING |
> +            VMX_CPU_BASED_USE_IO_BITMAPS | VMX_CPU_BASED_MONITOR_TRAP_FLAG |
> +            VMX_CPU_BASED_USE_MSR_BITMAPS | VMX_CPU_BASED_MONITOR_EXITING |
> +            VMX_CPU_BASED_PAUSE_EXITING |
> +            VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS,
> +        .features[FEAT_VMX_SECONDARY_CTLS] =
> +            VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> +            VMX_SECONDARY_EXEC_ENABLE_EPT | VMX_SECONDARY_EXEC_DESC |
> +            VMX_SECONDARY_EXEC_RDTSCP |
> +            VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> +            VMX_SECONDARY_EXEC_ENABLE_VPID | VMX_SECONDARY_EXEC_WBINVD_EXITING |
> +            VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST |
> +            VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT |
> +            VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> +            VMX_SECONDARY_EXEC_RDRAND_EXITING |
> +            VMX_SECONDARY_EXEC_ENABLE_INVPCID |
> +            VMX_SECONDARY_EXEC_ENABLE_VMFUNC | VMX_SECONDARY_EXEC_SHADOW_VMCS |
> +            VMX_SECONDARY_EXEC_RDSEED_EXITING | VMX_SECONDARY_EXEC_ENABLE_PML |
> +            VMX_SECONDARY_EXEC_XSAVES,
> +        .features[FEAT_VMX_VMFUNC] =
> +            MSR_VMX_VMFUNC_EPT_SWITCHING,
> +        .xlevel = 0x80000008,
> +        .model_id = "Intel Xeon Processor (SapphireRapids)",
> +        .versions = (X86CPUVersionDefinition[]) {
> +            { .version = 1 },
> +            { /* end of list */ },
> +        },
> +    },
>      {
>          .name = "Denverton",
>          .level = 21,
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 3e3e0cfe59..8361b9f3ff 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -893,10 +893,14 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>  #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
>  /* Architectural LBRs */
>  #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
> +/* AMX_BF16 instruction */
> +#define CPUID_7_0_EDX_AMX_BF16          (1U << 22)
>  /* AVX512_FP16 instruction */
>  #define CPUID_7_0_EDX_AVX512_FP16       (1U << 23)
>  /* AMX tile (two-dimensional register) */
>  #define CPUID_7_0_EDX_AMX_TILE          (1U << 24)
> +/* AMX_INT8 instruction */
> +#define CPUID_7_0_EDX_AMX_INT8          (1U << 25)
>  /* Speculation Control */
>  #define CPUID_7_0_EDX_SPEC_CTRL         (1U << 26)
>  /* Single Thread Indirect Branch Predictors */



  reply	other threads:[~2023-02-02 10:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-06  8:38 [PATCH v3 0/6] Support for new CPU model SapphireRapids Lei Wang
2023-01-06  8:38 ` [PATCH v3 1/6] i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E Lei Wang
2023-02-06  7:45   ` Yuan Yao
2023-01-06  8:38 ` [PATCH v3 2/6] i386: Remove unused parameter "uint32_t bit" in feature_word_description() Lei Wang
2023-01-27 13:29   ` Igor Mammedov
2023-02-08 14:33   ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 3/6] i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit features Lei Wang
2023-02-08 14:39   ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 4/6] i386: Mask and report unavailable multi-bit feature values Lei Wang
2023-02-06  7:43   ` Yuan Yao
2023-02-09  1:04     ` Wang, Lei
2023-02-09  3:29       ` Xiaoyao Li
2023-02-09  4:21         ` Wang, Lei
2023-02-09  5:59           ` Xiaoyao Li
2023-02-09  6:15             ` Wang, Lei
2023-02-09  9:26               ` Xiaoyao Li
2023-01-06  8:38 ` [PATCH v3 5/6] i386: Initialize AMX CPUID leaves with corresponding env->features[] leaves Lei Wang
2023-01-06  8:38 ` [PATCH v3 6/6] i386: Add new CPU model SapphireRapids Lei Wang
2023-02-02 10:40   ` Igor Mammedov [this message]
2023-02-03  6:02     ` Wang, Lei
2023-02-02 11:05 ` [PATCH v3 0/6] Support for " Igor Mammedov
2023-02-07  2:50   ` Wang, Lei
2023-03-06 12:49     ` Igor Mammedov
2023-02-08 14:53   ` Xiaoyao Li
2023-03-02 14:49   ` Robert Hoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230202114055.2b8dd9c3@imammedo.users.ipa.redhat.com \
    --to=imammedo@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lei4.wang@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yang.zhong@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).