From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C9527CD68EC for ; Tue, 10 Oct 2023 05:30:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qq5KA-0004XC-8G; Tue, 10 Oct 2023 01:30:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qq5K0-0004WP-5n for qemu-devel@nongnu.org; Tue, 10 Oct 2023 01:30:27 -0400 Received: from mgamail.intel.com ([192.55.52.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qq5Jv-0006wY-QU for qemu-devel@nongnu.org; Tue, 10 Oct 2023 01:30:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696915815; x=1728451815; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=6VJFbVvrQpAOu5bD0WuuWxMpxBE0tMfgXqFvuxiE6O0=; b=YMcwLqJEWSbeCvu4Fit8UmADwPrxix+lOX5o+xiVJCUUHNWbuFbdCpwc hlMNakj9+UwgEHnNHxuV4KHKQJe0yd40O0jyV6BUrXBkP5t4AcI7fnAvk CIuEQE6vsgfk6JlYxCktl6e4487JYRdNUMYZCKKQEoZSLADZLi8eXLYAf WnvkEKcdJgrLcteHGCE+OlPraYwKsp41eeers6SSTWY2uoIcSptxCdvSW Rc1l+fv+KQEepOXKOeORU2L6eMqv3GIFhFJQlgFAVbLKaSkoGdjVkb97x BFScrVcn42RX8eoueS21v5ecw5SIK1XU6OxRQ+EIWIrUMoLZc7BV1ZV7e Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10858"; a="470575747" X-IronPort-AV: E=Sophos;i="6.03,211,1694761200"; d="scan'208";a="470575747" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2023 22:30:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10858"; a="746943652" X-IronPort-AV: E=Sophos;i="6.03,211,1694761200"; d="scan'208";a="746943652" Received: from xiaoyaol-hp-g830.ccr.corp.intel.com (HELO [10.93.19.128]) ([10.93.19.128]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2023 22:30:01 -0700 Message-ID: <7e8deb37-4521-090a-cc77-83ece4e3aa19@intel.com> Date: Tue, 10 Oct 2023 13:29:58 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.15.1 Subject: Re: [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Content-Language: en-US To: Tina Zhang , Paolo Bonzini , Richard Henderson , "Michael S. Tsirkin" , Marcel Apfelbaum , Igor Mammedov , Ani Sinha , Peter Xu , David Hildenbrand , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , =?UTF-8?Q?Daniel_P=2e_Berrang=c3=a9?= , Cornelia Huck , Eric Blake , Markus Armbruster , Marcelo Tosatti , Gerd Hoffmann Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, Eduardo Habkost , Laszlo Ersek , Isaku Yamahata , erdemaktas@google.com, Chenyi Qiang References: <20230818095041.1973309-1-xiaoyao.li@intel.com> <20230818095041.1973309-9-xiaoyao.li@intel.com> <2175b694-c21f-464e-afee-b9ee9da154c1@intel.com> From: Xiaoyao Li In-Reply-To: <2175b694-c21f-464e-afee-b9ee9da154c1@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=192.55.52.43; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -56 X-Spam_score: -5.7 X-Spam_bar: ----- X-Spam_report: (-5.7 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.999, HK_RANDOM_FROM=1, NICE_REPLY_A=-3.339, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On 10/10/2023 9:02 AM, Tina Zhang wrote: > Hi, > > On 8/18/23 17:49, Xiaoyao Li wrote: >> According to Chapter "CPUID Virtualization" in TDX module spec, CPUID >> bits of TD can be classified into 6 types: >> >> ------------------------------------------------------------------------ >> 1 | As configured | configurable by VMM, independent of native value; >> ------------------------------------------------------------------------ >> 2 | As configured | configurable by VMM if the bit is supported natively >>      (if native)   | Otherwise it equals as native(0). >> ------------------------------------------------------------------------ >> 3 | Fixed         | fixed to 0/1 >> ------------------------------------------------------------------------ >> 4 | Native        | reflect the native value >> ------------------------------------------------------------------------ >> 5 | Calculated    | calculated by TDX module. >> ------------------------------------------------------------------------ >> 6 | Inducing #VE  | get #VE exception >> ------------------------------------------------------------------------ >> >> Note: >> 1. All the configurable XFAM related features and TD attributes related >>     features fall into type #2. And fixed0/1 bits of XFAM and TD >>     attributes fall into type #3. >> >> 2. For CPUID leaves not listed in "CPUID virtualization Overview" table >>     in TDX module spec, TDX module injects #VE to TDs when those are >>     queried. For this case, TDs can request CPUID emulation from VMM via >>     TDVMCALL and the values are fully controlled by VMM. >> >> Due to TDX module has its own virtualization policy on CPUID bits, it >> leads >> to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported >> CPUID bits for TDs. In order to keep a consistent CPUID configuration >> between VMM and TDs. Adjust supported CPUID for TDs based on TDX >> restrictions. >> >> Currently only focus on the CPUID leaves recognized by QEMU's >> feature_word_info[] that are indexed by a FeatureWord. >> >> Introduce a TDX CPUID lookup table, which maintains 1 entry for each >> FeatureWord. Each entry has below fields: >> >>   - tdx_fixed0/1: The bits that are fixed as 0/1; >> >>   - vmm_fixup:   The bits that are configurable from the view of TDX >> module. >>                  But they requires emulation of VMM when they are >> configured >>             as enabled. For those, they are not supported if VMM doesn't >>         report them as supported. So they need be fixed up by >>         checking if VMM supports them. >> >>   - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is >>                  totally configurable by VMM. >> >>   - supported_on_ve: It's valid only when @inducing_ve is true. It >> represents >>             the maximum feature set supported that be emulated >>             for TDs. >> >> By applying TDX CPUID lookup table and TDX capabilities reported from >> TDX module, the supported CPUID for TDs can be obtained from following >> steps: >> >> - get the base of VMM supported feature set; >> >> - if the leaf is not a FeatureWord just return VMM's value without >>    modification; >> >> - if the leaf is an inducing_ve type, applying supported_on_ve mask and >>    return; >> >> - include all native bits, it covers type #2, #4, and parts of type #1. >>    (it also includes some unsupported bits. The following step will >>     correct it.) >> >> - apply fixed0/1 to it (it covers #3, and rectifies the previous step); >> >> - add configurable bits (it covers the other part of type #1); >> >> - fix the ones in vmm_fixup; >> >> - filter the one has valid .supported field; >> >> (Calculated type is ignored since it's determined at runtime). >> >> Co-developed-by: Chenyi Qiang >> Signed-off-by: Chenyi Qiang >> Signed-off-by: Xiaoyao Li >> --- >>   target/i386/cpu.h     |  16 +++ >>   target/i386/kvm/kvm.c |   4 + >>   target/i386/kvm/tdx.c | 254 ++++++++++++++++++++++++++++++++++++++++++ >>   target/i386/kvm/tdx.h |   2 + >>   4 files changed, 276 insertions(+) >> >> diff --git a/target/i386/cpu.h b/target/i386/cpu.h >> index e0771a10433b..c93dcd274531 100644 >> --- a/target/i386/cpu.h >> +++ b/target/i386/cpu.h >> @@ -780,6 +780,8 @@ uint64_t >> x86_cpu_get_supported_feature_word(FeatureWord w, >>   /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */ >>   #define CPUID_7_0_EBX_FSGSBASE          (1U << 0) >> +/* Support for TSC adjustment MSR 0x3B */ >> +#define CPUID_7_0_EBX_TSC_ADJUST        (1U << 1) >>   /* Support SGX */ >>   #define CPUID_7_0_EBX_SGX               (1U << 2) >>   /* 1st Group of Advanced Bit Manipulation Extensions */ >> @@ -798,8 +800,12 @@ uint64_t >> x86_cpu_get_supported_feature_word(FeatureWord w, >>   #define CPUID_7_0_EBX_INVPCID           (1U << 10) >>   /* Restricted Transactional Memory */ >>   #define CPUID_7_0_EBX_RTM               (1U << 11) >> +/* Cache QoS Monitoring */ >> +#define CPUID_7_0_EBX_PQM               (1U << 12) >>   /* Memory Protection Extension */ >>   #define CPUID_7_0_EBX_MPX               (1U << 14) >> +/* Resource Director Technology Allocation */ >> +#define CPUID_7_0_EBX_RDT_A             (1U << 15) >>   /* AVX-512 Foundation */ >>   #define CPUID_7_0_EBX_AVX512F           (1U << 16) >>   /* AVX-512 Doubleword & Quadword Instruction */ >> @@ -855,10 +861,16 @@ uint64_t >> x86_cpu_get_supported_feature_word(FeatureWord w, >>   #define CPUID_7_0_ECX_AVX512VNNI        (1U << 11) >>   /* Support for VPOPCNT[B,W] and VPSHUFBITQMB */ >>   #define CPUID_7_0_ECX_AVX512BITALG      (1U << 12) >> +/* Intel Total Memory Encryption */ >> +#define CPUID_7_0_ECX_TME               (1U << 13) >>   /* POPCNT for vectors of DW/QW */ >>   #define CPUID_7_0_ECX_AVX512_VPOPCNTDQ  (1U << 14) >> +/* Placeholder for bit 15 */ >> +#define CPUID_7_0_ECX_FZM               (1U << 15) >>   /* 5-level Page Tables */ >>   #define CPUID_7_0_ECX_LA57              (1U << 16) >> +/* MAWAU for MPX */ >> +#define CPUID_7_0_ECX_MAWAU             (31U << 17) >>   /* Read Processor ID */ >>   #define CPUID_7_0_ECX_RDPID             (1U << 22) >>   /* Bus Lock Debug Exception */ >> @@ -869,6 +881,8 @@ uint64_t >> x86_cpu_get_supported_feature_word(FeatureWord w, >>   #define CPUID_7_0_ECX_MOVDIRI           (1U << 27) >>   /* Move 64 Bytes as Direct Store Instruction */ >>   #define CPUID_7_0_ECX_MOVDIR64B         (1U << 28) >> +/* ENQCMD and ENQCMDS instructions */ >> +#define CPUID_7_0_ECX_ENQCMD            (1U << 29) >>   /* Support SGX Launch Control */ >>   #define CPUID_7_0_ECX_SGX_LC            (1U << 30) >>   /* Protection Keys for Supervisor-mode Pages */ >> @@ -886,6 +900,8 @@ uint64_t >> x86_cpu_get_supported_feature_word(FeatureWord w, >>   #define CPUID_7_0_EDX_SERIALIZE         (1U << 14) >>   /* TSX Suspend Load Address Tracking instruction */ >>   #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16) >> +/* PCONFIG instruction */ >> +#define CPUID_7_0_EDX_PCONFIG           (1U << 18) >>   /* Architectural LBRs */ >>   #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19) >>   /* AMX_BF16 instruction */ >> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c >> index ec5c07bffd38..46a455a1e331 100644 >> --- a/target/i386/kvm/kvm.c >> +++ b/target/i386/kvm/kvm.c >> @@ -539,6 +539,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState >> *s, uint32_t function, >>           ret |= 1U << KVM_HINTS_REALTIME; >>       } >> +    if (is_tdx_vm()) { >> +        tdx_get_supported_cpuid(function, index, reg, &ret); >> +    } >> + >>       return ret; >>   } >> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c >> index 56cb826f6125..3198bc9fd5fb 100644 >> --- a/target/i386/kvm/tdx.c >> +++ b/target/i386/kvm/tdx.c >> @@ -15,11 +15,129 @@ >>   #include "qemu/error-report.h" >>   #include "qapi/error.h" >>   #include "qom/object_interfaces.h" >> +#include "standard-headers/asm-x86/kvm_para.h" >>   #include "sysemu/kvm.h" >> +#include "sysemu/sysemu.h" >>   #include "hw/i386/x86.h" >>   #include "kvm_i386.h" >>   #include "tdx.h" >> +#include "../cpu-internal.h" >> + >> +#define TDX_SUPPORTED_KVM_FEATURES  ((1U << KVM_FEATURE_NOP_IO_DELAY) >> | \ >> +                                     (1U << KVM_FEATURE_PV_UNHALT) | \ >> +                                     (1U << KVM_FEATURE_PV_TLB_FLUSH) >> | \ >> +                                     (1U << KVM_FEATURE_PV_SEND_IPI) | \ >> +                                     (1U << KVM_FEATURE_POLL_CONTROL) >> | \ >> +                                     (1U << >> KVM_FEATURE_PV_SCHED_YIELD) | \ >> +                                     (1U << >> KVM_FEATURE_MSI_EXT_DEST_ID)) >> + >> +typedef struct KvmTdxCpuidLookup { >> +    uint32_t tdx_fixed0; >> +    uint32_t tdx_fixed1; >> + >> +    /* >> +     * The CPUID bits that are configurable from the view of TDX module >> +     * but require VMM emulation if configured to enabled by VMM. >> +     * >> +     * For those bits, they cannot be enabled actually if VMM >> (KVM/QEMU) cannot >> +     * virtualize them. >> +     */ >> +    uint32_t vmm_fixup; >> + >> +    bool inducing_ve; >> +    /* >> +     * The maximum supported feature set for given inducing-#VE leaf. >> +     * It's valid only when .inducing_ve is true. >> +     */ >> +    uint32_t supported_on_ve; >> +} KvmTdxCpuidLookup; >> + >> + /* >> +  * QEMU maintained TDX CPUID lookup tables, which reflects how >> CPUIDs are >> +  * virtualized for guest TDs based on "CPUID virtualization" of TDX >> spec. >> +  * >> +  * Note: >> +  * >> +  * This table will be updated runtime by tdx_caps reported by platform. >> +  * >> +  */ >> +static KvmTdxCpuidLookup tdx_cpuid_lookup[FEATURE_WORDS] = { >> +    [FEAT_1_EDX] = { >> +        .tdx_fixed0 = >> +            BIT(10) /* Reserved */ | BIT(20) /* Reserved */ | >> CPUID_IA64, >> +        .tdx_fixed1 = >> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_APIC | >> +            CPUID_MTRR | CPUID_MCA | CPUID_CLFLUSH | CPUID_DTS, >> +        .vmm_fixup = >> +            CPUID_ACPI | CPUID_PBE, > CPUID_HT might also be needed here, as it's disabled by QEMU when TD > guest only has a single processor core. Add CPUID_HT here seems not correct fix. The root cause is that CPUID_HT is wrongly treated as auto_enabled bit, I will sent a fix separately. > Regards, > -Tina >