public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Adrian Hunter <adrian.hunter@intel.com>
To: Xiaoyao Li <xiaoyao.li@intel.com>, <pbonzini@redhat.com>,
	<seanjc@google.com>
Cc: <kvm@vger.kernel.org>, <rick.p.edgecombe@intel.com>,
	<kai.huang@intel.com>, <reinette.chatre@intel.com>,
	<tony.lindgren@linux.intel.com>, <binbin.wu@linux.intel.com>,
	<dmatlack@google.com>, <isaku.yamahata@intel.com>,
	<nik.borisov@suse.com>, <linux-kernel@vger.kernel.org>,
	<yan.y.zhao@intel.com>, <chao.gao@intel.com>,
	<weijiang.yang@intel.com>
Subject: Re: [PATCH V2 07/12] KVM: TDX: restore host xsave state when exit from the guest TD
Date: Thu, 27 Feb 2025 16:29:13 +0200	[thread overview]
Message-ID: <bac63116-916f-46f8-ae5b-945b43d181bc@intel.com> (raw)
In-Reply-To: <5cbb181c-f226-4d50-8b92-04775e8b65e0@intel.com>

On 25/02/25 08:43, Xiaoyao Li wrote:
> On 1/29/2025 5:58 PM, Adrian Hunter wrote:
>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>
>> On exiting from the guest TD, xsave state is clobbered.  Restore xsave
>> state on TD exit.
>>
>> Set up guest state so that existing kvm_load_host_xsave_state() can be
>> used. Do not allow VCPU entry if guest state conflicts with the TD's
>> configuration.
>>
>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>> TD vcpu enter/exit v2:
>>   - Drop PT and CET feature flags (Chao)
>>   - Use cpu_feature_enabled() instead of static_cpu_has() (Chao)
>>   - Restore PKRU only if the host value differs from defined
>>     exit value (Chao)
>>   - Use defined masks to separate XFAM bits into XCR0/XSS (Adrian)
>>   - Use existing kvm_load_host_xsave_state() in place of
>>     tdx_restore_host_xsave_state() by defining guest CR4, XCR0, XSS
>>     and PKRU (Sean)
>>   - Do not enter if vital guest state is invalid (Adrian)
>>
>> TD vcpu enter/exit v1:
>>   - Remove noinstr on tdx_vcpu_enter_exit() (Sean)
>>   - Switch to kvm_host struct for xcr0 and xss
>>
>> v19:
>>   - Add EXPORT_SYMBOL_GPL(host_xcr0)
>>
>> v15 -> v16:
>>   - Added CET flag mask
>> ---
>>   arch/x86/kvm/vmx/tdx.c | 72 ++++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 66 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
>> index 3f3d61935a58..e4355553569a 100644
>> --- a/arch/x86/kvm/vmx/tdx.c
>> +++ b/arch/x86/kvm/vmx/tdx.c
>> @@ -83,16 +83,21 @@ static u64 tdx_get_supported_attrs(const struct tdx_sys_info_td_conf *td_conf)
>>       return val;
>>   }
>>   +/*
>> + * Before returning from TDH.VP.ENTER, the TDX Module assigns:
>> + *   XCR0 to the TD’s user-mode feature bits of XFAM (bits 7:0, 9, 18:17)
>> + *   IA32_XSS to the TD's supervisor-mode feature bits of XFAM (bits 8, 16:10)
>> + */
>> +#define TDX_XFAM_XCR0_MASK    (GENMASK(7, 0) | BIT(9) | GENMASK(18, 17))
>> +#define TDX_XFAM_XSS_MASK    (BIT(8) | GENMASK(16, 10))
>> +#define TDX_XFAM_MASK        (TDX_XFAM_XCR0_MASK | TDX_XFAM_XSS_MASK)
> 
> No need to define TDX-specific mask for XFAM. kernel defines XFEATURE_MASK_* and you can define XFEATURE_XCR0_MASK and XFEATURE_XSS_MASK with XFEATURE_MASK_*.

Curently, those masks are being added only when the (host) kernel uses them.

> 
>>   static u64 tdx_get_supported_xfam(const struct tdx_sys_info_td_conf *td_conf)
>>   {
>>       u64 val = kvm_caps.supported_xcr0 | kvm_caps.supported_xss;
>>   -    /*
>> -     * PT and CET can be exposed to TD guest regardless of KVM's XSS, PT
>> -     * and, CET support.
>> -     */
>> -    val |= XFEATURE_MASK_PT | XFEATURE_MASK_CET_USER |
>> -           XFEATURE_MASK_CET_KERNEL;
>> +    /* Ensure features are in the masks */
>> +    val &= TDX_XFAM_MASK;
> 
> It's unncessary.
> 
> kvm_caps.supported_xcr0 | kvm_caps.supported_xss must be the subset of TDX_XFAM_MASK;

The code has changed in kvm-coco-queue.

> 
>>       if ((val & td_conf->xfam_fixed1) != td_conf->xfam_fixed1)
>>           return 0;
>> @@ -724,6 +729,19 @@ int tdx_vcpu_pre_run(struct kvm_vcpu *vcpu)
>>       return 1;
>>   }
>>   +static bool tdx_guest_state_is_invalid(struct kvm_vcpu *vcpu)
>> +{
>> +    struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
>> +
>> +    return vcpu->arch.xcr0 != (kvm_tdx->xfam & TDX_XFAM_XCR0_MASK) ||
>> +           vcpu->arch.ia32_xss != (kvm_tdx->xfam & TDX_XFAM_XSS_MASK) ||
>> +           vcpu->arch.pkru ||
>> +           (cpu_feature_enabled(X86_FEATURE_XSAVE) &&
>> +        !kvm_is_cr4_bit_set(vcpu, X86_CR4_OSXSAVE)) ||
>> +           (cpu_feature_enabled(X86_FEATURE_XSAVES) &&
>> +        !guest_cpu_cap_has(vcpu, X86_FEATURE_XSAVES));
> 
> guest_cpu_cap_has() is better to put into the path when userspace configures the vcpu model. So that KVM can return error to userspace earlier before running the vcpu.

The purpose of the function is to protect host state from being
restored incorrectly, which is why it is called before tdx_vcpu_enter_exit().
i.e. it is protecting against unexpected changes to guest state information
that do not match TD configuration.  That is largely because the KVM code base
is quite complex and the consequences of messing up host state are dire.
guest_cpu_cap_has() is very quick, so there is not really any reason
not to use it here.

> 
>> +}> +
>>   static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>>   {
>>       struct vcpu_tdx *tdx = to_tdx(vcpu);
>> @@ -740,6 +758,8 @@ static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>>     fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit)
>>   {
>> +    struct vcpu_tdx *tdx = to_tdx(vcpu);
>> +
>>       /*
>>        * force_immediate_exit requires vCPU entering for events injection with
>>        * an immediately exit followed. But The TDX module doesn't guarantee
>> @@ -750,10 +770,22 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit)
>>        */
>>       WARN_ON_ONCE(force_immediate_exit);
>>   +    if (WARN_ON_ONCE(tdx_guest_state_is_invalid(vcpu))) {
>> +        /*
>> +         * Invalid exit_reason becomes KVM_EXIT_INTERNAL_ERROR, refer
>> +         * tdx_handle_exit().
>> +         */
>> +        tdx->vt.exit_reason.full = -1u;
>> +        tdx->vp_enter_ret = -1u;
>> +        return EXIT_FASTPATH_NONE;
>> +    }
>> +
>>       trace_kvm_entry(vcpu, force_immediate_exit);
>>         tdx_vcpu_enter_exit(vcpu);
>>   +    kvm_load_host_xsave_state(vcpu);
>> +
>>       vcpu->arch.regs_avail &= ~TDX_REGS_UNSUPPORTED_SET;
>>         trace_kvm_exit(vcpu, KVM_ISA_VMX);
>> @@ -1878,9 +1910,23 @@ static int tdx_vcpu_get_cpuid(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd)
>>       return r;
>>   }
>>   +static u64 tdx_guest_cr0(struct kvm_vcpu *vcpu, u64 cr4)
>> +{
>> +    u64 cr0 = ~CR0_RESERVED_BITS;
>> +
>> +    if (cr4 & X86_CR4_CET)
>> +        cr0 |= X86_CR0_WP;
>> +
>> +    cr0 |= X86_CR0_PE | X86_CR0_NE;
>> +    cr0 &= ~(X86_CR0_NW | X86_CR0_CD);
>> +
>> +    return cr0;
>> +}
>> +
>>   static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd)
>>   {
>>       u64 apic_base;
>> +    struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
>>       struct vcpu_tdx *tdx = to_tdx(vcpu);
>>       int ret;
>>   @@ -1903,6 +1949,20 @@ static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd)
>>       if (ret)
>>           return ret;
>>   +    vcpu->arch.cr4 = ~vcpu->arch.cr4_guest_rsvd_bits;
> 
> when userspace doesn't configure XFEATURE_MASK_PKRU in XFAM, it seems kvm_load_host_xsave_state() will skip restore host's PKRU value.

That's correct.

> 
> Besides, vcpu->arch.cr4_guest_rsvd_bits depends on KVM_SET_CPUID*, we need enfore the dependency that tdx_vcpu_init() needs to be called after vcpu model is configured.

Sean had some code to freeze the CPU model, hence the subsequent TODO.
refer https://lore.kernel.org/all/Z4FZKOzXIdhLOlU8@google.com/
Anything that matters will be caught by tdx_guest_state_is_invalid().

> 
>> +    vcpu->arch.cr0 = tdx_guest_cr0(vcpu, vcpu->arch.cr4);
>> +    /*
>> +     * On return from VP.ENTER, the TDX Module sets XCR0 and XSS to the
>> +     * maximal values supported by the guest, and zeroes PKRU, so from
>> +     * KVM's perspective, those are the guest's values at all times.
>> +     */
> 
> It's better to also call out that this is only to make kvm_load_host_xsave_state() work for TDX. They don't represent the real guest value.

It is to provide a reasonable value not just for
kvm_load_host_xsave_state().

Note Sean wrote the comment.

This patch set is owned by Paolo now, so it is up to him.

> 
>> +    vcpu->arch.ia32_xss = kvm_tdx->xfam & TDX_XFAM_XSS_MASK;
>> +    vcpu->arch.xcr0 = kvm_tdx->xfam & TDX_XFAM_XCR0_MASK;
>> +    vcpu->arch.pkru = 0;
>> +
>> +    /* TODO: freeze vCPU model before kvm_update_cpuid_runtime() */
>> +    kvm_update_cpuid_runtime(vcpu);
>> +
>>       tdx->state = VCPU_TD_STATE_INITIALIZED;
>>         return 0;
> 


  reply	other threads:[~2025-02-27 14:29 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-29  9:58 [PATCH V2 00/12] KVM: TDX: TD vcpu enter/exit Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 01/12] x86/virt/tdx: Make tdh_vp_enter() noinstr Adrian Hunter
2025-02-16 18:26   ` Paolo Bonzini
2025-02-27 14:13     ` Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 02/12] KVM: x86: Allow the use of kvm_load_host_xsave_state() with guest_state_protected Adrian Hunter
2025-02-20 10:50   ` Xiaoyao Li
2025-02-24 11:38     ` Adrian Hunter
2025-02-25  5:56       ` Xiaoyao Li
2025-02-27 14:14         ` Adrian Hunter
2025-03-06 18:04     ` Paolo Bonzini
2025-03-06 20:43       ` Sean Christopherson
2025-03-06 22:34         ` Paolo Bonzini
2025-03-07 23:04           ` Sean Christopherson
2025-03-10 19:08             ` Paolo Bonzini
2025-01-29  9:58 ` [PATCH V2 03/12] KVM: TDX: Set arch.has_protected_state to true Adrian Hunter
2025-02-20 12:35   ` Xiaoyao Li
2025-02-27 14:17     ` Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 04/12] KVM: VMX: Move common fields of struct vcpu_{vmx,tdx} to a struct Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 05/12] KVM: TDX: Implement TDX vcpu enter/exit path Adrian Hunter
2025-02-20 13:16   ` Xiaoyao Li
2025-02-24 12:27     ` Adrian Hunter
2025-02-25  6:15       ` Xiaoyao Li
2025-02-27 18:37         ` Adrian Hunter
2025-03-06 18:19           ` Paolo Bonzini
2025-03-06 19:13             ` Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 06/12] KVM: TDX: vcpu_run: save/restore host state(host kernel gs) Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 07/12] KVM: TDX: restore host xsave state when exit from the guest TD Adrian Hunter
2025-02-25  6:43   ` Xiaoyao Li
2025-02-27 14:29     ` Adrian Hunter [this message]
2025-02-28  1:58       ` Xiaoyao Li
2025-01-29  9:58 ` [PATCH V2 08/12] KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o wrmsr Adrian Hunter
2025-02-25  7:00   ` Xiaoyao Li
2025-01-29  9:58 ` [PATCH V2 09/12] KVM: TDX: restore user ret MSRs Adrian Hunter
2025-02-25  7:01   ` Xiaoyao Li
2025-02-27 14:19     ` Adrian Hunter
2025-01-29  9:58 ` [PATCH V2 10/12] KVM: TDX: Disable support for TSX and WAITPKG Adrian Hunter
2025-01-29  9:59 ` [PATCH V2 11/12] KVM: TDX: Save and restore IA32_DEBUGCTL Adrian Hunter
2025-01-29  9:59 ` [PATCH V2 12/12] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bac63116-916f-46f8-ae5b-945b43d181bc@intel.com \
    --to=adrian.hunter@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=dmatlack@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nik.borisov@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=tony.lindgren@linux.intel.com \
    --cc=weijiang.yang@intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox