Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
Date: Thu, 30 Nov 2023 17:51:28 -0800	[thread overview]
Message-ID: <ZWk8IMZamuemfwXG@google.com> (raw)
In-Reply-To: <3ad69657ba8e1b19d150db574193619cf0cb34df.camel@redhat.com>

On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > +/*
> > + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> > + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> > + * of the input.
> > + */
> > +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> > +{
> > +	switch (reg) {
> > +	case CPUID_EAX:
> > +		return entry->eax;
> > +	case CPUID_EBX:
> > +		return entry->ebx;
> > +	case CPUID_ECX:
> > +		return entry->ecx;
> > +	case CPUID_EDX:
> > +		return entry->edx;
> > +	default:
> > +		WARN_ON_ONCE(1);
> > +		return 0;
> > +	}
> > +}

...

> >  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  {
> >  	struct kvm_lapic *apic = vcpu->arch.apic;
> >  	struct kvm_cpuid_entry2 *best;
> >  	bool allow_gbpages;
> > +	int i;
> >  
> > -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> > +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> > +
> > +	/*
> > +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > +	 * honor userspace's definition for features that don't require KVM or
> > +	 * hardware management/support (or that KVM simply doesn't care about).
> > +	 */
> > +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> > +
> > +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> > +		if (best)
> > +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
> 
> Why not just use __cpuid_entry_get_reg? 
> 
> cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
> it seems that all callers of __cpuid_entry_get_reg, take the reg value from
> x86_feature_cpuid() which also takes it from 'reverse_cpuid'.
> 
> So if the compiler is smart enough to not complain in these cases, I don't
> see why this case is different.

It's because the input isn't a compile-time constant, and so the BUILD_BUG() in
the default path will fire.  All of the compile-time assertions in reverse_cpuid.h
rely on the feature being a constant value, which allows the compiler to optimize
away the dead paths, i.e. turn __cpuid_entry_get_reg()'s switch statement into
simple pointer arithmetic and thus omit the BUILD_BUG() code.

> Also why not to initialize guest_caps = host_caps & userspace_cpuid?
>
> If this was the default we won't need any guest_cpu_cap_restrict and such,
> instead it will just work.

Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
break userspace.

Aside from purging the governed feature nomenclature, the main goal of this series
provide a way to do fast lookups of all known guest CPUID bits without needing to
opt-in on a feature-by-feature basis, including for features that are fully
controlled by userspace.

It's definitely doable, but I'm not all that confident that the end result would
be a net positive, e.g. I believe we would need to special case things like the
feature bits that gate MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.  MOVBE and RDPID
are other features that come to mind, where KVM emulates the feature in software
but it won't be set in kvm_cpu_caps.

Oof, and MONITOR and MWAIT too, as KVM deliberately doesn't advertise those to
userspace.

So yeah, I'm not opposed to trying that route at some point, but I really don't
want to do that in this series as the risk of subtly breaking something is super
high.

> Special code will only be needed in few more complex cases, like forced exposed
> of a feature to a guest due to a virtualization hole.
> 
> 
> > +		else
> > +			vcpu->arch.cpu_caps[i] = 0;
> > +	}
> >  
> >  	/*
> >  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> > @@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 */
> >  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> >  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> > -	if (allow_gbpages)
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> 
> IMHO the original code was more readable, now I need to look up the
> 'guest_cpu_cap_change()' to understand what is going on.

The change is "necessary".  The issue is that with the caps 0-initialied, the
!allow_gbpages could simply do nothing.  Now, KVM needs to explicitly clear the
flag, i.e. would need to do:

	if (allow_gbpages)
		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
	else
		guest_cpu_cap_clear(vcpu, X86_FEATURE_GBPAGES);

I don't much love the name either, but it pairs with cpuid_entry_change() and I
want to keep the kvm_cpu_cap, cpuid_entry, and guest_cpu_cap APIs in sync as far
as the APIs go.  The only reason kvm_cpu_cap_change() doesn't exist is because
there aren't any flows that need to toggle a bit.

> >  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 8a99a73b6ee5..5827328e30f1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
> >  	 * the guest read/write access to the host's XSS.
> >  	 */
> > -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> > -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> > -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> > +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> > +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> > +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
> 
> In theory this change does change behavior, now the X86_FEATURE_XSAVE will
> be set iff the condition is true, but before it was set *if* the condition was true.

No, before it was set if and only if the condition was true, because in that case
caps were 0-initialized, i.e. this was/is the only way for XSAVE to be set.

> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
> 
> One of the main reasons I don't like governed features is this manual list.

To be fair, the manual lists predate the governed features.

> I want to reach the point that one won't need to add anything manually,
> unless there is a good reason to do so, and there are only a few exceptions
> when the guest cap is set, while the host's isn't.

Yeah, agreed.

next prev parent reply	other threads:[~2023-12-01  1:51 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
2023-11-19 17:08   ` Maxim Levitsky
2023-11-21  3:20   ` Chao Gao
2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
2023-11-14  9:12   ` Binbin Wu
2023-11-19 17:22   ` Maxim Levitsky
2023-11-28  1:24     ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
2023-11-16  3:16   ` Yang, Weijiang
2023-11-16 22:29     ` Sean Christopherson
2023-11-17  8:33       ` Yang, Weijiang
2023-11-21  3:10         ` Yuan Yao
2023-11-19 17:32   ` Maxim Levitsky
2023-12-01  1:51     ` Sean Christopherson [this message]
2023-12-21 16:59       ` Maxim Levitsky
2024-01-05  2:13         ` Sean Christopherson
2024-01-12  0:44           ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
2023-11-13  8:03   ` Robert Hoo
2023-11-14 13:48     ` Sean Christopherson
2023-11-15  1:59       ` Robert Hoo
2023-11-15 15:09         ` Sean Christopherson
2023-11-17  1:28           ` Robert Hoo
2023-11-16  2:24   ` Yang, Weijiang
2023-11-16 22:19     ` Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-24  6:33     ` Xu Yilun
2023-11-28  0:43       ` Sean Christopherson
2023-11-28  5:13         ` Xu Yilun
2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
2023-11-19 17:36   ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWk8IMZamuemfwXG@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).