Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID
Date: Thu, 30 Nov 2023 17:51:28 -0800	[thread overview]
Message-ID: <ZWk8IMZamuemfwXG@google.com> (raw)
In-Reply-To: <3ad69657ba8e1b19d150db574193619cf0cb34df.camel@redhat.com>

On Sun, Nov 19, 2023, Maxim Levitsky wrote:
> On Fri, 2023-11-10 at 15:55 -0800, Sean Christopherson wrote:
> > +/*
> > + * This isn't truly "unsafe", but all callers except kvm_cpu_after_set_cpuid()
> > + * should use __cpuid_entry_get_reg(), which provides compile-time validation
> > + * of the input.
> > + */
> > +static u32 cpuid_get_reg_unsafe(struct kvm_cpuid_entry2 *entry, u32 reg)
> > +{
> > +	switch (reg) {
> > +	case CPUID_EAX:
> > +		return entry->eax;
> > +	case CPUID_EBX:
> > +		return entry->ebx;
> > +	case CPUID_ECX:
> > +		return entry->ecx;
> > +	case CPUID_EDX:
> > +		return entry->edx;
> > +	default:
> > +		WARN_ON_ONCE(1);
> > +		return 0;
> > +	}
> > +}

...

> >  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  {
> >  	struct kvm_lapic *apic = vcpu->arch.apic;
> >  	struct kvm_cpuid_entry2 *best;
> >  	bool allow_gbpages;
> > +	int i;
> >  
> > -	memset(vcpu->arch.cpu_caps, 0, sizeof(vcpu->arch.cpu_caps));
> > +	BUILD_BUG_ON(ARRAY_SIZE(reverse_cpuid) != NR_KVM_CPU_CAPS);
> > +
> > +	/*
> > +	 * Reset guest capabilities to userspace's guest CPUID definition, i.e.
> > +	 * honor userspace's definition for features that don't require KVM or
> > +	 * hardware management/support (or that KVM simply doesn't care about).
> > +	 */
> > +	for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > +		const struct cpuid_reg cpuid = reverse_cpuid[i];
> > +
> > +		best = kvm_find_cpuid_entry_index(vcpu, cpuid.function, cpuid.index);
> > +		if (best)
> > +			vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(best, cpuid.reg);
> 
> Why not just use __cpuid_entry_get_reg? 
> 
> cpuid.reg comes from read/only 'reverse_cpuid' anyway, and in fact
> it seems that all callers of __cpuid_entry_get_reg, take the reg value from
> x86_feature_cpuid() which also takes it from 'reverse_cpuid'.
> 
> So if the compiler is smart enough to not complain in these cases, I don't
> see why this case is different.

It's because the input isn't a compile-time constant, and so the BUILD_BUG() in
the default path will fire.  All of the compile-time assertions in reverse_cpuid.h
rely on the feature being a constant value, which allows the compiler to optimize
away the dead paths, i.e. turn __cpuid_entry_get_reg()'s switch statement into
simple pointer arithmetic and thus omit the BUILD_BUG() code.

> Also why not to initialize guest_caps = host_caps & userspace_cpuid?
>
> If this was the default we won't need any guest_cpu_cap_restrict and such,
> instead it will just work.

Hrm, I definitely like the idea.  Unfortunately, unless we do an audit of all
~120 uses of guest_cpuid_has(), restricting those based on kvm_cpu_caps might
break userspace.

Aside from purging the governed feature nomenclature, the main goal of this series
provide a way to do fast lookups of all known guest CPUID bits without needing to
opt-in on a feature-by-feature basis, including for features that are fully
controlled by userspace.

It's definitely doable, but I'm not all that confident that the end result would
be a net positive, e.g. I believe we would need to special case things like the
feature bits that gate MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.  MOVBE and RDPID
are other features that come to mind, where KVM emulates the feature in software
but it won't be set in kvm_cpu_caps.

Oof, and MONITOR and MWAIT too, as KVM deliberately doesn't advertise those to
userspace.

So yeah, I'm not opposed to trying that route at some point, but I really don't
want to do that in this series as the risk of subtly breaking something is super
high.

> Special code will only be needed in few more complex cases, like forced exposed
> of a feature to a guest due to a virtualization hole.
> 
> 
> > +		else
> > +			vcpu->arch.cpu_caps[i] = 0;
> > +	}
> >  
> >  	/*
> >  	 * If TDP is enabled, let the guest use GBPAGES if they're supported in
> > @@ -342,8 +380,7 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 */
> >  	allow_gbpages = tdp_enabled ? boot_cpu_has(X86_FEATURE_GBPAGES) :
> >  				      guest_cpuid_has(vcpu, X86_FEATURE_GBPAGES);
> > -	if (allow_gbpages)
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_GBPAGES, allow_gbpages);
> 
> IMHO the original code was more readable, now I need to look up the
> 'guest_cpu_cap_change()' to understand what is going on.

The change is "necessary".  The issue is that with the caps 0-initialied, the
!allow_gbpages could simply do nothing.  Now, KVM needs to explicitly clear the
flag, i.e. would need to do:

	if (allow_gbpages)
		guest_cpu_cap_set(vcpu, X86_FEATURE_GBPAGES);
	else
		guest_cpu_cap_clear(vcpu, X86_FEATURE_GBPAGES);

I don't much love the name either, but it pairs with cpuid_entry_change() and I
want to keep the kvm_cpu_cap, cpuid_entry, and guest_cpu_cap APIs in sync as far
as the APIs go.  The only reason kvm_cpu_cap_change() doesn't exist is because
there aren't any flows that need to toggle a bit.

> >  static __always_inline bool guest_cpu_cap_has(struct kvm_vcpu *vcpu,
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 8a99a73b6ee5..5827328e30f1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4315,14 +4315,14 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >  	 * XSS on VM-Enter/VM-Exit.  Failure to do so would effectively give
> >  	 * the guest read/write access to the host's XSS.
> >  	 */
> > -	if (boot_cpu_has(X86_FEATURE_XSAVE) &&
> > -	    boot_cpu_has(X86_FEATURE_XSAVES) &&
> > -	    guest_cpuid_has(vcpu, X86_FEATURE_XSAVE))
> > -		guest_cpu_cap_set(vcpu, X86_FEATURE_XSAVES);
> > +	guest_cpu_cap_change(vcpu, X86_FEATURE_XSAVES,
> > +			     boot_cpu_has(X86_FEATURE_XSAVE) &&
> > +			     boot_cpu_has(X86_FEATURE_XSAVES) &&
> > +			     guest_cpuid_has(vcpu, X86_FEATURE_XSAVE));
> 
> In theory this change does change behavior, now the X86_FEATURE_XSAVE will
> be set iff the condition is true, but before it was set *if* the condition was true.

No, before it was set if and only if the condition was true, because in that case
caps were 0-initialized, i.e. this was/is the only way for XSAVE to be set.

> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_NRIPS);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_TSCRATEMSR);
> > -	guest_cpu_cap_check_and_set(vcpu, X86_FEATURE_LBRV);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_NRIPS);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_TSCRATEMSR);
> > +	guest_cpu_cap_restrict(vcpu, X86_FEATURE_LBRV);
> 
> One of the main reasons I don't like governed features is this manual list.

To be fair, the manual lists predate the governed features.

> I want to reach the point that one won't need to add anything manually,
> unless there is a good reason to do so, and there are only a few exceptions
> when the guest cap is set, while the host's isn't.

Yeah, agreed.

next prev parent reply	other threads:[~2023-12-01  1:51 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-10 23:55 [PATCH 0/9] KVM: x86: Replace governed features with guest cpu_caps Sean Christopherson
2023-11-10 23:55 ` [PATCH 1/9] KVM: x86: Rename "governed features" helpers to use "guest_cpu_cap" Sean Christopherson
2023-11-19 17:08   ` Maxim Levitsky
2023-11-21  3:20   ` Chao Gao
2023-11-10 23:55 ` [PATCH 2/9] KVM: x86: Replace guts of "goverened" features with comprehensive cpu_caps Sean Christopherson
2023-11-14  9:12   ` Binbin Wu
2023-11-19 17:22   ` Maxim Levitsky
2023-11-28  1:24     ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 3/9] KVM: x86: Initialize guest cpu_caps based on guest CPUID Sean Christopherson
2023-11-16  3:16   ` Yang, Weijiang
2023-11-16 22:29     ` Sean Christopherson
2023-11-17  8:33       ` Yang, Weijiang
2023-11-21  3:10         ` Yuan Yao
2023-11-19 17:32   ` Maxim Levitsky
2023-12-01  1:51     ` Sean Christopherson [this message]
2023-12-21 16:59       ` Maxim Levitsky
2024-01-05  2:13         ` Sean Christopherson
2024-01-12  0:44           ` Sean Christopherson
2023-11-10 23:55 ` [PATCH 4/9] KVM: x86: Avoid double CPUID lookup when updating MWAIT at runtime Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 5/9] KVM: x86: Drop unnecessary check that cpuid_entry2_find() returns right leaf Sean Christopherson
2023-11-19 17:33   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 6/9] KVM: x86: Update guest cpu_caps at runtime for dynamic CPUID-based features Sean Christopherson
2023-11-13  8:03   ` Robert Hoo
2023-11-14 13:48     ` Sean Christopherson
2023-11-15  1:59       ` Robert Hoo
2023-11-15 15:09         ` Sean Christopherson
2023-11-17  1:28           ` Robert Hoo
2023-11-16  2:24   ` Yang, Weijiang
2023-11-16 22:19     ` Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-24  6:33     ` Xu Yilun
2023-11-28  0:43       ` Sean Christopherson
2023-11-28  5:13         ` Xu Yilun
2023-11-10 23:55 ` [PATCH 7/9] KVM: x86: Shuffle code to prepare for dropping guest_cpuid_has() Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 8/9] KVM: x86: Replace all guest CPUID feature queries with cpu_caps check Sean Christopherson
2023-11-19 17:35   ` Maxim Levitsky
2023-11-10 23:55 ` [PATCH 9/9] KVM: x86: Restrict XSAVE in cpu_caps based on KVM capabilities Sean Christopherson
2023-11-19 17:36   ` Maxim Levitsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWk8IMZamuemfwXG@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.