public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Li RongQing <lirongqing@baidu.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT
Date: Thu, 29 Feb 2024 14:20:30 +0100	[thread overview]
Message-ID: <87h6hrmmox.fsf@redhat.com> (raw)
In-Reply-To: <Zd_BY8Us6TYNBueI@google.com>

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Feb 28, 2024, Vitaly Kuznetsov wrote:
>> @@ -273,6 +273,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
>>  				       int nent)
>>  {
>>  	struct kvm_cpuid_entry2 *best;
>> +	struct kvm_hypervisor_cpuid kvm_cpuid;
>>  
>>  	best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
>>  	if (best) {
>> @@ -299,10 +300,12 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
>>  		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
>>  		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
>>  
>> -	best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent);
>> -	if (kvm_hlt_in_guest(vcpu->kvm) && best &&
>> -		(best->eax & (1 << KVM_FEATURE_PV_UNHALT)))
>> -		best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
>> +	kvm_cpuid = __kvm_get_hypervisor_cpuid(entries, nent, KVM_SIGNATURE);
>> +	if (kvm_cpuid.base) {
>> +		best = __kvm_find_kvm_cpuid_features(entries, nent, kvm_cpuid.base);
>> +		if (kvm_hlt_in_guest(vcpu->kvm) && best)
>> +			best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
>> +	}
>>  
>>  	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) {
>>  		best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
>
> Not now, as we need a minimal fix, but we need to fix the root problem, this is
> way to brittle.  Multiple helpers take @vcpu, including __kvm_update_cpuid_runtime(),
> before the incoming CPUID is set.  That's just asking for new bugs to
> crop up.

Yes, I'm all for making this all more robust but I think it would be
nice to have a smaller, backportable fix for the real, observed issue first.

>
> Am I missing something, or can we just swap() the new and old, update the new
> in the context of the vCPU, and then undo the swap() if there's an issue?
> vcpu->mutex is held, and accessing this state from a different task is wildly
> unsafe, so I don't see any problem with temporarily having an in-flux state.
>

I don't see why this approach shouldn't work and I agree it looks like
it would make things better but I can't say that I'm in love with
it. Ideally, I would want to see the following "atomic" workflow for all
updates:

- Check that the supplied data is correct, return an error if not. No
changes to the state on this step.
- Tweak the data if needed.
- Update the state and apply the side-effects of the update. Ideally,
there should be no errors on this step as rollback can be
problemmatic. In the real world we will have to handle e.g. failed
memory allocations here but in most cases the best course of action is
to kill the VM.

Well, kvm_set_cpuid() is not like that. At least:
- kvm_hv_vcpu_init() is a side-effect but we apply it before all checks
are complete. There's no way back.
- kvm_check_cpuid() sounds like a pure checker but in reallity we end up
mangling guest FPU state in fpstate_realloc()

Both are probably "no big deal" but certainly break the atomicity.

> If we want to be paranoid, we can probably get away with killing the VM if the
> vCPU has run and the incoming CPUID is "bad", e.g. to guard against something
> in kvm_set_cpuid() consuming soon-to-be-stale state.  And that's actually a
> feature of sorts, because _if_ something in kvm_set_cpuid() consumes the vCPU's
> CPUID, then we have a bug _now_ that affects the happy path.
>
> Completely untested (I haven't updated the myriad helpers), but this would allow
> us to revert/remove all of the changes that allow peeking at a CPUID array that
> lives outside of the vCPU.

Thanks, assuming there's no urgency, let me take a look at this in the
course of the next week or so.

>
> static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
>                         int nent)
> {
> 	int r, i;
>
> 	swap(vcpu->arch.cpuid_entries, e2);
> 	swap(vcpu->arch.cpuid_nent, nent);
>
> #ifdef CONFIG_KVM_HYPERV
> 	if (kvm_cpuid_has_hyperv(vcpu)) {
> 		r = kvm_hv_vcpu_init(vcpu);
> 		if (r)
> 			goto err;
> 	}
> #endif
>
> 	r = kvm_check_cpuid(vcpu);
> 	if (r)
> 		goto err;
>
> 	kvm_update_cpuid_runtime(vcpu);
>
> 	/*
> 	 * KVM does not correctly handle changing guest CPUID after KVM_RUN, as
> 	 * MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
> 	 * tracked in kvm_mmu_page_role.  As a result, KVM may miss guest page
> 	 * faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
> 	 * the core vCPU model on the fly. It would've been better to forbid any
> 	 * KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately
> 	 * some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do
> 	 * KVM_SET_CPUID{,2} again. To support this legacy behavior, check
> 	 * whether the supplied CPUID data is equal to what's already set.
> 	 */
> 	if (kvm_vcpu_has_run(vcpu)) {
> 		r = kvm_cpuid_check_equal(vcpu, e2, nent);
> 		if (r)
> 			goto err;
> 	}
>
> 	vcpu->arch.kvm_cpuid = kvm_get_hypervisor_cpuid(vcpu, KVM_SIGNATURE);
> #ifdef CONFIG_KVM_XEN
> 	vcpu->arch.xen.cpuid = kvm_get_hypervisor_cpuid(vcpu, XEN_SIGNATURE);
> #endif
> 	kvm_vcpu_after_set_cpuid(vcpu);
>
> 	kvfree(e2);
> 	return 0;
>
> err:
> 	swap(vcpu->arch.cpuid_entries, e2);
> 	swap(vcpu->arch.cpuid_nent, nent);
> 	return r;
> }
>

-- 
Vitaly


  reply	other threads:[~2024-02-29 13:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28 10:18 [PATCH 0/3] KVM: x86: Fix KVM_FEATURE_PV_UNHALT update logic Vitaly Kuznetsov
2024-02-28 10:18 ` [PATCH 1/3] KVM: x86: Introduce __kvm_get_hypervisor_cpuid() helper Vitaly Kuznetsov
2024-02-28 10:18 ` [PATCH 2/3] KVM: x86: Use actual kvm_cpuid.base for clearing KVM_FEATURE_PV_UNHALT Vitaly Kuznetsov
2024-02-28 23:27   ` Sean Christopherson
2024-02-29 13:20     ` Vitaly Kuznetsov [this message]
2024-02-29 18:54       ` Sean Christopherson
2024-02-28 10:18 ` [PATCH 3/3] KVM: selftests: Check that KVM_FEATURE_PV_UNHALT is cleared with KVM_X86_DISABLE_EXITS_HLT Vitaly Kuznetsov
2024-03-08  4:13   ` Sean Christopherson
2024-03-08  4:13 ` [PATCH 0/3] KVM: x86: Fix KVM_FEATURE_PV_UNHALT update logic Sean Christopherson
2024-03-08 10:44   ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h6hrmmox.fsf@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lirongqing@baidu.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox