All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Aaron Lewis <aaronlewis@google.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, jmattson@google.com
Subject: Re: [PATCH] KVM: x86: Free the MSR filter after destorying VCPUs
Date: Thu, 22 Aug 2024 10:24:30 -0700	[thread overview]
Message-ID: <Zsd0TqCeY3B5Sb5b@google.com> (raw)
In-Reply-To: <20240703175618.2304869-2-aaronlewis@google.com>

On Wed, Jul 03, 2024, Aaron Lewis wrote:
> Delay freeing the MSR filter until after the VCPUs are destroyed to
> avoid a possible use-after-free when freeing a nested guest.
> 
> This callstack is from a 5.15 kernel, but the issue still appears to be
> possible upstream.
> 
> kvm_msr_allowed+0x4c/0xd0
> __kvm_set_msr+0x12d/0x1e0
> kvm_set_msr+0x19/0x40
> load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
> nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
> nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
> vmx_free_vcpu+0x54/0xc0 [kvm_intel]
> kvm_arch_vcpu_destroy+0x28/0xf0
> kvm_vcpu_destroy+0x12/0x50
> kvm_arch_destroy_vm+0x12c/0x1c0
> kvm_put_kvm+0x263/0x3c0
> kvm_vm_release+0x21/0x30
> __fput+0xb9/0x240
> ____fput+0xe/0x20
> task_work_run+0x6f/0xd0
> syscall_exit_to_user_mode+0x123/0x300
> do_syscall_64+0x72/0xb0
> entry_SYSCALL_64_after_hwframe+0x61/0xc6
> 
> Fixes: b318e8decf6b ("KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish")
> 
> Signed-off-by: Aaron Lewis <aaronlewis@google.com>
> Suggested-by: Jim Mattson <jmattson@google.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 66c4381460dc..638696efa17e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12711,12 +12711,12 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  	}
>  	kvm_unload_vcpu_mmus(kvm);
>  	static_call_cond(kvm_x86_vm_destroy)(kvm);
> -	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
>  	kvm_pic_destroy(kvm);
>  	kvm_ioapic_destroy(kvm);
>  	kvm_destroy_vcpus(kvm);
>  	kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
>  	kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
> +	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));

This makes the KFENCE/KASAN issues go away, but I'm not so sure it actually fixes
the main problem, which is that doing nested_vmx_vmexit() when freeing a vCPU is
insane.  E.g. it can do VMWRITEs, VMPTRLD, read and write guest memory, write
guest MSRs (which could theoretically write _host_ MSRs), and so on and so fort.

And free_nested() really should free _everything_, i.e. it shouldn't rely on
nested_vmx_vmexit() to do things like cancel hrtimers.

Completely untested, but at a glance I think we can/should do this.  If we need
a fix for stable, then we can move the MSR filter freeing, but going forward, I
think we need to fix the underlying mess.

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 2392a7ef254d..f5210834a246 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -359,6 +359,8 @@ static void free_nested(struct kvm_vcpu *vcpu)
        kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true);
        vmx->nested.pi_desc = NULL;
 
+       hrtimer_cancel(&to_vmx(vcpu)->nested.preemption_timer);
+
        kvm_mmu_free_roots(vcpu->kvm, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
 
        nested_release_evmcs(vcpu);
@@ -372,9 +374,10 @@ static void free_nested(struct kvm_vcpu *vcpu)
  */
 void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu)
 {
-       vcpu_load(vcpu);
-       vmx_leave_nested(vcpu);
-       vcpu_put(vcpu);
+       struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+       vmx->loaded_vmcs = &vmx->vmcs01;
+       free_nested(vcpu);
 }
 
 #define EPTP_PA_MASK   GENMASK_ULL(51, 12)


      reply	other threads:[~2024-08-22 17:24 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-03 17:56 [PATCH] KVM: x86: Free the MSR filter after destorying VCPUs Aaron Lewis
2024-08-22 17:24 ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zsd0TqCeY3B5Sb5b@google.com \
    --to=seanjc@google.com \
    --cc=aaronlewis@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.