stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 6.12.y] KVM: x86: Free vCPUs before freeing VM state
@ 2025-07-25 18:58 Kevin Cheng
  2025-07-26  1:37 ` Sasha Levin
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Cheng @ 2025-07-25 18:58 UTC (permalink / raw)
  To: stable
  Cc: seanjc, Aaron Lewis, Jim Mattson, Yan Zhao, Rick P Edgecombe,
	Kai Huang, Isaku Yamahata, Kevin Cheng

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]

Free vCPUs before freeing any VM state, as both SVM and VMX may access
VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
to be kicked out of nested guest mode.

Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
called") partially fixed the issue, but for unknown reasons only moved the
MMU unloading before VM destruction.  Complete the change, and free all
vCPU state prior to destroying VM state, as nVMX accesses even more state
than nSVM.

In addition to the AVIC, KVM can hit a use-after-free on MSR filters:

  kvm_msr_allowed+0x4c/0xd0
  __kvm_set_msr+0x12d/0x1e0
  kvm_set_msr+0x19/0x40
  load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
  nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
  nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
  vmx_free_vcpu+0x54/0xc0 [kvm_intel]
  kvm_arch_vcpu_destroy+0x28/0xf0
  kvm_vcpu_destroy+0x12/0x50
  kvm_arch_destroy_vm+0x12c/0x1c0
  kvm_put_kvm+0x263/0x3c0
  kvm_vm_release+0x21/0x30

and an upcoming fix to process injectable interrupts on nested VM-Exit
will access the PIC:

  BUG: kernel NULL pointer dereference, address: 0000000000000090
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
  RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
  Call Trace:
   <TASK>
   kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
   nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
   nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
   vmx_vcpu_free+0x2d/0x80 [kvm_intel]
   kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
   kvm_destroy_vcpus+0x8a/0x100 [kvm]
   kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
   kvm_destroy_vm+0x172/0x300 [kvm]
   kvm_vcpu_release+0x31/0x50 [kvm]

Inarguably, both nSVM and nVMX need to be fixed, but punt on those
cleanups for the moment.  Conceptually, vCPUs should be freed before VM
state.  Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
are created, so it stands to reason that they must be freed _after_ vCPUs
are destroyed.

Reported-by: Aaron Lewis <aaronlewis@google.com>
Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
Cc: Jim Mattson <jmattson@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kevin Cheng <chengkev@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f378d479fea3f..7f91b11e6f0ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12888,11 +12888,11 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		mutex_unlock(&kvm->slots_lock);
 	}
 	kvm_unload_vcpu_mmus(kvm);
+	kvm_destroy_vcpus(kvm);
 	kvm_x86_call(vm_destroy)(kvm);
 	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
 	kvm_pic_destroy(kvm);
 	kvm_ioapic_destroy(kvm);
-	kvm_destroy_vcpus(kvm);
 	kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
 	kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
 	kvm_mmu_uninit_vm(kvm);
-- 
2.50.1.470.g6ba607880d-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 6.12.y] KVM: x86: Free vCPUs before freeing VM state
  2025-07-25 18:58 [PATCH 6.12.y] KVM: x86: Free vCPUs before freeing VM state Kevin Cheng
@ 2025-07-26  1:37 ` Sasha Levin
  2025-07-28 17:51   ` [PATCH 6.12.y v2] " Kevin Cheng
  0 siblings, 1 reply; 8+ messages in thread
From: Sasha Levin @ 2025-07-26  1:37 UTC (permalink / raw)
  To: stable; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 17bcd714426386fda741a4bccd96a2870179344b

WARNING: Author mismatch between patch and upstream commit:
Backport author: Kevin Cheng <chengkev@google.com>
Commit author: Sean Christopherson <seanjc@google.com>

Status in newer kernel trees:
6.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  17bcd7144263 ! 1:  c2bfc381d88f KVM: x86: Free vCPUs before freeing VM state
    @@ Metadata
      ## Commit message ##
         KVM: x86: Free vCPUs before freeing VM state
     
    +    [ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]
    +
         Free vCPUs before freeing any VM state, as both SVM and VMX may access
         VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
         to be kicked out of nested guest mode.
    @@ Commit message
         Cc: Kai Huang <kai.huang@intel.com>
         Cc: Isaku Yamahata <isaku.yamahata@intel.com>
         Signed-off-by: Sean Christopherson <seanjc@google.com>
    -    Message-ID: <20250224235542.2562848-2-seanjc@google.com>
    -    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    +    Signed-off-by: Kevin Cheng <chengkev@google.com>
     
      ## arch/x86/kvm/x86.c ##
     @@ arch/x86/kvm/x86.c: void kvm_arch_destroy_vm(struct kvm *kvm)

---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| origin/linux-6.12.y       | Success     | Success    |

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-05 21:40 [PATCH 6.12.y v2] mm/vmalloc: fix data race in show_numa_info() Sasha Levin
@ 2025-07-28 17:50 ` Kevin Cheng
  2025-07-29 19:49   ` Sasha Levin
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Cheng @ 2025-07-28 17:50 UTC (permalink / raw)
  To: sashal
  Cc: aha310510, stable, Sean Christopherson, Aaron Lewis, Jim Mattson,
	Yan Zhao, Rick P Edgecombe, Kai Huang, Isaku Yamahata,
	Paolo Bonzini, Kevin Cheng

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]

Free vCPUs before freeing any VM state, as both SVM and VMX may access
VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
to be kicked out of nested guest mode.

Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
called") partially fixed the issue, but for unknown reasons only moved the
MMU unloading before VM destruction.  Complete the change, and free all
vCPU state prior to destroying VM state, as nVMX accesses even more state
than nSVM.

In addition to the AVIC, KVM can hit a use-after-free on MSR filters:

  kvm_msr_allowed+0x4c/0xd0
  __kvm_set_msr+0x12d/0x1e0
  kvm_set_msr+0x19/0x40
  load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
  nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
  nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
  vmx_free_vcpu+0x54/0xc0 [kvm_intel]
  kvm_arch_vcpu_destroy+0x28/0xf0
  kvm_vcpu_destroy+0x12/0x50
  kvm_arch_destroy_vm+0x12c/0x1c0
  kvm_put_kvm+0x263/0x3c0
  kvm_vm_release+0x21/0x30

and an upcoming fix to process injectable interrupts on nested VM-Exit
will access the PIC:

  BUG: kernel NULL pointer dereference, address: 0000000000000090
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
  RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
  Call Trace:
   <TASK>
   kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
   nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
   nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
   vmx_vcpu_free+0x2d/0x80 [kvm_intel]
   kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
   kvm_destroy_vcpus+0x8a/0x100 [kvm]
   kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
   kvm_destroy_vm+0x172/0x300 [kvm]
   kvm_vcpu_release+0x31/0x50 [kvm]

Inarguably, both nSVM and nVMX need to be fixed, but punt on those
cleanups for the moment.  Conceptually, vCPUs should be freed before VM
state.  Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
are created, so it stands to reason that they must be freed _after_ vCPUs
are destroyed.

Reported-by: Aaron Lewis <aaronlewis@google.com>
Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
Cc: Jim Mattson <jmattson@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250224235542.2562848-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Cheng <chengkev@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f378d479fea3f..7f91b11e6f0ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12888,11 +12888,11 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		mutex_unlock(&kvm->slots_lock);
 	}
 	kvm_unload_vcpu_mmus(kvm);
+	kvm_destroy_vcpus(kvm);
 	kvm_x86_call(vm_destroy)(kvm);
 	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
 	kvm_pic_destroy(kvm);
 	kvm_ioapic_destroy(kvm);
-	kvm_destroy_vcpus(kvm);
 	kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
 	kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
 	kvm_mmu_uninit_vm(kvm);
-- 
2.50.1.487.gc89ff58d15-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-26  1:37 ` Sasha Levin
@ 2025-07-28 17:51   ` Kevin Cheng
  2025-07-29 14:46     ` Greg KH
  2025-07-29 19:45     ` Sasha Levin
  0 siblings, 2 replies; 8+ messages in thread
From: Kevin Cheng @ 2025-07-28 17:51 UTC (permalink / raw)
  To: sashal
  Cc: stable, Sean Christopherson, Aaron Lewis, Jim Mattson, Yan Zhao,
	Rick P Edgecombe, Kai Huang, Isaku Yamahata, Paolo Bonzini,
	Kevin Cheng

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]

Free vCPUs before freeing any VM state, as both SVM and VMX may access
VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
to be kicked out of nested guest mode.

Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
called") partially fixed the issue, but for unknown reasons only moved the
MMU unloading before VM destruction.  Complete the change, and free all
vCPU state prior to destroying VM state, as nVMX accesses even more state
than nSVM.

In addition to the AVIC, KVM can hit a use-after-free on MSR filters:

  kvm_msr_allowed+0x4c/0xd0
  __kvm_set_msr+0x12d/0x1e0
  kvm_set_msr+0x19/0x40
  load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
  nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
  nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
  vmx_free_vcpu+0x54/0xc0 [kvm_intel]
  kvm_arch_vcpu_destroy+0x28/0xf0
  kvm_vcpu_destroy+0x12/0x50
  kvm_arch_destroy_vm+0x12c/0x1c0
  kvm_put_kvm+0x263/0x3c0
  kvm_vm_release+0x21/0x30

and an upcoming fix to process injectable interrupts on nested VM-Exit
will access the PIC:

  BUG: kernel NULL pointer dereference, address: 0000000000000090
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
  RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
  Call Trace:
   <TASK>
   kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
   nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
   nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
   vmx_vcpu_free+0x2d/0x80 [kvm_intel]
   kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
   kvm_destroy_vcpus+0x8a/0x100 [kvm]
   kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
   kvm_destroy_vm+0x172/0x300 [kvm]
   kvm_vcpu_release+0x31/0x50 [kvm]

Inarguably, both nSVM and nVMX need to be fixed, but punt on those
cleanups for the moment.  Conceptually, vCPUs should be freed before VM
state.  Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
are created, so it stands to reason that they must be freed _after_ vCPUs
are destroyed.

Reported-by: Aaron Lewis <aaronlewis@google.com>
Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
Cc: Jim Mattson <jmattson@google.com>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20250224235542.2562848-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Cheng <chengkev@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f378d479fea3f..7f91b11e6f0ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12888,11 +12888,11 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		mutex_unlock(&kvm->slots_lock);
 	}
 	kvm_unload_vcpu_mmus(kvm);
+	kvm_destroy_vcpus(kvm);
 	kvm_x86_call(vm_destroy)(kvm);
 	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
 	kvm_pic_destroy(kvm);
 	kvm_ioapic_destroy(kvm);
-	kvm_destroy_vcpus(kvm);
 	kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
 	kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
 	kvm_mmu_uninit_vm(kvm);
-- 
2.50.1.487.gc89ff58d15-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-28 17:51   ` [PATCH 6.12.y v2] " Kevin Cheng
@ 2025-07-29 14:46     ` Greg KH
  2025-07-29 15:31       ` Kevin Cheng
  2025-07-29 19:45     ` Sasha Levin
  1 sibling, 1 reply; 8+ messages in thread
From: Greg KH @ 2025-07-29 14:46 UTC (permalink / raw)
  To: Kevin Cheng
  Cc: sashal, stable, Sean Christopherson, Aaron Lewis, Jim Mattson,
	Yan Zhao, Rick P Edgecombe, Kai Huang, Isaku Yamahata,
	Paolo Bonzini

On Mon, Jul 28, 2025 at 05:51:22PM +0000, Kevin Cheng wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> [ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]
> 
> Free vCPUs before freeing any VM state, as both SVM and VMX may access
> VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
> to be kicked out of nested guest mode.
> 
> Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
> called") partially fixed the issue, but for unknown reasons only moved the
> MMU unloading before VM destruction.  Complete the change, and free all
> vCPU state prior to destroying VM state, as nVMX accesses even more state
> than nSVM.
> 
> In addition to the AVIC, KVM can hit a use-after-free on MSR filters:
> 
>   kvm_msr_allowed+0x4c/0xd0
>   __kvm_set_msr+0x12d/0x1e0
>   kvm_set_msr+0x19/0x40
>   load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
>   nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
>   nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
>   vmx_free_vcpu+0x54/0xc0 [kvm_intel]
>   kvm_arch_vcpu_destroy+0x28/0xf0
>   kvm_vcpu_destroy+0x12/0x50
>   kvm_arch_destroy_vm+0x12c/0x1c0
>   kvm_put_kvm+0x263/0x3c0
>   kvm_vm_release+0x21/0x30
> 
> and an upcoming fix to process injectable interrupts on nested VM-Exit
> will access the PIC:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000090
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
>   RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
>   Call Trace:
>    <TASK>
>    kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
>    nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
>    nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
>    vmx_vcpu_free+0x2d/0x80 [kvm_intel]
>    kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
>    kvm_destroy_vcpus+0x8a/0x100 [kvm]
>    kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
>    kvm_destroy_vm+0x172/0x300 [kvm]
>    kvm_vcpu_release+0x31/0x50 [kvm]
> 
> Inarguably, both nSVM and nVMX need to be fixed, but punt on those
> cleanups for the moment.  Conceptually, vCPUs should be freed before VM
> state.  Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
> are created, so it stands to reason that they must be freed _after_ vCPUs
> are destroyed.
> 
> Reported-by: Aaron Lewis <aaronlewis@google.com>
> Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Yan Zhao <yan.y.zhao@intel.com>
> Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
> Cc: Kai Huang <kai.huang@intel.com>
> Cc: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Message-ID: <20250224235542.2562848-2-seanjc@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Kevin Cheng <chengkev@google.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f378d479fea3f..7f91b11e6f0ec 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12888,11 +12888,11 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  		mutex_unlock(&kvm->slots_lock);
>  	}
>  	kvm_unload_vcpu_mmus(kvm);
> +	kvm_destroy_vcpus(kvm);
>  	kvm_x86_call(vm_destroy)(kvm);
>  	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
>  	kvm_pic_destroy(kvm);
>  	kvm_ioapic_destroy(kvm);
> -	kvm_destroy_vcpus(kvm);
>  	kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
>  	kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
>  	kvm_mmu_uninit_vm(kvm);
> -- 
> 2.50.1.487.gc89ff58d15-goog
> 
> 

What changed in this v2 version?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-29 14:46     ` Greg KH
@ 2025-07-29 15:31       ` Kevin Cheng
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin Cheng @ 2025-07-29 15:31 UTC (permalink / raw)
  To: Greg KH
  Cc: sashal, stable, Sean Christopherson, Aaron Lewis, Jim Mattson,
	Yan Zhao, Rick P Edgecombe, Kai Huang, Isaku Yamahata,
	Paolo Bonzini

I left out

Message-ID: <20250224235542.2562848-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mistakingly.

Thank you,
Kevin



On Tue, Jul 29, 2025 at 10:46 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Mon, Jul 28, 2025 at 05:51:22PM +0000, Kevin Cheng wrote:
> > From: Sean Christopherson <seanjc@google.com>
> >
> > [ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]
> >
> > Free vCPUs before freeing any VM state, as both SVM and VMX may access
> > VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
> > to be kicked out of nested guest mode.
> >
> > Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was
> > called") partially fixed the issue, but for unknown reasons only moved the
> > MMU unloading before VM destruction.  Complete the change, and free all
> > vCPU state prior to destroying VM state, as nVMX accesses even more state
> > than nSVM.
> >
> > In addition to the AVIC, KVM can hit a use-after-free on MSR filters:
> >
> >   kvm_msr_allowed+0x4c/0xd0
> >   __kvm_set_msr+0x12d/0x1e0
> >   kvm_set_msr+0x19/0x40
> >   load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel]
> >   nested_vmx_vmexit+0x715/0xbd0 [kvm_intel]
> >   nested_vmx_free_vcpu+0x33/0x50 [kvm_intel]
> >   vmx_free_vcpu+0x54/0xc0 [kvm_intel]
> >   kvm_arch_vcpu_destroy+0x28/0xf0
> >   kvm_vcpu_destroy+0x12/0x50
> >   kvm_arch_destroy_vm+0x12c/0x1c0
> >   kvm_put_kvm+0x263/0x3c0
> >   kvm_vm_release+0x21/0x30
> >
> > and an upcoming fix to process injectable interrupts on nested VM-Exit
> > will access the PIC:
> >
> >   BUG: kernel NULL pointer dereference, address: 0000000000000090
> >   #PF: supervisor read access in kernel mode
> >   #PF: error_code(0x0000) - not-present page
> >   CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re
> >   RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm]
> >   Call Trace:
> >    <TASK>
> >    kvm_cpu_has_injectable_intr+0xe/0x60 [kvm]
> >    nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel]
> >    nested_vmx_free_vcpu+0x40/0x50 [kvm_intel]
> >    vmx_vcpu_free+0x2d/0x80 [kvm_intel]
> >    kvm_arch_vcpu_destroy+0x2d/0x130 [kvm]
> >    kvm_destroy_vcpus+0x8a/0x100 [kvm]
> >    kvm_arch_destroy_vm+0xa7/0x1d0 [kvm]
> >    kvm_destroy_vm+0x172/0x300 [kvm]
> >    kvm_vcpu_release+0x31/0x50 [kvm]
> >
> > Inarguably, both nSVM and nVMX need to be fixed, but punt on those
> > cleanups for the moment.  Conceptually, vCPUs should be freed before VM
> > state.  Assets like the I/O APIC and PIC _must_ be allocated before vCPUs
> > are created, so it stands to reason that they must be freed _after_ vCPUs
> > are destroyed.
> >
> > Reported-by: Aaron Lewis <aaronlewis@google.com>
> > Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com
> > Cc: Jim Mattson <jmattson@google.com>
> > Cc: Yan Zhao <yan.y.zhao@intel.com>
> > Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>
> > Cc: Kai Huang <kai.huang@intel.com>
> > Cc: Isaku Yamahata <isaku.yamahata@intel.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > Message-ID: <20250224235542.2562848-2-seanjc@google.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Kevin Cheng <chengkev@google.com>
> > ---
> >  arch/x86/kvm/x86.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index f378d479fea3f..7f91b11e6f0ec 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -12888,11 +12888,11 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
> >               mutex_unlock(&kvm->slots_lock);
> >       }
> >       kvm_unload_vcpu_mmus(kvm);
> > +     kvm_destroy_vcpus(kvm);
> >       kvm_x86_call(vm_destroy)(kvm);
> >       kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
> >       kvm_pic_destroy(kvm);
> >       kvm_ioapic_destroy(kvm);
> > -     kvm_destroy_vcpus(kvm);
> >       kvfree(rcu_dereference_check(kvm->arch.apic_map, 1));
> >       kfree(srcu_dereference_check(kvm->arch.pmu_event_filter, &kvm->srcu, 1));
> >       kvm_mmu_uninit_vm(kvm);
> > --
> > 2.50.1.487.gc89ff58d15-goog
> >
> >
>
> What changed in this v2 version?
>
> confused,
>
> greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-28 17:51   ` [PATCH 6.12.y v2] " Kevin Cheng
  2025-07-29 14:46     ` Greg KH
@ 2025-07-29 19:45     ` Sasha Levin
  1 sibling, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2025-07-29 19:45 UTC (permalink / raw)
  To: stable; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 17bcd714426386fda741a4bccd96a2870179344b

WARNING: Author mismatch between patch and upstream commit:
Backport author: Kevin Cheng <chengkev@google.com>
Commit author: Sean Christopherson <seanjc@google.com>

Status in newer kernel trees:
6.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  17bcd7144263 ! 1:  ee33d92e5a01 KVM: x86: Free vCPUs before freeing VM state
    @@ Metadata
      ## Commit message ##
         KVM: x86: Free vCPUs before freeing VM state
     
    +    [ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]
    +
         Free vCPUs before freeing any VM state, as both SVM and VMX may access
         VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
         to be kicked out of nested guest mode.
    @@ Commit message
         Signed-off-by: Sean Christopherson <seanjc@google.com>
         Message-ID: <20250224235542.2562848-2-seanjc@google.com>
         Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    +    Signed-off-by: Kevin Cheng <chengkev@google.com>
     
      ## arch/x86/kvm/x86.c ##
     @@ arch/x86/kvm/x86.c: void kvm_arch_destroy_vm(struct kvm *kvm)

---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| origin/linux-6.12.y       | Success     | Success    |

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state
  2025-07-28 17:50 ` [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state Kevin Cheng
@ 2025-07-29 19:49   ` Sasha Levin
  0 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2025-07-29 19:49 UTC (permalink / raw)
  To: stable; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

✅ All tests passed successfully. No issues detected.
No action required from the submitter.

The upstream commit SHA1 provided is correct: 17bcd714426386fda741a4bccd96a2870179344b

WARNING: Author mismatch between patch and upstream commit:
Backport author: Kevin Cheng <chengkev@google.com>
Commit author: Sean Christopherson <seanjc@google.com>

Status in newer kernel trees:
6.15.y | Present (exact SHA1)

Note: The patch differs from the upstream commit:
---
1:  17bcd7144263 ! 1:  0bfb70711c37 KVM: x86: Free vCPUs before freeing VM state
    @@ Metadata
      ## Commit message ##
         KVM: x86: Free vCPUs before freeing VM state
     
    +    [ Upstream commit 17bcd714426386fda741a4bccd96a2870179344b ]
    +
         Free vCPUs before freeing any VM state, as both SVM and VMX may access
         VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs
         to be kicked out of nested guest mode.
    @@ Commit message
         Signed-off-by: Sean Christopherson <seanjc@google.com>
         Message-ID: <20250224235542.2562848-2-seanjc@google.com>
         Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    +    Signed-off-by: Kevin Cheng <chengkev@google.com>
     
      ## arch/x86/kvm/x86.c ##
     @@ arch/x86/kvm/x86.c: void kvm_arch_destroy_vm(struct kvm *kvm)

---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| origin/linux-6.12.y       | Success     | Success    |

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-29 19:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 18:58 [PATCH 6.12.y] KVM: x86: Free vCPUs before freeing VM state Kevin Cheng
2025-07-26  1:37 ` Sasha Levin
2025-07-28 17:51   ` [PATCH 6.12.y v2] " Kevin Cheng
2025-07-29 14:46     ` Greg KH
2025-07-29 15:31       ` Kevin Cheng
2025-07-29 19:45     ` Sasha Levin
  -- strict thread matches above, loose matches on Subject: below --
2025-07-05 21:40 [PATCH 6.12.y v2] mm/vmalloc: fix data race in show_numa_info() Sasha Levin
2025-07-28 17:50 ` [PATCH 6.12.y v2] KVM: x86: Free vCPUs before freeing VM state Kevin Cheng
2025-07-29 19:49   ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).