public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest
@ 2024-03-05 14:57 Zheyun Shen
  2024-03-05 16:49 ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: Zheyun Shen @ 2024-03-05 14:57 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: pbonzini, tglx, thomas lendacky, kvm, linux-kernel

On Mon, Mar 04, 2024, Sean Christopherson wrote:
> Instead of copy+paste WBINVD+cpumask_clear() everywhere, add a prep patch to
> replace relevant open coded calls to wbinvd_on_all_cpus() with calls to
> sev_guest_memory_reclaimed().  Then only sev_guest_memory_reclaimed() needs to
> updated, and IMO it helps document why KVM is blasting WBINVD.

> I'm also pretty sure this should be a cpumask_var_t, and dynamically allocated
> as appropriate.  And at that point, it should be allocated and filled if and only
> if the CPU doesn't have X86_FEATURE_SME_COHERENT

I notice that several callers of wbinvd_on_all_cpus() must use wbinvd to flush cache
instead of using clflush or just doing nothing if the CPU has X86_FEATURE_SME_COHERENT,
according to https://github.com/AMDESE/linux/commit/2e2409afe5f0c284c7dfe5504058e8d115806a7d
Therefore, I think the flush operation should be divided into two functions. One is the 
optimized wbinvd, which does not consider X86_FEATURE_SME_COHERENT, and the other is 
sev_guest_memory_reclaimed(), which should use clflush instead of wbinvd in case of 
X86_FEATURE_SME_COHERENT. Thus the cpumask struct should be exist whether the CPU has
X86_FEATURE_SME_COHERENT or not.

Besides, if we consider X86_FEATURE_SME_COHERENT to get rid of wbinvd in sev_guest_memory_reclaimed(),
we should ensure the clflush is called on corresponding addresses, as mentioned in  
https://github.com/AMDESE/linux/commit/d45829b351ee6ec5f54dd55e6aca1f44fe239fe6 
However, caller of sev_guest_memory_reclaimed() (e.g., kvm_mmu_notifier_invalidate_range_start()) 
only get HVA belongs to userspace(e.g., qemu), so calling clflush with this HVA may 
lead to a page fault in kernel. I was wondering if notifying userspace applications to 
do clflush themselves is the only solution here. But for the sake of safety, maybe KVM 
cannot left the work for untrusted userspace applications?

Or should I just temporarily ignore the X86_FEATURE_SME_COHERENT scenario which is hard to implement, 
and just refine the patch only for wbinvd_on_all_cpus() ?

Best regards,
Zheyun Shen

^ permalink raw reply	[flat|nested] 8+ messages in thread
* [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest
@ 2024-03-01  2:30 Zheyun Shen
  2024-03-04 17:55 ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: Zheyun Shen @ 2024-03-01  2:30 UTC (permalink / raw)
  To: seanjc, pbonzini, tglx; +Cc: kvm, linux-kernel

On AMD CPUs without ensuring cache consistency, each memory page reclamation in
an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
performance of other programs on the host.

Typically, an AMD server may have 128 cores or more, while the SEV guest might only
utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
to specific physical CPUs.

Therefore, keeping a record of the physical core numbers each time a vCPU runs
can help avoid flushing the cache for all CPUs every time.

Signed-off-by: Zheyun Shen <szy0127@sjtu.edu.cn>
---
 arch/x86/include/asm/smp.h |  1 +
 arch/x86/kvm/svm/sev.c     | 28 ++++++++++++++++++++++++----
 arch/x86/kvm/svm/svm.c     |  4 ++++
 arch/x86/kvm/svm/svm.h     |  3 +++
 arch/x86/lib/cache-smp.c   |  7 +++++++
 5 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4fab2ed45..19297202b 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -120,6 +120,7 @@ void native_play_dead(void);
 void play_dead_common(void);
 void wbinvd_on_cpu(int cpu);
 int wbinvd_on_all_cpus(void);
+int wbinvd_on_cpus(struct cpumask *cpumask);
 
 void smp_kick_mwait_play_dead(void);
 
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f760106c3..b6ed9a878 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -215,6 +215,21 @@ static void sev_asid_free(struct kvm_sev_info *sev)
         sev->misc_cg = NULL;
 }
 
+struct cpumask *sev_get_cpumask(struct kvm *kvm)
+{
+        struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+        return &sev->cpumask;
+}
+
+void sev_clear_cpumask(struct kvm *kvm)
+{
+        struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
+
+        cpumask_clear(&sev->cpumask);
+}
+
+
 static void sev_decommission(unsigned int handle)
 {
         struct sev_data_decommission decommission;
@@ -255,6 +270,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
         if (unlikely(sev->active))
                 return ret;
 
+        cpumask_clear(&sev->cpumask);
         sev->active = true;
         sev->es_active = argp->id == KVM_SEV_ES_INIT;
         asid = sev_asid_new(sev);
@@ -2048,7 +2064,8 @@ int sev_mem_enc_unregister_region(struct kvm *kvm,
          * releasing the pages back to the system for use. CLFLUSH will
          * not do this, so issue a WBINVD.
          */
-        wbinvd_on_all_cpus();
+        wbinvd_on_cpus(sev_get_cpumask(kvm));
+        sev_clear_cpumask(kvm);
 
         __unregister_enc_region_locked(kvm, region);
 
@@ -2152,7 +2169,8 @@ void sev_vm_destroy(struct kvm *kvm)
          * releasing the pages back to the system for use. CLFLUSH will
          * not do this, so issue a WBINVD.
          */
-        wbinvd_on_all_cpus();
+        wbinvd_on_cpus(sev_get_cpumask(kvm));
+        sev_clear_cpumask(kvm);
 
         /*
          * if userspace was terminated before unregistering the memory regions
@@ -2343,7 +2361,8 @@ static void sev_flush_encrypted_page(struct kvm_vcpu *vcpu, void *va)
         return;
 
 do_wbinvd:
-        wbinvd_on_all_cpus();
+        wbinvd_on_cpus(sev_get_cpumask(vcpu->kvm));
+        sev_clear_cpumask(vcpu->kvm);
 }
 
 void sev_guest_memory_reclaimed(struct kvm *kvm)
@@ -2351,7 +2370,8 @@ void sev_guest_memory_reclaimed(struct kvm *kvm)
         if (!sev_guest(kvm))
                 return;
 
-        wbinvd_on_all_cpus();
+        wbinvd_on_cpus(sev_get_cpumask(kvm));
+        sev_clear_cpumask(kvm);
 }
 
 void sev_free_vcpu(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e90b429c8..f9bfa6e57 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4107,6 +4107,10 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
 
         amd_clear_divider();
 
+    if (sev_guest(vcpu->kvm))
+                cpumask_set_cpu(smp_processor_id(), sev_get_cpumask(vcpu->kvm));
+    
         if (sev_es_guest(vcpu->kvm))
                 __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
         else
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8ef95139c..1577e200e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -90,6 +90,7 @@ struct kvm_sev_info {
         struct list_head mirror_entry; /* Use as a list entry of mirrors */
         struct misc_cg *misc_cg; /* For misc cgroup accounting */
         atomic_t migration_in_progress;
+        struct cpumask cpumask; /* CPU list to flush */
 };
 
 struct kvm_svm {
@@ -694,6 +695,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+struct cpumask *sev_get_cpumask(struct kvm *kvm);
+void sev_clear_cpumask(struct kvm *kvm);
 
 /* vmenter.S */
 
diff --git a/arch/x86/lib/cache-smp.c b/arch/x86/lib/cache-smp.c
index 7af743bd3..8806f53ba 100644
--- a/arch/x86/lib/cache-smp.c
+++ b/arch/x86/lib/cache-smp.c
@@ -20,3 +20,10 @@ int wbinvd_on_all_cpus(void)
         return 0;
 }
 EXPORT_SYMBOL(wbinvd_on_all_cpus);
+
+int wbinvd_on_cpus(struct cpumask *mask)
+{
+    on_each_cpu_cond_mask(NULL, __wbinvd, NULL, 1, mask);
+    return 0;
+}
+EXPORT_SYMBOL(wbinvd_on_cpus);
--
2.34.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-06 16:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-05 14:57 [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest Zheyun Shen
2024-03-05 16:49 ` Sean Christopherson
  -- strict thread matches above, loose matches on Subject: below --
2024-03-01  2:30 Zheyun Shen
2024-03-04 17:55 ` Sean Christopherson
2024-03-05 22:51   ` Tom Lendacky
2024-03-06  0:19     ` Sean Christopherson
2024-03-06 16:26       ` Tom Lendacky
2024-03-06 16:45         ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox