From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Michael Larabel <Michael@michaellarabel.com>,
Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH v2] KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions
Date: Tue, 6 May 2025 09:48:52 +0000 [thread overview]
Message-ID: <aBnbBL8Db0rHXxFX@google.com> (raw)
In-Reply-To: <20250505180300.973137-1-seanjc@google.com>
On Mon, May 05, 2025 at 11:03:00AM -0700, Sean Christopherson wrote:
> Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and
> only if KVM has at least one active VM. Leaving the bit set at all times
> unfortunately degrades performance by a wee bit more than expected.
>
> Use a dedicated spinlock and counter instead of hooking virtualization
> enablement, as changing the behavior of kvm.enable_virt_at_load based on
> SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could
> result in performance issues for flows that are sensitive to VM creation
> latency.
>
> Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting
> performance on CPUs that aren't running VMs, e.g. if a setup is using
> housekeeping CPUs. Setting BP_SPEC_REDUCE in task context, i.e. without
> blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions
> without incurring a gross amount of complexity (see the Link for details
> on how ugly coordinating via IPIs gets).
>
> Link: https://lore.kernel.org/all/aBOnzNCngyS_pQIW@google.com
> Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX")
> Reported-by: Michael Larabel <Michael@michaellarabel.com>
> Closes: https://www.phoronix.com/review/linux-615-amd-regression
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>
> v2: Defer setting BP_SPEC_REDUCE until VMRUN is imminent, which in turn
> allows for eliding the lock on 0<=>1 transitions as there is no race
> with CPUs doing VMRUN before receiving the IPI to set the bit, and
> having multiple tasks take the lock during svm_srso_vm_init() is a-ok.
>
> v1: https://lore.kernel.org/all/20250502223456.887618-1-seanjc@google.com
>
> arch/x86/kvm/svm/svm.c | 71 ++++++++++++++++++++++++++++++++++++++----
> arch/x86/kvm/svm/svm.h | 2 ++
> 2 files changed, 67 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index cc1c721ba067..15f7a0703c16 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -607,9 +607,6 @@ static void svm_disable_virtualization_cpu(void)
> kvm_cpu_svm_disable();
>
> amd_pmu_disable_virt();
> -
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> }
>
> static int svm_enable_virtualization_cpu(void)
> @@ -687,9 +684,6 @@ static int svm_enable_virtualization_cpu(void)
> rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
> }
>
> - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> - msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> -
> return 0;
> }
>
> @@ -1518,6 +1512,63 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
> __free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
> }
>
> +#ifdef CONFIG_CPU_MITIGATIONS
> +static DEFINE_SPINLOCK(srso_lock);
> +static atomic_t srso_nr_vms;
> +
> +static void svm_srso_clear_bp_spec_reduce(void *ign)
> +{
> + struct svm_cpu_data *sd = this_cpu_ptr(&svm_data);
> +
> + if (!sd->bp_spec_reduce_set)
> + return;
> +
> + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> + sd->bp_spec_reduce_set = false;
> +}
> +
> +static void svm_srso_vm_destroy(void)
> +{
> + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE))
> + return;
> +
> + if (atomic_dec_return(&srso_nr_vms))
> + return;
> +
> + guard(spinlock)(&srso_lock);
> +
> + /*
> + * Verify a new VM didn't come along, acquire the lock, and increment
> + * the count before this task acquired the lock.
> + */
> + if (atomic_read(&srso_nr_vms))
> + return;
> +
> + on_each_cpu(svm_srso_clear_bp_spec_reduce, NULL, 1);
Just a passing-by comment. I get worried about sending IPIs while
holding a spinlock because if someone ever tries to hold that spinlock
with IRQs disabled, it may cause a deadlock.
This is not the case for this lock, but it's not obvious (at least to
me) that holding it in a different code path that doesn't send IPIs with
IRQs disabled could cause a problem.
You could add a comment, convert it to a mutex to make this scenario
impossible, or dismiss my comment as being too paranoid/ridiculous :)
next prev parent reply other threads:[~2025-05-06 9:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-05 18:03 [PATCH v2] KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions Sean Christopherson
2025-05-06 9:48 ` Yosry Ahmed [this message]
2025-05-06 14:16 ` Sean Christopherson
2025-05-06 14:29 ` Yosry Ahmed
2025-05-06 15:57 ` Sean Christopherson
2025-05-07 7:05 ` Yosry Ahmed
2025-05-07 13:19 ` Sean Christopherson
2025-05-06 14:22 ` Borislav Petkov
2025-05-08 23:04 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBnbBL8Db0rHXxFX@google.com \
--to=yosry.ahmed@linux.dev \
--cc=Michael@michaellarabel.com \
--cc=bp@alien8.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.