From: Sean Christopherson <seanjc@google.com>
To: Aaron Lewis <aaronlewis@google.com>
Cc: Mingwei Zhang <mizhang@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled
Date: Wed, 24 Jan 2024 13:25:50 -0800 [thread overview]
Message-ID: <ZbGAXpFUso9JzIjo@google.com> (raw)
In-Reply-To: <CAAAPnDFAvJBuETUsBScX6WqSbf_j=5h_CpWwrPHwXdBxDg_LFQ@mail.gmail.com>
On Wed, Jan 24, 2024, Aaron Lewis wrote:
> On Wed, Jan 24, 2024 at 7:49 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Jan 24, 2024, Mingwei Zhang wrote:
> > > Reset vcpu->arch.perf_capabilities to 0 if PDCM is disabled in guest cpuid.
> > > Without this, there is an issue in live migration. In particular, to
> > > migrate a VM with no PDCM enabled, VMM on the source is able to retrieve a
> > > non-zero value by reading the MSR_IA32_PERF_CAPABILITIES. However, VMM on
> > > the target is unable to set the value. This creates confusions on the user
> > > side.
> > >
> > > Fundamentally, it is because vcpu->arch.perf_capabilities as the cached
> > > value of MSR_IA32_PERF_CAPABILITIES is incorrect, and there is nothing
> > > wrong on the kvm_get_msr_common() which just reads
> > > vcpu->arch.perf_capabilities.
> > >
> > > Fix the issue by adding the reset code in kvm_vcpu_after_set_cpuid(), i.e.
> > > early in VM setup time.
> > >
> > > Cc: Aaron Lewis <aaronlewis@google.com>
> > > Signed-off-by: Mingwei Zhang <mizhang@google.com>
> > > ---
> > > arch/x86/kvm/cpuid.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index adba49afb5fe..416bee03c42a 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -369,6 +369,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
> > > vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
> > >
> > > + /* Reset MSR_IA32_PERF_CAPABILITIES guest value to 0 if PDCM is off. */
> > > + if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> > > + vcpu->arch.perf_capabilities = 0;
> >
> > No, this is just papering over the underlying bug. KVM shouldn't be stuffing
> > vcpu->arch.perf_capabilities without explicit writes from host userspace. E.g
> > KVM_SET_CPUID{,2} is allowed multiple times, at which point KVM could clobber a
> > host userspace write to MSR_IA32_PERF_CAPABILITIES. It's unlikely any userspace
> > actually does something like that, but KVM overwriting guest state is almost
> > never a good thing.
> >
> > I've been meaning to send a patch for a long time (IIRC, Aaron also ran into this?).
> > KVM needs to simply not stuff vcpu->arch.perf_capabilities. I believe we are
> > already fudging around this in our internal kernels, so I don't think there's a
> > need to carry a hack-a-fix for the destination kernel.
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 27e23714e960..fdef9d706d61 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -12116,7 +12116,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >
> > kvm_async_pf_hash_reset(vcpu);
> >
> > - vcpu->arch.perf_capabilities = kvm_caps.supported_perf_cap;
>
> Yeah, that will fix the issue we are seeing. The only thing that's
> not clear to me is if userspace should expect KVM to set this or if
> KVM should expect userspace to set this. How is that generally
> decided?
By "this", you mean the effective RESET value for vcpu->arch.perf_capabilities?
To be consistent with KVM's CPUID module at vCPU creation, which is completely
empty (vCPU has no PMU and no PDCM support) KVM *must* zero
vcpu->arch.perf_capabilities.
If userspace wants a non-zero value, then userspace needs to set CPUID to enable
PDCM and set MSR_IA32_PERF_CAPABILITIES.
MSR_IA32_ARCH_CAPABILITIES is in the same boat, e.g. a vCPU without
X86_FEATURE_ARCH_CAPABILITIES can end up seeing a non-zero MSR value. That too
should be excised.
In a perfect world, KVM would also zero-initialize vcpu->arch.msr_platform_info,
but that one is less obviously broken and also less obviously safe to remove.
commit e53d88af63ab4104e1226b8f9959f1e9903da10b
Author: Jim Mattson <jmattson@google.com>
AuthorDate: Tue Oct 30 12:20:21 2018 -0700
Commit: Paolo Bonzini <pbonzini@redhat.com>
CommitDate: Fri Dec 14 18:00:01 2018 +0100
kvm: x86: Don't modify MSR_PLATFORM_INFO on vCPU reset
If userspace has provided a different value for this MSR (e.g with the
turbo bits set), the userspace-provided value should survive a vCPU
reset. For backwards compatibility, MSR_PLATFORM_INFO is initialized
in kvm_arch_vcpu_setup.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Drew Schmitt <dasch@google.com>
Cc: Abhiroop Dabral <adabral@paloaltonetworks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In other words, KVM shouldn't define the vCPU model beyond the absolute bare
minimum that is required by the x86 architecture (as of P6 CPUs, which is more
or less the oldest CPU KVM can reasonably virtualize without carrying useless code).
next prev parent reply other threads:[~2024-01-24 21:25 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-24 0:38 [PATCH 0/2] minor fix on perf_capabilities in KVM/x86 Mingwei Zhang
2024-01-24 0:38 ` [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled Mingwei Zhang
2024-01-24 15:49 ` Sean Christopherson
2024-01-24 21:04 ` Aaron Lewis
2024-01-24 21:25 ` Sean Christopherson [this message]
2024-01-24 22:24 ` Mingwei Zhang
2024-01-24 22:51 ` Sean Christopherson
2024-01-25 0:14 ` Mingwei Zhang
2024-01-26 18:33 ` Sean Christopherson
2024-01-26 19:30 ` Mingwei Zhang
2024-01-26 19:34 ` Sean Christopherson
2024-01-29 14:40 ` Paolo Bonzini
2024-01-29 14:39 ` Paolo Bonzini
2024-01-31 19:43 ` Mingwei Zhang
2024-01-24 0:38 ` [PATCH 2/2] KVM: x86/pmu: Remove vcpu_get_perf_capabilities() Mingwei Zhang
2024-01-24 15:52 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZbGAXpFUso9JzIjo@google.com \
--to=seanjc@google.com \
--cc=aaronlewis@google.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox