public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Aaron Lewis <aaronlewis@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled
Date: Wed, 24 Jan 2024 22:24:43 +0000	[thread overview]
Message-ID: <ZbGOK9m6UKkQ38bK@google.com> (raw)
In-Reply-To: <ZbGAXpFUso9JzIjo@google.com>

On Wed, Jan 24, 2024, Sean Christopherson wrote:
> On Wed, Jan 24, 2024, Aaron Lewis wrote:
> > On Wed, Jan 24, 2024 at 7:49 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Jan 24, 2024, Mingwei Zhang wrote:
> > > > Reset vcpu->arch.perf_capabilities to 0 if PDCM is disabled in guest cpuid.
> > > > Without this, there is an issue in live migration. In particular, to
> > > > migrate a VM with no PDCM enabled, VMM on the source is able to retrieve a
> > > > non-zero value by reading the MSR_IA32_PERF_CAPABILITIES. However, VMM on
> > > > the target is unable to set the value. This creates confusions on the user
> > > > side.
> > > >
> > > > Fundamentally, it is because vcpu->arch.perf_capabilities as the cached
> > > > value of MSR_IA32_PERF_CAPABILITIES is incorrect, and there is nothing
> > > > wrong on the kvm_get_msr_common() which just reads
> > > > vcpu->arch.perf_capabilities.
> > > >
> > > > Fix the issue by adding the reset code in kvm_vcpu_after_set_cpuid(), i.e.
> > > > early in VM setup time.
> > > >
> > > > Cc: Aaron Lewis <aaronlewis@google.com>
> > > > Signed-off-by: Mingwei Zhang <mizhang@google.com>
> > > > ---
> > > >  arch/x86/kvm/cpuid.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > > index adba49afb5fe..416bee03c42a 100644
> > > > --- a/arch/x86/kvm/cpuid.c
> > > > +++ b/arch/x86/kvm/cpuid.c
> > > > @@ -369,6 +369,9 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > >       vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
> > > >       vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
> > > >
> > > > +     /* Reset MSR_IA32_PERF_CAPABILITIES guest value to 0 if PDCM is off. */
> > > > +     if (!guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> > > > +             vcpu->arch.perf_capabilities = 0;
> > >
> > > No, this is just papering over the underlying bug.  KVM shouldn't be stuffing
> > > vcpu->arch.perf_capabilities without explicit writes from host userspace.  E.g
> > > KVM_SET_CPUID{,2} is allowed multiple times, at which point KVM could clobber a
> > > host userspace write to MSR_IA32_PERF_CAPABILITIES.  It's unlikely any userspace
> > > actually does something like that, but KVM overwriting guest state is almost
> > > never a good thing.
> > >
> > > I've been meaning to send a patch for a long time (IIRC, Aaron also ran into this?).
> > > KVM needs to simply not stuff vcpu->arch.perf_capabilities.  I believe we are
> > > already fudging around this in our internal kernels, so I don't think there's a
> > > need to carry a hack-a-fix for the destination kernel.
> > >
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 27e23714e960..fdef9d706d61 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -12116,7 +12116,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> > >
> > >         kvm_async_pf_hash_reset(vcpu);
> > >
> > > -       vcpu->arch.perf_capabilities = kvm_caps.supported_perf_cap;
> > 
> > Yeah, that will fix the issue we are seeing.  The only thing that's
> > not clear to me is if userspace should expect KVM to set this or if
> > KVM should expect userspace to set this.  How is that generally
> > decided?
> 
> By "this", you mean the effective RESET value for vcpu->arch.perf_capabilities?
> To be consistent with KVM's CPUID module at vCPU creation, which is completely
> empty (vCPU has no PMU and no PDCM support) KVM *must* zero
> vcpu->arch.perf_capabilities.
> 
> If userspace wants a non-zero value, then userspace needs to set CPUID to enable
> PDCM and set MSR_IA32_PERF_CAPABILITIES.
> 
> MSR_IA32_ARCH_CAPABILITIES is in the same boat, e.g. a vCPU without
> X86_FEATURE_ARCH_CAPABILITIES can end up seeing a non-zero MSR value.  That too
> should be excised.
> 
hmm, does that mean KVM just allows an invalid vcpu state exist from
host point of view? I think this makes a lot of confusions on migration
where VMM on the source believes that a non-zero value from KVM_GET_MSRS
is valid and the VMM on the target will find it not true.

If we follow the suggestion by removing the initial value at vCPU
creation time, then I think it breaks the existing VMM code, since that
requires VMM to explicitly set the MSR, which I am not sure we do today.

The following code below is different. The key difference is that the
following code preserves a valid value, but this case is to not preserve
an invalid value. 

Thanks.
-Mingwei

> In a perfect world, KVM would also zero-initialize vcpu->arch.msr_platform_info,
> but that one is less obviously broken and also less obviously safe to remove.
> 
>   commit e53d88af63ab4104e1226b8f9959f1e9903da10b
>   Author:     Jim Mattson <jmattson@google.com>
>   AuthorDate: Tue Oct 30 12:20:21 2018 -0700
>   Commit:     Paolo Bonzini <pbonzini@redhat.com>
>   CommitDate: Fri Dec 14 18:00:01 2018 +0100
> 
>       kvm: x86: Don't modify MSR_PLATFORM_INFO on vCPU reset
>     
>       If userspace has provided a different value for this MSR (e.g with the
>       turbo bits set), the userspace-provided value should survive a vCPU
>       reset. For backwards compatibility, MSR_PLATFORM_INFO is initialized
>       in kvm_arch_vcpu_setup.
>     
>       Signed-off-by: Jim Mattson <jmattson@google.com>
>       Reviewed-by: Drew Schmitt <dasch@google.com>
>       Cc: Abhiroop Dabral <adabral@paloaltonetworks.com>
>       Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> In other words, KVM shouldn't define the vCPU model beyond the absolute bare
> minimum that is required by the x86 architecture (as of P6 CPUs, which is more
> or less the oldest CPU KVM can reasonably virtualize without carrying useless code).

  reply	other threads:[~2024-01-24 22:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24  0:38 [PATCH 0/2] minor fix on perf_capabilities in KVM/x86 Mingwei Zhang
2024-01-24  0:38 ` [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled Mingwei Zhang
2024-01-24 15:49   ` Sean Christopherson
2024-01-24 21:04     ` Aaron Lewis
2024-01-24 21:25       ` Sean Christopherson
2024-01-24 22:24         ` Mingwei Zhang [this message]
2024-01-24 22:51           ` Sean Christopherson
2024-01-25  0:14             ` Mingwei Zhang
2024-01-26 18:33               ` Sean Christopherson
2024-01-26 19:30                 ` Mingwei Zhang
2024-01-26 19:34                   ` Sean Christopherson
2024-01-29 14:40                   ` Paolo Bonzini
2024-01-29 14:39             ` Paolo Bonzini
2024-01-31 19:43               ` Mingwei Zhang
2024-01-24  0:38 ` [PATCH 2/2] KVM: x86/pmu: Remove vcpu_get_perf_capabilities() Mingwei Zhang
2024-01-24 15:52   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZbGOK9m6UKkQ38bK@google.com \
    --to=mizhang@google.com \
    --cc=aaronlewis@google.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox