All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Wei W Wang <wei.w.wang@intel.com>
Cc: Mingwei Zhang <mizhang@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 "H. Peter Anvin" <hpa@zytor.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Aaron Lewis <aaronlewis@google.com>,
	 Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH] KVM: x86/pmu: Return correct value of IA32_PERF_CAPABILITIES for userspace after vCPU has run
Date: Wed, 13 Mar 2024 07:42:18 -0700	[thread overview]
Message-ID: <ZfG7SgyqTTtqF3cw@google.com> (raw)
In-Reply-To: <DS0PR11MB63731F54EA26D14CF7D6A3FDDC2A2@DS0PR11MB6373.namprd11.prod.outlook.com>

On Wed, Mar 13, 2024, Wei W Wang wrote:
> On Wednesday, March 13, 2024 8:38 AM, Mingwei Zhang wrote:
> > Return correct value of IA32_PERF_CAPABILITIES when userspace tries to read
> > it after vCPU has already run. Previously, KVM will always return the guest
> > cached value on get_msr() even if guest CPUID lacks X86_FEATURE_PDCM. The
> > guest cached value on default is kvm_caps.supported_perf_cap. However,
> > when userspace sets the value during live migration, the call fails because of
> > the check on X86_FEATURE_PDCM.
> 
> Could you point where in the set_msr path that could fail?
> (I don’t find there is a check of X86_FEATURE_PDCM in vmx_set_msr and
> kvm_set_msr_common)

The changelog is misleading, it's not the PDCM feature bit, it's the PMU version
check in vmx_set_msr():

	case MSR_IA32_PERF_CAPABILITIES:
		if (data && !vcpu_to_pmu(vcpu)->version)
			return 1;

> > Initially, it sounds like a pure userspace issue. It is not. After vCPU has run,
> > KVM should faithfully return correct value to satisify legitimate requests from
> > userspace such as VM suspend/resume and live migrartion. In this case, KVM
> > should return 0 when guest cpuid lacks X86_FEATURE_PDCM. 
> Some typos above (satisfy, migration, CPUID)
> 
> Seems the description here isn’t aligned to your code below?
> The code below prevents userspace from reading the MSR value (not return 0 as the
> read value) in that case.

Ya.

> >So fix the
> > problem by adding an additional check in vmx_set_msr().
> > 
> > Note that IA32_PERF_CAPABILITIES is emulated on AMD side, which is fine
> > because it set_msr() is guarded by kvm_caps.supported_perf_cap which is
> > always 0.
> > 
> > Cc: Aaron Lewis <aaronlewis@google.com>
> > Cc: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Mingwei Zhang <mizhang@google.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> > 40e3780d73ae..6d8667b56091 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -2049,6 +2049,17 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu,
> > struct msr_data *msr_info)
> >  		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
> >  			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
> >  		break;
> > +	case MSR_IA32_PERF_CAPABILITIES:
> > +		/*
> > +		 * Host VMM should not get potentially invalid MSR value if
> > vCPU
> > +		 * has already run but guest cpuid lacks the support for the
> > +		 * MSR.
> > +		 */
> > +		if (msr_info->host_initiated &&
> > +		    kvm_vcpu_has_run(vcpu) &&
> > +		    !guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> > +			return 1;

As Wei pointed out, this doesn't match the changelog.  And I don't think this is
what we want, at least not in isolation.  Making KVM more restrictive on userspace
reads doesn't solve the live migration save/restore issue, and the kvm_vcpu_has_run()
adds yet another flavor of MSR handling.

We discussed this whole MSRs mess at PUCK this morning.  I forgot to hit RECORD,
but Paolo took notes and will post them soon.

Going from memory, the plan is to:

  1. Commit to, and document, that userspace must do KVM_SET_CPUID{,2} prior to
     KVM_SET_MSRS.

  2. Go with roughly what I proposed in the CET thread (for unsupported MSRS,
     read 0 and drop writes of '0')[*].

  3. Add a quire for PERF_CAPABILITIES, ARCH_CAPABILITIES, and PLATFORM_INFO
     (if quirk==enabled, keep KVM's current behavior; if quirk==disabled, zero-
      initialize the MSRs).

With those pieces in place, KVM can simply check X86_FEATURE_PDCM for both reads
and writes to PERF_CAPABILITIES, and the common userspace MSR handling will
convert "unsupported" to "success" as appropriate.

[*] https://lore.kernel.org/all/ZfDdS8rtVtyEr0UR@google.com

  reply	other threads:[~2024-03-13 14:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-13  0:37 [PATCH] KVM: x86/pmu: Return correct value of IA32_PERF_CAPABILITIES for userspace after vCPU has run Mingwei Zhang
2024-03-13 10:22 ` Wang, Wei W
2024-03-13 14:42   ` Sean Christopherson [this message]
2024-03-13 15:01     ` Paolo Bonzini
2024-03-13 18:18       ` Mingwei Zhang
2024-03-13 17:30   ` Mingwei Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZfG7SgyqTTtqF3cw@google.com \
    --to=seanjc@google.com \
    --cc=aaronlewis@google.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.