public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: Jim Mattson <jmattson@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 4/6] KVM: x86/pmu: Re-evaluate Host-Only/Guest-Only on nested SVM transitions
Date: Wed, 22 Apr 2026 15:42:09 -0700	[thread overview]
Message-ID: <aelOwWh-kfM5EYsL@google.com> (raw)
In-Reply-To: <aefSuVNRtepzL921@google.com>

On Tue, Apr 21, 2026, Yosry Ahmed wrote:
> On Thu, Apr 09, 2026 at 02:21:14PM -0700, Sean Christopherson wrote:
> > On Thu, Apr 09, 2026, Sean Christopherson wrote:
> > > On Thu, Apr 09, 2026, Jim Mattson wrote:
> > > > On Thu, Apr 9, 2026 at 10:48 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > On Thu, Apr 09, 2026, Jim Mattson wrote:
> > > > > > > > In general, this deferral is misguided. The G/H bits should be
> > > > > > > > re-evaluated before we call kvm_pmu_instruction_retired() for an
> > > > > > > > emulated instruction.

...

> > > > > > > > This happens too late for VMRUN, since we have already called
> > > > > > > > kvm_pmu_instruction_retired() via kvm_skip_emulated_instruction(), and
> > > > > > > > VMRUN counts as a *guest* instruction.
> > > > > > >
> > > > > > > It's just VMRUN that's problematic though, correct?  I.e. the scheme as a whole
> > > > > > > is fine, we just need to special case VMRUN due to SVM's erratum^Warchitecture.
> > > > > > > Alternatively, maybe we could get AMD to document the silly VMRUN behavior as an
> > > > > > > erratum, then we could claim KVM is architecturally superior. :-D
> > > > > >
> > > > > > Here, it's just VMRUN. Above, it's WRMSR(EFER).
> > > > >
> > > > > But clearing EFER.SVME while in the guest generates architecturally undefined
> > > > > behavior.  I don't see any reason to complicate PMU virtualization for that
> > > > > scenario, especially now that KVM synthesizes triple fault for L1.
> > > > 
> > > > L1 can clear the virtual EFER.SVME. That is well-defined.
> > > 
> > > Gah, I forgot that the H/G bits are ignored when EFER.SVME=0.  That's really
> > > annoying.
> > 
> > What do you think about having two flavors of kvm_pmu_handle_nested_transition()?
> > One that defers via a request, and a "special" (SVM-only?) version that does
> > direct updates.
> > 
> > Poking into PMU state in arbitrary contexts makes me nervous.  E.g. when calling
> > svm_leave_nested(), odds are good EFER isn't even correct, and the update *needs*
> > to be deferred.
> 
> Hmm is it really that bad?

It's not horrible, but it's a lot of "I think" and "should" and whatnot.  I
generally agree that it's unlikely to be a problem, but I can point at far too
many bugs where KVM unexpectedly invokes a helper and consumes stale state.

I'm not completely opposed to non-deferred updates, but I really don't want to
use them for svm_leave_nested(). 

> - In the emulated VMRUN and #VMEXIT paths, EFER.SVME should be set in
>   both L1 and L2, so it should be fine.
> 
> - In the restore path entering guest mode, EFER.SVME should also be set
>   in both L1 and L2.
> 
> So I think svm_leave_nested() is the only interesting case:
> 
> - In the restore path, svm_leave_nested() should only be called if the
>   CPU is in guest mode and EFER.SVME is set in both L1 and L2.
> 
> - In the EFER update path, L1 will get a shutdown if we forcefully leave
>   nested anyway, unless userspace is setting state. Either way, the
>   value of EFER should be correct and valid to use to update the PMU
>   here.
> 
> - In the vCPU free path, it shouldn't really matter, but the value of
>   EFER should still be correct.

> So I *think* generally the value of EFER should be fine to use. The
> other inputs are is_guest_mode() and eventsel. In both cases we should
> just make sure to update the PMU *after* updating the state.
> 
> So I think we'd end up with something similar to Jim's v2:
> https://lore.kernel.org/kvm/20260129232835.3710773-1-jmattson@google.com/
> 
> We will directly re-evaluate eventsel_hw on nested transitions, EFER
> updates, and PMU MSR updates -- without deferring anything.
>
> We'd still need to make other changes:
> - Always disable the PMC if EFER.SVME is clear and either H/G bit (or
>   both) is set.
> 
> - Handle VMRUN correctly. I was going to suggest just moving the call to
>   kvm_skip_emulated_instruction() to the end of the function, but that
>   doesn't handle the case where we inject #VMEXIT(INVALID) due to a
>   VMRUN failure (e.g. consistency checks, loading CR3, etc).
> 
>   I am actually not sure if the instruction should count in host or
>   guest mode in this case. Logically, we never entered the guest, so
>   perhaps counting it in host mode is the right thing to do? I think
>   we'll also need to test what HW does.
> 
>   Honestly, it would be a lot easier of someone from AMD could just tell
>   us these things :)
> 
>   Basically:
>   - Does the PMU generally count based on processor state (e.g. guest
>     mode, EFER.SVME) before or after instruction retirement?
>   - A successful VMRUN will be counted in guest mode, what about a
>     failed VMRUN that produces #VMEXIT(INVALID)?
> 
> > I definitely don't love having two separate update mechanisms, but it seems like
> > the safest option in this case.
> 
> Same here, and I like the deferred handling, but to Jim's point I think
> we can use it anywhere :/

Why can't we defer the svm_leave_nested() case?  The only flows the invoke
svm_leave_nested() are non-architectural, being precise there doesn't matter at
all (and I'm not convinced it matters in general given none of us can figure
out what hardware is _supposed_ to do).

Having a synchronous path for architectural flows, and a deferred mechanism for
everything else seems reasonable, and would all but eliminate my concerns about
consuming stale state and/or doing things like attempting to write MSRs while
freeing a vCPU.

  reply	other threads:[~2026-04-22 22:42 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  3:11 [PATCH v4 0/6] KVM: x86/pmu: Add support for AMD Host-Only/Guest-Only bits Yosry Ahmed
2026-03-26  3:11 ` [PATCH v4 1/6] KVM: x86: Move enable_pmu/enable_mediated_pmu to pmu.h and pmu.c Yosry Ahmed
2026-03-26  3:11 ` [PATCH v4 2/6] KVM: x86: Move guest_mode helpers to x86.h Yosry Ahmed
2026-03-26 22:48   ` kernel test robot
2026-03-26 23:18     ` Yosry Ahmed
2026-03-27  3:15   ` kernel test robot
2026-03-26  3:11 ` [PATCH v4 3/6] KVM: x86/pmu: Disable counters based on Host-Only/Guest-Only bits in SVM Yosry Ahmed
2026-04-07  1:30   ` Sean Christopherson
2026-03-26  3:11 ` [PATCH v4 4/6] KVM: x86/pmu: Re-evaluate Host-Only/Guest-Only on nested SVM transitions Yosry Ahmed
2026-04-07  1:35   ` Sean Christopherson
2026-04-09  4:59   ` Jim Mattson
2026-04-09 17:22     ` Sean Christopherson
2026-04-09 17:29       ` Jim Mattson
2026-04-09 17:48         ` Sean Christopherson
2026-04-09 18:35           ` Jim Mattson
2026-04-09 18:38             ` Sean Christopherson
2026-04-09 21:21               ` Sean Christopherson
2026-04-10  3:50                 ` Jim Mattson
2026-04-15 21:26                   ` Sean Christopherson
2026-04-15 23:07                     ` Jim Mattson
2026-04-16  0:29                       ` Sean Christopherson
2026-04-17 22:51                         ` Jim Mattson
2026-04-21 20:01                 ` Yosry Ahmed
2026-04-22 22:42                   ` Sean Christopherson [this message]
2026-03-26  3:11 ` [PATCH v4 5/6] KVM: x86/pmu: Allow Host-Only/Guest-Only bits with nSVM and mediated PMU Yosry Ahmed
2026-03-26  3:11 ` [PATCH v4 6/6] KVM: selftests: Add svm_pmu_host_guest_test for Host-Only/Guest-Only bits Yosry Ahmed
2026-04-07  1:39   ` Sean Christopherson
2026-04-07  3:23     ` Jim Mattson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aelOwWh-kfM5EYsL@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yosry@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox