From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: Jim Mattson <jmattson@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 4/6] KVM: x86/pmu: Re-evaluate Host-Only/Guest-Only on nested SVM transitions
Date: Wed, 22 Apr 2026 15:42:09 -0700 [thread overview]
Message-ID: <aelOwWh-kfM5EYsL@google.com> (raw)
In-Reply-To: <aefSuVNRtepzL921@google.com>
On Tue, Apr 21, 2026, Yosry Ahmed wrote:
> On Thu, Apr 09, 2026 at 02:21:14PM -0700, Sean Christopherson wrote:
> > On Thu, Apr 09, 2026, Sean Christopherson wrote:
> > > On Thu, Apr 09, 2026, Jim Mattson wrote:
> > > > On Thu, Apr 9, 2026 at 10:48 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > > On Thu, Apr 09, 2026, Jim Mattson wrote:
> > > > > > > > In general, this deferral is misguided. The G/H bits should be
> > > > > > > > re-evaluated before we call kvm_pmu_instruction_retired() for an
> > > > > > > > emulated instruction.
...
> > > > > > > > This happens too late for VMRUN, since we have already called
> > > > > > > > kvm_pmu_instruction_retired() via kvm_skip_emulated_instruction(), and
> > > > > > > > VMRUN counts as a *guest* instruction.
> > > > > > >
> > > > > > > It's just VMRUN that's problematic though, correct? I.e. the scheme as a whole
> > > > > > > is fine, we just need to special case VMRUN due to SVM's erratum^Warchitecture.
> > > > > > > Alternatively, maybe we could get AMD to document the silly VMRUN behavior as an
> > > > > > > erratum, then we could claim KVM is architecturally superior. :-D
> > > > > >
> > > > > > Here, it's just VMRUN. Above, it's WRMSR(EFER).
> > > > >
> > > > > But clearing EFER.SVME while in the guest generates architecturally undefined
> > > > > behavior. I don't see any reason to complicate PMU virtualization for that
> > > > > scenario, especially now that KVM synthesizes triple fault for L1.
> > > >
> > > > L1 can clear the virtual EFER.SVME. That is well-defined.
> > >
> > > Gah, I forgot that the H/G bits are ignored when EFER.SVME=0. That's really
> > > annoying.
> >
> > What do you think about having two flavors of kvm_pmu_handle_nested_transition()?
> > One that defers via a request, and a "special" (SVM-only?) version that does
> > direct updates.
> >
> > Poking into PMU state in arbitrary contexts makes me nervous. E.g. when calling
> > svm_leave_nested(), odds are good EFER isn't even correct, and the update *needs*
> > to be deferred.
>
> Hmm is it really that bad?
It's not horrible, but it's a lot of "I think" and "should" and whatnot. I
generally agree that it's unlikely to be a problem, but I can point at far too
many bugs where KVM unexpectedly invokes a helper and consumes stale state.
I'm not completely opposed to non-deferred updates, but I really don't want to
use them for svm_leave_nested().
> - In the emulated VMRUN and #VMEXIT paths, EFER.SVME should be set in
> both L1 and L2, so it should be fine.
>
> - In the restore path entering guest mode, EFER.SVME should also be set
> in both L1 and L2.
>
> So I think svm_leave_nested() is the only interesting case:
>
> - In the restore path, svm_leave_nested() should only be called if the
> CPU is in guest mode and EFER.SVME is set in both L1 and L2.
>
> - In the EFER update path, L1 will get a shutdown if we forcefully leave
> nested anyway, unless userspace is setting state. Either way, the
> value of EFER should be correct and valid to use to update the PMU
> here.
>
> - In the vCPU free path, it shouldn't really matter, but the value of
> EFER should still be correct.
> So I *think* generally the value of EFER should be fine to use. The
> other inputs are is_guest_mode() and eventsel. In both cases we should
> just make sure to update the PMU *after* updating the state.
>
> So I think we'd end up with something similar to Jim's v2:
> https://lore.kernel.org/kvm/20260129232835.3710773-1-jmattson@google.com/
>
> We will directly re-evaluate eventsel_hw on nested transitions, EFER
> updates, and PMU MSR updates -- without deferring anything.
>
> We'd still need to make other changes:
> - Always disable the PMC if EFER.SVME is clear and either H/G bit (or
> both) is set.
>
> - Handle VMRUN correctly. I was going to suggest just moving the call to
> kvm_skip_emulated_instruction() to the end of the function, but that
> doesn't handle the case where we inject #VMEXIT(INVALID) due to a
> VMRUN failure (e.g. consistency checks, loading CR3, etc).
>
> I am actually not sure if the instruction should count in host or
> guest mode in this case. Logically, we never entered the guest, so
> perhaps counting it in host mode is the right thing to do? I think
> we'll also need to test what HW does.
>
> Honestly, it would be a lot easier of someone from AMD could just tell
> us these things :)
>
> Basically:
> - Does the PMU generally count based on processor state (e.g. guest
> mode, EFER.SVME) before or after instruction retirement?
> - A successful VMRUN will be counted in guest mode, what about a
> failed VMRUN that produces #VMEXIT(INVALID)?
>
> > I definitely don't love having two separate update mechanisms, but it seems like
> > the safest option in this case.
>
> Same here, and I like the deferred handling, but to Jim's point I think
> we can use it anywhere :/
Why can't we defer the svm_leave_nested() case? The only flows the invoke
svm_leave_nested() are non-architectural, being precise there doesn't matter at
all (and I'm not convinced it matters in general given none of us can figure
out what hardware is _supposed_ to do).
Having a synchronous path for architectural flows, and a deferred mechanism for
everything else seems reasonable, and would all but eliminate my concerns about
consuming stale state and/or doing things like attempting to write MSRs while
freeing a vCPU.
next prev parent reply other threads:[~2026-04-22 22:42 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 3:11 [PATCH v4 0/6] KVM: x86/pmu: Add support for AMD Host-Only/Guest-Only bits Yosry Ahmed
2026-03-26 3:11 ` [PATCH v4 1/6] KVM: x86: Move enable_pmu/enable_mediated_pmu to pmu.h and pmu.c Yosry Ahmed
2026-03-26 3:11 ` [PATCH v4 2/6] KVM: x86: Move guest_mode helpers to x86.h Yosry Ahmed
2026-03-26 22:48 ` kernel test robot
2026-03-26 23:18 ` Yosry Ahmed
2026-03-27 3:15 ` kernel test robot
2026-03-26 3:11 ` [PATCH v4 3/6] KVM: x86/pmu: Disable counters based on Host-Only/Guest-Only bits in SVM Yosry Ahmed
2026-04-07 1:30 ` Sean Christopherson
2026-04-24 6:55 ` Yosry Ahmed
2026-04-27 18:50 ` Sean Christopherson
2026-04-27 19:11 ` Yosry Ahmed
2026-04-27 19:54 ` Sean Christopherson
2026-04-27 20:02 ` Yosry Ahmed
2026-04-27 20:06 ` Sean Christopherson
2026-04-27 23:20 ` Yosry Ahmed
2026-04-27 23:53 ` Sean Christopherson
2026-04-28 0:34 ` Yosry Ahmed
2026-04-28 0:35 ` Yosry Ahmed
2026-04-28 0:37 ` Yosry Ahmed
2026-03-26 3:11 ` [PATCH v4 4/6] KVM: x86/pmu: Re-evaluate Host-Only/Guest-Only on nested SVM transitions Yosry Ahmed
2026-04-07 1:35 ` Sean Christopherson
2026-04-09 4:59 ` Jim Mattson
2026-04-09 17:22 ` Sean Christopherson
2026-04-09 17:29 ` Jim Mattson
2026-04-09 17:48 ` Sean Christopherson
2026-04-09 18:35 ` Jim Mattson
2026-04-09 18:38 ` Sean Christopherson
2026-04-09 21:21 ` Sean Christopherson
2026-04-10 3:50 ` Jim Mattson
2026-04-15 21:26 ` Sean Christopherson
2026-04-15 23:07 ` Jim Mattson
2026-04-16 0:29 ` Sean Christopherson
2026-04-17 22:51 ` Jim Mattson
2026-04-21 20:01 ` Yosry Ahmed
2026-04-22 22:42 ` Sean Christopherson [this message]
2026-04-24 6:57 ` Yosry Ahmed
2026-03-26 3:11 ` [PATCH v4 5/6] KVM: x86/pmu: Allow Host-Only/Guest-Only bits with nSVM and mediated PMU Yosry Ahmed
2026-03-26 3:11 ` [PATCH v4 6/6] KVM: selftests: Add svm_pmu_host_guest_test for Host-Only/Guest-Only bits Yosry Ahmed
2026-04-07 1:39 ` Sean Christopherson
2026-04-07 3:23 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aelOwWh-kfM5EYsL@google.com \
--to=seanjc@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.