From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [RFC PATCH 4/5] KVM: SVM: Recalculate nested RIPs after restoring REGS/SREGS
Date: Fri, 20 Feb 2026 14:50:40 -0800 [thread overview]
Message-ID: <aZjlQGmYT4ufduOT@google.com> (raw)
In-Reply-To: <unqj7mrl5j2feevcuwfpiurhtzppbdn7b5gimlalvunv3bx25y@5ko7vwgzxxdw>
On Fri, Feb 20, 2026, Yosry Ahmed wrote:
> > > > svm->nmi_l1_to_l2 = is_evtinj_nmi(vmcb02->control.event_inj);
> > > > if (is_evtinj_soft(vmcb02->control.event_inj)) {
> > > > svm->soft_int_injected = true;
> > > > - svm->soft_int_csbase = vmcb12_csbase;
> > > > - svm->soft_int_old_rip = vmcb12_rip;
> > > > +
> > > > if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
> > > > svm->soft_int_next_rip = svm->nested.ctl.next_rip;
> > >
> > > Why not move this too?
> >
> > For the same reason I think we should keep
> >
> > if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
> > vmcb02->control.next_rip = svm->nested.ctl.next_rip;
> >
> > where it is. When NRIPS is exposed to the guest, the incoming nested state is
> > the one and only source of truth. By keeping the code different, we'd effectively
> > be documenting that the host.NRIPS+!guest.NRIPS case is the anomaly.
>
> I see, makes sense. I like the fact that we should be able to completely
> drop vmcb12_rip and vmcb12_csbase with this (unless we want to start
> using it for the bus_lock_rip check), which will also remove the need
> for patch 2.
...
> It seems a bit fragile that the 'if' is somewhere and the 'else' (or the
> opposite condition) is somewhere else. They could get out of sync.
> Maybe
> a helper will make this better:
>
> /* Huge comment */
> bool nested_svm_use_vmcb12_next_rip(struct kvm_vcpu *vcpu)
> {
> return guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS) ||
> !svm->nested.nested_run_pending;
> }
>
> or maybe the name makes more sense as the negative:
> nested_svm_use_rip_as_next_rip()?
>
> I don't like both names..
>
> Aha! Maybe it's actually better to have the helper set the NextRip
> directly?
>
> /* Huge comment */
> u64 nested_vmcb02_update_next_rip(struct kvm_vcpu *vcpu)
> {
> u64 next_rip;
>
> if (WARN_ON_ONCE(svm->vmcb != svm->nested.vmcb02.ptr))
> return;
>
> if (!boot_cpu_has(X86_FEATURE_NRIPS))
> return;
>
> if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS) ||
> !svm->nested.nested_run_pending)
> next_rip = svm->nested.ctl.next_rip;
> else
> next_rip = kvm_rip_read(vcpu);
>
> svm->vmcb->control.next_rip = next_rip;
> }
>
> Then, we just call this from nested_vmcb02_prepare_control() and
> svm_vcpu_run() (with the soft IRQ stuff). In some cases we'll put the
> wrong thing in vmcb02 and then fix it up later, but I think that's fine
> (that's what the patch is doing anyway).
>
> However, we lose the fact that the whole thing is guarded by
> nested_run_pending, so performance in svm_vcpu_run() could suffer. We
> could leave it all guarded by nested_run_pending, as
> nested_vmcb02_update_next_rip() should do nothing in svm_vcpu_run()
> otherwise, but then the caller is depending on implementation details of
> the helper.
>
> Maybe just put it in svm_prepare_switch_to_guest() to begin with and not
> guard it with nested_run_pending?
Of all the options, I still like open coding the two, even though it means the
"else" will be separated from the "if", followed by the
nested_svm_use_vmcb12_next_rip() helper option. I straight up dislike
nested_vmcb02_update_next_rip() because it buries simple concepts behind a
complex function (copy vmcb12->next_rip to vmcb02->next_rip is at its core a
*very* simple idea). Oh, and it unnecessarily rewrites vmcb02->next_rip in the
common case where the guest has NRIPs.
I'm a-ok with the separate "if" and "else", because it's not a pure "else", and
that's the entire reason we have a mess in the first place: we wrote an if-else,
but what is actually necessitate by KVM's uAPI is two separate (but tightly
coupled) if-statements.
next prev parent reply other threads:[~2026-02-20 22:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 23:07 [RFC PATCH 0/5] KVM: nSVM: Fix RIP usage in the control area after restore Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 1/5] KVM: nSVM: Do not use L2's RIP for vmcb02's NextRIP after first L2 VMRUN Yosry Ahmed
2026-02-18 23:22 ` Sean Christopherson
2026-02-18 23:38 ` Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 2/5] KVM: nSVM: Use the correct RIP when restoring vmcb02's control area Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 3/5] KVM: nSVM: Move updating NextRIP and soft IRQ RIPs into a helper Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 4/5] KVM: SVM: Recalculate nested RIPs after restoring REGS/SREGS Yosry Ahmed
2026-02-19 0:13 ` Sean Christopherson
2026-02-19 0:26 ` Yosry Ahmed
2026-02-20 17:07 ` Sean Christopherson
2026-02-20 20:27 ` Yosry Ahmed
2026-02-20 22:50 ` Sean Christopherson [this message]
2026-02-12 23:07 ` [RFC PATCH 5/5] DO NOT MERGE: KVM: selftests: Reproduce nested RIP restore bug Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZjlQGmYT4ufduOT@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.