From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [RFC PATCH 4/5] KVM: SVM: Recalculate nested RIPs after restoring REGS/SREGS
Date: Wed, 18 Feb 2026 16:13:29 -0800 [thread overview]
Message-ID: <aZZVqQrQ1iCNJhJJ@google.com> (raw)
In-Reply-To: <20260212230751.1871720-5-yosry.ahmed@linux.dev>
On Thu, Feb 12, 2026, Yosry Ahmed wrote:
> In the save/restore path, if KVM_SET_NESTED_STATE is performed before
> restoring REGS and/or SREGS , the values of CS and RIP used to
> initialize the vmcb02's NextRIP and soft interrupt tracking RIPs are
> incorrect.
>
> Recalculate them up after CS is set, or REGS are restored. This is only
> needed when a nested run is pending during restore. After L2 runs for
> the first time, any soft interrupts injected by L1 are already delivered
> or tracked by KVM separately for re-injection, so the CS and RIP values
> are no longer relevant.
>
> If KVM_SET_NESTED_STATE is performed after both REGS and SREGS are
> restored, it will just overwrite the fields.
Apparently I suggested this general idea, but ugh. :-)
> static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
> {
> kvm_register_mark_available(vcpu, reg);
> @@ -1826,6 +1844,8 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
> if (seg == VCPU_SREG_SS)
> /* This is symmetric with svm_get_segment() */
> svm->vmcb->save.cpl = (var->dpl & 3);
> + else if (seg == VCPU_SREG_CS)
> + svm_fixup_nested_rips(vcpu);
>
> vmcb_mark_dirty(svm->vmcb, VMCB_SEG);
> }
> @@ -5172,6 +5192,7 @@ struct kvm_x86_ops svm_x86_ops __initdata = {
> .get_rflags = svm_get_rflags,
> .set_rflags = svm_set_rflags,
> .get_if_flag = svm_get_if_flag,
> + .post_user_set_regs = svm_fixup_nested_rips,
>
> .flush_tlb_all = svm_flush_tlb_all,
> .flush_tlb_current = svm_flush_tlb_current,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index db3f393192d9..35fe1d337273 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12112,6 +12112,8 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
> kvm_rip_write(vcpu, regs->rip);
> kvm_set_rflags(vcpu, regs->rflags | X86_EFLAGS_FIXED);
>
> + kvm_x86_call(post_user_set_regs)(vcpu);
I especially don't love this callback. Aside from adding a new kvm_x86_ops hook,
I don't like that _any_ CS change triggers a fixup, whereas only userspace writes
to RIP trigger a fixup. That _should_ be a moot point, because neither CS nor RIP
should change while nested_run_pending is true, but I dislike the asymmetry.
I was going to suggest we instead react to RIP being dirty, but what if we take
it a step further? Somewhat of a crazy idea, but what happens if we simply wait
until just before VMRUN to set soft_int_csbase, soft_int_old_rip, and
soft_int_next_rip (when the guest doesn't have NRIPS)?
E.g. after patch 2, completely untested...
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index aec17c80ed73..6fc1b2e212d2 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -863,12 +863,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
svm->nmi_l1_to_l2 = is_evtinj_nmi(vmcb02->control.event_inj);
if (is_evtinj_soft(vmcb02->control.event_inj)) {
svm->soft_int_injected = true;
- svm->soft_int_csbase = vmcb12_csbase;
- svm->soft_int_old_rip = vmcb12_rip;
+
if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
svm->soft_int_next_rip = svm->nested.ctl.next_rip;
- else
- svm->soft_int_next_rip = vmcb12_rip;
}
/* LBR_CTL_ENABLE_MASK is controlled by svm_update_lbrv() */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8f8bc863e214..358ec940ffc9 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4322,6 +4322,14 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
return EXIT_FASTPATH_EXIT_USERSPACE;
}
+ if (is_guest_mode(vcpu) && svm->nested.nested_run_pending &&
+ svm->soft_int_injected) {
+ svm->soft_int_csbase = svm->vmcb->save.cs.base;
+ svm->soft_int_old_rip = kvm_rip_read(vcpu);
+ if (!guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
+ svm->soft_int_next_rip = kvm_rip_read(vcpu);
+ }
+
sync_lapic_to_cr8(vcpu);
if (unlikely(svm->asid != svm->vmcb->control.asid)) {
next prev parent reply other threads:[~2026-02-19 0:13 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 23:07 [RFC PATCH 0/5] KVM: nSVM: Fix RIP usage in the control area after restore Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 1/5] KVM: nSVM: Do not use L2's RIP for vmcb02's NextRIP after first L2 VMRUN Yosry Ahmed
2026-02-18 23:22 ` Sean Christopherson
2026-02-18 23:38 ` Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 2/5] KVM: nSVM: Use the correct RIP when restoring vmcb02's control area Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 3/5] KVM: nSVM: Move updating NextRIP and soft IRQ RIPs into a helper Yosry Ahmed
2026-02-12 23:07 ` [RFC PATCH 4/5] KVM: SVM: Recalculate nested RIPs after restoring REGS/SREGS Yosry Ahmed
2026-02-19 0:13 ` Sean Christopherson [this message]
2026-02-19 0:26 ` Yosry Ahmed
2026-02-20 17:07 ` Sean Christopherson
2026-02-20 20:27 ` Yosry Ahmed
2026-02-20 22:50 ` Sean Christopherson
2026-02-12 23:07 ` [RFC PATCH 5/5] DO NOT MERGE: KVM: selftests: Reproduce nested RIP restore bug Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZZVqQrQ1iCNJhJJ@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.