From: Yosry Ahmed <yosry@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Jim Mattson <jmattson@google.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
Date: Fri, 13 Mar 2026 23:08:52 +0000 [thread overview]
Message-ID: <abSXvZGuxCGJH0ID@google.com> (raw)
In-Reply-To: <abSTQifukogF5yEF@google.com>
> > > + /*
> > > + * VMware backdoor emulation on #GP interception only handles
> > > + * IN{S}, OUT{S}, and RDPMC.
> > > + */
> > > + if (!is_guest_mode(vcpu))
> > > + return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> >
> > AI review pointed out that we should not drop the page_address_valid()
> > from here, because if an SVM instruction is executed by L2, and KVM
> > intercepts the #GP, it should re-inject the #GP into L2 if RAX is
> > illegal instead of synthesizing a #VMEXIT to L1.
>
> No, because the intercept has higher priority than the #GP due to bad RAX.
Which is literally what I say next :P
>
> > My initial instincth is to keep the check here as well as in the intercept
> > handlers, but no, L1's intercept should take precedence over #GP due to
> > invalid RAX anyway. In fact, if L1 has the intercept set, then it must be set
> > in vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.
>
> Except for the erratum case.
Yes.
>
> > The actual problem is that the current code does not check if L1
> > actually sets the intercept in emulate_svm_instr().
>
> Oh dagnabbit. I had thought about this, multiple times, but wrote it off as a
> non-issue because if L1 wanted to intercept VMWHATEVER, KVM would set the intercept
> in vmcb02 and would get _that_ instead of a #GP. But the erratum case means that
> hardware could have signaled #GP even when the instruction should have been
> intercepted.
The problem is actually the other way around, it's when L1 does not want
to intercept it. So I think it's a problem regardless of the erratum.
> And I also forgot the KVM could be intercepting #GP for the VMware crud, which
> would unintentionally grab the CPL case too. Darn kitchen sink #GPs.
>
> > So if L1 and KVM do not set the intercept, and RAX is invalid, the current
> > code could synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP.
> > The existing check on RAX prevents that, but it doesn't really fix the
> > problem because if we get #GP due to CPL != 0, we'll still generate a
> > spurious #VMEXIT to L1. What we really should be doing in gp_interception()
> > is:
> >
> > 1. if CPL != 0, re-inject #GP.
> > 2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
> > 3. Otherwise emulate the instruction, which would take care of
> > re-injecting the #GP if RAX is invalid with this patch.
> >
> > Something like this on top (over 2 patches):
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index cf5ebdc4b27bf..8942272eb80b2 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> > *vcpu, int opcode)
> > [SVM_INSTR_VMLOAD] = vmload_interception,
> > [SVM_INSTR_VMSAVE] = vmsave_interception,
> > };
> > + int exit_code = guest_mode_exit_codes[opcode];
> > struct vcpu_svm *svm = to_svm(vcpu);
> >
> > - if (is_guest_mode(vcpu)) {
> > - nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> > + if (is_guest_mode(vcpu) &&
> > vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> > + nested_svm_simple_vmexit(svm, exit_code);
> > return 1;
> > }
> > return svm_instr_handlers[opcode](vcpu);
> > @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> > goto reinject;
> >
> > opcode = svm_instr_opcode(vcpu);
> > - if (opcode != NONE_SVM_INSTR)
> > + if (opcode != NONE_SVM_INSTR) {
> > + if (svm->vmcb->save.cpl)
> > + goto reinject;
>
> Don't you need the page_address_valid() check here? Ooooh, no, because either
> emulate_svm_instr() will synthesize #VMEXIT, or svm_instr_handlers() will take
> care of the #GP. It's only CPL that needs to be checked early, because it has
> priority over the #VMEXIT.
Yeah, exactly my thought process.
>
> > return emulate_svm_instr(vcpu, opcode);
> > + }
> >
> > if (!enable_vmware_backdoor)
> > goto reinject;
> >
> > ---
> >
> > Sean, do you prefer that I send patches separately on top of this
> > series or a new version with these patches included?
>
> Go ahead and send an entirely new series. The less threads I have to chase down
> after I get back, the less likely I am to screw things up :-)
I will send one next week.
I might also add a patch at the end up cleaning up all of this
svm_instr_opcode() and emulate_svm_instr() stuff. The code is
unnecessarily convoluted, we get the opcode in one place then key off of
it in another.
I think it would be nicer with a single helper to handle SVM
instructions, and that would create a good spot to add a comment about
precedence ordering. Something like this:
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a0dacbeaa3c5a..d5afcb179398b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2235,54 +2235,42 @@ static int vmrun_interception(struct kvm_vcpu *vcpu)
return nested_svm_vmrun(vcpu);
}
-enum {
- NONE_SVM_INSTR,
- SVM_INSTR_VMRUN,
- SVM_INSTR_VMLOAD,
- SVM_INSTR_VMSAVE,
-};
-
-/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
-static int svm_instr_opcode(struct kvm_vcpu *vcpu)
+static bool check_emulate_svm_instr(struct kvm_vcpu *vcpu, int *ret)
{
struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+ int exit_code;
if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
- return NONE_SVM_INSTR;
+ return false;
switch (ctxt->modrm) {
case 0xd8: /* VMRUN */
- return SVM_INSTR_VMRUN;
+ exit_code = SVM_EXIT_VMRUN;
+ break;
case 0xda: /* VMLOAD */
- return SVM_INSTR_VMLOAD;
+ exit_code = SVM_EXIT_VMLOAD;
+ break;
case 0xdb: /* VMSAVE */
- return SVM_INSTR_VMSAVE;
- default:
+ exit_code = SVM_EXIT_VMSAVE;
break;
+ default:
+ return false;
}
- return NONE_SVM_INSTR;
-}
-
-static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
-{
- const int guest_mode_exit_codes[] = {
- [SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
- [SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
- [SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
- };
- int (*const svm_instr_handlers[])(struct kvm_vcpu *vcpu) = {
- [SVM_INSTR_VMRUN] = vmrun_interception,
- [SVM_INSTR_VMLOAD] = vmload_interception,
- [SVM_INSTR_VMSAVE] = vmsave_interception,
- };
- struct vcpu_svm *svm = to_svm(vcpu);
+ /*
+ * #GP due to CPL != 0 takes precedence over intercepts, but intercepts
+ * take precedence over #GP due to invalid RAX (which is checked by the
+ * exit handlers).
+ */
+ *ret = 1;
+ if (to_svm(vcpu)->vmcb->save.cpl)
+ kvm_inject_gp(vcpu, 0);
+ else if (is_guest_mode(vcpu) && vmcb12_is_intercept(&svm->nested.ctl, exit_code))
+ nested_svm_simple_vmexit(svm, exit_code);
+ else
+ *ret = svm_invoke_exit_handler(vcpu, exit_code);
- if (is_guest_mode(vcpu)) {
- nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
- return 1;
- }
- return svm_instr_handlers[opcode](vcpu);
+ return true;
}
/*
@@ -2297,7 +2285,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
u32 error_code = svm->vmcb->control.exit_info_1;
- int opcode;
+ int r;
/* Both #GP cases have zero error_code */
if (error_code)
@@ -2307,9 +2295,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
goto reinject;
- opcode = svm_instr_opcode(vcpu);
- if (opcode != NONE_SVM_INSTR)
- return emulate_svm_instr(vcpu, opcode);
+ if (check_emulate_svm_instr(vcpu, &r))
+ return r;
if (!enable_vmware_backdoor)
goto reinject;
---
The only thing I am unsure of is whether to check if it's an SVM
instruction in a separate helper to avoid the output parameter.
next prev parent reply other threads:[~2026-03-13 23:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-13 0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
2026-03-15 12:55 ` Paolo Bonzini
2026-03-16 13:49 ` Yosry Ahmed
2026-03-16 16:28 ` Yosry Ahmed
2026-03-17 13:15 ` Paolo Bonzini
2026-03-17 14:58 ` Jim Mattson
2026-03-18 15:55 ` Paolo Bonzini
2026-03-13 0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
2026-03-13 18:17 ` Yosry Ahmed
2026-03-13 22:44 ` Sean Christopherson
2026-03-13 23:08 ` Yosry Ahmed [this message]
2026-03-16 15:25 ` Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
2026-03-13 0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abSXvZGuxCGJH0ID@google.com \
--to=yosry@kernel.org \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox