public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	 kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
Date: Fri, 13 Mar 2026 23:08:52 +0000	[thread overview]
Message-ID: <abSXvZGuxCGJH0ID@google.com> (raw)
In-Reply-To: <abSTQifukogF5yEF@google.com>

> > > +       /*
> > > +        * VMware backdoor emulation on #GP interception only handles
> > > +        * IN{S}, OUT{S}, and RDPMC.
> > > +        */
> > > +       if (!is_guest_mode(vcpu))
> > > +               return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> > 
> > AI review pointed out that we should not drop the page_address_valid()
> > from here, because if an SVM instruction is executed by L2, and KVM
> > intercepts the #GP, it should re-inject the #GP into L2 if RAX is
> > illegal instead of synthesizing a #VMEXIT to L1. 
> 
> No, because the intercept has higher priority than the #GP due to bad RAX.

Which is literally what I say next :P

> 
> > My initial instincth is to keep the check here as well as in the intercept
> > handlers, but no, L1's intercept should take precedence over #GP due to
> > invalid RAX anyway. In fact, if L1 has the intercept set, then it must be set
> > in vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.
> 
> Except for the erratum case.

Yes.

> 
> > The actual problem is that the current code does not check if L1
> > actually sets the intercept in emulate_svm_instr(). 
> 
> Oh dagnabbit.  I had thought about this, multiple times, but wrote it off as a
> non-issue because if L1 wanted to intercept VMWHATEVER, KVM would set the intercept
> in vmcb02 and would get _that_ instead of a #GP. But the erratum case means that
> hardware could have signaled #GP even when the instruction should have been
> intercepted.

The problem is actually the other way around, it's when L1 does not want
to intercept it. So I think it's a problem regardless of the erratum.

> And I also forgot the KVM could be intercepting #GP for the VMware crud, which
> would unintentionally grab the CPL case too.  Darn kitchen sink #GPs.
> 
> > So if L1 and KVM do not set the intercept, and RAX is invalid, the current
> > code could synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP.
> > The existing check on RAX prevents that, but it doesn't really fix the
> > problem because if we get #GP due to CPL != 0, we'll still generate a
> > spurious #VMEXIT to L1. What we really should be doing in gp_interception()
> > is:
> > 
> > 1. if CPL != 0, re-inject #GP.
> > 2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
> > 3. Otherwise emulate the instruction, which would take care of
> > re-injecting the #GP if RAX is invalid with this patch.
> > 
> > Something like this on top (over 2 patches):
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index cf5ebdc4b27bf..8942272eb80b2 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> > *vcpu, int opcode)
> >                 [SVM_INSTR_VMLOAD] = vmload_interception,
> >                 [SVM_INSTR_VMSAVE] = vmsave_interception,
> >         };
> > +       int exit_code = guest_mode_exit_codes[opcode];
> >         struct vcpu_svm *svm = to_svm(vcpu);
> > 
> > -       if (is_guest_mode(vcpu)) {
> > -               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> > +       if (is_guest_mode(vcpu) &&
> > vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> > +               nested_svm_simple_vmexit(svm, exit_code);
> >                 return 1;
> >         }
> >         return svm_instr_handlers[opcode](vcpu);
> > @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> >                 goto reinject;
> > 
> >         opcode = svm_instr_opcode(vcpu);
> > -       if (opcode != NONE_SVM_INSTR)
> > +       if (opcode != NONE_SVM_INSTR) {
> > +               if (svm->vmcb->save.cpl)
> > +                       goto reinject;
> 
> Don't you need the page_address_valid() check here?  Ooooh, no, because either
> emulate_svm_instr() will synthesize #VMEXIT, or svm_instr_handlers() will take
> care of the #GP.  It's only CPL that needs to be checked early, because it has
> priority over the #VMEXIT.

Yeah, exactly my thought process.

> 
> >                 return emulate_svm_instr(vcpu, opcode);
> > +       }
> > 
> >         if (!enable_vmware_backdoor)
> >                 goto reinject;
> > 
> > ---
> > 
> > Sean, do you prefer that I send patches separately on top of this
> > series or a new version with these patches included?
> 
> Go ahead and send an entirely new series.  The less threads I have to chase down
> after I get back, the less likely I am to screw things up :-)

I will send one next week.

I might also add a patch at the end up cleaning up all of this
svm_instr_opcode() and emulate_svm_instr() stuff. The code is
unnecessarily convoluted, we get the opcode in one place then key off of
it in another.

I think it would be nicer with a single helper to handle SVM
instructions, and that would create a good spot to add a comment about
precedence ordering. Something like this:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a0dacbeaa3c5a..d5afcb179398b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2235,54 +2235,42 @@ static int vmrun_interception(struct kvm_vcpu *vcpu)
 	return nested_svm_vmrun(vcpu);
 }
 
-enum {
-	NONE_SVM_INSTR,
-	SVM_INSTR_VMRUN,
-	SVM_INSTR_VMLOAD,
-	SVM_INSTR_VMSAVE,
-};
-
-/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
-static int svm_instr_opcode(struct kvm_vcpu *vcpu)
+static bool check_emulate_svm_instr(struct kvm_vcpu *vcpu, int *ret)
 {
 	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+	int exit_code;
 
 	if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
-		return NONE_SVM_INSTR;
+		return false;
 
 	switch (ctxt->modrm) {
 	case 0xd8: /* VMRUN */
-		return SVM_INSTR_VMRUN;
+		exit_code = SVM_EXIT_VMRUN;
+		break;
 	case 0xda: /* VMLOAD */
-		return SVM_INSTR_VMLOAD;
+		exit_code = SVM_EXIT_VMLOAD;
+		break;
 	case 0xdb: /* VMSAVE */
-		return SVM_INSTR_VMSAVE;
-	default:
+		exit_code = SVM_EXIT_VMSAVE;
 		break;
+	default:
+		return false;
 	}
 
-	return NONE_SVM_INSTR;
-}
-
-static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
-{
-	const int guest_mode_exit_codes[] = {
-		[SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
-		[SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
-		[SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
-	};
-	int (*const svm_instr_handlers[])(struct kvm_vcpu *vcpu) = {
-		[SVM_INSTR_VMRUN] = vmrun_interception,
-		[SVM_INSTR_VMLOAD] = vmload_interception,
-		[SVM_INSTR_VMSAVE] = vmsave_interception,
-	};
-	struct vcpu_svm *svm = to_svm(vcpu);
+	/*
+	 * #GP due to CPL != 0 takes precedence over intercepts, but intercepts
+	 * take precedence over #GP due to invalid RAX (which is checked by the
+	 * exit handlers).
+	 */
+	*ret = 1;
+	if (to_svm(vcpu)->vmcb->save.cpl)
+		kvm_inject_gp(vcpu, 0);
+	else if (is_guest_mode(vcpu) && vmcb12_is_intercept(&svm->nested.ctl, exit_code))
+		nested_svm_simple_vmexit(svm, exit_code);
+	else
+		*ret = svm_invoke_exit_handler(vcpu, exit_code);
 
-	if (is_guest_mode(vcpu)) {
-		nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
-		return 1;
-	}
-	return svm_instr_handlers[opcode](vcpu);
+	return true;
 }
 
 /*
@@ -2297,7 +2285,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u32 error_code = svm->vmcb->control.exit_info_1;
-	int opcode;
+	int r;
 
 	/* Both #GP cases have zero error_code */
 	if (error_code)
@@ -2307,9 +2295,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 	if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
 		goto reinject;
 
-	opcode = svm_instr_opcode(vcpu);
-	if (opcode != NONE_SVM_INSTR)
-		return emulate_svm_instr(vcpu, opcode);
+	if (check_emulate_svm_instr(vcpu, &r))
+		return r;
 
 	if (!enable_vmware_backdoor)
 		goto reinject;

---

The only thing I am unsure of is whether to check if it's an SVM
instruction in a separate helper to avoid the output parameter.

  reply	other threads:[~2026-03-13 23:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
2026-03-15 12:55   ` Paolo Bonzini
2026-03-16 13:49     ` Yosry Ahmed
2026-03-16 16:28       ` Yosry Ahmed
2026-03-17 13:15       ` Paolo Bonzini
2026-03-17 14:58         ` Jim Mattson
2026-03-18 15:55           ` Paolo Bonzini
2026-03-13  0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
2026-03-13 18:17   ` Yosry Ahmed
2026-03-13 22:44     ` Sean Christopherson
2026-03-13 23:08       ` Yosry Ahmed [this message]
2026-03-16 15:25     ` Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abSXvZGuxCGJH0ID@google.com \
    --to=yosry@kernel.org \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox