All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry@kernel.org>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Jim Mattson <jmattson@google.com>,
	 kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers
Date: Fri, 13 Mar 2026 23:08:52 +0000	[thread overview]
Message-ID: <abSXvZGuxCGJH0ID@google.com> (raw)
In-Reply-To: <abSTQifukogF5yEF@google.com>

> > > +       /*
> > > +        * VMware backdoor emulation on #GP interception only handles
> > > +        * IN{S}, OUT{S}, and RDPMC.
> > > +        */
> > > +       if (!is_guest_mode(vcpu))
> > > +               return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
> > 
> > AI review pointed out that we should not drop the page_address_valid()
> > from here, because if an SVM instruction is executed by L2, and KVM
> > intercepts the #GP, it should re-inject the #GP into L2 if RAX is
> > illegal instead of synthesizing a #VMEXIT to L1. 
> 
> No, because the intercept has higher priority than the #GP due to bad RAX.

Which is literally what I say next :P

> 
> > My initial instincth is to keep the check here as well as in the intercept
> > handlers, but no, L1's intercept should take precedence over #GP due to
> > invalid RAX anyway. In fact, if L1 has the intercept set, then it must be set
> > in vmcb02, and KVM would get a #VMEXIT on the intercept not on #GP.
> 
> Except for the erratum case.

Yes.

> 
> > The actual problem is that the current code does not check if L1
> > actually sets the intercept in emulate_svm_instr(). 
> 
> Oh dagnabbit.  I had thought about this, multiple times, but wrote it off as a
> non-issue because if L1 wanted to intercept VMWHATEVER, KVM would set the intercept
> in vmcb02 and would get _that_ instead of a #GP. But the erratum case means that
> hardware could have signaled #GP even when the instruction should have been
> intercepted.

The problem is actually the other way around, it's when L1 does not want
to intercept it. So I think it's a problem regardless of the erratum.

> And I also forgot the KVM could be intercepting #GP for the VMware crud, which
> would unintentionally grab the CPL case too.  Darn kitchen sink #GPs.
> 
> > So if L1 and KVM do not set the intercept, and RAX is invalid, the current
> > code could synthesize a spurious #VMEXIT to L1 instead of reinjecting #GP.
> > The existing check on RAX prevents that, but it doesn't really fix the
> > problem because if we get #GP due to CPL != 0, we'll still generate a
> > spurious #VMEXIT to L1. What we really should be doing in gp_interception()
> > is:
> > 
> > 1. if CPL != 0, re-inject #GP.
> > 2. If in guest mode and L1 intercepts the instruction, synthesize a #VMEXIT.
> > 3. Otherwise emulate the instruction, which would take care of
> > re-injecting the #GP if RAX is invalid with this patch.
> > 
> > Something like this on top (over 2 patches):
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index cf5ebdc4b27bf..8942272eb80b2 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2237,10 +2237,11 @@ static int emulate_svm_instr(struct kvm_vcpu
> > *vcpu, int opcode)
> >                 [SVM_INSTR_VMLOAD] = vmload_interception,
> >                 [SVM_INSTR_VMSAVE] = vmsave_interception,
> >         };
> > +       int exit_code = guest_mode_exit_codes[opcode];
> >         struct vcpu_svm *svm = to_svm(vcpu);
> > 
> > -       if (is_guest_mode(vcpu)) {
> > -               nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
> > +       if (is_guest_mode(vcpu) &&
> > vmcb12_is_intercept(&svm->nested.ctl, exit_code))
> > +               nested_svm_simple_vmexit(svm, exit_code);
> >                 return 1;
> >         }
> >         return svm_instr_handlers[opcode](vcpu);
> > @@ -2269,8 +2270,11 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> >                 goto reinject;
> > 
> >         opcode = svm_instr_opcode(vcpu);
> > -       if (opcode != NONE_SVM_INSTR)
> > +       if (opcode != NONE_SVM_INSTR) {
> > +               if (svm->vmcb->save.cpl)
> > +                       goto reinject;
> 
> Don't you need the page_address_valid() check here?  Ooooh, no, because either
> emulate_svm_instr() will synthesize #VMEXIT, or svm_instr_handlers() will take
> care of the #GP.  It's only CPL that needs to be checked early, because it has
> priority over the #VMEXIT.

Yeah, exactly my thought process.

> 
> >                 return emulate_svm_instr(vcpu, opcode);
> > +       }
> > 
> >         if (!enable_vmware_backdoor)
> >                 goto reinject;
> > 
> > ---
> > 
> > Sean, do you prefer that I send patches separately on top of this
> > series or a new version with these patches included?
> 
> Go ahead and send an entirely new series.  The less threads I have to chase down
> after I get back, the less likely I am to screw things up :-)

I will send one next week.

I might also add a patch at the end up cleaning up all of this
svm_instr_opcode() and emulate_svm_instr() stuff. The code is
unnecessarily convoluted, we get the opcode in one place then key off of
it in another.

I think it would be nicer with a single helper to handle SVM
instructions, and that would create a good spot to add a comment about
precedence ordering. Something like this:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a0dacbeaa3c5a..d5afcb179398b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2235,54 +2235,42 @@ static int vmrun_interception(struct kvm_vcpu *vcpu)
 	return nested_svm_vmrun(vcpu);
 }
 
-enum {
-	NONE_SVM_INSTR,
-	SVM_INSTR_VMRUN,
-	SVM_INSTR_VMLOAD,
-	SVM_INSTR_VMSAVE,
-};
-
-/* Return NONE_SVM_INSTR if not SVM instrs, otherwise return decode result */
-static int svm_instr_opcode(struct kvm_vcpu *vcpu)
+static bool check_emulate_svm_instr(struct kvm_vcpu *vcpu, int *ret)
 {
 	struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt;
+	int exit_code;
 
 	if (ctxt->b != 0x1 || ctxt->opcode_len != 2)
-		return NONE_SVM_INSTR;
+		return false;
 
 	switch (ctxt->modrm) {
 	case 0xd8: /* VMRUN */
-		return SVM_INSTR_VMRUN;
+		exit_code = SVM_EXIT_VMRUN;
+		break;
 	case 0xda: /* VMLOAD */
-		return SVM_INSTR_VMLOAD;
+		exit_code = SVM_EXIT_VMLOAD;
+		break;
 	case 0xdb: /* VMSAVE */
-		return SVM_INSTR_VMSAVE;
-	default:
+		exit_code = SVM_EXIT_VMSAVE;
 		break;
+	default:
+		return false;
 	}
 
-	return NONE_SVM_INSTR;
-}
-
-static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
-{
-	const int guest_mode_exit_codes[] = {
-		[SVM_INSTR_VMRUN] = SVM_EXIT_VMRUN,
-		[SVM_INSTR_VMLOAD] = SVM_EXIT_VMLOAD,
-		[SVM_INSTR_VMSAVE] = SVM_EXIT_VMSAVE,
-	};
-	int (*const svm_instr_handlers[])(struct kvm_vcpu *vcpu) = {
-		[SVM_INSTR_VMRUN] = vmrun_interception,
-		[SVM_INSTR_VMLOAD] = vmload_interception,
-		[SVM_INSTR_VMSAVE] = vmsave_interception,
-	};
-	struct vcpu_svm *svm = to_svm(vcpu);
+	/*
+	 * #GP due to CPL != 0 takes precedence over intercepts, but intercepts
+	 * take precedence over #GP due to invalid RAX (which is checked by the
+	 * exit handlers).
+	 */
+	*ret = 1;
+	if (to_svm(vcpu)->vmcb->save.cpl)
+		kvm_inject_gp(vcpu, 0);
+	else if (is_guest_mode(vcpu) && vmcb12_is_intercept(&svm->nested.ctl, exit_code))
+		nested_svm_simple_vmexit(svm, exit_code);
+	else
+		*ret = svm_invoke_exit_handler(vcpu, exit_code);
 
-	if (is_guest_mode(vcpu)) {
-		nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
-		return 1;
-	}
-	return svm_instr_handlers[opcode](vcpu);
+	return true;
 }
 
 /*
@@ -2297,7 +2285,7 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u32 error_code = svm->vmcb->control.exit_info_1;
-	int opcode;
+	int r;
 
 	/* Both #GP cases have zero error_code */
 	if (error_code)
@@ -2307,9 +2295,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
 	if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
 		goto reinject;
 
-	opcode = svm_instr_opcode(vcpu);
-	if (opcode != NONE_SVM_INSTR)
-		return emulate_svm_instr(vcpu, opcode);
+	if (check_emulate_svm_instr(vcpu, &r))
+		return r;
 
 	if (!enable_vmware_backdoor)
 		goto reinject;

---

The only thing I am unsure of is whether to check if it's an SVM
instruction in a separate helper to avoid the output parameter.

  reply	other threads:[~2026-03-13 23:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13  0:10 [PATCH v3 0/7] KVM: SVM: Fixes for VMCB12 checks and mapping Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator Yosry Ahmed
2026-03-15 12:55   ` Paolo Bonzini
2026-03-16 13:49     ` Yosry Ahmed
2026-03-16 16:28       ` Yosry Ahmed
2026-03-17 13:15       ` Paolo Bonzini
2026-03-17 14:58         ` Jim Mattson
2026-03-18 15:55           ` Paolo Bonzini
2026-03-13  0:10 ` [PATCH v3 2/7] KVM: SVM: Check that RAX has legal GPA on #GP interception of SVM insns Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 3/7] KVM: SVM: Move RAX legality check to SVM insn interception handlers Yosry Ahmed
2026-03-13 18:17   ` Yosry Ahmed
2026-03-13 22:44     ` Sean Christopherson
2026-03-13 23:08       ` Yosry Ahmed [this message]
2026-03-16 15:25     ` Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 4/7] KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 5/7] KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 6/7] KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa Yosry Ahmed
2026-03-13  0:10 ` [PATCH v3 7/7] KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abSXvZGuxCGJH0ID@google.com \
    --to=yosry@kernel.org \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.