All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Jon Kohler <jon@nutanix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] KVM: X86: set EXITING_GUEST_MODE as soon as vCPU exits
Date: Wed, 7 Dec 2022 20:18:45 +0000	[thread overview]
Message-ID: <Y5D1JWutV7+nARxS@google.com> (raw)
In-Reply-To: <B9071742-7C64-40F4-8A93-D61DC1FD4CE5@nutanix.com>

On Mon, Dec 05, 2022, Jon Kohler wrote:
> 
> > On Dec 1, 2022, at 2:17 PM, Sean Christopherson <seanjc@google.com> wrote:
> >> Changing vcpu->mode away from IN_GUEST_MODE as early as possible
> > 
> > Except this isn't as early as possible.  If we're going to bother doing something
> > like this, my vote is to move it into assembly.
> 
> In vmenter.S, tacking on to call vmx_spec_ctrl_restore_host seemed like the
> most logical place after handling all of the state saves and RSB work. Are you
> saying put it even closer to the exit, meaning before the FILL_RETURN_BUFFER?

Yes, assuming that's safe, in which case it's truly the "as early as possible"
location.

> >> gives IPI senders as much runway as possible to avoid ringing
> >> doorbell or sending posted interrupt IPI in AMD and Intel,
> >> respectively. Since this is done without an explicit memory
> >> barrier, the worst case is that the IPI sender sees IN_GUEST_MODE
> >> still and sends a spurious event, which is the behavior prior
> >> to this patch.
> > 
> > No, worst case scenario is that kvm_vcpu_exiting_guest_mode() sees EXITING_GUEST_MODE
> > and doesn't kick the vCPU.  For "kicks" that set a request, kvm_vcpu_exit_request()
> > will punt the vCPU out of the tight run loop, though there might be ordering issues.
> > 
> > But whether or not there are ordering issues is a moot point since there are uses
> > of kvm_vcpu_kick() that aren't accompanied by a request, e.g. to purge the PML
> > buffers.  In other words, kvm_vcpu_kick() absolutely cannot have false negatives.
> > We could modify KVM to require a request when using kvm_vcpu_kick(), but that's
> > a bit of a hack, and all of the possible ordering problems is still a pile of
> > complexity I'd rather avoid.
> > 
> > No small part of me thinks we'd be better off adding a dedicated flag to very
> > precisely track whether or not the vCPU is truly "in the guest" for the purposes
> > of sending IPIs.  Things like kicks have different requirements around IN_GUEST_MODE
> > than sending interrupts, e.g. KVM manually processes the IRR on every VM-Enter and
> > so lack of an IPI is a non-issue, whereas missing an IPI for a kick is problematic.
> > In other words, EXITING_GUEST_MODE really needs to mean "existing the run loop".
> 
> Do you mean:
> “one small part” (as in give this a shot, maybe), 
> or 
> “no small part” (as in good-god-don’t-do-this!)

Neither.  "No small part" as in "Most of my brain", i.e. "I haven't completely
thought things through, but I think we'd be better off adding a dedicated flag".

> I’m assuming you meant one small part :) sure, how about something like:
> 
> To my earlier comment about when to do this within a few instructions, I don’t want
> to clobber other stuff happening as part of the enter/exit, what if we repurposed/renamed
> vmx_update_host_rsp and vmx_spec_ctrl_restore_host to make them “do stuff before
> entry” and “do stuff right after entry returns” functions. That way we wouldn’t have to
> add another other function calls or change the existing control flow all that much.

I'd prefer not to wrap vmx_update_host_rsp(), that thing is a very special
snowflake.

I don't see why we'd have to add function calls or change the existing control
flow anyways.  The asm flows for VMX and SVM both take the vCPU in the form of
@vmx and @svm, so accessing the proposed excution mode field is just a matter of
adding an entry in arch/x86/kvm/kvm-asm-offsets.c.

And now that kvm-asm-offsets.c exists, I think it makes sense to drop the @regs
parameter for __vmx_vcpu_run(), e.g. to roughly match __svm_vcpu_run().

With that done as prep, accessing the vCPU immediately before/after VM-Enter and
VM-Exit is easy.

As a rough, incomplete sketch for VMX:

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 766c6b3ef5ed..f80553e34f26 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -102,7 +102,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
         * an LFENCE to stop speculation from skipping the wrmsr.
         */
 
-       /* Load @regs to RAX. */
+       /* Load @vmx to RAX. */
        mov (%_ASM_SP), %_ASM_AX
 
        /* Check if vmlaunch or vmresume is needed */
@@ -125,7 +125,11 @@ SYM_FUNC_START(__vmx_vcpu_run)
        mov VCPU_R14(%_ASM_AX), %r14
        mov VCPU_R15(%_ASM_AX), %r15
 #endif
-       /* Load guest RAX.  This kills the @regs pointer! */
+
+       /* Mark the vCPU as executing in the guest! */
+       movb $IN_GUEST_MODE, VCPU_EXECUTION_MODE(_%ASM_AX)
+
+       /* Load guest RAX.  This kills the @vmx pointer! */
        mov VCPU_RAX(%_ASM_AX), %_ASM_AX
 
        /* Check EFLAGS.ZF from 'testb' above */
@@ -163,9 +167,11 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
        /* Temporarily save guest's RAX. */
        push %_ASM_AX
 
-       /* Reload @regs to RAX. */
+       /* Reload @vmx to RAX. */
        mov WORD_SIZE(%_ASM_SP), %_ASM_AX
 
+       movb $IN_HOST_MODE, VCPU_EXECUTION_MODE(_%ASM_AX)
+
        /* Save all guest registers, including RAX from the stack */
        pop           VCPU_RAX(%_ASM_AX)
        mov %_ASM_CX, VCPU_RCX(%_ASM_AX)
@@ -189,9 +195,12 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
        xor %ebx, %ebx
 
 .Lclear_regs:
-       /* Discard @regs.  The register is irrelevant, it just can't be RBX. */
+       /* Pop @vmx.  The register can be anything except RBX. */
        pop %_ASM_AX
 
+       /* Set the execution mode (again) in case of VM-Fail. */
+       movb $IN_HOST_MODE, VCPU_EXECUTION_MODE(_%ASM_AX)
+
        /*
         * Clear all general purpose registers except RSP and RBX to prevent
         * speculative use of the guest's values, even those that are reloaded


> In “do before” we could set something like vcpu->non_root_mode = true
> In “do after” we could set vcpu->non_root_mode = false
> 
> Or perhaps something like (respectively)
> vcpu->operational_state = NON_ROOT_MODE
> vcpu->operational_state = ROOT_MODE
>
> Using the root/non-root moniker would precisely track when the whether the 
> vCPU is in guest, and aligns with the language used to describe such a state
> from the SDM.
> 
> Thoughts?

No, we should use KVM-defined names, not VMX-defined names, because we'll want
the same logic for SVM.  If we first rename GUEST_MODE => RUN_LOOP, then we can
use IN_GUEST_MODE and IN_HOST_MODE or whatever.

  reply	other threads:[~2022-12-07 20:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-29 18:22 [PATCH] KVM: X86: set EXITING_GUEST_MODE as soon as vCPU exits Jon Kohler
2022-11-29 19:47 ` Maxim Levitsky
2022-11-29 19:56   ` Jon Kohler
2022-12-01 15:45     ` Maxim Levitsky
2022-11-30  6:29 ` Chao Gao
2022-11-30 14:07   ` Jon Kohler
2022-12-01  4:55     ` Chao Gao
2022-12-01 14:57       ` Jon Kohler
2022-12-01 19:17 ` Sean Christopherson
2022-12-05 15:09   ` Jon Kohler
2022-12-07 20:18     ` Sean Christopherson [this message]
2022-12-09  8:06       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y5D1JWutV7+nARxS@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jon@nutanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.