All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Jon Kohler <jon@nutanix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: X86: set EXITING_GUEST_MODE as soon as vCPU exits
Date: Thu, 1 Dec 2022 19:17:15 +0000	[thread overview]
Message-ID: <Y4j9u6YEpJ/px6kj@google.com> (raw)
In-Reply-To: <20221129182226.82087-1-jon@nutanix.com>

On Tue, Nov 29, 2022, Jon Kohler wrote:
> Set vcpu->mode to EXITING_GUEST_MODE as soon vCPU exits to reflect
> that we are indeed exiting guest mode, but not quite out of guest
> mode yet. Note: This is done lazily without an explicit memory
> barrier so that we do not regress the cost in the critical path
> of going from the exit to the exit handler.

This is not remotely sufficient justification.  Memory "barriers" are just compiler
barriers on x86, so the odds of regressing the VM-Enter/VM-Exit cost without
introducing a bug are miniscule.

> Flip back to IN_GUEST_MODE for exits that use
> EXIT_FASTPATH_REENTER_GUEST, such that we are IN_GUEST_MODE upon
> reentry.
> 
> Changing vcpu->mode away from IN_GUEST_MODE as early as possible

Except this isn't as early as possible.  If we're going to bother doing something
like this, my vote is to move it into assembly.

> gives IPI senders as much runway as possible to avoid ringing
> doorbell or sending posted interrupt IPI in AMD and Intel,
> respectively. Since this is done without an explicit memory
> barrier, the worst case is that the IPI sender sees IN_GUEST_MODE
> still and sends a spurious event, which is the behavior prior
> to this patch.

No, worst case scenario is that kvm_vcpu_exiting_guest_mode() sees EXITING_GUEST_MODE
and doesn't kick the vCPU.  For "kicks" that set a request, kvm_vcpu_exit_request()
will punt the vCPU out of the tight run loop, though there might be ordering issues.

But whether or not there are ordering issues is a moot point since there are uses
of kvm_vcpu_kick() that aren't accompanied by a request, e.g. to purge the PML
buffers.  In other words, kvm_vcpu_kick() absolutely cannot have false negatives.
We could modify KVM to require a request when using kvm_vcpu_kick(), but that's
a bit of a hack, and all of the possible ordering problems is still a pile of
complexity I'd rather avoid.

No small part of me thinks we'd be better off adding a dedicated flag to very
precisely track whether or not the vCPU is truly "in the guest" for the purposes
of sending IPIs.  Things like kicks have different requirements around IN_GUEST_MODE
than sending interrupts, e.g. KVM manually processes the IRR on every VM-Enter and
so lack of an IPI is a non-issue, whereas missing an IPI for a kick is problematic.
In other words, EXITING_GUEST_MODE really needs to mean "existing the run loop".

E.g. toggle the dedicated flag within a few instructions of VM-Enter and
VM-Exit for maximum efficiency for interrupts, and avoid having to make vcpu->mode
more complex than it already is.

To add clarity, we could even rename IN_GUEST_MODE and EXITING_GUEST_MODE to
something like IN_RUN_LOOP and EXITING_RUN_LOOP.

> Signed-off-by: Jon Kohler <jon@nutanix.com>
> ---
>  arch/x86/kvm/svm/svm.c |  7 +++++++
>  arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
>  arch/x86/kvm/x86.c     |  8 ++++++++
>  3 files changed, 38 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index ce362e88a567..5f0c118a3ffd 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3907,6 +3907,13 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
>  	else
>  		__svm_vcpu_run(svm, spec_ctrl_intercepted);
>  
> +	/* Optimize IPI reduction by setting mode immediately after vmexit

	/*
 	 * Because KVM isn't the crazy land of net/ block comments should like
	 * this. 
	 */

> +	 * without a memmory barrier as this as not paired anywhere.
> +	 * is will be set to OUTSIDE_GUEST_MODE in x86 common code with a memory
> +	 * barrier, after the host is done fully restoring various host states.
> +	 */
> +	vcpu->mode = EXITING_GUEST_MODE;
> +
>  	guest_state_exit_irqoff();
>  }
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 63247c57c72c..243dcb87c727 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5878,6 +5878,17 @@ static fastpath_t handle_fastpath_preemption_timer(struct kvm_vcpu *vcpu)
>  
>  	if (!vmx->req_immediate_exit &&
>  	    !unlikely(vmx->loaded_vmcs->hv_timer_soft_disabled)) {
> +		/* Reset IN_GUEST_MODE since we're going to reenter
> +		 * guest as part of this fast path. This is done as
> +		 * an optimization without a memory barrier since
> +		 * EXITING_GUEST_MODE is also set without a memory

Heh, justifying the lack of a memory barrier by saying pointing out the other
code you added doesn't use a memory barrier is interesting, to put it politely.

> +		 * barrier. This also needs to be reset prior to
> +		 * calling apic_timer_expired() so that
> +		 * kvm_use_posted_timer_interrupt() returns the proper
> +		 * value.
> +		 */
> +		if (vcpu->mode == EXITING_GUEST_MODE)
> +			vcpu->mode = IN_GUEST_MODE;

It's far easier, likely more performant, documents why this has a chance of working,
and significantly less error prone to do this unconditionally in either assembly
or after the EXIT_FASTPATH_REENTER_GUEST check in vcpu_enter_guest().

  parent reply	other threads:[~2022-12-01 19:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-29 18:22 [PATCH] KVM: X86: set EXITING_GUEST_MODE as soon as vCPU exits Jon Kohler
2022-11-29 19:47 ` Maxim Levitsky
2022-11-29 19:56   ` Jon Kohler
2022-12-01 15:45     ` Maxim Levitsky
2022-11-30  6:29 ` Chao Gao
2022-11-30 14:07   ` Jon Kohler
2022-12-01  4:55     ` Chao Gao
2022-12-01 14:57       ` Jon Kohler
2022-12-01 19:17 ` Sean Christopherson [this message]
2022-12-05 15:09   ` Jon Kohler
2022-12-07 20:18     ` Sean Christopherson
2022-12-09  8:06       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4j9u6YEpJ/px6kj@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jon@nutanix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.