linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Saenz Julienne <nsaenz@amazon.com>
To: <linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>
Cc: <pbonzini@redhat.com>, <seanjc@google.com>, <vkuznets@redhat.com>,
	<linux-doc@vger.kernel.org>, <linux-hyperv@vger.kernel.org>,
	<linux-arch@vger.kernel.org>,
	<linux-trace-kernel@vger.kernel.org>, <graf@amazon.de>,
	<dwmw2@infradead.org>, <paul@amazon.com>, <nsaenz@amazon.com>,
	<mlevitsk@redhat.com>, <jgowans@amazon.com>, <corbet@lwn.net>,
	<decui@microsoft.com>, <tglx@linutronix.de>, <mingo@redhat.com>,
	<bp@alien8.de>, <dave.hansen@linux.intel.com>, <x86@kernel.org>,
	<amoorthy@google.com>
Subject: [PATCH 05/18] KVM: x86: hyper-v: Introduce MP_STATE_HV_INACTIVE_VTL
Date: Sun, 9 Jun 2024 15:49:34 +0000	[thread overview]
Message-ID: <20240609154945.55332-6-nsaenz@amazon.com> (raw)
In-Reply-To: <20240609154945.55332-1-nsaenz@amazon.com>

Model inactive VTL vCPUs' behaviour with a new MP state.

Inactive VTLs are in an artificial halt state. They enter into this
state in response to invoking HvCallVtlCall, HvCallVtlReturn.
User-space, which is VTL aware, can processes the hypercall, and set the
vCPU in MP_STATE_HV_INACTIVE_VTL. When a vCPU is run in this state it'll
block until a wakeup event is received. The rules of what constitutes an
event are analogous to halt's except that VTL's ignore RFLAGS.IF.

When a wakeup event is registered, KVM will exit to user-space with a
KVM_SYSTEM_EVENT exit, and KVM_SYSTEM_EVENT_WAKEUP event type.
User-space is responsible of deciding whether the event has precedence
over the active VTL and will switch the vCPU to KVM_MP_STATE_RUNNABLE
before resuming execution on it.

Running a KVM_MP_STATE_HV_INACTIVE_VTL vCPU with pending events will
return immediately to user-space.

Note that by re-using the readily available halt infrastructure in
KVM_RUN, MP_STATE_HV_INACTIVE_VTL correctly handles (or disables)
virtualisation features like the VMX preemption timer or APICv before
blocking.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>

---

I do recall Sean mentioning using MP states for this might have
unexpected side-effects. But it was in the context of introducing a
broader `HALTED_USERSPACE` style state. I believe that by narrowing down
the MP state's semantics to the specifics of inactive VTLs --
alternatively, we could change RFLAGS.IF in user-space before updating
the mp state -- we cement this as a VSM-only API as well as limit the
ambiguity on the guest/vCPU's state upon entering into this execution
mode.

 Documentation/virt/kvm/api.rst | 19 +++++++++++++++++++
 arch/x86/kvm/hyperv.h          |  8 ++++++++
 arch/x86/kvm/svm/svm.c         |  7 ++++++-
 arch/x86/kvm/vmx/vmx.c         |  7 ++++++-
 arch/x86/kvm/x86.c             | 16 +++++++++++++++-
 include/uapi/linux/kvm.h       |  1 +
 6 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 17893b330b76f..e664c54a13b04 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1517,6 +1517,8 @@ Possible values are:
                                  [s390]
    KVM_MP_STATE_SUSPENDED        the vcpu is in a suspend state and is waiting
                                  for a wakeup event [arm64]
+   KVM_MP_STATE_HV_INACTIVE_VTL  the vcpu is an inactive VTL and is waiting for
+                                 a wakeup event [x86]
    ==========================    ===============================================
 
 On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
@@ -1559,6 +1561,23 @@ KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
 On LoongArch, only the KVM_MP_STATE_RUNNABLE state is used to reflect
 whether the vcpu is runnable.
 
+For x86:
+^^^^^^^^
+
+KVM_MP_STATE_HV_INACTIVE_VTL is only available to a VM if Hyper-V's
+HV_ACCESS_VSM CPUID is exposed to the guest.  This processor state models the
+behavior of an inactive VTL and should only be used for this purpose. A
+userspace process should only switch a vCPU into this MP state in response to a
+HvCallVtlCall, HvCallVtlReturn.
+
+If a vCPU is in KVM_MP_STATE_HV_INACTIVE_VTL, KVM will emulate the
+architectural execution of a HLT instruction with the caveat that RFLAGS.IF is
+ignored when deciding whether to wake up (TLFS 12.12.2.1).  If a wakeup is
+recognized, KVM will exit to userspace with a KVM_SYSTEM_EVENT exit, where the
+event type is KVM_SYSTEM_EVENT_WAKEUP. Userspace has the responsibility to
+switch the vCPU back into KVM_MP_STATE_RUNNABLE state. Calling KVM_RUN on a
+KVM_MP_STATE_HV_INACTIVE_VTL vCPU with pending events will exit immediately.
+
 4.39 KVM_SET_MP_STATE
 ---------------------
 
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index d007d2203e0e4..d42fe3f85b002 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -271,6 +271,10 @@ static inline bool kvm_hv_cpuid_vsm_enabled(struct kvm_vcpu *vcpu)
 
 	return hv_vcpu && (hv_vcpu->cpuid_cache.features_ebx & HV_ACCESS_VSM);
 }
+static inline bool kvm_hv_vcpu_is_idle_vtl(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.mp_state == KVM_MP_STATE_HV_INACTIVE_VTL;
+}
 #else /* CONFIG_KVM_HYPERV */
 static inline void kvm_hv_setup_tsc_page(struct kvm *kvm,
 					 struct pvclock_vcpu_time_info *hv_clock) {}
@@ -332,6 +336,10 @@ static inline bool kvm_hv_cpuid_vsm_enabled(struct kvm_vcpu *vcpu)
 {
 	return false;
 }
+static inline bool kvm_hv_vcpu_is_idle_vtl(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
 #endif /* CONFIG_KVM_HYPERV */
 
 #endif /* __ARCH_X86_KVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 296c524988f95..9671191fef4ea 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -49,6 +49,7 @@
 #include "svm.h"
 #include "svm_ops.h"
 
+#include "hyperv.h"
 #include "kvm_onhyperv.h"
 #include "svm_onhyperv.h"
 
@@ -3797,6 +3798,10 @@ bool svm_interrupt_blocked(struct kvm_vcpu *vcpu)
 	if (!gif_set(svm))
 		return true;
 
+	/*
+	 * The Hyper-V TLFS states that RFLAGS.IF is ignored when deciding
+	 * whether to block interrupts targeted at inactive VTLs.
+	 */
 	if (is_guest_mode(vcpu)) {
 		/* As long as interrupts are being delivered...  */
 		if ((svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK)
@@ -3808,7 +3813,7 @@ bool svm_interrupt_blocked(struct kvm_vcpu *vcpu)
 		if (nested_exit_on_intr(svm))
 			return false;
 	} else {
-		if (!svm_get_if_flag(vcpu))
+		if (!svm_get_if_flag(vcpu) && !kvm_hv_vcpu_is_idle_vtl(vcpu))
 			return true;
 	}
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b3c83c06f8265..ac0682fece604 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5057,7 +5057,12 @@ bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu)
 	if (is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
 		return false;
 
-	return !(vmx_get_rflags(vcpu) & X86_EFLAGS_IF) ||
+	/*
+	 * The Hyper-V TLFS states that RFLAGS.IF is ignored when deciding
+	 * whether to block interrupts targeted at inactive VTLs.
+	 */
+	return (!(vmx_get_rflags(vcpu) & X86_EFLAGS_IF) &&
+		!kvm_hv_vcpu_is_idle_vtl(vcpu)) ||
 	       (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
 		(GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8c9e4281d978d..a6e2312ccb68f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -134,6 +134,7 @@ static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu);
 
 static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
 static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
+static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu);
 
 static DEFINE_MUTEX(vendor_module_lock);
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
@@ -11176,7 +11177,8 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu)
 			kvm_lapic_switch_to_sw_timer(vcpu);
 
 		kvm_vcpu_srcu_read_unlock(vcpu);
-		if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
+		if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED ||
+		    kvm_hv_vcpu_is_idle_vtl(vcpu))
 			kvm_vcpu_halt(vcpu);
 		else
 			kvm_vcpu_block(vcpu);
@@ -11218,6 +11220,7 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu)
 		vcpu->arch.apf.halted = false;
 		break;
 	case KVM_MP_STATE_INIT_RECEIVED:
+	case KVM_MP_STATE_HV_INACTIVE_VTL:
 		break;
 	default:
 		WARN_ON_ONCE(1);
@@ -11264,6 +11267,13 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 		if (kvm_cpu_has_pending_timer(vcpu))
 			kvm_inject_pending_timer_irqs(vcpu);
 
+		if (kvm_hv_vcpu_is_idle_vtl(vcpu) && kvm_vcpu_has_events(vcpu)) {
+			r = 0;
+			vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+			vcpu->run->system_event.type = KVM_SYSTEM_EVENT_WAKEUP;
+			break;
+		}
+
 		if (dm_request_for_irq_injection(vcpu) &&
 			kvm_vcpu_ready_for_interrupt_injection(vcpu)) {
 			r = 0;
@@ -11703,6 +11713,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 			goto out;
 		break;
 
+	case KVM_MP_STATE_HV_INACTIVE_VTL:
+		if (is_guest_mode(vcpu) || !kvm_hv_cpuid_vsm_enabled(vcpu))
+			goto out;
+		break;
 	case KVM_MP_STATE_RUNNABLE:
 		break;
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fbdee8d754595..f4864e6907e0b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -564,6 +564,7 @@ struct kvm_vapic_addr {
 #define KVM_MP_STATE_LOAD              8
 #define KVM_MP_STATE_AP_RESET_HOLD     9
 #define KVM_MP_STATE_SUSPENDED         10
+#define KVM_MP_STATE_HV_INACTIVE_VTL   11
 
 struct kvm_mp_state {
 	__u32 mp_state;
-- 
2.40.1


  parent reply	other threads:[~2024-06-09 15:54 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-09 15:49 [PATCH 00/18] Introducing Core Building Blocks for Hyper-V VSM Emulation Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 01/18] KVM: x86: hyper-v: Introduce XMM output support Nicolas Saenz Julienne
2024-07-08 14:59   ` Vitaly Kuznetsov
2024-07-17 14:12     ` Nicolas Saenz Julienne
2024-07-29 13:53       ` Vitaly Kuznetsov
2024-08-05 14:08         ` Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 02/18] KVM: x86: hyper-v: Introduce helpers to check if VSM is exposed to guest Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 03/18] hyperv-tlfs: Update struct hv_send_ipi{_ex}'s declarations Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 04/18] KVM: x86: hyper-v: Introduce VTL awareness to Hyper-V's PV-IPIs Nicolas Saenz Julienne
2024-09-13 18:02   ` Sean Christopherson
2024-09-16 14:52     ` Nicolas Saenz Julienne
2024-06-09 15:49 ` Nicolas Saenz Julienne [this message]
2024-09-13 19:01   ` [PATCH 05/18] KVM: x86: hyper-v: Introduce MP_STATE_HV_INACTIVE_VTL Sean Christopherson
2024-09-16 15:33     ` Nicolas Saenz Julienne
2024-09-18  7:56       ` Sean Christopherson
2024-06-09 15:49 ` [PATCH 06/18] KVM: x86: hyper-v: Exit on Get/SetVpRegisters hcall Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 07/18] KVM: x86: hyper-v: Exit on TranslateVirtualAddress hcall Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 08/18] KVM: x86: hyper-v: Exit on StartVirtualProcessor and GetVpIndexFromApicId hcalls Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 09/18] KVM: Define and communicate KVM_EXIT_MEMORY_FAULT RWX flags to userspace Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 10/18] KVM: x86: Keep track of instruction length during faults Nicolas Saenz Julienne
2024-09-13 19:10   ` Sean Christopherson
2024-06-09 15:49 ` [PATCH 11/18] KVM: x86: Pass the instruction length on memory fault user-space exits Nicolas Saenz Julienne
2024-09-13 19:11   ` Sean Christopherson
2024-09-16 15:53     ` Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 12/18] KVM: x86/mmu: Introduce infrastructure to handle non-executable mappings Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 13/18] KVM: x86/mmu: Avoid warning when installing non-private memory attributes Nicolas Saenz Julienne
2024-09-13 19:13   ` Sean Christopherson
2024-06-09 15:49 ` [PATCH 14/18] KVM: x86/mmu: Init memslot if memory attributes available Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 15/18] KVM: Introduce RWX memory attributes Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 16/18] KVM: x86: Take mem attributes into account when faulting memory Nicolas Saenz Julienne
2024-08-22 15:21   ` Nicolas Saenz Julienne
2024-08-22 16:58     ` Sean Christopherson
2024-09-13 18:26       ` Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 17/18] KVM: Introduce traces to track memory attributes modification Nicolas Saenz Julienne
2024-06-09 15:49 ` [PATCH 18/18] KVM: x86: hyper-v: Handle VSM hcalls in user-space Nicolas Saenz Julienne
2024-07-03  9:55 ` [PATCH 00/18] Introducing Core Building Blocks for Hyper-V VSM Emulation Nicolas Saenz Julienne
2024-07-03 12:48   ` Vitaly Kuznetsov
2024-07-03 13:18     ` Nicolas Saenz Julienne
2024-09-13 19:19 ` Sean Christopherson
2024-09-16 16:32   ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240609154945.55332-6-nsaenz@amazon.com \
    --to=nsaenz@amazon.com \
    --cc=amoorthy@google.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=dwmw2@infradead.org \
    --cc=graf@amazon.de \
    --cc=jgowans@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=paul@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).