public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/19] KVM hardware enable/disable reorganize
@ 2022-08-30 12:01 isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online() isaku.yamahata
                   ` (18 more replies)
  0 siblings, 19 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

This patch series is to implement the suggestion by Sean Christopherson [1]
to reorganize enable/disable cpu virtualization feature by replacing
the arch-generic current enable/disable logic with PM related hooks. And
convert kvm/x86 to use new hooks.

- Untable x86 hardware enable logic, snapshot MSRs for user return notifier,
  enabling cpu virtualization on cpu online and platform resume. and real
  enabling of CPU virtualization feature
- Introduce hooks related to PM.
- Convert kvm/x86 code to user those hooks.
- Split out hardware enabling/disabling logic into a separate file.  Compile
  it for non-x86 code.  Once conversion of other KVM archs is done, this file
  can be dropped.
- Delete cpus_hardware_enabled. 17/18 and 18/18

[1] https://lore.kernel.org/kvm/YvU+6fdkHaqQiKxp@google.com/

Changes from v1:
- Add a patch "KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section"
  to make online/offline callback to run thread context to use mutex instead
  of spin lock
- fixes pointed by Chao Gao

Chao Gao (3):
  KVM: x86: Move check_processor_compatibility from init ops to runtime
    ops
  Partially revert "KVM: Pass kvm_init()'s opaque param to additional
    arch funcs"
  KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section

Isaku Yamahata (16):
  KVM: x86: Drop kvm_user_return_msr_cpu_online()
  KVM: x86: Use this_cpu_ptr() instead of
    per_cpu_ptr(smp_processor_id())
  KVM: Drop kvm_count_lock and instead protect kvm_usage_count with
    kvm_lock
  KVM: Add arch hooks for PM events with empty stub
  KVM: x86: Move TSC fixup logic to KVM arch resume callback
  KVM: Add arch hook when VM is added/deleted
  KVM: Move out KVM arch PM hooks and hardware enable/disable logic
  KVM: kvm_arch.c: Remove _nolock post fix
  KVM: kvm_arch.c: Remove a global variable, hardware_enable_failed
  KVM: Do processor compatibility check on cpu online and resume
  KVM: x86: Duplicate arch callbacks related to pm events
  KVM: Eliminate kvm_arch_post_init_vm()
  KVM: x86: Delete kvm_arch_hardware_enable/disable()
  KVM: Add config to not compile kvm_arch.c
  RFC: KVM: x86: Remove cpus_hardware_enabled and related sanity check
  RFC: KVM: Remove cpus_hardware_enabled and related sanity check

 Documentation/virt/kvm/locking.rst |  14 +--
 arch/arm64/kvm/arm.c               |   2 +-
 arch/mips/kvm/mips.c               |   2 +-
 arch/powerpc/kvm/powerpc.c         |   2 +-
 arch/riscv/kvm/main.c              |   2 +-
 arch/s390/kvm/kvm-s390.c           |   2 +-
 arch/x86/include/asm/kvm-x86-ops.h |   1 +
 arch/x86/include/asm/kvm_host.h    |   2 +-
 arch/x86/kvm/Kconfig               |   1 +
 arch/x86/kvm/svm/svm.c             |   4 +-
 arch/x86/kvm/vmx/vmx.c             |  14 +--
 arch/x86/kvm/x86.c                 | 192 +++++++++++++++++++++-------
 include/linux/cpuhotplug.h         |   2 +-
 include/linux/kvm_host.h           |  14 ++-
 virt/kvm/Kconfig                   |   3 +
 virt/kvm/Makefile.kvm              |   3 +
 virt/kvm/kvm_arch.c                | 126 +++++++++++++++++++
 virt/kvm/kvm_main.c                | 193 +++++++++--------------------
 18 files changed, 373 insertions(+), 206 deletions(-)
 create mode 100644 virt/kvm/kvm_arch.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online()
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-09-01  5:29   ` Chao Gao
  2022-08-30 12:01 ` [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) isaku.yamahata
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

KVM/X86 uses user return notifier to switch MSR for guest or user space.
Snapshot host values on CPU online, change MSR values for guest, and
restore them on returning to user space.  The current code abuses
kvm_arch_hardware_enable() which is called on kvm module initialization or
CPU online.

Remove such the abuse of kvm_arch_hardware_enable by capturing the host
value on the first change of the MSR value to guest VM instead of CPU
online.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 43 ++++++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 205ebdc2b11b..16104a2f7d8e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -200,6 +200,7 @@ struct kvm_user_return_msrs {
 	struct kvm_user_return_msr_values {
 		u64 host;
 		u64 curr;
+		bool initialized;
 	} values[KVM_MAX_NR_USER_RETURN_MSRS];
 };
 
@@ -363,6 +364,10 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
 	local_irq_restore(flags);
 	for (slot = 0; slot < kvm_nr_uret_msrs; ++slot) {
 		values = &msrs->values[slot];
+		/*
+		 * No need to check values->initialized because host = curr = 0
+		 * by __GFP_ZERO when !values->initialized.
+		 */
 		if (values->host != values->curr) {
 			wrmsrl(kvm_uret_msrs_list[slot], values->host);
 			values->curr = values->host;
@@ -409,34 +414,30 @@ int kvm_find_user_return_msr(u32 msr)
 }
 EXPORT_SYMBOL_GPL(kvm_find_user_return_msr);
 
-static void kvm_user_return_msr_cpu_online(void)
-{
-	unsigned int cpu = smp_processor_id();
-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
-	u64 value;
-	int i;
-
-	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
-		rdmsrl_safe(kvm_uret_msrs_list[i], &value);
-		msrs->values[i].host = value;
-		msrs->values[i].curr = value;
-	}
-}
-
 int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
 {
 	unsigned int cpu = smp_processor_id();
 	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
+	struct kvm_user_return_msr_values *values = &msrs->values[slot];
 	int err;
 
-	value = (value & mask) | (msrs->values[slot].host & ~mask);
-	if (value == msrs->values[slot].curr)
+	if (unlikely(!values->initialized)) {
+		u64 host_value;
+
+		rdmsrl_safe(kvm_uret_msrs_list[slot], &host_value);
+		values->host = host_value;
+		values->curr = host_value;
+		values->initialized = true;
+	}
+
+	value = (value & mask) | (values->host & ~mask);
+	if (value == values->curr)
 		return 0;
 	err = wrmsrl_safe(kvm_uret_msrs_list[slot], value);
 	if (err)
 		return 1;
 
-	msrs->values[slot].curr = value;
+	values->curr = value;
 	if (!msrs->registered) {
 		msrs->urn.on_user_return = kvm_on_user_return;
 		user_return_notifier_register(&msrs->urn);
@@ -9212,7 +9213,12 @@ int kvm_arch_init(void *opaque)
 		return -ENOMEM;
 	}
 
-	user_return_msrs = alloc_percpu(struct kvm_user_return_msrs);
+	/*
+	 * __GFP_ZERO to ensure user_return_msrs.values[].{host, curr} match.
+	 * See kvm_on_user_return()
+	 */
+	user_return_msrs = alloc_percpu_gfp(struct kvm_user_return_msrs,
+					    GFP_KERNEL | __GFP_ZERO);
 	if (!user_return_msrs) {
 		printk(KERN_ERR "kvm: failed to allocate percpu kvm_user_return_msrs\n");
 		r = -ENOMEM;
@@ -11836,7 +11842,6 @@ int kvm_arch_hardware_enable(void)
 	u64 max_tsc = 0;
 	bool stable, backwards_tsc = false;
 
-	kvm_user_return_msr_cpu_online();
 	ret = static_call(kvm_x86_hardware_enable)();
 	if (ret != 0)
 		return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id())
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online() isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-09-01  5:56   ` Chao Gao
  2022-08-30 12:01 ` [PATCH v2 03/19] KVM: x86: Move check_processor_compatibility from init ops to runtime ops isaku.yamahata
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

convert per_cpu_ptr(smp_processor_id()) to this_cpu_ptr() as trivial
cleanup.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 16104a2f7d8e..7d5fff68befe 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -416,8 +416,7 @@ EXPORT_SYMBOL_GPL(kvm_find_user_return_msr);
 
 int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
 {
-	unsigned int cpu = smp_processor_id();
-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
+	struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
 	struct kvm_user_return_msr_values *values = &msrs->values[slot];
 	int err;
 
@@ -449,8 +448,7 @@ EXPORT_SYMBOL_GPL(kvm_set_user_return_msr);
 
 static void drop_user_return_notifiers(void)
 {
-	unsigned int cpu = smp_processor_id();
-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
+	struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
 
 	if (msrs->registered)
 		kvm_on_user_return(&msrs->urn);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 03/19] KVM: x86: Move check_processor_compatibility from init ops to runtime ops
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online() isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" isaku.yamahata
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Chao Gao <chao.gao@intel.com>

so that KVM can do compatibility checks on hotplugged CPUs. Drop __init
from check_processor_compatibility() and its callees.

use a static_call() to invoke .check_processor_compatibility.

Opportunistically rename {svm,vmx}_check_processor_compat to conform
to the naming convention of fields of kvm_x86_ops.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220216031528.92558-2-chao.gao@intel.com
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  2 +-
 arch/x86/kvm/svm/svm.c             |  4 ++--
 arch/x86/kvm/vmx/vmx.c             | 14 +++++++-------
 arch/x86/kvm/x86.c                 |  3 +--
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 51f777071584..3bc45932e2d1 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -129,6 +129,7 @@ KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
 KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP(check_processor_compatibility)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c96c43c313a..5df5d88d345f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1445,6 +1445,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
 struct kvm_x86_ops {
 	const char *name;
 
+	int (*check_processor_compatibility)(void);
 	int (*hardware_enable)(void);
 	void (*hardware_disable)(void);
 	void (*hardware_unsetup)(void);
@@ -1655,7 +1656,6 @@ struct kvm_x86_nested_ops {
 struct kvm_x86_init_ops {
 	int (*cpu_has_kvm_support)(void);
 	int (*disabled_by_bios)(void);
-	int (*check_processor_compatibility)(void);
 	int (*hardware_setup)(void);
 	unsigned int (*handle_intel_pt_intr)(void);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f3813dbacb9f..371300f03f55 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4134,7 +4134,7 @@ svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 	hypercall[2] = 0xd9;
 }
 
-static int __init svm_check_processor_compat(void)
+static int svm_check_processor_compatibility(void)
 {
 	return 0;
 }
@@ -4740,6 +4740,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = "kvm_amd",
 
 	.hardware_unsetup = svm_hardware_unsetup,
+	.check_processor_compatibility = svm_check_processor_compatibility,
 	.hardware_enable = svm_hardware_enable,
 	.hardware_disable = svm_hardware_disable,
 	.has_emulated_msr = svm_has_emulated_msr,
@@ -5122,7 +5123,6 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
 	.hardware_setup = svm_hardware_setup,
-	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
 	.pmu_ops = &amd_pmu_ops,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d7f8331d6f7e..3cf7f18a4115 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2495,8 +2495,8 @@ static bool cpu_has_sgx(void)
 	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
 }
 
-static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-				      u32 msr, u32 *result)
+static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
+			       u32 msr, u32 *result)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 ctl = ctl_min | ctl_opt;
@@ -2514,7 +2514,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 	return 0;
 }
 
-static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
+static u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
 {
 	u64 allowed;
 
@@ -2523,8 +2523,8 @@ static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
 	return  ctl_opt & allowed;
 }
 
-static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
-				    struct vmx_capability *vmx_cap)
+static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
+			     struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 min, opt, min2, opt2;
@@ -7417,7 +7417,7 @@ static int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
-static int __init vmx_check_processor_compat(void)
+static int vmx_check_processor_compatibility(void)
 {
 	struct vmcs_config vmcs_conf;
 	struct vmx_capability vmx_cap;
@@ -8015,6 +8015,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 
 	.hardware_unsetup = vmx_hardware_unsetup,
 
+	.check_processor_compatibility = vmx_check_processor_compatibility,
 	.hardware_enable = vmx_hardware_enable,
 	.hardware_disable = vmx_hardware_disable,
 	.has_emulated_msr = vmx_has_emulated_msr,
@@ -8404,7 +8405,6 @@ static __init int hardware_setup(void)
 static struct kvm_x86_init_ops vmx_init_ops __initdata = {
 	.cpu_has_kvm_support = cpu_has_kvm_support,
 	.disabled_by_bios = vmx_disabled_by_bios,
-	.check_processor_compatibility = vmx_check_processor_compat,
 	.hardware_setup = hardware_setup,
 	.handle_intel_pt_intr = NULL,
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7d5fff68befe..985487fe0d63 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11997,7 +11997,6 @@ void kvm_arch_hardware_unsetup(void)
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
-	struct kvm_x86_init_ops *ops = opaque;
 
 	WARN_ON(!irqs_disabled());
 
@@ -12005,7 +12004,7 @@ int kvm_arch_check_processor_compat(void *opaque)
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
 		return -EIO;
 
-	return ops->check_processor_compatibility();
+	return static_call(kvm_x86_check_processor_compatibility)();
 }
 
 bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (2 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 03/19] KVM: x86: Move check_processor_compatibility from init ops to runtime ops isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 22:39   ` Huang, Kai
  2022-08-30 12:01 ` [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section isaku.yamahata
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon,
	Suzuki K Poulose, Anup Patel, Claudio Imbrenda

From: Chao Gao <chao.gao@intel.com>

This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
param to additional arch funcs") remove opaque from
kvm_arch_check_processor_compat because no one uses this opaque now.
Address conflicts for ARM (due to file movement) and manually handle RISC-V
which comes after the commit.  The change about kvm_arch_hardware_setup()
in original commit are still needed so they are not reverted.

The current implementation enables hardware (e.g. enable VMX on all CPUs),
arch-specific initialization for the first VM creation, and disables
hardware (in x86, disable VMX on all CPUs) for last VM destruction.

To support TDX, hardware_enable_all() will be done during module loading
time.  As a result, CPU compatibility check will be opportunistically moved
to hardware_enable_nolock(), which doesn't take any argument.  Instead of
passing 'opaque' around to hardware_enable_nolock() and
hardware_enable_all(), just remove the unused 'opaque' argument from
kvm_arch_check_processor_compat().

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220216031528.92558-3-chao.gao@intel.com
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
---
 arch/arm64/kvm/arm.c       |  2 +-
 arch/mips/kvm/mips.c       |  2 +-
 arch/powerpc/kvm/powerpc.c |  2 +-
 arch/riscv/kvm/main.c      |  2 +-
 arch/s390/kvm/kvm-s390.c   |  2 +-
 arch/x86/kvm/x86.c         |  2 +-
 include/linux/kvm_host.h   |  2 +-
 virt/kvm/kvm_main.c        | 16 +++-------------
 8 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2ff0ef62abad..3385fb57c11a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -68,7 +68,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index a25e0b73ee70..092d09fb6a7e 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -140,7 +140,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index fb1490761c87..7b56d6ccfdfb 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -447,7 +447,7 @@ int kvm_arch_hardware_setup(void *opaque)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return kvmppc_core_check_processor_compat();
 }
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index 1549205fe5fe..f8d6372d208f 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -20,7 +20,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
 	return -EINVAL;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index edfd4bbd0cba..e26d4dd85668 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -254,7 +254,7 @@ int kvm_arch_hardware_enable(void)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	return 0;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 985487fe0d63..ca920b6b925d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11994,7 +11994,7 @@ void kvm_arch_hardware_unsetup(void)
 	static_call(kvm_x86_hardware_unsetup)();
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
+int kvm_arch_check_processor_compat(void)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f4519d3689e1..eab352902de7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1438,7 +1438,7 @@ int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
 int kvm_arch_hardware_setup(void *opaque);
 void kvm_arch_hardware_unsetup(void);
-int kvm_arch_check_processor_compat(void *opaque);
+int kvm_arch_check_processor_compat(void);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 584a5bab3af3..4243a9541543 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5799,22 +5799,14 @@ void kvm_unregister_perf_callbacks(void)
 }
 #endif
 
-struct kvm_cpu_compat_check {
-	void *opaque;
-	int *ret;
-};
-
-static void check_processor_compat(void *data)
+static void check_processor_compat(void *rtn)
 {
-	struct kvm_cpu_compat_check *c = data;
-
-	*c->ret = kvm_arch_check_processor_compat(c->opaque);
+	*(int *)rtn = kvm_arch_check_processor_compat();
 }
 
 int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 		  struct module *module)
 {
-	struct kvm_cpu_compat_check c;
 	int r;
 	int cpu;
 
@@ -5842,10 +5834,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (r < 0)
 		goto out_free_1;
 
-	c.ret = &r;
-	c.opaque = opaque;
 	for_each_online_cpu(cpu) {
-		smp_call_function_single(cpu, check_processor_compat, &c, 1);
+		smp_call_function_single(cpu, check_processor_compat, &r, 1);
 		if (r < 0)
 			goto out_free_2;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (3 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-09-01  5:59   ` Chao Gao
  2022-09-01  6:18   ` Chao Gao
  2022-08-30 12:01 ` [PATCH v2 06/19] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock isaku.yamahata
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon,
	Thomas Gleixner

From: Chao Gao <chao.gao@intel.com>

The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
hotplug callback to ONLINE section so that it can abort onlining a CPU in
certain cases to avoid potentially breaking VMs running on existing CPUs.
For example, when kvm fails to enable hardware virtualization on the
hotplugged CPU.

Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
when offlining a CPU, all user tasks and non-pinned kernel tasks have left
the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
CPU offline callback to disable hardware virtualization at that point.
Likewise, KVM's online callback can enable hardware virtualization before
any vCPU task gets a chance to run on hotplugged CPUs.

KVM's CPU hotplug callbacks are renamed as well.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/r/20220216031528.92558-6-chao.gao@intel.com
---
 include/linux/cpuhotplug.h |  2 +-
 virt/kvm/kvm_main.c        | 30 ++++++++++++++++++++++--------
 2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index f61447913db9..7972bd63e0cb 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -185,7 +185,6 @@ enum cpuhp_state {
 	CPUHP_AP_CSKY_TIMER_STARTING,
 	CPUHP_AP_TI_GP_TIMER_STARTING,
 	CPUHP_AP_HYPERV_TIMER_STARTING,
-	CPUHP_AP_KVM_STARTING,
 	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
 	CPUHP_AP_KVM_ARM_VGIC_STARTING,
 	CPUHP_AP_KVM_ARM_TIMER_STARTING,
@@ -203,6 +202,7 @@ enum cpuhp_state {
 
 	/* Online section invoked on the hotplugged CPU from the hotplug thread */
 	CPUHP_AP_ONLINE_IDLE,
+	CPUHP_AP_KVM_ONLINE,
 	CPUHP_AP_SCHED_WAIT_EMPTY,
 	CPUHP_AP_SMPBOOT_THREADS,
 	CPUHP_AP_X86_VDSO_VMA_ONLINE,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4243a9541543..6ce6f27f2934 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5010,13 +5010,27 @@ static void hardware_enable_nolock(void *junk)
 	}
 }
 
-static int kvm_starting_cpu(unsigned int cpu)
+static int kvm_online_cpu(unsigned int cpu)
 {
+	int ret = 0;
+
 	raw_spin_lock(&kvm_count_lock);
-	if (kvm_usage_count)
+	/*
+	 * Abort the CPU online process if hardware virtualization cannot
+	 * be enabled. Otherwise running VMs would encounter unrecoverable
+	 * errors when scheduled to this CPU.
+	 */
+	if (kvm_usage_count) {
+		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
+
 		hardware_enable_nolock(NULL);
+		if (atomic_read(&hardware_enable_failed)) {
+			atomic_set(&hardware_enable_failed, 0);
+			ret = -EIO;
+		}
+	}
 	raw_spin_unlock(&kvm_count_lock);
-	return 0;
+	return ret;
 }
 
 static void hardware_disable_nolock(void *junk)
@@ -5029,7 +5043,7 @@ static void hardware_disable_nolock(void *junk)
 	kvm_arch_hardware_disable();
 }
 
-static int kvm_dying_cpu(unsigned int cpu)
+static int kvm_offline_cpu(unsigned int cpu)
 {
 	raw_spin_lock(&kvm_count_lock);
 	if (kvm_usage_count)
@@ -5840,8 +5854,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 			goto out_free_2;
 	}
 
-	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
-				      kvm_starting_cpu, kvm_dying_cpu);
+	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
+				      kvm_online_cpu, kvm_offline_cpu);
 	if (r)
 		goto out_free_2;
 	register_reboot_notifier(&kvm_reboot_notifier);
@@ -5902,7 +5916,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 out_free_2:
 	kvm_arch_hardware_unsetup();
 out_free_1:
@@ -5928,7 +5942,7 @@ void kvm_exit(void)
 	kvm_async_pf_deinit();
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 06/19] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (4 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 07/19] KVM: Add arch hooks for PM events with empty stub isaku.yamahata
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

Because kvm_count_lock unnecessarily complicates the KVM locking convention
Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock for
simplicity.

Opportunistically add some comments on locking.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 Documentation/virt/kvm/locking.rst | 14 +++++-------
 virt/kvm/kvm_main.c                | 36 +++++++++++++++++++++---------
 2 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 845a561629f1..8957e32aa724 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -216,15 +216,11 @@ time it will be set using the Dirty tracking mechanism described above.
 :Type:		mutex
 :Arch:		any
 :Protects:	- vm_list
-
-``kvm_count_lock``
-^^^^^^^^^^^^^^^^^^
-
-:Type:		raw_spinlock_t
-:Arch:		any
-:Protects:	- hardware virtualization enable/disable
-:Comment:	'raw' because hardware enabling/disabling must be atomic /wrt
-		migration.
+                - kvm_usage_count
+                - hardware virtualization enable/disable
+:Comment:	Use cpus_read_lock() for hardware virtualization enable/disable
+                because hardware enabling/disabling must be atomic /wrt
+                migration.  The lock order is cpus lock => kvm_lock.
 
 ``kvm->mn_invalidate_lock``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6ce6f27f2934..606ac6bb67d0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
  */
 
 DEFINE_MUTEX(kvm_lock);
-static DEFINE_RAW_SPINLOCK(kvm_count_lock);
 LIST_HEAD(vm_list);
 
 static cpumask_var_t cpus_hardware_enabled;
@@ -4996,6 +4995,8 @@ static void hardware_enable_nolock(void *junk)
 	int cpu = raw_smp_processor_id();
 	int r;
 
+	WARN_ON_ONCE(preemptible());
+
 	if (cpumask_test_cpu(cpu, cpus_hardware_enabled))
 		return;
 
@@ -5014,7 +5015,7 @@ static int kvm_online_cpu(unsigned int cpu)
 {
 	int ret = 0;
 
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	/*
 	 * Abort the CPU online process if hardware virtualization cannot
 	 * be enabled. Otherwise running VMs would encounter unrecoverable
@@ -5029,7 +5030,7 @@ static int kvm_online_cpu(unsigned int cpu)
 			ret = -EIO;
 		}
 	}
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	return ret;
 }
 
@@ -5037,6 +5038,8 @@ static void hardware_disable_nolock(void *junk)
 {
 	int cpu = raw_smp_processor_id();
 
+	WARN_ON_ONCE(preemptible());
+
 	if (!cpumask_test_cpu(cpu, cpus_hardware_enabled))
 		return;
 	cpumask_clear_cpu(cpu, cpus_hardware_enabled);
@@ -5045,10 +5048,10 @@ static void hardware_disable_nolock(void *junk)
 
 static int kvm_offline_cpu(unsigned int cpu)
 {
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	return 0;
 }
 
@@ -5063,16 +5066,19 @@ static void hardware_disable_all_nolock(void)
 
 static void hardware_disable_all(void)
 {
-	raw_spin_lock(&kvm_count_lock);
+	cpus_read_lock();
+	mutex_lock(&kvm_lock);
 	hardware_disable_all_nolock();
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
 }
 
 static int hardware_enable_all(void)
 {
 	int r = 0;
 
-	raw_spin_lock(&kvm_count_lock);
+	cpus_read_lock();
+	mutex_lock(&kvm_lock);
 
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
@@ -5085,7 +5091,8 @@ static int hardware_enable_all(void)
 		}
 	}
 
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
 
 	return r;
 }
@@ -5691,15 +5698,22 @@ static void kvm_init_debug(void)
 
 static int kvm_suspend(void)
 {
-	if (kvm_usage_count)
+	/*
+	 * The caller ensures that CPU hotlug is disabled by
+	 * cpu_hotplug_disable() and other CPUs are offlined.  No need for
+	 * locking.
+	 */
+	if (kvm_usage_count) {
+		lockdep_assert_not_held(&kvm_lock);
 		hardware_disable_nolock(NULL);
+	}
 	return 0;
 }
 
 static void kvm_resume(void)
 {
 	if (kvm_usage_count) {
-		lockdep_assert_not_held(&kvm_count_lock);
+		lockdep_assert_not_held(&kvm_lock);
 		hardware_enable_nolock(NULL);
 	}
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 07/19] KVM: Add arch hooks for PM events with empty stub
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (5 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 06/19] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 08/19] KVM: x86: Move TSC fixup logic to KVM arch resume callback isaku.yamahata
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

Add arch hooks for reboot, suspend, resume, and CPU-online/offline events
with empty stub functions.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 include/linux/kvm_host.h |  6 +++++
 virt/kvm/Makefile.kvm    |  2 +-
 virt/kvm/kvm_arch.c      | 44 ++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c      | 51 +++++++++++++++++++++++++---------------
 4 files changed, 83 insertions(+), 20 deletions(-)
 create mode 100644 virt/kvm/kvm_arch.c

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index eab352902de7..dd2a6d98d4de 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1448,6 +1448,12 @@ int kvm_arch_post_init_vm(struct kvm *kvm);
 void kvm_arch_pre_destroy_vm(struct kvm *kvm);
 int kvm_arch_create_vm_debugfs(struct kvm *kvm);
 
+int kvm_arch_suspend(int usage_count);
+void kvm_arch_resume(int usage_count);
+int kvm_arch_reboot(int val);
+int kvm_arch_online_cpu(unsigned int cpu, int usage_count);
+int kvm_arch_offline_cpu(unsigned int cpu, int usage_count);
+
 #ifndef __KVM_HAVE_ARCH_VM_ALLOC
 /*
  * All architectures that want to use vzalloc currently also
diff --git a/virt/kvm/Makefile.kvm b/virt/kvm/Makefile.kvm
index 2c27d5d0c367..c4210acabd35 100644
--- a/virt/kvm/Makefile.kvm
+++ b/virt/kvm/Makefile.kvm
@@ -5,7 +5,7 @@
 
 KVM ?= ../../../virt/kvm
 
-kvm-y := $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
+kvm-y := $(KVM)/kvm_main.o $(KVM)/kvm_arch.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
 kvm-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o
 kvm-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
new file mode 100644
index 000000000000..4748a76bcb03
--- /dev/null
+++ b/virt/kvm/kvm_arch.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * kvm_arch.c: kvm default arch hooks for hardware enabling/disabling
+ * Copyright (c) 2022 Intel Corporation.
+ *
+ * Author:
+ *   Isaku Yamahata <isaku.yamahata@intel.com>
+ *                  <isaku.yamahata@gmail.com>
+ */
+
+#include <linux/kvm_host.h>
+
+/*
+ * Called after the VM is otherwise initialized, but just before adding it to
+ * the vm_list.
+ */
+__weak int kvm_arch_post_init_vm(struct kvm *kvm)
+{
+	return 0;
+}
+
+__weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
+{
+	return 0;
+}
+
+__weak int kvm_arch_offline_cpu(unsigned int cpu, int usage_count)
+{
+	return 0;
+}
+
+__weak int kvm_arch_reboot(int val)
+{
+	return NOTIFY_OK;
+}
+
+__weak int kvm_arch_suspend(int usage_count)
+{
+	return 0;
+}
+
+__weak void kvm_arch_resume(int usage_count)
+{
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 606ac6bb67d0..de336fba902b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -144,6 +144,7 @@ static int kvm_no_compat_open(struct inode *inode, struct file *file)
 #endif
 static int hardware_enable_all(void);
 static void hardware_disable_all(void);
+static void hardware_disable_nolock(void *junk);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -1097,15 +1098,6 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, const char *fdname)
 	return ret;
 }
 
-/*
- * Called after the VM is otherwise initialized, but just before adding it to
- * the vm_list.
- */
-int __weak kvm_arch_post_init_vm(struct kvm *kvm)
-{
-	return 0;
-}
-
 /*
  * Called just after removing the VM from the vm_list, but before doing any
  * other destruction.
@@ -5028,6 +5020,11 @@ static int kvm_online_cpu(unsigned int cpu)
 		if (atomic_read(&hardware_enable_failed)) {
 			atomic_set(&hardware_enable_failed, 0);
 			ret = -EIO;
+		} else {
+			ret = kvm_arch_online_cpu(cpu, kvm_usage_count);
+			if (ret) {
+				hardware_disable_nolock(NULL);
+			}
 		}
 	}
 	mutex_unlock(&kvm_lock);
@@ -5048,11 +5045,19 @@ static void hardware_disable_nolock(void *junk)
 
 static int kvm_offline_cpu(unsigned int cpu)
 {
+	int ret = 0;
+
 	mutex_lock(&kvm_lock);
-	if (kvm_usage_count)
+	if (kvm_usage_count) {
 		hardware_disable_nolock(NULL);
+		ret = kvm_arch_offline_cpu(cpu, kvm_usage_count);
+		if (ret) {
+			(void)hardware_enable_nolock(NULL);
+			atomic_set(&hardware_enable_failed, 0);
+		}
+	}
 	mutex_unlock(&kvm_lock);
-	return 0;
+	return ret;
 }
 
 static void hardware_disable_all_nolock(void)
@@ -5100,6 +5105,8 @@ static int hardware_enable_all(void)
 static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
 		      void *v)
 {
+	int r;
+
 	/*
 	 * Some (well, at least mine) BIOSes hang on reboot if
 	 * in vmx root mode.
@@ -5108,8 +5115,15 @@ static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
 	 */
 	pr_info("kvm: exiting hardware virtualization\n");
 	kvm_rebooting = true;
+
+	/* This hook is called without cpuhotplug disabled.  */
+	cpus_read_lock();
+	mutex_lock(&kvm_lock);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
-	return NOTIFY_OK;
+	r = kvm_arch_reboot(val);
+	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
+	return r;
 }
 
 static struct notifier_block kvm_reboot_notifier = {
@@ -5703,19 +5717,18 @@ static int kvm_suspend(void)
 	 * cpu_hotplug_disable() and other CPUs are offlined.  No need for
 	 * locking.
 	 */
-	if (kvm_usage_count) {
-		lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_not_held(&kvm_lock);
+	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
-	}
-	return 0;
+	return kvm_arch_suspend(kvm_usage_count);
 }
 
 static void kvm_resume(void)
 {
-	if (kvm_usage_count) {
-		lockdep_assert_not_held(&kvm_lock);
+	kvm_arch_resume(kvm_usage_count);
+	lockdep_assert_not_held(&kvm_lock);
+	if (kvm_usage_count)
 		hardware_enable_nolock(NULL);
-	}
 }
 
 static struct syscore_ops kvm_syscore_ops = {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 08/19] KVM: x86: Move TSC fixup logic to KVM arch resume callback
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (6 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 07/19] KVM: Add arch hooks for PM events with empty stub isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 09/19] KVM: Add arch hook when VM is added/deleted isaku.yamahata
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

commit 0dd6a6edb012 ("KVM: Dont mark TSC unstable due to S4 suspend") made
use of kvm_arch_hardware_enable() callback to detect that TSC goes backward
due to S4 suspend.  It has to check it only when resuming from S4. Not
every time virtualization hardware ennoblement.  Move the logic to
kvm_arch_resume() callback.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ca920b6b925d..0b112cd7de58 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11831,18 +11831,30 @@ void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_sipi_vector);
 
 int kvm_arch_hardware_enable(void)
+{
+	return static_call(kvm_x86_hardware_enable)();
+}
+
+void kvm_arch_hardware_disable(void)
+{
+	static_call(kvm_x86_hardware_disable)();
+	drop_user_return_notifiers();
+}
+
+void kvm_arch_resume(int usage_count)
 {
 	struct kvm *kvm;
 	struct kvm_vcpu *vcpu;
 	unsigned long i;
-	int ret;
 	u64 local_tsc;
 	u64 max_tsc = 0;
 	bool stable, backwards_tsc = false;
 
-	ret = static_call(kvm_x86_hardware_enable)();
-	if (ret != 0)
-		return ret;
+	if (!usage_count)
+		return;
+
+	if (kvm_arch_hardware_enable())
+		return;
 
 	local_tsc = rdtsc();
 	stable = !kvm_check_tsc_unstable();
@@ -11917,13 +11929,6 @@ int kvm_arch_hardware_enable(void)
 		}
 
 	}
-	return 0;
-}
-
-void kvm_arch_hardware_disable(void)
-{
-	static_call(kvm_x86_hardware_disable)();
-	drop_user_return_notifiers();
 }
 
 static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 09/19] KVM: Add arch hook when VM is added/deleted
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (7 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 08/19] KVM: x86: Move TSC fixup logic to KVM arch resume callback isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 10/19] KVM: Move out KVM arch PM hooks and hardware enable/disable logic isaku.yamahata
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

and pass kvm_usage_count with kvm_lock.  Move kvm_arch_post_init_vm() under
kvm_arch_add_vm().  Later kvm_arch_post_init_vm() is deleted once x86
overrides kvm_arch_add_vm().

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_arch.c      | 12 +++++++++++-
 virt/kvm/kvm_main.c      | 21 +++++++++++++++++----
 3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dd2a6d98d4de..f78364e01ca9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1445,6 +1445,8 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
 bool kvm_arch_dy_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu);
 int kvm_arch_post_init_vm(struct kvm *kvm);
+int kvm_arch_add_vm(struct kvm *kvm, int usage_count);
+int kvm_arch_del_vm(int usage_count);
 void kvm_arch_pre_destroy_vm(struct kvm *kvm);
 int kvm_arch_create_vm_debugfs(struct kvm *kvm);
 
diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 4748a76bcb03..0eac996f4981 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -10,11 +10,21 @@
 
 #include <linux/kvm_host.h>
 
+__weak int kvm_arch_post_init_vm(struct kvm *kvm)
+{
+	return 0;
+}
+
 /*
  * Called after the VM is otherwise initialized, but just before adding it to
  * the vm_list.
  */
-__weak int kvm_arch_post_init_vm(struct kvm *kvm)
+__weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
+{
+	return kvm_arch_post_init_vm(kvm);
+}
+
+__weak int kvm_arch_del_vm(int usage_count)
 {
 	return 0;
 }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index de336fba902b..5b9dc6d6ee28 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -145,6 +145,7 @@ static int kvm_no_compat_open(struct inode *inode, struct file *file)
 static int hardware_enable_all(void);
 static void hardware_disable_all(void);
 static void hardware_disable_nolock(void *junk);
+static void kvm_del_vm(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
@@ -1215,11 +1216,12 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 	if (r)
 		goto out_err_no_debugfs;
 
-	r = kvm_arch_post_init_vm(kvm);
-	if (r)
-		goto out_err;
-
 	mutex_lock(&kvm_lock);
+	r = kvm_arch_add_vm(kvm, kvm_usage_count);
+	if (r) {
+		mutex_unlock(&kvm_lock);
+		goto out_err;
+	}
 	list_add(&kvm->vm_list, &vm_list);
 	mutex_unlock(&kvm_lock);
 
@@ -1239,6 +1241,7 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 #endif
 out_err_no_mmu_notifier:
 	hardware_disable_all();
+	kvm_del_vm();
 out_err_no_disable:
 	kvm_arch_destroy_vm(kvm);
 out_err_no_arch_destroy_vm:
@@ -1319,6 +1322,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	kvm_arch_free_vm(kvm);
 	preempt_notifier_dec();
 	hardware_disable_all();
+	kvm_del_vm();
 	mmdrop(mm);
 	module_put(kvm_chardev_ops.owner);
 }
@@ -5078,6 +5082,15 @@ static void hardware_disable_all(void)
 	cpus_read_unlock();
 }
 
+static void kvm_del_vm(void)
+{
+	cpus_read_lock();
+	mutex_lock(&kvm_lock);
+	kvm_arch_del_vm(kvm_usage_count);
+	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
+}
+
 static int hardware_enable_all(void)
 {
 	int r = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 10/19] KVM: Move out KVM arch PM hooks and hardware enable/disable logic
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (8 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 09/19] KVM: Add arch hook when VM is added/deleted isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 11/19] KVM: kvm_arch.c: Remove _nolock post fix isaku.yamahata
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

To make clear that those files are default implementation that KVM/x86 (and
other KVM arch in future) will override them, split out those into a single
file. Once conversions for all kvm archs are done, the file will be
deleted.  kvm_arch_pre_hardware_unsetup() is introduced to avoid cross-arch
code churn for now.  Once it's settled down,
kvm_arch_pre_hardware_unsetup() can be merged into
kvm_arch_hardware_unsetup() in each arch code.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 include/linux/kvm_host.h |   1 +
 virt/kvm/kvm_arch.c      |  91 +++++++++++++++++++++++-
 virt/kvm/kvm_main.c      | 145 ++++-----------------------------------
 3 files changed, 104 insertions(+), 133 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f78364e01ca9..60f4ae9d6f48 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1437,6 +1437,7 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
 int kvm_arch_hardware_setup(void *opaque);
+void kvm_arch_pre_hardware_unsetup(void);
 void kvm_arch_hardware_unsetup(void);
 int kvm_arch_check_processor_compat(void);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 0eac996f4981..51c6e9f03ed5 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -6,49 +6,136 @@
  * Author:
  *   Isaku Yamahata <isaku.yamahata@intel.com>
  *                  <isaku.yamahata@gmail.com>
+ *
+ * TODO: Delete this file once the conversion of all KVM arch is done.
  */
 
 #include <linux/kvm_host.h>
 
+static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
+static atomic_t hardware_enable_failed;
+
 __weak int kvm_arch_post_init_vm(struct kvm *kvm)
 {
 	return 0;
 }
 
+static void hardware_enable_nolock(void *junk)
+{
+	int cpu = raw_smp_processor_id();
+	int r;
+
+	WARN_ON_ONCE(preemptible());
+
+	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
+		return;
+
+	cpumask_set_cpu(cpu, &cpus_hardware_enabled);
+
+	r = kvm_arch_hardware_enable();
+
+	if (r) {
+		cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
+		atomic_inc(&hardware_enable_failed);
+		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
+	}
+}
+
+static void hardware_disable_nolock(void *junk)
+{
+	int cpu = raw_smp_processor_id();
+
+	WARN_ON_ONCE(preemptible());
+
+	if (!cpumask_test_cpu(cpu, &cpus_hardware_enabled))
+		return;
+	cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
+	kvm_arch_hardware_disable();
+}
+
+__weak void kvm_arch_pre_hardware_unsetup(void)
+{
+	on_each_cpu(hardware_disable_nolock, NULL, 1);
+}
+
 /*
  * Called after the VM is otherwise initialized, but just before adding it to
  * the vm_list.
  */
 __weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 {
-	return kvm_arch_post_init_vm(kvm);
+	int r = 0;
+
+	if (usage_count != 1)
+		return 0;
+
+	atomic_set(&hardware_enable_failed, 0);
+	on_each_cpu(hardware_enable_nolock, NULL, 1);
+
+	if (atomic_read(&hardware_enable_failed)) {
+		r = -EBUSY;
+		goto err;
+	}
+
+	r = kvm_arch_post_init_vm(kvm);
+err:
+	if (r)
+		on_each_cpu(hardware_disable_nolock, NULL, 1);
+	return r;
 }
 
 __weak int kvm_arch_del_vm(int usage_count)
 {
+	if (usage_count)
+		return 0;
+
+	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	return 0;
 }
 
 __weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
 {
-	return 0;
+	int ret = 0;
+
+	/*
+	 * Abort the CPU online process if hardware virtualization cannot
+	 * be enabled. Otherwise running VMs would encounter unrecoverable
+	 * errors when scheduled to this CPU.
+	 */
+	if (usage_count) {
+		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
+
+		hardware_enable_nolock(NULL);
+		if (atomic_read(&hardware_enable_failed)) {
+			atomic_set(&hardware_enable_failed, 0);
+			ret = -EIO;
+		}
+	}
+	return ret;
 }
 
 __weak int kvm_arch_offline_cpu(unsigned int cpu, int usage_count)
 {
+	if (usage_count)
+		hardware_disable_nolock(NULL);
 	return 0;
 }
 
 __weak int kvm_arch_reboot(int val)
 {
+	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	return NOTIFY_OK;
 }
 
 __weak int kvm_arch_suspend(int usage_count)
 {
+	if (usage_count)
+		hardware_disable_nolock(NULL);
 	return 0;
 }
 
 __weak void kvm_arch_resume(int usage_count)
 {
+	if (usage_count)
+		hardware_enable_nolock(NULL);
 }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5b9dc6d6ee28..752edf9bc1c7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -102,9 +102,7 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
 DEFINE_MUTEX(kvm_lock);
 LIST_HEAD(vm_list);
 
-static cpumask_var_t cpus_hardware_enabled;
 static int kvm_usage_count;
-static atomic_t hardware_enable_failed;
 
 static struct kmem_cache *kvm_vcpu_cache;
 
@@ -142,9 +140,6 @@ static int kvm_no_compat_open(struct inode *inode, struct file *file)
 #define KVM_COMPAT(c)	.compat_ioctl	= kvm_no_compat_ioctl,	\
 			.open		= kvm_no_compat_open
 #endif
-static int hardware_enable_all(void);
-static void hardware_disable_all(void);
-static void hardware_disable_nolock(void *junk);
 static void kvm_del_vm(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
@@ -1196,10 +1191,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 	if (r)
 		goto out_err_no_arch_destroy_vm;
 
-	r = hardware_enable_all();
-	if (r)
-		goto out_err_no_disable;
-
 #ifdef CONFIG_HAVE_KVM_IRQFD
 	INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
 #endif
@@ -1216,14 +1207,18 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 	if (r)
 		goto out_err_no_debugfs;
 
+	cpus_read_lock();
 	mutex_lock(&kvm_lock);
+	kvm_usage_count++;
 	r = kvm_arch_add_vm(kvm, kvm_usage_count);
 	if (r) {
+		/* the following kvm_del_vm() decrements kvm_usage_count. */
 		mutex_unlock(&kvm_lock);
 		goto out_err;
 	}
 	list_add(&kvm->vm_list, &vm_list);
 	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
 
 	preempt_notifier_inc();
 	kvm_init_pm_notifier(kvm);
@@ -1240,9 +1235,7 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 		mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
 #endif
 out_err_no_mmu_notifier:
-	hardware_disable_all();
 	kvm_del_vm();
-out_err_no_disable:
 	kvm_arch_destroy_vm(kvm);
 out_err_no_arch_destroy_vm:
 	WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
@@ -1321,7 +1314,6 @@ static void kvm_destroy_vm(struct kvm *kvm)
 	cleanup_srcu_struct(&kvm->srcu);
 	kvm_arch_free_vm(kvm);
 	preempt_notifier_dec();
-	hardware_disable_all();
 	kvm_del_vm();
 	mmdrop(mm);
 	module_put(kvm_chardev_ops.owner);
@@ -4986,135 +4978,37 @@ static struct miscdevice kvm_dev = {
 	&kvm_chardev_ops,
 };
 
-static void hardware_enable_nolock(void *junk)
-{
-	int cpu = raw_smp_processor_id();
-	int r;
-
-	WARN_ON_ONCE(preemptible());
-
-	if (cpumask_test_cpu(cpu, cpus_hardware_enabled))
-		return;
-
-	cpumask_set_cpu(cpu, cpus_hardware_enabled);
-
-	r = kvm_arch_hardware_enable();
-
-	if (r) {
-		cpumask_clear_cpu(cpu, cpus_hardware_enabled);
-		atomic_inc(&hardware_enable_failed);
-		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
-	}
-}
-
 static int kvm_online_cpu(unsigned int cpu)
 {
-	int ret = 0;
+	int ret;
 
 	mutex_lock(&kvm_lock);
-	/*
-	 * Abort the CPU online process if hardware virtualization cannot
-	 * be enabled. Otherwise running VMs would encounter unrecoverable
-	 * errors when scheduled to this CPU.
-	 */
-	if (kvm_usage_count) {
-		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
-
-		hardware_enable_nolock(NULL);
-		if (atomic_read(&hardware_enable_failed)) {
-			atomic_set(&hardware_enable_failed, 0);
-			ret = -EIO;
-		} else {
-			ret = kvm_arch_online_cpu(cpu, kvm_usage_count);
-			if (ret) {
-				hardware_disable_nolock(NULL);
-			}
-		}
-	}
+	ret = kvm_arch_online_cpu(cpu, kvm_usage_count);
 	mutex_unlock(&kvm_lock);
 	return ret;
 }
 
-static void hardware_disable_nolock(void *junk)
-{
-	int cpu = raw_smp_processor_id();
-
-	WARN_ON_ONCE(preemptible());
-
-	if (!cpumask_test_cpu(cpu, cpus_hardware_enabled))
-		return;
-	cpumask_clear_cpu(cpu, cpus_hardware_enabled);
-	kvm_arch_hardware_disable();
-}
-
 static int kvm_offline_cpu(unsigned int cpu)
 {
-	int ret = 0;
+	int ret;
 
 	mutex_lock(&kvm_lock);
-	if (kvm_usage_count) {
-		hardware_disable_nolock(NULL);
-		ret = kvm_arch_offline_cpu(cpu, kvm_usage_count);
-		if (ret) {
-			(void)hardware_enable_nolock(NULL);
-			atomic_set(&hardware_enable_failed, 0);
-		}
-	}
+	ret = kvm_arch_offline_cpu(cpu, kvm_usage_count);
 	mutex_unlock(&kvm_lock);
 	return ret;
 }
 
-static void hardware_disable_all_nolock(void)
-{
-	BUG_ON(!kvm_usage_count);
-
-	kvm_usage_count--;
-	if (!kvm_usage_count)
-		on_each_cpu(hardware_disable_nolock, NULL, 1);
-}
-
-static void hardware_disable_all(void)
-{
-	cpus_read_lock();
-	mutex_lock(&kvm_lock);
-	hardware_disable_all_nolock();
-	mutex_unlock(&kvm_lock);
-	cpus_read_unlock();
-}
-
 static void kvm_del_vm(void)
 {
 	cpus_read_lock();
 	mutex_lock(&kvm_lock);
+	WARN_ON_ONCE(!kvm_usage_count);
+	kvm_usage_count--;
 	kvm_arch_del_vm(kvm_usage_count);
 	mutex_unlock(&kvm_lock);
 	cpus_read_unlock();
 }
 
-static int hardware_enable_all(void)
-{
-	int r = 0;
-
-	cpus_read_lock();
-	mutex_lock(&kvm_lock);
-
-	kvm_usage_count++;
-	if (kvm_usage_count == 1) {
-		atomic_set(&hardware_enable_failed, 0);
-		on_each_cpu(hardware_enable_nolock, NULL, 1);
-
-		if (atomic_read(&hardware_enable_failed)) {
-			hardware_disable_all_nolock();
-			r = -EBUSY;
-		}
-	}
-
-	mutex_unlock(&kvm_lock);
-	cpus_read_unlock();
-
-	return r;
-}
-
 static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
 		      void *v)
 {
@@ -5132,7 +5026,6 @@ static int kvm_reboot(struct notifier_block *notifier, unsigned long val,
 	/* This hook is called without cpuhotplug disabled.  */
 	cpus_read_lock();
 	mutex_lock(&kvm_lock);
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	r = kvm_arch_reboot(val);
 	mutex_unlock(&kvm_lock);
 	cpus_read_unlock();
@@ -5731,17 +5624,13 @@ static int kvm_suspend(void)
 	 * locking.
 	 */
 	lockdep_assert_not_held(&kvm_lock);
-	if (kvm_usage_count)
-		hardware_disable_nolock(NULL);
 	return kvm_arch_suspend(kvm_usage_count);
 }
 
 static void kvm_resume(void)
 {
-	kvm_arch_resume(kvm_usage_count);
 	lockdep_assert_not_held(&kvm_lock);
-	if (kvm_usage_count)
-		hardware_enable_nolock(NULL);
+	kvm_arch_resume(kvm_usage_count);
 }
 
 static struct syscore_ops kvm_syscore_ops = {
@@ -5879,11 +5768,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (r)
 		goto out_irqfd;
 
-	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
-		r = -ENOMEM;
-		goto out_free_0;
-	}
-
 	r = kvm_arch_hardware_setup(opaque);
 	if (r < 0)
 		goto out_free_1;
@@ -5960,8 +5844,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 out_free_2:
 	kvm_arch_hardware_unsetup();
 out_free_1:
-	free_cpumask_var(cpus_hardware_enabled);
-out_free_0:
 	kvm_irqfd_exit();
 out_irqfd:
 	kvm_arch_exit();
@@ -5983,11 +5865,12 @@ void kvm_exit(void)
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
+	cpus_read_lock();
+	kvm_arch_pre_hardware_unsetup();
 	kvm_arch_hardware_unsetup();
+	cpus_read_unlock();
 	kvm_arch_exit();
 	kvm_irqfd_exit();
-	free_cpumask_var(cpus_hardware_enabled);
 	kvm_vfio_ops_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 11/19] KVM: kvm_arch.c: Remove _nolock post fix
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (9 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 10/19] KVM: Move out KVM arch PM hooks and hardware enable/disable logic isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 12/19] KVM: kvm_arch.c: Remove a global variable, hardware_enable_failed isaku.yamahata
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

Now all related callbacks are called under kvm_lock, no point for _nolock
post fix.  Remove _nolock post fix for short function name.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 virt/kvm/kvm_arch.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 51c6e9f03ed5..491e92ef9e3d 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -20,7 +20,7 @@ __weak int kvm_arch_post_init_vm(struct kvm *kvm)
 	return 0;
 }
 
-static void hardware_enable_nolock(void *junk)
+static void hardware_enable(void *junk)
 {
 	int cpu = raw_smp_processor_id();
 	int r;
@@ -41,7 +41,7 @@ static void hardware_enable_nolock(void *junk)
 	}
 }
 
-static void hardware_disable_nolock(void *junk)
+static void hardware_disable(void *junk)
 {
 	int cpu = raw_smp_processor_id();
 
@@ -55,7 +55,7 @@ static void hardware_disable_nolock(void *junk)
 
 __weak void kvm_arch_pre_hardware_unsetup(void)
 {
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
+	on_each_cpu(hardware_disable, NULL, 1);
 }
 
 /*
@@ -70,7 +70,7 @@ __weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 		return 0;
 
 	atomic_set(&hardware_enable_failed, 0);
-	on_each_cpu(hardware_enable_nolock, NULL, 1);
+	on_each_cpu(hardware_enable, NULL, 1);
 
 	if (atomic_read(&hardware_enable_failed)) {
 		r = -EBUSY;
@@ -80,7 +80,7 @@ __weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 	r = kvm_arch_post_init_vm(kvm);
 err:
 	if (r)
-		on_each_cpu(hardware_disable_nolock, NULL, 1);
+		on_each_cpu(hardware_disable, NULL, 1);
 	return r;
 }
 
@@ -89,7 +89,7 @@ __weak int kvm_arch_del_vm(int usage_count)
 	if (usage_count)
 		return 0;
 
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
+	on_each_cpu(hardware_disable, NULL, 1);
 	return 0;
 }
 
@@ -105,7 +105,7 @@ __weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
 	if (usage_count) {
 		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
 
-		hardware_enable_nolock(NULL);
+		hardware_enable(NULL);
 		if (atomic_read(&hardware_enable_failed)) {
 			atomic_set(&hardware_enable_failed, 0);
 			ret = -EIO;
@@ -117,25 +117,25 @@ __weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
 __weak int kvm_arch_offline_cpu(unsigned int cpu, int usage_count)
 {
 	if (usage_count)
-		hardware_disable_nolock(NULL);
+		hardware_disable(NULL);
 	return 0;
 }
 
 __weak int kvm_arch_reboot(int val)
 {
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
+	on_each_cpu(hardware_disable, NULL, 1);
 	return NOTIFY_OK;
 }
 
 __weak int kvm_arch_suspend(int usage_count)
 {
 	if (usage_count)
-		hardware_disable_nolock(NULL);
+		hardware_disable(NULL);
 	return 0;
 }
 
 __weak void kvm_arch_resume(int usage_count)
 {
 	if (usage_count)
-		hardware_enable_nolock(NULL);
+		hardware_enable(NULL);
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 12/19] KVM: kvm_arch.c: Remove a global variable, hardware_enable_failed
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (10 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 11/19] KVM: kvm_arch.c: Remove _nolock post fix isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 13/19] KVM: Do processor compatibility check on cpu online and resume isaku.yamahata
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

A global variable hardware_enable_failed in kvm_arch.c is used only by
kvm_arch_add_vm() and hardware_enable().  It doesn't have to be a global
variable.  Make it function local.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 virt/kvm/kvm_arch.c | 56 +++++++++++++++++++++------------------------
 1 file changed, 26 insertions(+), 30 deletions(-)

diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 491e92ef9e3d..3990f85edab3 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -13,14 +13,13 @@
 #include <linux/kvm_host.h>
 
 static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
-static atomic_t hardware_enable_failed;
 
 __weak int kvm_arch_post_init_vm(struct kvm *kvm)
 {
 	return 0;
 }
 
-static void hardware_enable(void *junk)
+static int __hardware_enable(void)
 {
 	int cpu = raw_smp_processor_id();
 	int r;
@@ -28,17 +27,21 @@ static void hardware_enable(void *junk)
 	WARN_ON_ONCE(preemptible());
 
 	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
-		return;
-
-	cpumask_set_cpu(cpu, &cpus_hardware_enabled);
-
+		return 0;
 	r = kvm_arch_hardware_enable();
-
-	if (r) {
-		cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
-		atomic_inc(&hardware_enable_failed);
+	if (r)
 		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
-	}
+	else
+		cpumask_set_cpu(cpu, &cpus_hardware_enabled);
+	return r;
+}
+
+static void hardware_enable(void *arg)
+{
+	atomic_t *failed = arg;
+
+	if (__hardware_enable())
+		atomic_inc(failed);
 }
 
 static void hardware_disable(void *junk)
@@ -64,15 +67,16 @@ __weak void kvm_arch_pre_hardware_unsetup(void)
  */
 __weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 {
+	atomic_t failed;
 	int r = 0;
 
 	if (usage_count != 1)
 		return 0;
 
-	atomic_set(&hardware_enable_failed, 0);
-	on_each_cpu(hardware_enable, NULL, 1);
+	atomic_set(&failed, 0);
+	on_each_cpu(hardware_enable, &failed, 1);
 
-	if (atomic_read(&hardware_enable_failed)) {
+	if (atomic_read(&failed)) {
 		r = -EBUSY;
 		goto err;
 	}
@@ -95,23 +99,15 @@ __weak int kvm_arch_del_vm(int usage_count)
 
 __weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
 {
-	int ret = 0;
-
-	/*
-	 * Abort the CPU online process if hardware virtualization cannot
-	 * be enabled. Otherwise running VMs would encounter unrecoverable
-	 * errors when scheduled to this CPU.
-	 */
 	if (usage_count) {
-		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
-
-		hardware_enable(NULL);
-		if (atomic_read(&hardware_enable_failed)) {
-			atomic_set(&hardware_enable_failed, 0);
-			ret = -EIO;
-		}
+		/*
+		 * Abort the CPU online process if hardware virtualization cannot
+		 * be enabled. Otherwise running VMs would encounter unrecoverable
+		 * errors when scheduled to this CPU.
+		 */
+		return __hardware_enable();
 	}
-	return ret;
+	return 0;
 }
 
 __weak int kvm_arch_offline_cpu(unsigned int cpu, int usage_count)
@@ -137,5 +133,5 @@ __weak int kvm_arch_suspend(int usage_count)
 __weak void kvm_arch_resume(int usage_count)
 {
 	if (usage_count)
-		hardware_enable(NULL);
+		(void)__hardware_enable();
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 13/19] KVM: Do processor compatibility check on cpu online and resume
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (11 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 12/19] KVM: kvm_arch.c: Remove a global variable, hardware_enable_failed isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 14/19] KVM: x86: Duplicate arch callbacks related to pm events isaku.yamahata
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

So far the processor compatibility check is not done for newly added CPU.
It should be done.  For online cpu case, the function is called by kernel
thread bind to the cpu without irq disabled.  So remove
WARN_ON(!irq_disabled()).

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c  |  2 --
 virt/kvm/kvm_arch.c | 15 +++++++++++++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0b112cd7de58..ac185e199f69 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12003,8 +12003,6 @@ int kvm_arch_check_processor_compat(void)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
 
-	WARN_ON(!irqs_disabled());
-
 	if (__cr4_reserved_bits(cpu_has, c) !=
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
 		return -EIO;
diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 3990f85edab3..e440d4a99c8a 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -99,6 +99,12 @@ __weak int kvm_arch_del_vm(int usage_count)
 
 __weak int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
 {
+	int r;
+
+	r = kvm_arch_check_processor_compat();
+	if (r)
+		return r;
+
 	if (usage_count) {
 		/*
 		 * Abort the CPU online process if hardware virtualization cannot
@@ -132,6 +138,11 @@ __weak int kvm_arch_suspend(int usage_count)
 
 __weak void kvm_arch_resume(int usage_count)
 {
-	if (usage_count)
-		(void)__hardware_enable();
+	if (kvm_arch_check_processor_compat())
+		return; /* FIXME: disable KVM */
+
+	if (!usage_count)
+		return;
+
+	(void)__hardware_enable();
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 14/19] KVM: x86: Duplicate arch callbacks related to pm events
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (12 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 13/19] KVM: Do processor compatibility check on cpu online and resume isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 15/19] KVM: Eliminate kvm_arch_post_init_vm() isaku.yamahata
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

KVM/X86 can change those callbacks without worrying about breaking other
archs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 131 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 126 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac185e199f69..2485f3d792b2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11841,6 +11841,130 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
+static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
+
+int kvm_arch_post_init_vm(struct kvm *kvm)
+{
+	return kvm_mmu_post_init_vm(kvm);
+}
+
+static int __hardware_enable(void)
+{
+	int cpu = raw_smp_processor_id();
+	int r;
+
+	WARN_ON_ONCE(preemptible());
+
+	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
+		return 0;
+	r = kvm_arch_hardware_enable();
+	if (r)
+		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
+	else
+		cpumask_set_cpu(cpu, &cpus_hardware_enabled);
+	return r;
+}
+
+static void hardware_enable(void *arg)
+{
+	atomic_t *failed = arg;
+
+	if (__hardware_enable())
+		atomic_inc(failed);
+}
+
+static void hardware_disable(void *junk)
+{
+	int cpu = raw_smp_processor_id();
+
+	WARN_ON_ONCE(preemptible());
+
+	if (!cpumask_test_cpu(cpu, &cpus_hardware_enabled))
+		return;
+	cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
+	kvm_arch_hardware_disable();
+}
+
+void kvm_arch_pre_hardware_unsetup(void)
+{
+	on_each_cpu(hardware_disable, NULL, 1);
+}
+
+/*
+ * Called after the VM is otherwise initialized, but just before adding it to
+ * the vm_list.
+ */
+int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
+{
+	atomic_t failed;
+	int r = 0;
+
+	if (usage_count != 1)
+		return 0;
+
+	atomic_set(&failed, 0);
+	on_each_cpu(hardware_enable, &failed, 1);
+
+	if (atomic_read(&failed)) {
+		r = -EBUSY;
+		goto err;
+	}
+
+	r = kvm_arch_post_init_vm(kvm);
+err:
+	if (r)
+		on_each_cpu(hardware_disable, NULL, 1);
+	return r;
+}
+
+int kvm_arch_del_vm(int usage_count)
+{
+	if (usage_count)
+		return 0;
+
+	on_each_cpu(hardware_disable, NULL, 1);
+	return 0;
+}
+
+int kvm_arch_online_cpu(unsigned int cpu, int usage_count)
+{
+	int r;
+
+	r = kvm_arch_check_processor_compat();
+	if (r)
+		return r;
+
+	if (usage_count) {
+		/*
+		 * Abort the CPU online process if hardware virtualization cannot
+		 * be enabled. Otherwise running VMs would encounter unrecoverable
+		 * errors when scheduled to this CPU.
+		 */
+		return __hardware_enable();
+	}
+	return 0;
+}
+
+int kvm_arch_offline_cpu(unsigned int cpu, int usage_count)
+{
+	if (usage_count)
+		hardware_disable(NULL);
+	return 0;
+}
+
+int kvm_arch_reboot(int val)
+{
+	on_each_cpu(hardware_disable, NULL, 1);
+	return NOTIFY_OK;
+}
+
+int kvm_arch_suspend(int usage_count)
+{
+	if (usage_count)
+		hardware_disable(NULL);
+	return 0;
+}
+
 void kvm_arch_resume(int usage_count)
 {
 	struct kvm *kvm;
@@ -11853,6 +11977,8 @@ void kvm_arch_resume(int usage_count)
 	if (!usage_count)
 		return;
 
+	if (kvm_arch_check_processor_compat())
+		return;
 	if (kvm_arch_hardware_enable())
 		return;
 
@@ -12102,11 +12228,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	return ret;
 }
 
-int kvm_arch_post_init_vm(struct kvm *kvm)
-{
-	return kvm_mmu_post_init_vm(kvm);
-}
-
 static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
 {
 	vcpu_load(vcpu);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 15/19] KVM: Eliminate kvm_arch_post_init_vm()
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (13 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 14/19] KVM: x86: Duplicate arch callbacks related to pm events isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 16/19] KVM: x86: Delete kvm_arch_hardware_enable/disable() isaku.yamahata
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

Now kvm_arch_post_init_vm() is used only by x86 kvm_arch_add_vm().  Other
arch doesn't define it. Merge x86 kvm_arch_post_init_vm() int x86
kvm_arch_add_vm() and eliminate kvm_arch_post_init_vm().

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c       |  7 +------
 include/linux/kvm_host.h |  1 -
 virt/kvm/kvm_arch.c      | 12 +-----------
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2485f3d792b2..e5f066138ee9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11843,11 +11843,6 @@ void kvm_arch_hardware_disable(void)
 
 static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
 
-int kvm_arch_post_init_vm(struct kvm *kvm)
-{
-	return kvm_mmu_post_init_vm(kvm);
-}
-
 static int __hardware_enable(void)
 {
 	int cpu = raw_smp_processor_id();
@@ -11910,7 +11905,7 @@ int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 		goto err;
 	}
 
-	r = kvm_arch_post_init_vm(kvm);
+	r = kvm_mmu_post_init_vm(kvm);
 err:
 	if (r)
 		on_each_cpu(hardware_disable, NULL, 1);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 60f4ae9d6f48..8abbf7a1773b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1445,7 +1445,6 @@ bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
 bool kvm_arch_dy_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu);
-int kvm_arch_post_init_vm(struct kvm *kvm);
 int kvm_arch_add_vm(struct kvm *kvm, int usage_count);
 int kvm_arch_del_vm(int usage_count);
 void kvm_arch_pre_destroy_vm(struct kvm *kvm);
diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index e440d4a99c8a..8f2d920a2a8f 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -14,11 +14,6 @@
 
 static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
 
-__weak int kvm_arch_post_init_vm(struct kvm *kvm)
-{
-	return 0;
-}
-
 static int __hardware_enable(void)
 {
 	int cpu = raw_smp_processor_id();
@@ -78,13 +73,8 @@ __weak int kvm_arch_add_vm(struct kvm *kvm, int usage_count)
 
 	if (atomic_read(&failed)) {
 		r = -EBUSY;
-		goto err;
-	}
-
-	r = kvm_arch_post_init_vm(kvm);
-err:
-	if (r)
 		on_each_cpu(hardware_disable, NULL, 1);
+	}
 	return r;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 16/19] KVM: x86: Delete kvm_arch_hardware_enable/disable()
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (14 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 15/19] KVM: Eliminate kvm_arch_post_init_vm() isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 17/19] KVM: Add config to not compile kvm_arch.c isaku.yamahata
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

Now they're function call and there is no point to keep them.
Opportunistically make kvm_arch_pre_hardware_unsetup() empty.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e5f066138ee9..14a464f7302b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -354,7 +354,7 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
 
 	/*
 	 * Disabling irqs at this point since the following code could be
-	 * interrupted and executed through kvm_arch_hardware_disable()
+	 * interrupted and executed through hardware_disable()
 	 */
 	local_irq_save(flags);
 	if (msrs->registered) {
@@ -11830,17 +11830,6 @@ void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_sipi_vector);
 
-int kvm_arch_hardware_enable(void)
-{
-	return static_call(kvm_x86_hardware_enable)();
-}
-
-void kvm_arch_hardware_disable(void)
-{
-	static_call(kvm_x86_hardware_disable)();
-	drop_user_return_notifiers();
-}
-
 static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
 
 static int __hardware_enable(void)
@@ -11852,7 +11841,7 @@ static int __hardware_enable(void)
 
 	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
 		return 0;
-	r = kvm_arch_hardware_enable();
+	r = static_call(kvm_x86_hardware_enable)();
 	if (r)
 		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
 	else
@@ -11877,12 +11866,13 @@ static void hardware_disable(void *junk)
 	if (!cpumask_test_cpu(cpu, &cpus_hardware_enabled))
 		return;
 	cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
-	kvm_arch_hardware_disable();
+	static_call(kvm_x86_hardware_disable)();
+	drop_user_return_notifiers();
 }
 
 void kvm_arch_pre_hardware_unsetup(void)
 {
-	on_each_cpu(hardware_disable, NULL, 1);
+	/* TODO: eliminate this function */
 }
 
 /*
@@ -11974,7 +11964,7 @@ void kvm_arch_resume(int usage_count)
 
 	if (kvm_arch_check_processor_compat())
 		return;
-	if (kvm_arch_hardware_enable())
+	if (static_call(kvm_x86_hardware_enable)())
 		return;
 
 	local_tsc = rdtsc();
@@ -12115,6 +12105,8 @@ int kvm_arch_hardware_setup(void *opaque)
 
 void kvm_arch_hardware_unsetup(void)
 {
+	on_each_cpu(hardware_disable, NULL, 1);
+
 	kvm_unregister_perf_callbacks();
 
 	static_call(kvm_x86_hardware_unsetup)();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 17/19] KVM: Add config to not compile kvm_arch.c
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (15 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 16/19] KVM: x86: Delete kvm_arch_hardware_enable/disable() isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 18/19] RFC: KVM: x86: Remove cpus_hardware_enabled and related sanity check isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 19/19] RFC: KVM: " isaku.yamahata
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

So that kvm_arch_hardware_enable/disable() aren't defined.

Once the conversion of all KVM archs is done, this config and kvm_arch.c
should be removed.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/Kconfig     | 1 +
 include/linux/kvm_host.h | 2 ++
 virt/kvm/Kconfig         | 3 +++
 virt/kvm/Makefile.kvm    | 5 ++++-
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index e3cbd7706136..e2e16205425d 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -25,6 +25,7 @@ config KVM
 	depends on X86_LOCAL_APIC
 	select PREEMPT_NOTIFIERS
 	select MMU_NOTIFIER
+	select HAVE_KVM_OVERRIDE_HARDWARE_ENABLE
 	select HAVE_KVM_IRQCHIP
 	select HAVE_KVM_PFNCACHE
 	select HAVE_KVM_IRQFD
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8abbf7a1773b..74111118db42 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1434,8 +1434,10 @@ void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu, struct dentry *debugfs_
 static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
 #endif
 
+#ifndef CONFIG_HAVE_KVM_OVERRIDE_HARDWARE_ENABLE
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
+#endif
 int kvm_arch_hardware_setup(void *opaque);
 void kvm_arch_pre_hardware_unsetup(void);
 void kvm_arch_hardware_unsetup(void);
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index a8c5c9f06b3c..917314a87696 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -72,3 +72,6 @@ config KVM_XFER_TO_GUEST_WORK
 
 config HAVE_KVM_PM_NOTIFIER
        bool
+
+config HAVE_KVM_OVERRIDE_HARDWARE_ENABLE
+	def_bool n
diff --git a/virt/kvm/Makefile.kvm b/virt/kvm/Makefile.kvm
index c4210acabd35..c0187ec4f83c 100644
--- a/virt/kvm/Makefile.kvm
+++ b/virt/kvm/Makefile.kvm
@@ -5,7 +5,10 @@
 
 KVM ?= ../../../virt/kvm
 
-kvm-y := $(KVM)/kvm_main.o $(KVM)/kvm_arch.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
+kvm-y := $(KVM)/kvm_main.o $(KVM)/eventfd.o $(KVM)/binary_stats.o
+ifneq ($(CONFIG_HAVE_KVM_OVERRIDE_HARDWARE_ENABLE), y)
+kvm-y += $(KVM)/kvm_arch.o
+endif
 kvm-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o
 kvm-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 18/19] RFC: KVM: x86: Remove cpus_hardware_enabled and related sanity check
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (16 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 17/19] KVM: Add config to not compile kvm_arch.c isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  2022-08-30 12:01 ` [PATCH v2 19/19] RFC: KVM: " isaku.yamahata
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

cpus_hardware_enabled mask seems incomplete protection against other kernel
component using CPU virtualization feature.  Because it's obscure and
incomplete, remove the check.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/x86.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14a464f7302b..10b83cbb29ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11830,22 +11830,15 @@ void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_sipi_vector);
 
-static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
-
 static int __hardware_enable(void)
 {
-	int cpu = raw_smp_processor_id();
 	int r;
 
 	WARN_ON_ONCE(preemptible());
 
-	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
-		return 0;
 	r = static_call(kvm_x86_hardware_enable)();
 	if (r)
-		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
-	else
-		cpumask_set_cpu(cpu, &cpus_hardware_enabled);
+		pr_info("kvm: enabling virtualization on CPU%d failed\n", smp_processor_id());
 	return r;
 }
 
@@ -11859,13 +11852,7 @@ static void hardware_enable(void *arg)
 
 static void hardware_disable(void *junk)
 {
-	int cpu = raw_smp_processor_id();
-
 	WARN_ON_ONCE(preemptible());
-
-	if (!cpumask_test_cpu(cpu, &cpus_hardware_enabled))
-		return;
-	cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
 	static_call(kvm_x86_hardware_disable)();
 	drop_user_return_notifiers();
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 19/19] RFC: KVM: Remove cpus_hardware_enabled and related sanity check
  2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
                   ` (17 preceding siblings ...)
  2022-08-30 12:01 ` [PATCH v2 18/19] RFC: KVM: x86: Remove cpus_hardware_enabled and related sanity check isaku.yamahata
@ 2022-08-30 12:01 ` isaku.yamahata
  18 siblings, 0 replies; 30+ messages in thread
From: isaku.yamahata @ 2022-08-30 12:01 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: isaku.yamahata, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Chao Gao, Will Deacon

From: Isaku Yamahata <isaku.yamahata@intel.com>

cpus_hardware_enabled mask seems incomplete protection against other kernel
component using CPU virtualization feature.  Because it's obscure and
incomplete, remove the check.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 virt/kvm/kvm_arch.c | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/virt/kvm/kvm_arch.c b/virt/kvm/kvm_arch.c
index 8f2d920a2a8f..cbad0181c177 100644
--- a/virt/kvm/kvm_arch.c
+++ b/virt/kvm/kvm_arch.c
@@ -12,22 +12,16 @@
 
 #include <linux/kvm_host.h>
 
-static cpumask_t cpus_hardware_enabled = CPU_MASK_NONE;
-
 static int __hardware_enable(void)
 {
-	int cpu = raw_smp_processor_id();
 	int r;
 
 	WARN_ON_ONCE(preemptible());
 
-	if (cpumask_test_cpu(cpu, &cpus_hardware_enabled))
-		return 0;
 	r = kvm_arch_hardware_enable();
 	if (r)
-		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
-	else
-		cpumask_set_cpu(cpu, &cpus_hardware_enabled);
+		pr_info("kvm: enabling virtualization on CPU%d failed\n",
+			smp_processor_id());
 	return r;
 }
 
@@ -41,13 +35,7 @@ static void hardware_enable(void *arg)
 
 static void hardware_disable(void *junk)
 {
-	int cpu = raw_smp_processor_id();
-
 	WARN_ON_ONCE(preemptible());
-
-	if (!cpumask_test_cpu(cpu, &cpus_hardware_enabled))
-		return;
-	cpumask_clear_cpu(cpu, &cpus_hardware_enabled);
 	kvm_arch_hardware_disable();
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"
  2022-08-30 12:01 ` [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" isaku.yamahata
@ 2022-08-30 22:39   ` Huang, Kai
  2022-09-01 18:01     ` Isaku Yamahata
  0 siblings, 1 reply; 30+ messages in thread
From: Huang, Kai @ 2022-08-30 22:39 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yamahata, Isaku
  Cc: Gao, Chao, Christopherson,, Sean, suzuki.poulose@arm.com,
	isaku.yamahata@gmail.com, will@kernel.org, anup@brainfault.org,
	pbonzini@redhat.com, imbrenda@linux.ibm.com

On Tue, 2022-08-30 at 05:01 -0700, isaku.yamahata@intel.com wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
> param to additional arch funcs") remove opaque from
> kvm_arch_check_processor_compat because no one uses this opaque now.
> Address conflicts for ARM (due to file movement) and manually handle RISC-V
> which comes after the commit.  The change about kvm_arch_hardware_setup()
> in original commit are still needed so they are not reverted.
> 
> The current implementation enables hardware (e.g. enable VMX on all CPUs),
> arch-specific initialization for the first VM creation, and disables
> hardware (in x86, disable VMX on all CPUs) for last VM destruction.
> 
> To support TDX, hardware_enable_all() will be done during module loading
> time.  As a result, CPU compatibility check will be opportunistically moved
> to hardware_enable_nolock(), which doesn't take any argument.  Instead of
> passing 'opaque' around to hardware_enable_nolock() and
> hardware_enable_all(), just remove the unused 'opaque' argument from
> kvm_arch_check_processor_compat().

This patch now is not part of TDX's series, so it doesn't make a lot sense to
put the last two paragraphs here (because the purpose is different).  I think
you can just use Chao's original patch.


> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Acked-by: Anup Patel <anup@brainfault.org>
> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Link: https://lore.kernel.org/r/20220216031528.92558-3-chao.gao@intel.com
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Kai Huang <kai.huang@intel.com>
> ---
>  arch/arm64/kvm/arm.c       |  2 +-
>  arch/mips/kvm/mips.c       |  2 +-
>  arch/powerpc/kvm/powerpc.c |  2 +-
>  arch/riscv/kvm/main.c      |  2 +-
>  arch/s390/kvm/kvm-s390.c   |  2 +-
>  arch/x86/kvm/x86.c         |  2 +-
>  include/linux/kvm_host.h   |  2 +-
>  virt/kvm/kvm_main.c        | 16 +++-------------
>  8 files changed, 10 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 2ff0ef62abad..3385fb57c11a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -68,7 +68,7 @@ int kvm_arch_hardware_setup(void *opaque)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	return 0;
>  }
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index a25e0b73ee70..092d09fb6a7e 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -140,7 +140,7 @@ int kvm_arch_hardware_setup(void *opaque)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	return 0;
>  }
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index fb1490761c87..7b56d6ccfdfb 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -447,7 +447,7 @@ int kvm_arch_hardware_setup(void *opaque)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	return kvmppc_core_check_processor_compat();
>  }
> diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
> index 1549205fe5fe..f8d6372d208f 100644
> --- a/arch/riscv/kvm/main.c
> +++ b/arch/riscv/kvm/main.c
> @@ -20,7 +20,7 @@ long kvm_arch_dev_ioctl(struct file *filp,
>  	return -EINVAL;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	return 0;
>  }
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index edfd4bbd0cba..e26d4dd85668 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -254,7 +254,7 @@ int kvm_arch_hardware_enable(void)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	return 0;
>  }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 985487fe0d63..ca920b6b925d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11994,7 +11994,7 @@ void kvm_arch_hardware_unsetup(void)
>  	static_call(kvm_x86_hardware_unsetup)();
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> +int kvm_arch_check_processor_compat(void)
>  {
>  	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index f4519d3689e1..eab352902de7 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1438,7 +1438,7 @@ int kvm_arch_hardware_enable(void);
>  void kvm_arch_hardware_disable(void);
>  int kvm_arch_hardware_setup(void *opaque);
>  void kvm_arch_hardware_unsetup(void);
> -int kvm_arch_check_processor_compat(void *opaque);
> +int kvm_arch_check_processor_compat(void);
>  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
>  bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 584a5bab3af3..4243a9541543 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5799,22 +5799,14 @@ void kvm_unregister_perf_callbacks(void)
>  }
>  #endif
>  
> -struct kvm_cpu_compat_check {
> -	void *opaque;
> -	int *ret;
> -};
> -
> -static void check_processor_compat(void *data)
> +static void check_processor_compat(void *rtn)
>  {
> -	struct kvm_cpu_compat_check *c = data;
> -
> -	*c->ret = kvm_arch_check_processor_compat(c->opaque);
> +	*(int *)rtn = kvm_arch_check_processor_compat();
>  }
>  
>  int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  		  struct module *module)
>  {
> -	struct kvm_cpu_compat_check c;
>  	int r;
>  	int cpu;
>  
> @@ -5842,10 +5834,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  	if (r < 0)
>  		goto out_free_1;
>  
> -	c.ret = &r;
> -	c.opaque = opaque;
>  	for_each_online_cpu(cpu) {
> -		smp_call_function_single(cpu, check_processor_compat, &c, 1);
> +		smp_call_function_single(cpu, check_processor_compat, &r, 1);
>  		if (r < 0)
>  			goto out_free_2;
>  	}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online()
  2022-08-30 12:01 ` [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online() isaku.yamahata
@ 2022-09-01  5:29   ` Chao Gao
  2022-09-01 14:12     ` Sean Christopherson
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Gao @ 2022-09-01  5:29 UTC (permalink / raw)
  To: isaku.yamahata
  Cc: kvm, linux-kernel, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Will Deacon

On Tue, Aug 30, 2022 at 05:01:16AM -0700, isaku.yamahata@intel.com wrote:
>From: Isaku Yamahata <isaku.yamahata@intel.com>
>
>KVM/X86 uses user return notifier to switch MSR for guest or user space.
>Snapshot host values on CPU online, change MSR values for guest, and
>restore them on returning to user space.  The current code abuses
>kvm_arch_hardware_enable() which is called on kvm module initialization or
>CPU online.
>
>Remove such the abuse of kvm_arch_hardware_enable by capturing the host
>value on the first change of the MSR value to guest VM instead of CPU
>online.
>
>Suggested-by: Sean Christopherson <seanjc@google.com>
>Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>---
> arch/x86/kvm/x86.c | 43 ++++++++++++++++++++++++-------------------
> 1 file changed, 24 insertions(+), 19 deletions(-)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 205ebdc2b11b..16104a2f7d8e 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -200,6 +200,7 @@ struct kvm_user_return_msrs {
> 	struct kvm_user_return_msr_values {
> 		u64 host;
> 		u64 curr;
>+		bool initialized;
> 	} values[KVM_MAX_NR_USER_RETURN_MSRS];

The benefit of having an "initialized" state for each user return MSR on
each CPU is small. A per-cpu state looks suffice. With it, you can keep
kvm_user_return_msr_cpu_online() and simply call the function from
kvm_set_user_return_msr() if initialized is false on current CPU.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id())
  2022-08-30 12:01 ` [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) isaku.yamahata
@ 2022-09-01  5:56   ` Chao Gao
  0 siblings, 0 replies; 30+ messages in thread
From: Chao Gao @ 2022-09-01  5:56 UTC (permalink / raw)
  To: isaku.yamahata
  Cc: kvm, linux-kernel, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Will Deacon

On Tue, Aug 30, 2022 at 05:01:17AM -0700, isaku.yamahata@intel.com wrote:
>From: Isaku Yamahata <isaku.yamahata@intel.com>
>
>convert per_cpu_ptr(smp_processor_id()) to this_cpu_ptr() as trivial
>cleanup.
>
>Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>

Reviewed-by: Chao Gao <chao.gao@intel.com>

>---
> arch/x86/kvm/x86.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 16104a2f7d8e..7d5fff68befe 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -416,8 +416,7 @@ EXPORT_SYMBOL_GPL(kvm_find_user_return_msr);
> 
> int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
> {
>-	unsigned int cpu = smp_processor_id();
>-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
>+	struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
> 	struct kvm_user_return_msr_values *values = &msrs->values[slot];
> 	int err;
> 
>@@ -449,8 +448,7 @@ EXPORT_SYMBOL_GPL(kvm_set_user_return_msr);
> 
> static void drop_user_return_notifiers(void)
> {
>-	unsigned int cpu = smp_processor_id();
>-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
>+	struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
> 
> 	if (msrs->registered)
> 		kvm_on_user_return(&msrs->urn);
>-- 
>2.25.1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-08-30 12:01 ` [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section isaku.yamahata
@ 2022-09-01  5:59   ` Chao Gao
  2022-09-01  6:18   ` Chao Gao
  1 sibling, 0 replies; 30+ messages in thread
From: Chao Gao @ 2022-09-01  5:59 UTC (permalink / raw)
  To: isaku.yamahata
  Cc: kvm, linux-kernel, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Will Deacon, Thomas Gleixner

On Tue, Aug 30, 2022 at 05:01:20AM -0700, isaku.yamahata@intel.com wrote:
>From: Chao Gao <chao.gao@intel.com>
>
>The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
>hotplug callback to ONLINE section so that it can abort onlining a CPU in
>certain cases to avoid potentially breaking VMs running on existing CPUs.
>For example, when kvm fails to enable hardware virtualization on the
>hotplugged CPU.
>
>Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
>when offlining a CPU, all user tasks and non-pinned kernel tasks have left
>the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
>CPU offline callback to disable hardware virtualization at that point.
>Likewise, KVM's online callback can enable hardware virtualization before
>any vCPU task gets a chance to run on hotplugged CPUs.
>
>KVM's CPU hotplug callbacks are renamed as well.
>
>Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>Signed-off-by: Chao Gao <chao.gao@intel.com>
>Link: https://lore.kernel.org/r/20220216031528.92558-6-chao.gao@intel.com

Note that Sean gave his Reviewed-by for KVM changes.

https://lore.kernel.org/all/Yg%2FmxKrB5ZoRBIG+@google.com/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-08-30 12:01 ` [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section isaku.yamahata
  2022-09-01  5:59   ` Chao Gao
@ 2022-09-01  6:18   ` Chao Gao
  2022-09-01 10:58     ` Huang, Kai
  1 sibling, 1 reply; 30+ messages in thread
From: Chao Gao @ 2022-09-01  6:18 UTC (permalink / raw)
  To: isaku.yamahata
  Cc: kvm, linux-kernel, isaku.yamahata, Paolo Bonzini,
	Sean Christopherson, Kai Huang, Will Deacon, Thomas Gleixner

On Tue, Aug 30, 2022 at 05:01:20AM -0700, isaku.yamahata@intel.com wrote:
>From: Chao Gao <chao.gao@intel.com>
>
>The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
>hotplug callback to ONLINE section so that it can abort onlining a CPU in
>certain cases to avoid potentially breaking VMs running on existing CPUs.
>For example, when kvm fails to enable hardware virtualization on the
>hotplugged CPU.
>
>Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
>when offlining a CPU, all user tasks and non-pinned kernel tasks have left
>the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
>CPU offline callback to disable hardware virtualization at that point.
>Likewise, KVM's online callback can enable hardware virtualization before
>any vCPU task gets a chance to run on hotplugged CPUs.
>
>KVM's CPU hotplug callbacks are renamed as well.
>
>Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>Signed-off-by: Chao Gao <chao.gao@intel.com>

Isaku, your signed-off-by is missing.

>Link: https://lore.kernel.org/r/20220216031528.92558-6-chao.gao@intel.com
>---
> include/linux/cpuhotplug.h |  2 +-
> virt/kvm/kvm_main.c        | 30 ++++++++++++++++++++++--------
> 2 files changed, 23 insertions(+), 9 deletions(-)
>
>diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>index f61447913db9..7972bd63e0cb 100644
>--- a/include/linux/cpuhotplug.h
>+++ b/include/linux/cpuhotplug.h
>@@ -185,7 +185,6 @@ enum cpuhp_state {
> 	CPUHP_AP_CSKY_TIMER_STARTING,
> 	CPUHP_AP_TI_GP_TIMER_STARTING,
> 	CPUHP_AP_HYPERV_TIMER_STARTING,
>-	CPUHP_AP_KVM_STARTING,
> 	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
> 	CPUHP_AP_KVM_ARM_VGIC_STARTING,
> 	CPUHP_AP_KVM_ARM_TIMER_STARTING,

The movement of CPUHP_AP_KVM_STARTING changes the ordering between
CPUHP_AP_KVM_STARTING and CPUHP_AP_KVM_ARM_* above [1]. We need
the patch [2] from Marc to avoid breaking ARM.

[1] https://lore.kernel.org/lkml/87sfsq4xy8.wl-maz@kernel.org/
[2] https://lore.kernel.org/lkml/20220216031528.92558-5-chao.gao@intel.com/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-09-01  6:18   ` Chao Gao
@ 2022-09-01 10:58     ` Huang, Kai
  2022-09-01 16:52       ` Isaku Yamahata
  0 siblings, 1 reply; 30+ messages in thread
From: Huang, Kai @ 2022-09-01 10:58 UTC (permalink / raw)
  To: Gao, Chao, Yamahata, Isaku
  Cc: tglx@linutronix.de, kvm@vger.kernel.org, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com,
	will@kernel.org, Christopherson,, Sean

On Thu, 2022-09-01 at 14:18 +0800, Gao, Chao wrote:
> On Tue, Aug 30, 2022 at 05:01:20AM -0700, isaku.yamahata@intel.com wrote:
> > From: Chao Gao <chao.gao@intel.com>
> > 
> > The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
> > hotplug callback to ONLINE section so that it can abort onlining a CPU in
> > certain cases to avoid potentially breaking VMs running on existing CPUs.
> > For example, when kvm fails to enable hardware virtualization on the
> > hotplugged CPU.
> > 
> > Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
> > when offlining a CPU, all user tasks and non-pinned kernel tasks have left
> > the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
> > CPU offline callback to disable hardware virtualization at that point.
> > Likewise, KVM's online callback can enable hardware virtualization before
> > any vCPU task gets a chance to run on hotplugged CPUs.
> > 
> > KVM's CPU hotplug callbacks are renamed as well.
> > 
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Chao Gao <chao.gao@intel.com>
> 
> Isaku, your signed-off-by is missing.
> 
> > Link: https://lore.kernel.org/r/20220216031528.92558-6-chao.gao@intel.com
> > ---
> > include/linux/cpuhotplug.h |  2 +-
> > virt/kvm/kvm_main.c        | 30 ++++++++++++++++++++++--------
> > 2 files changed, 23 insertions(+), 9 deletions(-)
> > 
> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> > index f61447913db9..7972bd63e0cb 100644
> > --- a/include/linux/cpuhotplug.h
> > +++ b/include/linux/cpuhotplug.h
> > @@ -185,7 +185,6 @@ enum cpuhp_state {
> > 	CPUHP_AP_CSKY_TIMER_STARTING,
> > 	CPUHP_AP_TI_GP_TIMER_STARTING,
> > 	CPUHP_AP_HYPERV_TIMER_STARTING,
> > -	CPUHP_AP_KVM_STARTING,
> > 	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
> > 	CPUHP_AP_KVM_ARM_VGIC_STARTING,
> > 	CPUHP_AP_KVM_ARM_TIMER_STARTING,
> 
> The movement of CPUHP_AP_KVM_STARTING changes the ordering between
> CPUHP_AP_KVM_STARTING and CPUHP_AP_KVM_ARM_* above [1]. We need
> the patch [2] from Marc to avoid breaking ARM.
> 
> [1] https://lore.kernel.org/lkml/87sfsq4xy8.wl-maz@kernel.org/
> [2] https://lore.kernel.org/lkml/20220216031528.92558-5-chao.gao@intel.com/

How about Isaku just to take your series directly (+his SoB) and add additional
patches?

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online()
  2022-09-01  5:29   ` Chao Gao
@ 2022-09-01 14:12     ` Sean Christopherson
  2022-09-01 17:49       ` Isaku Yamahata
  0 siblings, 1 reply; 30+ messages in thread
From: Sean Christopherson @ 2022-09-01 14:12 UTC (permalink / raw)
  To: Chao Gao
  Cc: isaku.yamahata, kvm, linux-kernel, isaku.yamahata, Paolo Bonzini,
	Kai Huang, Will Deacon

On Thu, Sep 01, 2022, Chao Gao wrote:
> On Tue, Aug 30, 2022 at 05:01:16AM -0700, isaku.yamahata@intel.com wrote:
> >From: Isaku Yamahata <isaku.yamahata@intel.com>
> >
> >KVM/X86 uses user return notifier to switch MSR for guest or user space.
> >Snapshot host values on CPU online, change MSR values for guest, and
> >restore them on returning to user space.  The current code abuses
> >kvm_arch_hardware_enable() which is called on kvm module initialization or
> >CPU online.
> >
> >Remove such the abuse of kvm_arch_hardware_enable by capturing the host
> >value on the first change of the MSR value to guest VM instead of CPU
> >online.
> >
> >Suggested-by: Sean Christopherson <seanjc@google.com>
> >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> >---
> > arch/x86/kvm/x86.c | 43 ++++++++++++++++++++++++-------------------
> > 1 file changed, 24 insertions(+), 19 deletions(-)
> >
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 205ebdc2b11b..16104a2f7d8e 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -200,6 +200,7 @@ struct kvm_user_return_msrs {
> > 	struct kvm_user_return_msr_values {
> > 		u64 host;
> > 		u64 curr;
> >+		bool initialized;
> > 	} values[KVM_MAX_NR_USER_RETURN_MSRS];
> 
> The benefit of having an "initialized" state for each user return MSR on
> each CPU is small. A per-cpu state looks suffice. With it, you can keep
> kvm_user_return_msr_cpu_online() and simply call the function from
> kvm_set_user_return_msr() if initialized is false on current CPU.

Yep, a per-CPU flag is I intended.  This is the completely untested patch that's
sitting in a development branch of mine.

---
 arch/x86/kvm/x86.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eca76f187e4b..1328326acfae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -194,6 +194,7 @@ module_param(eager_page_split, bool, 0644);
 
 struct kvm_user_return_msrs {
 	struct user_return_notifier urn;
+	bool initialized;
 	bool registered;
 	struct kvm_user_return_msr_values {
 		u64 host;
@@ -400,18 +401,20 @@ int kvm_find_user_return_msr(u32 msr)
 	return -1;
 }
 
-static void kvm_user_return_msr_cpu_online(void)
+static void kvm_user_return_msr_init_cpu(struct kvm_user_return_msrs *msrs)
 {
-	unsigned int cpu = smp_processor_id();
-	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
 	u64 value;
 	int i;
 
+	if (msrs->initialized)
+		return;
+
 	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
 		rdmsrl_safe(kvm_uret_msrs_list[i], &value);
 		msrs->values[i].host = value;
 		msrs->values[i].curr = value;
 	}
+	msrs->initialized = true;
 }
 
 int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
@@ -420,6 +423,8 @@ int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
 	struct kvm_user_return_msrs *msrs = per_cpu_ptr(user_return_msrs, cpu);
 	int err;
 
+	kvm_user_return_msr_init_cpu(msrs);
+
 	value = (value & mask) | (msrs->values[slot].host & ~mask);
 	if (value == msrs->values[slot].curr)
 		return 0;
@@ -11740,7 +11745,6 @@ int kvm_arch_hardware_enable(void)
 	u64 max_tsc = 0;
 	bool stable, backwards_tsc = false;
 
-	kvm_user_return_msr_cpu_online();
 	ret = static_call(kvm_x86_hardware_enable)();
 	if (ret != 0)
 		return ret;

base-commit: a8f21d1980fbd7e877ed174142f7f572d547e611
-- 


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-09-01 10:58     ` Huang, Kai
@ 2022-09-01 16:52       ` Isaku Yamahata
  0 siblings, 0 replies; 30+ messages in thread
From: Isaku Yamahata @ 2022-09-01 16:52 UTC (permalink / raw)
  To: Huang, Kai
  Cc: Gao, Chao, Yamahata, Isaku, tglx@linutronix.de,
	kvm@vger.kernel.org, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com,
	will@kernel.org, Christopherson,, Sean

On Thu, Sep 01, 2022 at 10:58:04AM +0000,
"Huang, Kai" <kai.huang@intel.com> wrote:

> On Thu, 2022-09-01 at 14:18 +0800, Gao, Chao wrote:
> > On Tue, Aug 30, 2022 at 05:01:20AM -0700, isaku.yamahata@intel.com wrote:
> > > From: Chao Gao <chao.gao@intel.com>
> > > 
> > > The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
> > > hotplug callback to ONLINE section so that it can abort onlining a CPU in
> > > certain cases to avoid potentially breaking VMs running on existing CPUs.
> > > For example, when kvm fails to enable hardware virtualization on the
> > > hotplugged CPU.
> > > 
> > > Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
> > > when offlining a CPU, all user tasks and non-pinned kernel tasks have left
> > > the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
> > > CPU offline callback to disable hardware virtualization at that point.
> > > Likewise, KVM's online callback can enable hardware virtualization before
> > > any vCPU task gets a chance to run on hotplugged CPUs.
> > > 
> > > KVM's CPU hotplug callbacks are renamed as well.
> > > 
> > > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > > Signed-off-by: Chao Gao <chao.gao@intel.com>
> > 
> > Isaku, your signed-off-by is missing.
> > 
> > > Link: https://lore.kernel.org/r/20220216031528.92558-6-chao.gao@intel.com
> > > ---
> > > include/linux/cpuhotplug.h |  2 +-
> > > virt/kvm/kvm_main.c        | 30 ++++++++++++++++++++++--------
> > > 2 files changed, 23 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> > > index f61447913db9..7972bd63e0cb 100644
> > > --- a/include/linux/cpuhotplug.h
> > > +++ b/include/linux/cpuhotplug.h
> > > @@ -185,7 +185,6 @@ enum cpuhp_state {
> > > 	CPUHP_AP_CSKY_TIMER_STARTING,
> > > 	CPUHP_AP_TI_GP_TIMER_STARTING,
> > > 	CPUHP_AP_HYPERV_TIMER_STARTING,
> > > -	CPUHP_AP_KVM_STARTING,
> > > 	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
> > > 	CPUHP_AP_KVM_ARM_VGIC_STARTING,
> > > 	CPUHP_AP_KVM_ARM_TIMER_STARTING,
> > 
> > The movement of CPUHP_AP_KVM_STARTING changes the ordering between
> > CPUHP_AP_KVM_STARTING and CPUHP_AP_KVM_ARM_* above [1]. We need
> > the patch [2] from Marc to avoid breaking ARM.
> > 
> > [1] https://lore.kernel.org/lkml/87sfsq4xy8.wl-maz@kernel.org/
> > [2] https://lore.kernel.org/lkml/20220216031528.92558-5-chao.gao@intel.com/
> 
> How about Isaku just to take your series directly (+his SoB) and add additional
> patches?

Ok will do.  Although I hoped to slim it down, I've ended up to take most of it.
four out of six.  Now why not two more.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online()
  2022-09-01 14:12     ` Sean Christopherson
@ 2022-09-01 17:49       ` Isaku Yamahata
  0 siblings, 0 replies; 30+ messages in thread
From: Isaku Yamahata @ 2022-09-01 17:49 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Chao Gao, isaku.yamahata, kvm, linux-kernel, isaku.yamahata,
	Paolo Bonzini, Kai Huang, Will Deacon

On Thu, Sep 01, 2022 at 02:12:56PM +0000,
Sean Christopherson <seanjc@google.com> wrote:

> On Thu, Sep 01, 2022, Chao Gao wrote:
> > On Tue, Aug 30, 2022 at 05:01:16AM -0700, isaku.yamahata@intel.com wrote:
> > >From: Isaku Yamahata <isaku.yamahata@intel.com>
> > >
> > >KVM/X86 uses user return notifier to switch MSR for guest or user space.
> > >Snapshot host values on CPU online, change MSR values for guest, and
> > >restore them on returning to user space.  The current code abuses
> > >kvm_arch_hardware_enable() which is called on kvm module initialization or
> > >CPU online.
> > >
> > >Remove such the abuse of kvm_arch_hardware_enable by capturing the host
> > >value on the first change of the MSR value to guest VM instead of CPU
> > >online.
> > >
> > >Suggested-by: Sean Christopherson <seanjc@google.com>
> > >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> > >---
> > > arch/x86/kvm/x86.c | 43 ++++++++++++++++++++++++-------------------
> > > 1 file changed, 24 insertions(+), 19 deletions(-)
> > >
> > >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > >index 205ebdc2b11b..16104a2f7d8e 100644
> > >--- a/arch/x86/kvm/x86.c
> > >+++ b/arch/x86/kvm/x86.c
> > >@@ -200,6 +200,7 @@ struct kvm_user_return_msrs {
> > > 	struct kvm_user_return_msr_values {
> > > 		u64 host;
> > > 		u64 curr;
> > >+		bool initialized;
> > > 	} values[KVM_MAX_NR_USER_RETURN_MSRS];
> > 
> > The benefit of having an "initialized" state for each user return MSR on
> > each CPU is small. A per-cpu state looks suffice. With it, you can keep
> > kvm_user_return_msr_cpu_online() and simply call the function from
> > kvm_set_user_return_msr() if initialized is false on current CPU.
> 
> Yep, a per-CPU flag is I intended.  This is the completely untested patch that's
> sitting in a development branch of mine.

With the following fix, it worked.  I'll replace this patch with yours.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 205ebdc2b11b..0e200fe44b35 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9212,7 +9217,12 @@ int kvm_arch_init(void *opaque)
                return -ENOMEM;
        }
 
-       user_return_msrs = alloc_percpu(struct kvm_user_return_msrs);
+       /*
+        * __GFP_ZERO to ensure user_return_msrs.values[].initialized = false.
+        * See kvm_user_return_msr_init_cpu().
+        */
+       user_return_msrs = alloc_percpu_gfp(struct kvm_user_return_msrs,
+                                           GFP_KERNEL | __GFP_ZERO);
        if (!user_return_msrs) {
                printk(KERN_ERR "kvm: failed to allocate percpu kvm_user_return_msrs\n");
                r = -ENOMEM;

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"
  2022-08-30 22:39   ` Huang, Kai
@ 2022-09-01 18:01     ` Isaku Yamahata
  0 siblings, 0 replies; 30+ messages in thread
From: Isaku Yamahata @ 2022-09-01 18:01 UTC (permalink / raw)
  To: Huang, Kai
  Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yamahata, Isaku, Gao, Chao, Christopherson,, Sean,
	suzuki.poulose@arm.com, isaku.yamahata@gmail.com, will@kernel.org,
	anup@brainfault.org, pbonzini@redhat.com, imbrenda@linux.ibm.com

On Tue, Aug 30, 2022 at 10:39:48PM +0000,
"Huang, Kai" <kai.huang@intel.com> wrote:

> On Tue, 2022-08-30 at 05:01 -0700, isaku.yamahata@intel.com wrote:
> > From: Chao Gao <chao.gao@intel.com>
> > 
> > This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
> > param to additional arch funcs") remove opaque from
> > kvm_arch_check_processor_compat because no one uses this opaque now.
> > Address conflicts for ARM (due to file movement) and manually handle RISC-V
> > which comes after the commit.  The change about kvm_arch_hardware_setup()
> > in original commit are still needed so they are not reverted.
> > 
> > The current implementation enables hardware (e.g. enable VMX on all CPUs),
> > arch-specific initialization for the first VM creation, and disables
> > hardware (in x86, disable VMX on all CPUs) for last VM destruction.
> > 
> > To support TDX, hardware_enable_all() will be done during module loading
> > time.  As a result, CPU compatibility check will be opportunistically moved
> > to hardware_enable_nolock(), which doesn't take any argument.  Instead of
> > passing 'opaque' around to hardware_enable_nolock() and
> > hardware_enable_all(), just remove the unused 'opaque' argument from
> > kvm_arch_check_processor_compat().
> 
> This patch now is not part of TDX's series, so it doesn't make a lot sense to
> put the last two paragraphs here (because the purpose is different).  I think
> you can just use Chao's original patch.

Ok.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-09-01 18:01 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-30 12:01 [PATCH v2 00/19] KVM hardware enable/disable reorganize isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 01/19] KVM: x86: Drop kvm_user_return_msr_cpu_online() isaku.yamahata
2022-09-01  5:29   ` Chao Gao
2022-09-01 14:12     ` Sean Christopherson
2022-09-01 17:49       ` Isaku Yamahata
2022-08-30 12:01 ` [PATCH v2 02/19] KVM: x86: Use this_cpu_ptr() instead of per_cpu_ptr(smp_processor_id()) isaku.yamahata
2022-09-01  5:56   ` Chao Gao
2022-08-30 12:01 ` [PATCH v2 03/19] KVM: x86: Move check_processor_compatibility from init ops to runtime ops isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 04/19] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs" isaku.yamahata
2022-08-30 22:39   ` Huang, Kai
2022-09-01 18:01     ` Isaku Yamahata
2022-08-30 12:01 ` [PATCH v2 05/19] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section isaku.yamahata
2022-09-01  5:59   ` Chao Gao
2022-09-01  6:18   ` Chao Gao
2022-09-01 10:58     ` Huang, Kai
2022-09-01 16:52       ` Isaku Yamahata
2022-08-30 12:01 ` [PATCH v2 06/19] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 07/19] KVM: Add arch hooks for PM events with empty stub isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 08/19] KVM: x86: Move TSC fixup logic to KVM arch resume callback isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 09/19] KVM: Add arch hook when VM is added/deleted isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 10/19] KVM: Move out KVM arch PM hooks and hardware enable/disable logic isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 11/19] KVM: kvm_arch.c: Remove _nolock post fix isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 12/19] KVM: kvm_arch.c: Remove a global variable, hardware_enable_failed isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 13/19] KVM: Do processor compatibility check on cpu online and resume isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 14/19] KVM: x86: Duplicate arch callbacks related to pm events isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 15/19] KVM: Eliminate kvm_arch_post_init_vm() isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 16/19] KVM: x86: Delete kvm_arch_hardware_enable/disable() isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 17/19] KVM: Add config to not compile kvm_arch.c isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 18/19] RFC: KVM: x86: Remove cpus_hardware_enabled and related sanity check isaku.yamahata
2022-08-30 12:01 ` [PATCH v2 19/19] RFC: KVM: " isaku.yamahata

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox