linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>,
	 Huacai Chen <chenhuacai@kernel.org>,
	 Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>,
	Anup Patel <anup@brainfault.org>,
	 Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Albert Ou <aou@eecs.berkeley.edu>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	 Janosch Frank <frankja@linux.ibm.com>,
	Claudio Imbrenda <imbrenda@linux.ibm.com>,
	 Matthew Rosato <mjrosato@linux.ibm.com>,
	Eric Farman <farman@linux.ibm.com>,
	 Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kvm@vger.kernel.org, David Hildenbrand <david@redhat.com>,
	Atish Patra <atishp@atishpatra.org>,
	linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-s390@vger.kernel.org,
	Chao Gao <chao.gao@intel.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Yuan Yao <yuan.yao@intel.com>,
	kvmarm@lists.linux.dev, Thomas Gleixner <tglx@linutronix.de>,
	Alexandru Elisei <alexandru.elisei@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Isaku Yamahata <isaku.yamahata@intel.com>,
	Fabiano Rosas <farosas@linux.ibm.com>,
	linux-mips@vger.kernel.org, Oliver Upton <oliver.upton@linux.dev>,
	James Morse <james.morse@arm.com>,
	kvm-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org
Subject: [PATCH 39/44] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock
Date: Wed,  2 Nov 2022 23:19:06 +0000	[thread overview]
Message-ID: <20221102231911.3107438-40-seanjc@google.com> (raw)
In-Reply-To: <20221102231911.3107438-1-seanjc@google.com>

From: Isaku Yamahata <isaku.yamahata@intel.com>

Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock now
that KVM hooks CPU hotplug during the ONLINE phase, which can sleep.
Previously, KVM hooked the STARTING phase, which is not allowed to sleep
and thus could not take kvm_lock (a mutex).

Explicitly disable preemptions/IRQs in the CPU hotplug paths as needed to
keep arch code happy, e.g. x86 expects IRQs to be disabled during hardware
enabling, and expects preemption to be disabled during hardware disabling.
There are no preemption/interrupt concerns in the hotplug path, i.e. the
extra disabling is done purely to allow x86 to keep its sanity checks,
which are targeted primiarily at the "enable/disable all" paths.

Opportunistically update KVM's locking documentation.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/locking.rst | 18 ++++++------
 virt/kvm/kvm_main.c                | 44 +++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 845a561629f1..4feaf527575b 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -9,6 +9,8 @@ KVM Lock Overview
 
 The acquisition orders for mutexes are as follows:
 
+- cpus_read_lock() is taken outside kvm_lock
+
 - kvm->lock is taken outside vcpu->mutex
 
 - kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
@@ -29,6 +31,8 @@ The acquisition orders for mutexes are as follows:
 
 On x86:
 
+- kvm_lock is taken outside kvm->mmu_lock
+
 - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
 
 - kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock and
@@ -216,15 +220,11 @@ time it will be set using the Dirty tracking mechanism described above.
 :Type:		mutex
 :Arch:		any
 :Protects:	- vm_list
-
-``kvm_count_lock``
-^^^^^^^^^^^^^^^^^^
-
-:Type:		raw_spinlock_t
-:Arch:		any
-:Protects:	- hardware virtualization enable/disable
-:Comment:	'raw' because hardware enabling/disabling must be atomic /wrt
-		migration.
+		- kvm_usage_count
+		- hardware virtualization enable/disable
+		- module probing (x86 only)
+:Comment:	KVM also disables CPU hotplug via cpus_read_lock() during
+		enable/disable.
 
 ``kvm->mn_invalidate_lock``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4e765ef9f4bd..c8d92e6c3922 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
  */
 
 DEFINE_MUTEX(kvm_lock);
-static DEFINE_RAW_SPINLOCK(kvm_count_lock);
 LIST_HEAD(vm_list);
 
 static cpumask_var_t cpus_hardware_enabled;
@@ -5028,9 +5027,10 @@ static void hardware_enable_nolock(void *junk)
 
 static int kvm_online_cpu(unsigned int cpu)
 {
+	unsigned long flags;
 	int ret = 0;
 
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	/*
 	 * Abort the CPU online process if hardware virtualization cannot
 	 * be enabled. Otherwise running VMs would encounter unrecoverable
@@ -5039,13 +5039,16 @@ static int kvm_online_cpu(unsigned int cpu)
 	if (kvm_usage_count) {
 		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
 
+		local_irq_save(flags);
 		hardware_enable_nolock(NULL);
+		local_irq_restore(flags);
+
 		if (atomic_read(&hardware_enable_failed)) {
 			atomic_set(&hardware_enable_failed, 0);
 			ret = -EIO;
 		}
 	}
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	return ret;
 }
 
@@ -5061,10 +5064,13 @@ static void hardware_disable_nolock(void *junk)
 
 static int kvm_offline_cpu(unsigned int cpu)
 {
-	raw_spin_lock(&kvm_count_lock);
-	if (kvm_usage_count)
+	mutex_lock(&kvm_lock);
+	if (kvm_usage_count) {
+		preempt_disable();
 		hardware_disable_nolock(NULL);
-	raw_spin_unlock(&kvm_count_lock);
+		preempt_enable();
+	}
+	mutex_unlock(&kvm_lock);
 	return 0;
 }
 
@@ -5079,9 +5085,11 @@ static void hardware_disable_all_nolock(void)
 
 static void hardware_disable_all(void)
 {
-	raw_spin_lock(&kvm_count_lock);
+	cpus_read_lock();
+	mutex_lock(&kvm_lock);
 	hardware_disable_all_nolock();
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
+	cpus_read_unlock();
 }
 
 static int hardware_enable_all(void)
@@ -5097,7 +5105,7 @@ static int hardware_enable_all(void)
 	 * Disable CPU hotplug to prevent scenarios where KVM sees
 	 */
 	cpus_read_lock();
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
@@ -5110,7 +5118,7 @@ static int hardware_enable_all(void)
 		}
 	}
 
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	cpus_read_unlock();
 
 	return r;
@@ -5716,6 +5724,15 @@ static void kvm_init_debug(void)
 
 static int kvm_suspend(void)
 {
+	/*
+	 * Secondary CPUs and CPU hotplug are disabled across the suspend/resume
+	 * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count
+	 * is stable.  Assert that kvm_lock is not held as a paranoid sanity
+	 * check that the system isn't suspended when KVM is enabling hardware.
+	 */
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
 	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
 	return 0;
@@ -5723,10 +5740,11 @@ static int kvm_suspend(void)
 
 static void kvm_resume(void)
 {
-	if (kvm_usage_count) {
-		lockdep_assert_not_held(&kvm_count_lock);
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
+	if (kvm_usage_count)
 		hardware_enable_nolock(NULL);
-	}
 }
 
 static struct syscore_ops kvm_syscore_ops = {
-- 
2.38.1.431.g37b22c650d-goog


  parent reply	other threads:[~2022-11-02 23:55 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-02 23:18 [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
2022-11-02 23:18 ` [PATCH 01/44] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
2022-11-02 23:18 ` [PATCH 02/44] KVM: Initialize IRQ FD after arch hardware setup Sean Christopherson
2022-11-04  0:41   ` Chao Gao
2022-11-04 20:15     ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 03/44] KVM: Allocate cpus_hardware_enabled " Sean Christopherson
2022-11-04  5:37   ` Yuan Yao
2022-11-02 23:18 ` [PATCH 04/44] KVM: Teardown VFIO ops earlier in kvm_exit() Sean Christopherson
2022-11-03 12:46   ` Cornelia Huck
2022-11-07 17:56   ` Eric Farman
2022-11-02 23:18 ` [PATCH 05/44] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails Sean Christopherson
2022-11-07 17:57   ` Eric Farman
2022-11-02 23:18 ` [PATCH 06/44] KVM: s390: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-07 17:58   ` Eric Farman
2022-11-02 23:18 ` [PATCH 07/44] KVM: x86: Do timer initialization after XCR0 configuration Sean Christopherson
2022-11-02 23:18 ` [PATCH 08/44] KVM: x86: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-04  6:22   ` Yuan Yao
2022-11-04 16:31     ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 09/44] KVM: Drop arch hardware (un)setup hooks Sean Christopherson
2022-11-07  3:01   ` Anup Patel
2022-11-07 18:22   ` Eric Farman
2022-11-02 23:18 ` [PATCH 10/44] KVM: VMX: Clean up eVMCS enabling if KVM initialization fails Sean Christopherson
2022-11-03 14:01   ` Paolo Bonzini
2022-11-03 14:04     ` Paolo Bonzini
2022-11-03 14:28   ` Vitaly Kuznetsov
2022-11-11  1:38     ` Sean Christopherson
2022-11-15  9:30       ` Vitaly Kuznetsov
2022-11-02 23:18 ` [PATCH 11/44] KVM: x86: Move guts of kvm_arch_init() to standalone helper Sean Christopherson
2022-11-02 23:18 ` [PATCH 12/44] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace Sean Christopherson
2022-11-02 23:18 ` [PATCH 13/44] KVM: x86: Serialize vendor module initialization (hardware setup) Sean Christopherson
2022-11-16  1:46   ` Huang, Kai
2022-11-16 15:52     ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 14/44] KVM: arm64: Simplify the CPUHP logic Sean Christopherson
2022-11-02 23:18 ` [PATCH 15/44] KVM: arm64: Free hypervisor allocations if vector slot init fails Sean Christopherson
2022-11-02 23:18 ` [PATCH 16/44] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails Sean Christopherson
2022-11-02 23:18 ` [PATCH 17/44] KVM: arm64: Do arm/arch initialiation without bouncing through kvm_init() Sean Christopherson
2022-11-03  7:25   ` Philippe Mathieu-Daudé
2022-11-03 15:29     ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 18/44] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init Sean Christopherson
2022-11-02 23:18 ` [PATCH 19/44] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
2022-11-02 23:18 ` [PATCH 20/44] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init() Sean Christopherson
2022-11-03  7:10   ` Philippe Mathieu-Daudé
2022-11-02 23:18 ` [PATCH 21/44] KVM: MIPS: Register die notifier prior to kvm_init() Sean Christopherson
2022-11-03  7:12   ` Philippe Mathieu-Daudé
2022-11-02 23:18 ` [PATCH 22/44] KVM: RISC-V: Do arch init directly in riscv_kvm_init() Sean Christopherson
2022-11-03  7:14   ` Philippe Mathieu-Daudé
2022-11-07  3:05   ` Anup Patel
2022-11-02 23:18 ` [PATCH 23/44] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init Sean Christopherson
2022-11-07  3:10   ` Anup Patel
2022-11-02 23:18 ` [PATCH 24/44] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
2022-11-02 23:18 ` [PATCH 25/44] KVM: s390: Do s390 specific init without bouncing through kvm_init() Sean Christopherson
2022-11-03  7:16   ` Philippe Mathieu-Daudé
2022-11-03 12:44   ` Claudio Imbrenda
2022-11-03 13:21     ` Claudio Imbrenda
2022-11-07 18:22   ` Eric Farman
2022-11-02 23:18 ` [PATCH 26/44] KVM: s390: Mark __kvm_s390_init() and its descendants as __init Sean Christopherson
2022-11-07 18:22   ` Eric Farman
2022-11-02 23:18 ` [PATCH 27/44] KVM: Drop kvm_arch_{init,exit}() hooks Sean Christopherson
2022-11-03  7:18   ` Philippe Mathieu-Daudé
2022-11-07  3:13   ` Anup Patel
2022-11-07 19:08   ` Eric Farman
2022-11-02 23:18 ` [PATCH 28/44] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init Sean Christopherson
2022-11-02 23:18 ` [PATCH 29/44] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
2022-11-02 23:18 ` [PATCH 30/44] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
2022-11-03  7:20   ` Philippe Mathieu-Daudé
2022-11-07  3:16   ` Anup Patel
2022-11-07 19:08   ` Eric Farman
2022-11-02 23:18 ` [PATCH 31/44] KVM: x86: Use KBUILD_MODNAME to specify vendor module name Sean Christopherson
2022-11-02 23:18 ` [PATCH 32/44] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
2022-11-10  7:31   ` Robert Hoo
2022-11-10 16:50     ` Sean Christopherson
2022-11-30 23:02       ` Sean Christopherson
2022-12-01  1:34         ` Robert Hoo
2022-11-02 23:19 ` [PATCH 33/44] KVM: x86: Do VMX/SVM support checks directly in vendor code Sean Christopherson
2022-11-03 15:08   ` Paolo Bonzini
2022-11-03 18:35     ` Sean Christopherson
2022-11-03 18:46       ` Paolo Bonzini
2022-11-03 18:58         ` Sean Christopherson
2022-11-04  8:02           ` Paolo Bonzini
2022-11-04 15:40             ` Sean Christopherson
2022-11-15 22:50   ` Huang, Kai
2022-11-16  1:56     ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 34/44] KVM: VMX: Shuffle support checks and hardware enabling code around Sean Christopherson
2022-11-02 23:19 ` [PATCH 35/44] KVM: SVM: Check for SVM support in CPU compatibility checks Sean Christopherson
2022-11-02 23:19 ` [PATCH 36/44] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
2022-11-03 15:17   ` Paolo Bonzini
2022-11-03 17:44     ` Sean Christopherson
2022-11-03 17:57       ` Paolo Bonzini
2022-11-03 21:04   ` Isaku Yamahata
2022-11-03 22:34     ` Sean Christopherson
2022-11-04  7:18       ` Isaku Yamahata
2022-11-11  0:06         ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 37/44] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
2022-11-10  7:26   ` Robert Hoo
2022-11-10 16:49     ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 38/44] KVM: Disable CPU hotplug during hardware enabling Sean Christopherson
2022-11-10  1:08   ` Huang, Kai
2022-11-10  2:20     ` Huang, Kai
2022-11-10  1:33   ` Huang, Kai
2022-11-10  2:11     ` Huang, Kai
2022-11-10 16:58       ` Sean Christopherson
2022-11-15 20:16       ` Sean Christopherson
2022-11-15 20:21         ` Sean Christopherson
2022-11-16 12:23         ` Huang, Kai
2022-11-16 17:11           ` Sean Christopherson
2022-11-17  1:39             ` Huang, Kai
2022-11-17 15:16               ` Sean Christopherson
2022-11-02 23:19 ` Sean Christopherson [this message]
2022-11-03 15:23   ` [PATCH 39/44] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Paolo Bonzini
2022-11-03 17:53     ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 40/44] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit() Sean Christopherson
2022-11-02 23:19 ` [PATCH 41/44] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization Sean Christopherson
2022-11-02 23:19 ` [PATCH 42/44] KVM: Make hardware_enable_failed a local variable in the "enable all" path Sean Christopherson
2022-11-02 23:19 ` [PATCH 43/44] KVM: Register syscore (suspend/resume) ops early in kvm_init() Sean Christopherson
2022-11-02 23:19 ` [PATCH 44/44] KVM: Opt out of generic hardware enabling on s390 and PPC Sean Christopherson
2022-11-07  3:23   ` Anup Patel
2022-11-03 12:08 ` [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling Christian Borntraeger
2022-11-03 15:27   ` Paolo Bonzini
2022-11-04  7:17 ` Isaku Yamahata
2022-11-04  7:59   ` Paolo Bonzini
2022-11-04 20:27   ` Sean Christopherson
2022-11-07 21:46     ` Isaku Yamahata
2022-11-08  1:09       ` Huang, Kai
2022-11-08  5:43         ` Isaku Yamahata
2022-11-08  8:56           ` Huang, Kai
2022-11-08 10:35             ` Huang, Kai
2022-11-08 17:46       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221102231911.3107438-40-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=aleksandar.qemu.devel@gmail.com \
    --cc=alexandru.elisei@arm.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=atishp@atishpatra.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=chao.gao@intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=farosas@linux.ibm.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maz@kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=yuan.yao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).