From: "Huang, Kai" <kai.huang@intel.com>
To: "imbrenda@linux.ibm.com" <imbrenda@linux.ibm.com>,
"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>, "Christopherson,,
Sean" <seanjc@google.com>,
"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
"vkuznets@redhat.com" <vkuznets@redhat.com>,
"farman@linux.ibm.com" <farman@linux.ibm.com>,
"chenhuacai@kernel.org" <chenhuacai@kernel.org>,
"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
"palmer@dabbelt.com" <palmer@dabbelt.com>,
"maz@kernel.org" <maz@kernel.org>,
"anup@brainfault.org" <anup@brainfault.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"borntraeger@linux.ibm.com" <borntraeger@linux.ibm.com>,
"aleksandar.qemu.devel@gmail.com"
<aleksandar.qemu.devel@gmail.com>,
"frankja@linux.ibm.com" <frankja@linux.ibm.com>
Cc: "oliver.upton@linux.dev" <oliver.upton@linux.dev>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"Yao, Yuan" <yuan.yao@intel.com>,
"farosas@linux.ibm.com" <farosas@linux.ibm.com>,
"david@redhat.com" <david@redhat.com>,
"james.morse@arm.com" <james.morse@arm.com>,
"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
"alexandru.elisei@arm.com" <alexandru.elisei@arm.com>,
"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"Yamahata, Isaku" <isaku.yamahata@intel.com>,
"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
"kvm-riscv@lists.infradead.org" <kvm-riscv@lists.infradead.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>,
"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"Gao, Chao" <chao.gao@intel.com>,
"atishp@atishpatra.org" <atishp@atishpatra.org>,
"linux-riscv@lists.infradead.org"
<linux-riscv@lists.infradead.org>
Subject: Re: [PATCH 13/44] KVM: x86: Serialize vendor module initialization (hardware setup)
Date: Wed, 16 Nov 2022 01:46:51 +0000 [thread overview]
Message-ID: <e8e3b4c7bf3bd733c626618b57f9bf2f1835770e.camel@intel.com> (raw)
In-Reply-To: <20221102231911.3107438-14-seanjc@google.com>
On Wed, 2022-11-02 at 23:18 +0000, Sean Christopherson wrote:
> Acquire a new mutex, vendor_module_lock, in kvm_x86_vendor_init() while
> doing hardware setup to ensure that concurrent calls are fully serialized.
> KVM rejects attempts to load vendor modules if a different module has
> already been loaded, but doesn't handle the case where multiple vendor
> modules are loaded at the same time, and module_init() doesn't run under
> the global module_mutex.
>
> Note, in practice, this is likely a benign bug as no platform exists that
> supports both SVM and VMX, i.e. barring a weird VM setup, one of the
> vendor modules is guaranteed to fail a support check before modifying
> common KVM state.
>
> Alternatively, KVM could perform an atomic CMPXCHG on .hardware_enable,
> but that comes with its own ugliness as it would require setting
> .hardware_enable before success is guaranteed, e.g. attempting to load
> the "wrong" could result in spurious failure to load the "right" module.
>
> Introduce a new mutex as using kvm_lock is extremely deadlock prone due
> to kvm_lock being taken under cpus_write_lock(), and in the future, under
> under cpus_read_lock(). Any operation that takes cpus_read_lock() while
> holding kvm_lock would potentially deadlock, e.g. kvm_timer_init() takes
> cpus_read_lock() to register a callback. In theory, KVM could avoid
> such problematic paths, i.e. do less setup under kvm_lock, but avoiding
> all calls to cpus_read_lock() is subtly difficult and thus fragile. E.g.
> updating static calls also acquires cpus_read_lock().
>
> Inverting the lock ordering, i.e. always taking kvm_lock outside
> cpus_read_lock(), is not a viable option, e.g. kvm_online_cpu() takes
> kvm_lock and is called under cpus_write_lock().
"kvm_online_cpu() takes kvm_lock and is called under cpus_write_lock()" hasn't
happened yet.
>
> The lockdep splat below is dependent on future patches to take
> cpus_read_lock() in hardware_enable_all(), but as above, deadlock is
> already is already possible.
IIUC kvm_lock by design is supposed to protect vm_list, thus IMHO naturally it
doesn't fit to protect multiple vendor module loading.
Looks above argument is good enough. I am not sure whether we need additional
justification which comes from future patches. :)
Also, do you also want to update Documentation/virt/kvm/locking.rst" in this
patch?
>
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.0.0-smp--7ec93244f194-init2 #27 Tainted: G O
> ------------------------------------------------------
> stable/251833 is trying to acquire lock:
> ffffffffc097ea28 (kvm_lock){+.+.}-{3:3}, at: hardware_enable_all+0x1f/0xc0 [kvm]
>
> but task is already holding lock:
> ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (cpu_hotplug_lock){++++}-{0:0}:
> cpus_read_lock+0x2a/0xa0
> __cpuhp_setup_state+0x2b/0x60
> __kvm_x86_vendor_init+0x16a/0x1870 [kvm]
> kvm_x86_vendor_init+0x23/0x40 [kvm]
> 0xffffffffc0a4d02b
> do_one_initcall+0x110/0x200
> do_init_module+0x4f/0x250
> load_module+0x1730/0x18f0
> __se_sys_finit_module+0xca/0x100
> __x64_sys_finit_module+0x1d/0x20
> do_syscall_64+0x3d/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> -> #0 (kvm_lock){+.+.}-{3:3}:
> __lock_acquire+0x16f4/0x30d0
> lock_acquire+0xb2/0x190
> __mutex_lock+0x98/0x6f0
> mutex_lock_nested+0x1b/0x20
> hardware_enable_all+0x1f/0xc0 [kvm]
> kvm_dev_ioctl+0x45e/0x930 [kvm]
> __se_sys_ioctl+0x77/0xc0
> __x64_sys_ioctl+0x1d/0x20
> do_syscall_64+0x3d/0x80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(cpu_hotplug_lock);
> lock(kvm_lock);
> lock(cpu_hotplug_lock);
> lock(kvm_lock);
>
> *** DEADLOCK ***
>
> 1 lock held by stable/251833:
> #0: ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/x86.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a0ca401d3cdf..218707597bea 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -128,6 +128,7 @@ static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu);
> static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
> static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
>
> +static DEFINE_MUTEX(vendor_module_lock);
> struct kvm_x86_ops kvm_x86_ops __read_mostly;
>
> #define KVM_X86_OP(func) \
> @@ -9280,7 +9281,7 @@ void kvm_arch_exit(void)
>
> }
>
> -int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> +static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> {
> u64 host_pat;
> int r;
> @@ -9413,6 +9414,17 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> kmem_cache_destroy(x86_emulator_cache);
> return r;
> }
> +
> +int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> +{
> + int r;
> +
> + mutex_lock(&vendor_module_lock);
> + r = __kvm_x86_vendor_init(ops);
> + mutex_unlock(&vendor_module_lock);
> +
> + return r;
> +}
> EXPORT_SYMBOL_GPL(kvm_x86_vendor_init);
>
> void kvm_x86_vendor_exit(void)
> @@ -9435,7 +9447,6 @@ void kvm_x86_vendor_exit(void)
> cancel_work_sync(&pvclock_gtod_work);
> #endif
> static_call(kvm_x86_hardware_unsetup)();
> - kvm_x86_ops.hardware_enable = NULL;
> kvm_mmu_vendor_module_exit();
> free_percpu(user_return_msrs);
> kmem_cache_destroy(x86_emulator_cache);
> @@ -9443,6 +9454,9 @@ void kvm_x86_vendor_exit(void)
> static_key_deferred_flush(&kvm_xen_enabled);
> WARN_ON(static_branch_unlikely(&kvm_xen_enabled.key));
> #endif
> + mutex_lock(&vendor_module_lock);
> + kvm_x86_ops.hardware_enable = NULL;
> + mutex_unlock(&vendor_module_lock);
> }
> EXPORT_SYMBOL_GPL(kvm_x86_vendor_exit);
>
next prev parent reply other threads:[~2022-11-16 1:47 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-02 23:18 [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
2022-11-02 23:18 ` [PATCH 01/44] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
2022-11-02 23:18 ` [PATCH 02/44] KVM: Initialize IRQ FD after arch hardware setup Sean Christopherson
2022-11-04 0:41 ` Chao Gao
2022-11-04 20:15 ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 03/44] KVM: Allocate cpus_hardware_enabled " Sean Christopherson
2022-11-04 5:37 ` Yuan Yao
2022-11-02 23:18 ` [PATCH 04/44] KVM: Teardown VFIO ops earlier in kvm_exit() Sean Christopherson
2022-11-03 12:46 ` Cornelia Huck
2022-11-07 17:56 ` Eric Farman
2022-11-02 23:18 ` [PATCH 05/44] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails Sean Christopherson
2022-11-07 17:57 ` Eric Farman
2022-11-02 23:18 ` [PATCH 06/44] KVM: s390: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-07 17:58 ` Eric Farman
2022-11-02 23:18 ` [PATCH 07/44] KVM: x86: Do timer initialization after XCR0 configuration Sean Christopherson
2022-11-02 23:18 ` [PATCH 08/44] KVM: x86: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-04 6:22 ` Yuan Yao
2022-11-04 16:31 ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 09/44] KVM: Drop arch hardware (un)setup hooks Sean Christopherson
2022-11-07 3:01 ` Anup Patel
2022-11-07 18:22 ` Eric Farman
2022-11-02 23:18 ` [PATCH 10/44] KVM: VMX: Clean up eVMCS enabling if KVM initialization fails Sean Christopherson
2022-11-03 14:01 ` Paolo Bonzini
2022-11-03 14:04 ` Paolo Bonzini
2022-11-03 14:28 ` Vitaly Kuznetsov
2022-11-11 1:38 ` Sean Christopherson
2022-11-15 9:30 ` Vitaly Kuznetsov
2022-11-02 23:18 ` [PATCH 11/44] KVM: x86: Move guts of kvm_arch_init() to standalone helper Sean Christopherson
2022-11-02 23:18 ` [PATCH 12/44] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace Sean Christopherson
2022-11-02 23:18 ` [PATCH 13/44] KVM: x86: Serialize vendor module initialization (hardware setup) Sean Christopherson
2022-11-16 1:46 ` Huang, Kai [this message]
2022-11-16 15:52 ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 14/44] KVM: arm64: Simplify the CPUHP logic Sean Christopherson
2022-11-02 23:18 ` [PATCH 15/44] KVM: arm64: Free hypervisor allocations if vector slot init fails Sean Christopherson
2022-11-02 23:18 ` [PATCH 16/44] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails Sean Christopherson
2022-11-02 23:18 ` [PATCH 17/44] KVM: arm64: Do arm/arch initialiation without bouncing through kvm_init() Sean Christopherson
2022-11-03 7:25 ` Philippe Mathieu-Daudé
2022-11-03 15:29 ` Sean Christopherson
2022-11-02 23:18 ` [PATCH 18/44] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init Sean Christopherson
2022-11-02 23:18 ` [PATCH 19/44] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
2022-11-02 23:18 ` [PATCH 20/44] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init() Sean Christopherson
2022-11-03 7:10 ` Philippe Mathieu-Daudé
2022-11-02 23:18 ` [PATCH 21/44] KVM: MIPS: Register die notifier prior to kvm_init() Sean Christopherson
2022-11-03 7:12 ` Philippe Mathieu-Daudé
2022-11-02 23:18 ` [PATCH 22/44] KVM: RISC-V: Do arch init directly in riscv_kvm_init() Sean Christopherson
2022-11-03 7:14 ` Philippe Mathieu-Daudé
2022-11-07 3:05 ` Anup Patel
2022-11-02 23:18 ` [PATCH 23/44] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init Sean Christopherson
2022-11-07 3:10 ` Anup Patel
2022-11-02 23:18 ` [PATCH 24/44] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
2022-11-02 23:18 ` [PATCH 25/44] KVM: s390: Do s390 specific init without bouncing through kvm_init() Sean Christopherson
2022-11-03 7:16 ` Philippe Mathieu-Daudé
2022-11-03 12:44 ` Claudio Imbrenda
2022-11-03 13:21 ` Claudio Imbrenda
2022-11-07 18:22 ` Eric Farman
2022-11-02 23:18 ` [PATCH 26/44] KVM: s390: Mark __kvm_s390_init() and its descendants as __init Sean Christopherson
2022-11-07 18:22 ` Eric Farman
2022-11-02 23:18 ` [PATCH 27/44] KVM: Drop kvm_arch_{init,exit}() hooks Sean Christopherson
2022-11-03 7:18 ` Philippe Mathieu-Daudé
2022-11-07 3:13 ` Anup Patel
2022-11-07 19:08 ` Eric Farman
2022-11-02 23:18 ` [PATCH 28/44] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init Sean Christopherson
2022-11-02 23:18 ` [PATCH 29/44] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
2022-11-02 23:18 ` [PATCH 30/44] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
2022-11-03 7:20 ` Philippe Mathieu-Daudé
2022-11-07 3:16 ` Anup Patel
2022-11-07 19:08 ` Eric Farman
2022-11-02 23:18 ` [PATCH 31/44] KVM: x86: Use KBUILD_MODNAME to specify vendor module name Sean Christopherson
2022-11-02 23:18 ` [PATCH 32/44] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
2022-11-10 7:31 ` Robert Hoo
2022-11-10 16:50 ` Sean Christopherson
2022-11-30 23:02 ` Sean Christopherson
2022-12-01 1:34 ` Robert Hoo
2022-11-02 23:19 ` [PATCH 33/44] KVM: x86: Do VMX/SVM support checks directly in vendor code Sean Christopherson
2022-11-03 15:08 ` Paolo Bonzini
2022-11-03 18:35 ` Sean Christopherson
2022-11-03 18:46 ` Paolo Bonzini
2022-11-03 18:58 ` Sean Christopherson
2022-11-04 8:02 ` Paolo Bonzini
2022-11-04 15:40 ` Sean Christopherson
2022-11-15 22:50 ` Huang, Kai
2022-11-16 1:56 ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 34/44] KVM: VMX: Shuffle support checks and hardware enabling code around Sean Christopherson
2022-11-02 23:19 ` [PATCH 35/44] KVM: SVM: Check for SVM support in CPU compatibility checks Sean Christopherson
2022-11-02 23:19 ` [PATCH 36/44] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
2022-11-03 15:17 ` Paolo Bonzini
2022-11-03 17:44 ` Sean Christopherson
2022-11-03 17:57 ` Paolo Bonzini
2022-11-03 21:04 ` Isaku Yamahata
2022-11-03 22:34 ` Sean Christopherson
2022-11-04 7:18 ` Isaku Yamahata
2022-11-11 0:06 ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 37/44] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
2022-11-10 7:26 ` Robert Hoo
2022-11-10 16:49 ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 38/44] KVM: Disable CPU hotplug during hardware enabling Sean Christopherson
2022-11-10 1:08 ` Huang, Kai
2022-11-10 2:20 ` Huang, Kai
2022-11-10 1:33 ` Huang, Kai
2022-11-10 2:11 ` Huang, Kai
2022-11-10 16:58 ` Sean Christopherson
2022-11-15 20:16 ` Sean Christopherson
2022-11-15 20:21 ` Sean Christopherson
2022-11-16 12:23 ` Huang, Kai
2022-11-16 17:11 ` Sean Christopherson
2022-11-17 1:39 ` Huang, Kai
2022-11-17 15:16 ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 39/44] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Sean Christopherson
2022-11-03 15:23 ` Paolo Bonzini
2022-11-03 17:53 ` Sean Christopherson
2022-11-02 23:19 ` [PATCH 40/44] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit() Sean Christopherson
2022-11-02 23:19 ` [PATCH 41/44] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization Sean Christopherson
2022-11-02 23:19 ` [PATCH 42/44] KVM: Make hardware_enable_failed a local variable in the "enable all" path Sean Christopherson
2022-11-02 23:19 ` [PATCH 43/44] KVM: Register syscore (suspend/resume) ops early in kvm_init() Sean Christopherson
2022-11-02 23:19 ` [PATCH 44/44] KVM: Opt out of generic hardware enabling on s390 and PPC Sean Christopherson
2022-11-07 3:23 ` Anup Patel
2022-11-03 12:08 ` [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling Christian Borntraeger
2022-11-03 15:27 ` Paolo Bonzini
2022-11-04 7:17 ` Isaku Yamahata
2022-11-04 7:59 ` Paolo Bonzini
2022-11-04 20:27 ` Sean Christopherson
2022-11-07 21:46 ` Isaku Yamahata
2022-11-08 1:09 ` Huang, Kai
2022-11-08 5:43 ` Isaku Yamahata
2022-11-08 8:56 ` Huang, Kai
2022-11-08 10:35 ` Huang, Kai
2022-11-08 17:46 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8e3b4c7bf3bd733c626618b57f9bf2f1835770e.camel@intel.com \
--to=kai.huang@intel.com \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=alexandru.elisei@arm.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=atishp@atishpatra.org \
--cc=borntraeger@linux.ibm.com \
--cc=chao.gao@intel.com \
--cc=chenhuacai@kernel.org \
--cc=david@redhat.com \
--cc=farman@linux.ibm.com \
--cc=farosas@linux.ibm.com \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=isaku.yamahata@intel.com \
--cc=james.morse@arm.com \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maz@kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=oliver.upton@linux.dev \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=yuan.yao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).