From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Fri, 2 Dec 2022 16:31:18 +0000 Subject: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97854C4321E for ; Fri, 2 Dec 2022 16:31:28 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 0095F4B1B7; Fri, 2 Dec 2022 11:31:28 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Authentication-Results: mm01.cs.columbia.edu (amavisd-new); dkim=softfail (fail, message has been altered) header.i=@google.com Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9tbTSsJ+932y; Fri, 2 Dec 2022 11:31:26 -0500 (EST) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 87D5849E34; Fri, 2 Dec 2022 11:31:26 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id 1F5CB411C7 for ; Fri, 2 Dec 2022 11:31:25 -0500 (EST) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8PuSQtuDECT4 for ; Fri, 2 Dec 2022 11:31:23 -0500 (EST) Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 8C4CF40E00 for ; Fri, 2 Dec 2022 11:31:23 -0500 (EST) Received: by mail-pg1-f181.google.com with SMTP id 136so4801593pga.1 for ; Fri, 02 Dec 2022 08:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=JuBGytq8t5tr5SVnzdgiUWfOnnv3YoaxJ/EPOkhREUxf0Lu7Q36UoNdircCDscdJe1 Ne3pPvJP46804SbaTNQC7CD0nyOf+6DUf7RttT6MnjeqYf2tKeLulPmMzXN6C0vzpI5a kmqZ1b2nEnQBpLkucgI+K9CNuj0M6idXU9Z/tpGiUmJvFfSAKZUEJ0tYDDD5QTm1XgZh KUg4c1fCmmWL1AfPqQe/Q+7M8gt59bLLwzR5wrgb2mkyqnFKW9rqK8zn1jiKosWg6tGN xvGvpF7x1Z6otoY0Ddi4ibaMdJj9vaSCpFDZW9nFUZDm9ecAOG+G1OAJWfl9eXVsVqrU 2Qew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=WIqnIXg+W1pEnTCVj7azZKtEDM6EWdSqEK0KTh2oWHJL7FAiARSEaw3c3OjCfMlK4b QB3I/TSexSkpGwzqZQ6eUQcRQEf780D3by+D07mEncnRlXoDoP2S+WA0DrBwehWcYNLz QVes/DrYnlVZ1MLP3KfBbq5hjo6EXeL2L8iBHO9ncKbGj71eyqTxN7p8Z5Gez3PBJDcl R8rPybkEHDxoCEehWQi1pZPfuBx70rG2scLDVThhGVZKOdbw4VcEH4sDaJgEAnMzgFrf QNH6jNtlhJ9Ie8TeHJ/5MCstiMWF1mWFCyyRhOwCC6vPdyOGrctmhFdPBe5exBHnx1mq dYWQ== X-Gm-Message-State: ANoB5plW1Ctw9UVpDLmoyz19wF9gs2ISyo9rOTHE/hqQhcAxJXilrhti XJwxtEoSgVJGLeDfcKKaxeP/KQ== X-Google-Smtp-Source: AA0mqf5YDkgE2z83Sno75+zaFki0bhnK0sTw98u0yDu/SzUnzr69GggIhlhUBJVYlhlyDIQeZ9Y6nA== X-Received: by 2002:a65:5601:0:b0:43c:4eef:bac7 with SMTP id l1-20020a655601000000b0043c4eefbac7mr46619641pgs.356.1669998682344; Fri, 02 Dec 2022 08:31:22 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id g204-20020a6252d5000000b005756a67e227sm5438977pfb.90.2022.12.02.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:31:21 -0800 (PST) Date: Fri, 2 Dec 2022 16:31:18 +0000 From: Sean Christopherson To: "Huang, Kai" Subject: Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Message-ID: References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> Cc: "mjrosato@linux.ibm.com" , "paul@xen.org" , "Yao, Yuan" , "david@redhat.com" , "linux-mips@vger.kernel.org" , "atishp@atishpatra.org" , "mpe@ellerman.id.au" , "linux-riscv@lists.infradead.org" , "imbrenda@linux.ibm.com" , "kvmarm@lists.cs.columbia.edu" , "linux-s390@vger.kernel.org" , "frankja@linux.ibm.com" , "maz@kernel.org" , "chenhuacai@kernel.org" , "aleksandar.qemu.devel@gmail.com" , "borntraeger@linux.ibm.com" , "Gao, Chao" , "farman@linux.ibm.com" , "aou@eecs.berkeley.edu" , "kvm@vger.kernel.org" , "paul.walmsley@sifive.com" , "kvmarm@lists.linux.dev" , "tglx@linutronix.de" , "linux-arm-kernel@lists.infradead.org" , "Yamahata, Isaku" , "philmd@linaro.org" , "farosas@linux.ibm.com" , "linuxppc-dev@lists.ozlabs.org" , "cohuck@redhat.com" , "linux-kernel@vger.kernel.org" , "palmer@dabbelt.com" , "kvm-riscv@lists.infradead.org" , "pbonzini@redhat.com" , "vkuznets@redhat.com" , "dwmw2@infradead.org" X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pg1-f174.google.com (mail-pg1-f174.google.com [209.85.215.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1ABC733F1 for ; Fri, 2 Dec 2022 16:31:23 +0000 (UTC) Received: by mail-pg1-f174.google.com with SMTP id h33so4770160pgm.9 for ; Fri, 02 Dec 2022 08:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=JuBGytq8t5tr5SVnzdgiUWfOnnv3YoaxJ/EPOkhREUxf0Lu7Q36UoNdircCDscdJe1 Ne3pPvJP46804SbaTNQC7CD0nyOf+6DUf7RttT6MnjeqYf2tKeLulPmMzXN6C0vzpI5a kmqZ1b2nEnQBpLkucgI+K9CNuj0M6idXU9Z/tpGiUmJvFfSAKZUEJ0tYDDD5QTm1XgZh KUg4c1fCmmWL1AfPqQe/Q+7M8gt59bLLwzR5wrgb2mkyqnFKW9rqK8zn1jiKosWg6tGN xvGvpF7x1Z6otoY0Ddi4ibaMdJj9vaSCpFDZW9nFUZDm9ecAOG+G1OAJWfl9eXVsVqrU 2Qew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=GBQoe+I84R8m5iZLp194FDdCiPAy9yemU5AnPFN4I3C55WknbHcIpB59sLfmfldZix vTiijyqJSaUQJdmrCpg7VKUopnjKtCWZ7DLgKRRvETrTGZIQNYAoym7neNfK54Js3h5Z weKJp1igNyD1CIpnGkfabGTv3D2kJu8qxpnAfn9e9/jEu4YvQ7PXH0t6cEQ8TOf6a9ld wBVl/H4nLGliVhAonOW97aYrKshHb+/rDZoDc3q+Gx/+p+CfQKH4BlraGP7z4UyAjYMK cKiAowgApt+/M4LofnjJEJYaqtCVW2uudye+qIjq/3HMdm6PCWvUfckcqTTaQY2ZmTCm +ywA== X-Gm-Message-State: ANoB5pmef6mKCUchNVfLztDFDj3eXEl7fIlsbEq/Gnadmgd8b/7BZ/sW 84T4l6iSLzyPdKxxwx9RBWaKBA== X-Google-Smtp-Source: AA0mqf5YDkgE2z83Sno75+zaFki0bhnK0sTw98u0yDu/SzUnzr69GggIhlhUBJVYlhlyDIQeZ9Y6nA== X-Received: by 2002:a65:5601:0:b0:43c:4eef:bac7 with SMTP id l1-20020a655601000000b0043c4eefbac7mr46619641pgs.356.1669998682344; Fri, 02 Dec 2022 08:31:22 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id g204-20020a6252d5000000b005756a67e227sm5438977pfb.90.2022.12.02.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:31:21 -0800 (PST) Date: Fri, 2 Dec 2022 16:31:18 +0000 From: Sean Christopherson To: "Huang, Kai" Cc: "chenhuacai@kernel.org" , "maz@kernel.org" , "frankja@linux.ibm.com" , "borntraeger@linux.ibm.com" , "farman@linux.ibm.com" , "aou@eecs.berkeley.edu" , "palmer@dabbelt.com" , "paul.walmsley@sifive.com" , "pbonzini@redhat.com" , "dwmw2@infradead.org" , "aleksandar.qemu.devel@gmail.com" , "imbrenda@linux.ibm.com" , "paul@xen.org" , "mjrosato@linux.ibm.com" , "vkuznets@redhat.com" , "anup@brainfault.org" , "oliver.upton@linux.dev" , "kvm@vger.kernel.org" , "cohuck@redhat.com" , "farosas@linux.ibm.com" , "david@redhat.com" , "james.morse@arm.com" , "Yao, Yuan" , "alexandru.elisei@arm.com" , "linux-s390@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "mpe@ellerman.id.au" , "Yamahata, Isaku" , "kvmarm@lists.linux.dev" , "tglx@linutronix.de" , "suzuki.poulose@arm.com" , "kvm-riscv@lists.infradead.org" , "linuxppc-dev@lists.ozlabs.org" , "linux-arm-kernel@lists.infradead.org" , "linux-mips@vger.kernel.org" , "kvmarm@lists.cs.columbia.edu" , "philmd@linaro.org" , "atishp@atishpatra.org" , "linux-riscv@lists.infradead.org" , "Gao, Chao" Subject: Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Message-ID: References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> Message-ID: <20221202163118.HWkwy198_oIhlAkpcjjU4GRWH48Kcd1dKvnnHlLn0l4@z> On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4D8C8C4321E for ; Fri, 2 Dec 2022 16:31:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=0nWPNfIppnq7woXEqb3qYsC7WdfBhriv+Kpk4s8bSmw=; b=LDWyZM0bAnUjAa ST2zSQnJ5sC5hpEjBB5jx/agV2SJgSPgNH54Cb/7GDhWnYoMGdSI5LZZnxUSlZvhlzT7cd6AesZoW kzEWgBPEJ8B8bpQtgQF1jpM6dj9Bn0jpyGdF5TzFdnvknVSFpcCAz6oBRS53gvilKGaL7AwDWM7ev K9QlT+11JwBFDxZVH8EL+jijdo7iPjR4XQ71PLeH73csnZfaRhFa4brl4TSJYtWu6MXe0H4swbCyW ZBa7fu9ifwLBjSGIOagxPzGCQoskRVQMTKFyEtofvI3oJx9bH6/lqpiqRzK2WzH9C2NUCIP+tNQlz Up5e8gyaIcbPrZQXeGUg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p18wx-00HZ5Z-6R; Fri, 02 Dec 2022 16:31:43 +0000 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1p18wg-00HYvH-7D for linux-riscv@lists.infradead.org; Fri, 02 Dec 2022 16:31:30 +0000 Received: by mail-pg1-x52f.google.com with SMTP id 62so4764716pgb.13 for ; Fri, 02 Dec 2022 08:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=JuBGytq8t5tr5SVnzdgiUWfOnnv3YoaxJ/EPOkhREUxf0Lu7Q36UoNdircCDscdJe1 Ne3pPvJP46804SbaTNQC7CD0nyOf+6DUf7RttT6MnjeqYf2tKeLulPmMzXN6C0vzpI5a kmqZ1b2nEnQBpLkucgI+K9CNuj0M6idXU9Z/tpGiUmJvFfSAKZUEJ0tYDDD5QTm1XgZh KUg4c1fCmmWL1AfPqQe/Q+7M8gt59bLLwzR5wrgb2mkyqnFKW9rqK8zn1jiKosWg6tGN xvGvpF7x1Z6otoY0Ddi4ibaMdJj9vaSCpFDZW9nFUZDm9ecAOG+G1OAJWfl9eXVsVqrU 2Qew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=y7cGZq2LAlRwK4jYDh6sn10A1QViw5C2u23o+yC9in3yxGupzg3E2XY8mL8vML59yW 5FlHF7D4K+7Jcoh7QCxej3dW9tVLFzh0FqsYqx/zXR8IPJZyPVgay4bCjofeTZ9dz5aa NM+KW4w510J/dYjSLZ7g/spVLk32euSyzFm+ACUASx7YoVJQJ29rVXywBu3u8CNbHHx9 Wpqp6xDEmfioYA5X7Lra9XtMnAz11pcd8WeotBi1ryOX1hsLNET8Q7W6dU4gE2H2LVzy TXYxARq1LUw9r8qo7yv6WEmtmQUjfdc7LihM73ftESdrDRpJPjLTak+NEYed+wUfCStS atCQ== X-Gm-Message-State: ANoB5pkokaBYNogwQUW9TjdQeV/7PktMDRAhNmpVDdPq2kHhD2qSsMdZ WaG+SFAHhKLEHijP67NT/p6/aA== X-Google-Smtp-Source: AA0mqf5YDkgE2z83Sno75+zaFki0bhnK0sTw98u0yDu/SzUnzr69GggIhlhUBJVYlhlyDIQeZ9Y6nA== X-Received: by 2002:a65:5601:0:b0:43c:4eef:bac7 with SMTP id l1-20020a655601000000b0043c4eefbac7mr46619641pgs.356.1669998682344; Fri, 02 Dec 2022 08:31:22 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id g204-20020a6252d5000000b005756a67e227sm5438977pfb.90.2022.12.02.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:31:21 -0800 (PST) Date: Fri, 2 Dec 2022 16:31:18 +0000 From: Sean Christopherson To: "Huang, Kai" Cc: "chenhuacai@kernel.org" , "maz@kernel.org" , "frankja@linux.ibm.com" , "borntraeger@linux.ibm.com" , "farman@linux.ibm.com" , "aou@eecs.berkeley.edu" , "palmer@dabbelt.com" , "paul.walmsley@sifive.com" , "pbonzini@redhat.com" , "dwmw2@infradead.org" , "aleksandar.qemu.devel@gmail.com" , "imbrenda@linux.ibm.com" , "paul@xen.org" , "mjrosato@linux.ibm.com" , "vkuznets@redhat.com" , "anup@brainfault.org" , "oliver.upton@linux.dev" , "kvm@vger.kernel.org" , "cohuck@redhat.com" , "farosas@linux.ibm.com" , "david@redhat.com" , "james.morse@arm.com" , "Yao, Yuan" , "alexandru.elisei@arm.com" , "linux-s390@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "mpe@ellerman.id.au" , "Yamahata, Isaku" , "kvmarm@lists.linux.dev" , "tglx@linutronix.de" , "suzuki.poulose@arm.com" , "kvm-riscv@lists.infradead.org" , "linuxppc-dev@lists.ozlabs.org" , "linux-arm-kernel@lists.infradead.org" , "linux-mips@vger.kernel.org" , "kvmarm@lists.cs.columbia.edu" , "philmd@linaro.org" , "atishp@atishpatra.org" , "linux-riscv@lists.infradead.org" , "Gao, Chao" Subject: Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Message-ID: References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221202_083126_338484_B9A6FC5E X-CRM114-Status: GOOD ( 33.16 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A9C7C47089 for ; Fri, 2 Dec 2022 16:32:22 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NNz4N5mNJz3bj0 for ; Sat, 3 Dec 2022 03:32:20 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=JuBGytq8; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=google.com (client-ip=2607:f8b0:4864:20::535; helo=mail-pg1-x535.google.com; envelope-from=seanjc@google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20210112 header.b=JuBGytq8; dkim-atps=neutral Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NNz3K4g9Wz2xl6 for ; Sat, 3 Dec 2022 03:31:24 +1100 (AEDT) Received: by mail-pg1-x535.google.com with SMTP id q1so4765834pgl.11 for ; Fri, 02 Dec 2022 08:31:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=JuBGytq8t5tr5SVnzdgiUWfOnnv3YoaxJ/EPOkhREUxf0Lu7Q36UoNdircCDscdJe1 Ne3pPvJP46804SbaTNQC7CD0nyOf+6DUf7RttT6MnjeqYf2tKeLulPmMzXN6C0vzpI5a kmqZ1b2nEnQBpLkucgI+K9CNuj0M6idXU9Z/tpGiUmJvFfSAKZUEJ0tYDDD5QTm1XgZh KUg4c1fCmmWL1AfPqQe/Q+7M8gt59bLLwzR5wrgb2mkyqnFKW9rqK8zn1jiKosWg6tGN xvGvpF7x1Z6otoY0Ddi4ibaMdJj9vaSCpFDZW9nFUZDm9ecAOG+G1OAJWfl9eXVsVqrU 2Qew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=ta5OiTHlna7yshHghQpTDoGeuVAT52lU/UJzvb/GYyT0cMgnb03D5E8eGk3z+LDdtH SU0/h51Q/1mm6cwXR6nwJiMmj9dtYzjv8cFf0dUPI6+WyExF7P2sg3fLDaa2xFC9qpt/ 2dudb+uJITE2Y/wfmRBaDnSX+XUmWxIU11BlbPCm64OyHrLMT6UC/asa7RHpcAxbUsgH TW0YxsA0cXSQ948zRq/cYedMH+sAFK1OBzu9BdkNZYNJMog2wub0G6hOo8321WmhUmc1 ig1spHYJweC7qr2P2H87D+EADOjTMVb/iZiahZJDtz01ZdwojiI+4EuNWToUqM1fddDf Ja/w== X-Gm-Message-State: ANoB5pmhLqwzQUux8ldZj7jdCDxYoIQV8YBgAPB9TYsDPXCVgTyT7rTD Wce8LVBn4O104K4ZfHKL8avX+A== X-Google-Smtp-Source: AA0mqf5YDkgE2z83Sno75+zaFki0bhnK0sTw98u0yDu/SzUnzr69GggIhlhUBJVYlhlyDIQeZ9Y6nA== X-Received: by 2002:a65:5601:0:b0:43c:4eef:bac7 with SMTP id l1-20020a655601000000b0043c4eefbac7mr46619641pgs.356.1669998682344; Fri, 02 Dec 2022 08:31:22 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id g204-20020a6252d5000000b005756a67e227sm5438977pfb.90.2022.12.02.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:31:21 -0800 (PST) Date: Fri, 2 Dec 2022 16:31:18 +0000 From: Sean Christopherson To: "Huang, Kai" Subject: Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Message-ID: References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "mjrosato@linux.ibm.com" , "paul@xen.org" , "Yao, Yuan" , "david@redhat.com" , "linux-mips@vger.kernel.org" , "atishp@atishpatra.org" , "linux-riscv@lists.infradead.org" , "imbrenda@linux.ibm.com" , "kvmarm@lists.cs.columbia.edu" , "linux-s390@vger.kernel.org" , "frankja@linux.ibm.com" , "maz@kernel.org" , "chenhuacai@kernel.org" , "aleksandar.qemu.devel@gmail.com" , "james.morse@arm.com" , "borntraeger@linux.ibm.com" , "Gao, Chao" , "farman@linux.ibm.com" , "aou@eecs.berkeley.edu" , "suzuki.poulose@arm.com" , "kvm @vger.kernel.org" , "paul.walmsley@sifive.com" , "kvmarm@lists.linux.dev" , "tglx@linutronix.de" , "alexandru.elisei@arm.com" , "linux-arm-kernel@lists.infradead.org" , "Yamahata, Isaku" , "philmd@linaro.org" , "farosas@linux.ibm.com" , "linuxppc-dev@lists.ozlabs.org" , "cohuck@redhat.com" , "linux-kernel@vger.kernel.org" , "oliver.upton@linux.dev" , "palmer@dabbelt.com" , "kvm-riscv@lists.infradead.org" , "anup@brainfault.org" , "pbonzini@redhat.com" , "vkuznets@redhat.com" , "dwmw2@infradead.org" Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1DB10C4321E for ; Fri, 2 Dec 2022 16:32:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=qXQnNGgbzZ16O2Qq/6XqMbU2KEX/DWr2l6jgwgbAlVI=; b=RMfGtC89z8WWCC ySfqGnSqfzlj6McGkt1Marfm3wBruPaAuwnYwoDLvH3pxvU7X4OW7oxtHx94LNMLrXKC8382LYov6 a/2EJKq74BZZpU/cm1jDs5nyr9ILnUc8ttN2Gd5QwAtX5T6FvU72DXXSMbhwJJQtBMX94aIDt7SGL lmE515QaspPm+538Y8Dh+u3g9m3Bv/Wodso0G4m1mmtVtcBgpjFV0gFVL2nuGGViQ50GvVWdvFhaO E5+Rntmxy6q6pfArBjF/30ycmIVBeDTqD9PKc9yonWiZ6KXQS4sqVMzWzUDNz7rmiKfBxXzNjElxV fwFYzuGnRFM2msDh8Zbw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1p18wl-00HZ01-9J; Fri, 02 Dec 2022 16:31:31 +0000 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1p18wg-00HYvJ-77 for linux-arm-kernel@lists.infradead.org; Fri, 02 Dec 2022 16:31:28 +0000 Received: by mail-pg1-x52f.google.com with SMTP id f9so4784506pgf.7 for ; Fri, 02 Dec 2022 08:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=JuBGytq8t5tr5SVnzdgiUWfOnnv3YoaxJ/EPOkhREUxf0Lu7Q36UoNdircCDscdJe1 Ne3pPvJP46804SbaTNQC7CD0nyOf+6DUf7RttT6MnjeqYf2tKeLulPmMzXN6C0vzpI5a kmqZ1b2nEnQBpLkucgI+K9CNuj0M6idXU9Z/tpGiUmJvFfSAKZUEJ0tYDDD5QTm1XgZh KUg4c1fCmmWL1AfPqQe/Q+7M8gt59bLLwzR5wrgb2mkyqnFKW9rqK8zn1jiKosWg6tGN xvGvpF7x1Z6otoY0Ddi4ibaMdJj9vaSCpFDZW9nFUZDm9ecAOG+G1OAJWfl9eXVsVqrU 2Qew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RsaZyzKJOuyFVoAfYZw39HCTNem7JcJUeEd5tgG0e8Y=; b=f4DEdhfzz5ZJgO+T4Ea5VJUO7jlTUpyZxsM9ADAzv6GR0ZRIyBm/7Q2lVWHSM9p+D9 flnuPQdud/sigCpDKZVmx1JcCcKrYKGdeWxzxiQ2Rhq6XPkKvck7fWXCZNIN1xFwpAf5 nkQrrs4QHYfp7Kki5u57w/QAaDO1k9lM0pq34AhvhB/3swWAXFuYB4KSYbL2NqVm7hLw 2kXV+/VXLpGQUnqbLPjipNIhAcVnQs/PaaIbQLN9agSc+GRxXoLkj/ZgaoHz5iUnj1pC TCjA2JFJvxlnieL7yRRrQE4oIcZyo+uh5L88Hie+NWqpOirwfg/zBW4ebPeLh3wZhjTg H8YA== X-Gm-Message-State: ANoB5pnErcg/132rNaaMEunPtzz3UNzjlTeUhMJ6XjhOGnSUKW7zw3iN hW9hyYqPH6J+/FyvZc5ynNI+0w== X-Google-Smtp-Source: AA0mqf5YDkgE2z83Sno75+zaFki0bhnK0sTw98u0yDu/SzUnzr69GggIhlhUBJVYlhlyDIQeZ9Y6nA== X-Received: by 2002:a65:5601:0:b0:43c:4eef:bac7 with SMTP id l1-20020a655601000000b0043c4eefbac7mr46619641pgs.356.1669998682344; Fri, 02 Dec 2022 08:31:22 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id g204-20020a6252d5000000b005756a67e227sm5438977pfb.90.2022.12.02.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Dec 2022 08:31:21 -0800 (PST) Date: Fri, 2 Dec 2022 16:31:18 +0000 From: Sean Christopherson To: "Huang, Kai" Cc: "chenhuacai@kernel.org" , "maz@kernel.org" , "frankja@linux.ibm.com" , "borntraeger@linux.ibm.com" , "farman@linux.ibm.com" , "aou@eecs.berkeley.edu" , "palmer@dabbelt.com" , "paul.walmsley@sifive.com" , "pbonzini@redhat.com" , "dwmw2@infradead.org" , "aleksandar.qemu.devel@gmail.com" , "imbrenda@linux.ibm.com" , "paul@xen.org" , "mjrosato@linux.ibm.com" , "vkuznets@redhat.com" , "anup@brainfault.org" , "oliver.upton@linux.dev" , "kvm@vger.kernel.org" , "cohuck@redhat.com" , "farosas@linux.ibm.com" , "david@redhat.com" , "james.morse@arm.com" , "Yao, Yuan" , "alexandru.elisei@arm.com" , "linux-s390@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "mpe@ellerman.id.au" , "Yamahata, Isaku" , "kvmarm@lists.linux.dev" , "tglx@linutronix.de" , "suzuki.poulose@arm.com" , "kvm-riscv@lists.infradead.org" , "linuxppc-dev@lists.ozlabs.org" , "linux-arm-kernel@lists.infradead.org" , "linux-mips@vger.kernel.org" , "kvmarm@lists.cs.columbia.edu" , "philmd@linaro.org" , "atishp@atishpatra.org" , "linux-riscv@lists.infradead.org" , "Gao, Chao" Subject: Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Message-ID: References: <20221130230934.1014142-1-seanjc@google.com> <20221130230934.1014142-43-seanjc@google.com> <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <8b1053781e859aa95a08c10b0e8a06912a2b42a2.camel@intel.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221202_083126_329468_81B753A7 X-CRM114-Status: GOOD ( 34.63 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Dec 02, 2022, Huang, Kai wrote: > On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote: > > From: Chao Gao > > > > Disable CPU hotplug when enabling/disabling hardware to prevent the > > corner case where if the following sequence occurs: > > > > 1. A hotplugged CPU marks itself online in cpu_online_mask > > 2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE > > callback > > 3 hardware_{en,dis}able_all() is invoked on another CPU > > > > the hotplugged CPU will be included in on_each_cpu() and thus get sent > > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called. > > Should we explicitly call out what is the consequence of such case, otherwise > it's hard to tell whether this truly is an issue? > > IIUC, since now the compatibility check has already been moved to > kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail > if the now online cpu isn't compatible, which will results in failing to create > the first VM. This isn't ideal since the incompatible cpu should be rejected to > go online instead. Actually, in that specific scenario, KVM should not reject the CPU. E.g. if KVM is autoloaded (common with systemd and/or qemu-kvm installed), but not used by userspace, then KVM is overstepping by rejecting the incompatible CPU since the user likely cares more about onlining a CPU than they do about KVM. > > KVM currently fudges around this race by keeping track of which CPUs have > > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which > > cpus have virtualization enabled"), but that's an inefficient, convoluted, > > and hacky solution. ... > > + /* > > + * Compatibility checks are done when loading KVM and when enabling > > + * hardware, e.g. during CPU hotplug, to ensure all online CPUs are > > + * compatible, i.e. KVM should never perform a compatibility check on > > + * an offline CPU. > > + */ > > + WARN_ON(!cpu_online(cpu)); > > IMHO this chunk logically should belong to previous patch. IIUC disabling CPU > hotplug during hardware_enable_all() doesn't have relationship to this WARN(). Hmm, yeah, I agree. I'll move it. > > static int hardware_enable_all(void) > > { > > int r = 0; > > > > + /* > > + * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu() > > + * is called, and so on_each_cpu() between them includes the CPU that > > + * is being onlined. As a result, hardware_enable_nolock() may get > > + * invoked before kvm_online_cpu(), which also enables hardware if the > > + * usage count is non-zero. Disable CPU hotplug to avoid attempting to > > + * enable hardware multiple times. > > It won't enable hardware multiple times, right? Since hardware_enable_nolock() > has below check: > > if (cpumask_test_cpu(cpu, cpus_hardware_enabled)) > return; > > cpumask_set_cpu(cpu, cpus_hardware_enabled); > > IIUC the only issue is the one that I replied in the changelog. > > Or perhaps I am missing something? You're not missing anything in terms of code. What the comment means by "attempting" in this case is calling hardware_enable_nolock(). As called out in the changelog, guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should not have to rely on a per-CPU flag. : KVM currently fudges around this race by keeping track of which CPUs have : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which : cpus have virtualization enabled"), but that's an inefficient, convoluted, : and hacky solution. I actually considered removing the per-CPU flag, but decided not to because it's simpler to blast on_each_cpu(hardware_disable_nolock, ...) in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a #UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to have for other reasons. That said, after this patch, KVM should be able to WARN in the enable path. I'll test that and do a more thorough audit; unless I'm missing something, I'll add a patch to do: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b8c6bfb46066..ee896fa2f196 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5027,7 +5027,7 @@ static int kvm_usage_count; static int __hardware_enable_nolock(void) { - if (__this_cpu_read(hardware_enabled)) + if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled))) return 0; if (kvm_arch_hardware_enable()) { _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel