From: Chao Gao <chao.gao@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, kevin.tian@intel.com,
tglx@linutronix.de, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] KVM: Do compatibility checks on hotplugged CPUs
Date: Tue, 11 Jan 2022 13:32:06 +0800 [thread overview]
Message-ID: <20220111053205.GD2175@gao-cwp> (raw)
In-Reply-To: <YdzTfIEZ727L4g2R@google.com>
On Tue, Jan 11, 2022 at 12:46:52AM +0000, Sean Christopherson wrote:
>On Mon, Dec 27, 2021, Chao Gao wrote:
>> At init time, KVM does compatibility checks to ensure that all online
>> CPUs support hardware virtualization and a common set of features. But
>> KVM uses hotplugged CPUs without such compatibility checks. On Intel
>> CPUs, this leads to #GP if the hotplugged CPU doesn't support VMX or
>> vmentry failure if the hotplugged CPU doesn't meet minimal feature
>> requirements.
>>
>> Do compatibility checks when onlining a CPU. If any VM is running,
>> KVM hotplug callback returns an error to abort onlining incompatible
>> CPUs.
>>
>> But if no VM is running, onlining incompatible CPUs is allowed. Instead,
>> KVM is prohibited from creating VMs similar to the policy for init-time
>> compatibility checks.
>
>...
>
>> ---
>> virt/kvm/kvm_main.c | 36 ++++++++++++++++++++++++++++++++++--
>> 1 file changed, 34 insertions(+), 2 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index c1054604d1e8..0ff80076d48d 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -106,6 +106,8 @@ LIST_HEAD(vm_list);
>> static cpumask_var_t cpus_hardware_enabled;
>> static int kvm_usage_count;
>> static atomic_t hardware_enable_failed;
>> +/* Set if hardware becomes incompatible after CPU hotplug */
>> +static bool hardware_incompatible;
>>
>> static struct kmem_cache *kvm_vcpu_cache;
>>
>> @@ -4855,20 +4857,32 @@ static void hardware_enable_nolock(void *junk)
>>
>> static int kvm_online_cpu(unsigned int cpu)
>> {
>> - int ret = 0;
>> + int ret;
>>
>> + ret = kvm_arch_check_processor_compat();
>> raw_spin_lock(&kvm_count_lock);
>> /*
>> * Abort the CPU online process if hardware virtualization cannot
>> * be enabled. Otherwise running VMs would encounter unrecoverable
>> * errors when scheduled to this CPU.
>> */
>> - if (kvm_usage_count) {
>> + if (!ret && kvm_usage_count) {
>> hardware_enable_nolock(NULL);
>> if (atomic_read(&hardware_enable_failed)) {
>> ret = -EIO;
>> pr_info("kvm: abort onlining CPU%d", cpu);
>> }
>> + } else if (ret && !kvm_usage_count) {
>> + /*
>> + * Continue onlining an incompatible CPU if no VM is
>> + * running. KVM should reject creating any VM after this
>> + * point. Then this CPU can be still used to run non-VM
>> + * workload.
>> + */
>> + ret = 0;
>> + hardware_incompatible = true;
>
>This has a fairly big flaw in that it prevents KVM from creating VMs even if the
>offending CPU is offlined. That seems like a very reasonable thing to do, e.g.
>admin sees that hotplugging a CPU broke KVM and removes the CPU to remedy the
>problem. And if KVM is built-in, reloading KVM to wipe hardware_incompatible
>after offlining the CPU isn't an option.
Ideally, yes, creation VMs should be allowed after offending CPUs are offlined.
But the problem is kind of foundamental:
After kernel tries to online a CPU without VMX, boot_cpu_has(X86_FEATURE_VMX)
returns false. So, the current behavior is reloading KVM would fail if
kernel *tried* to bring up a CPU without VMX. So, it looks to me that
boot_cpu_has() doesn't do feature re-evalution either. Given that, I doubt
the value of making KVM able to create VM in this case.
>
>To make this approach work, I think kvm_offline_cpu() would have to reevaluate
>hardware_incompatible if the flag is set.
>
>And should there be a KVM module param to let the admin opt in/out of this
>behavior? E.g. if the primary use case for a system is to run VMs, disabling
>KVM just to online a CPU isn't very helpful.
>
>That said, I'm not convinced that continuing with the hotplug in this scenario
>is ever the right thing to do. Either the CPU being hotplugged really is a different
>CPU, or it's literally broken. In both cases, odds are very, very good that running
>on the dodgy CPU will hose the kernel sooner or later, i.e. KVM's compatibility checks
>are just the canary in the coal mine.
Ok. Then here are two options:
1. KVM always prevents incompatible CPUs from being brought up regardless of running VMs
2. make "disabling KVM on incompatible CPUs" an opt-in feature.
Which one do you think is better?
And as said above, even with option 1, KVM reloading would fail due to
boot_cpu_has(X86_FEATURE_VMX). I suppose it isn't necessary to be fixed in this series.
>
>TDX is a different beast as (a) that's purely a security restriction and (b) anyone
>trying to run TDX guests darn well better know that TDX doesn't allow hotplug.
>In other words, if TDX gets disabled due to hotplug, either someone majorly screwed
>up and is going to be unhappy no matter what, or there's no intention of using TDX
>and it's a complete don't care.
>
>> + pr_info("kvm: prohibit VM creation due to incompatible CPU%d",
>
>pr_info() is a bit weak, this should be at least pr_warn() and maybe even pr_err().
>
>> + cpu);
>
>Eh, I'd omit the newline and let that poke out.
Will do.
>
>> }
>> raw_spin_unlock(&kvm_count_lock);
>> return ret;
>> @@ -4913,8 +4927,24 @@ static int hardware_enable_all(void)
>> {
>> int r = 0;
>>
>> + /*
>> + * During onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
>> + * is called. on_each_cpu() between them includes the CPU. As a result,
>> + * hardware_enable_nolock() may get invoked before kvm_online_cpu().
>> + * This would enable hardware virtualization on that cpu without
>> + * compatibility checks, which can potentially crash system or break
>> + * running VMs.
>> + *
>> + * Disable CPU hotplug to prevent this case from happening.
>> + */
>> + cpus_read_lock();
>> raw_spin_lock(&kvm_count_lock);
>>
>> + if (hardware_incompatible) {
>
>Another error message would likely be helpful here. Even better would be if KVM
>could provide some way for userspace to query which CPU(s) is bad.
If option 1 is chosen, this check will be removed.
For option 2, will add an error message. And how about a debugfs tunable to provide
the list of bad CPUs?
next prev parent reply other threads:[~2022-01-11 5:21 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-27 8:15 [PATCH 0/6] Improve KVM's interaction with CPU hotplug Chao Gao
2021-12-27 8:15 ` [PATCH 1/6] KVM: x86: Move check_processor_compatibility from init ops to runtime ops Chao Gao
2022-01-10 23:27 ` Sean Christopherson
2022-01-11 3:36 ` Chao Gao
2022-01-12 17:59 ` Sean Christopherson
2021-12-27 8:15 ` [PATCH 2/6] KVM: x86: Use kvm_x86_ops in kvm_arch_check_processor_compat Chao Gao
2022-01-10 21:10 ` Sean Christopherson
2022-01-11 3:06 ` Chao Gao
2021-12-27 8:15 ` [PATCH 3/6] KVM: Remove opaque from kvm_arch_check_processor_compat Chao Gao
2022-01-10 23:06 ` Sean Christopherson
2022-01-11 3:19 ` Chao Gao
2022-01-12 17:20 ` Sean Christopherson
2022-01-12 17:21 ` Sean Christopherson
2021-12-27 8:15 ` [PATCH 4/6] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Chao Gao
2021-12-27 8:15 ` [PATCH 5/6] KVM: x86: Remove WARN_ON in kvm_arch_check_processor_compat Chao Gao
2022-01-10 22:59 ` Sean Christopherson
2022-01-11 2:15 ` Tian, Kevin
2022-01-11 19:48 ` Sean Christopherson
2022-01-12 11:00 ` Chao Gao
2022-01-12 17:35 ` Sean Christopherson
2022-01-17 13:35 ` Chao Gao
2022-01-17 13:46 ` Chao Gao
2022-01-19 0:34 ` Sean Christopherson
2021-12-27 8:15 ` [PATCH 6/6] KVM: Do compatibility checks on hotplugged CPUs Chao Gao
2022-01-11 0:46 ` Sean Christopherson
2022-01-11 5:32 ` Chao Gao [this message]
2022-01-12 17:52 ` Sean Christopherson
2022-01-12 23:01 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220111053205.GD2175@gao-cwp \
--to=chao.gao@intel.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).