All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: "Michael Kelley (LINUX)" <mikelley@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline
Date: Tue, 23 May 2023 10:04:47 +0200	[thread overview]
Message-ID: <87wn0zxylc.fsf@redhat.com> (raw)
In-Reply-To: <BYAPR21MB16889F38274F6F7691DB85F8D743A@BYAPR21MB1688.namprd21.prod.outlook.com>

"Michael Kelley (LINUX)" <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Monday, May 22, 2023 1:56 AM
>> 
>> Michael Kelley <mikelley@microsoft.com> writes:
>> 
>
> [snip]
>
>> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> > index 0f1001d..696004a 100644
>> > --- a/include/linux/cpuhotplug.h
>> > +++ b/include/linux/cpuhotplug.h
>> > @@ -201,6 +201,7 @@ enum cpuhp_state {
>> >  	/* Online section invoked on the hotplugged CPU from the hotplug thread */
>> >  	CPUHP_AP_ONLINE_IDLE,
>> >  	CPUHP_AP_KVM_ONLINE,
>> > +	CPUHP_AP_HYPERV_ONLINE,
>> 
>> (Cc: KVM)
>> 
>> I would very much prefer we swap the order with KVM here: hv_cpu_init()
>> allocates and sets vCPU's VP assist page which is used by KVM on
>> CPUHP_AP_KVM_ONLINE:
>> 
>> kvm_online_cpu() -> __hardware_enable_nolock() ->
>> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>> 
>>         /*
>>          * This can happen if we hot-added a CPU but failed to allocate
>>          * VP assist page for it.
>>          */
>> 	if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
>> 		return -EFAULT;
>> 
>> With the change, this is likely broken.
>> 
>> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
>> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
>> this happens upon KVM module load so CPU hotplug ordering should not
>> matter as this always happens on a CPU which is already online.
>> 
>> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
>> init before KVM's (and vice versa on teardown) makes sense.
>> 
>> >  	CPUHP_AP_SCHED_WAIT_EMPTY,
>> >  	CPUHP_AP_SMPBOOT_THREADS,
>> >  	CPUHP_AP_X86_VDSO_VMA_ONLINE,
>
> I have no objection to putting CPUHP_AP_HYPERV_ONLINE first.  I did
> not give any consideration to a possible dependency between the two. :-(
> But note that in current code, hv_cpu_init() is running on the
> CPUHP_AP_ONLINE_DYN state, which is also after KVM.  So this patch
> doesn't change the order w.r.t. KVM and the VP assist page.   Are things
> already broken for KVM, or is something else happening that makes it
> work anyway?

This looks like a currently present bug indeed so I had to refresh my
memory. 

KVM's CPUHP_AP_KVM_STARTIN is registered with
cpuhp_setup_state_nocalls() which means that kvm_starting_cpu() is not
called for all currently present CPUs. Moreover, kvm_init() is called
when KVM vendor module (e.g. kvm_intel) is loaded and as KVM is normally
built as module, this happens much later than Hyper-V's
hyperv_init(). vmx_hardware_enable() is actually called from
hardware_enable_all() which happens when the first KVM VM is created.

This all changes when a CPU is hotplugged. The order CPUHP_AP_* from
cpuhp_state is respected and KVM's kvm_starting_cpu() is called _before_
Hyper-V's hv_cpu_init() even before your patch. We don't see the bug
just because Hyper-V doesn't(?) support CPU hotplug. Just sending a CPU
offline with e.g. "echo 0 > /sys/devices/system/cpu/cpuX/online" is not
the same as once allocated, VP assist page persists for all non-root
Hyper-V partitions. I don't know if KVM is supported for Hyper-V root
partitions but in case it is, we may have a problem.

Let's put CPUHP_AP_HYPERV_ONLINE before KVM's CPUHP_AP_KVM_ONLINE
explicitly so CPU hotplug scenario is handled correctly, even if current
Hyper-V versions don't support it.

-- 
Vitaly


WARNING: multiple messages have this Message-ID (diff)
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: "Michael Kelley (LINUX)" <mikelley@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline
Date: Tue, 23 May 2023 10:04:47 +0200	[thread overview]
Message-ID: <87wn0zxylc.fsf@redhat.com> (raw)
In-Reply-To: <BYAPR21MB16889F38274F6F7691DB85F8D743A@BYAPR21MB1688.namprd21.prod.outlook.com>

"Michael Kelley (LINUX)" <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Monday, May 22, 2023 1:56 AM
>> 
>> Michael Kelley <mikelley@microsoft.com> writes:
>> 
>
> [snip]
>
>> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> > index 0f1001d..696004a 100644
>> > --- a/include/linux/cpuhotplug.h
>> > +++ b/include/linux/cpuhotplug.h
>> > @@ -201,6 +201,7 @@ enum cpuhp_state {
>> >  	/* Online section invoked on the hotplugged CPU from the hotplug thread */
>> >  	CPUHP_AP_ONLINE_IDLE,
>> >  	CPUHP_AP_KVM_ONLINE,
>> > +	CPUHP_AP_HYPERV_ONLINE,
>> 
>> (Cc: KVM)
>> 
>> I would very much prefer we swap the order with KVM here: hv_cpu_init()
>> allocates and sets vCPU's VP assist page which is used by KVM on
>> CPUHP_AP_KVM_ONLINE:
>> 
>> kvm_online_cpu() -> __hardware_enable_nolock() ->
>> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>> 
>>         /*
>>          * This can happen if we hot-added a CPU but failed to allocate
>>          * VP assist page for it.
>>          */
>> 	if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
>> 		return -EFAULT;
>> 
>> With the change, this is likely broken.
>> 
>> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
>> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
>> this happens upon KVM module load so CPU hotplug ordering should not
>> matter as this always happens on a CPU which is already online.
>> 
>> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
>> init before KVM's (and vice versa on teardown) makes sense.
>> 
>> >  	CPUHP_AP_SCHED_WAIT_EMPTY,
>> >  	CPUHP_AP_SMPBOOT_THREADS,
>> >  	CPUHP_AP_X86_VDSO_VMA_ONLINE,
>
> I have no objection to putting CPUHP_AP_HYPERV_ONLINE first.  I did
> not give any consideration to a possible dependency between the two. :-(
> But note that in current code, hv_cpu_init() is running on the
> CPUHP_AP_ONLINE_DYN state, which is also after KVM.  So this patch
> doesn't change the order w.r.t. KVM and the VP assist page.   Are things
> already broken for KVM, or is something else happening that makes it
> work anyway?

This looks like a currently present bug indeed so I had to refresh my
memory. 

KVM's CPUHP_AP_KVM_STARTIN is registered with
cpuhp_setup_state_nocalls() which means that kvm_starting_cpu() is not
called for all currently present CPUs. Moreover, kvm_init() is called
when KVM vendor module (e.g. kvm_intel) is loaded and as KVM is normally
built as module, this happens much later than Hyper-V's
hyperv_init(). vmx_hardware_enable() is actually called from
hardware_enable_all() which happens when the first KVM VM is created.

This all changes when a CPU is hotplugged. The order CPUHP_AP_* from
cpuhp_state is respected and KVM's kvm_starting_cpu() is called _before_
Hyper-V's hv_cpu_init() even before your patch. We don't see the bug
just because Hyper-V doesn't(?) support CPU hotplug. Just sending a CPU
offline with e.g. "echo 0 > /sys/devices/system/cpu/cpuX/online" is not
the same as once allocated, VP assist page persists for all non-root
Hyper-V partitions. I don't know if KVM is supported for Hyper-V root
partitions but in case it is, we may have a problem.

Let's put CPUHP_AP_HYPERV_ONLINE before KVM's CPUHP_AP_KVM_ONLINE
explicitly so CPU hotplug scenario is handled correctly, even if current
Hyper-V versions don't support it.

-- 
Vitaly


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-05-23  8:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-19 14:33 [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline Michael Kelley
2023-05-19 14:33 ` Michael Kelley
2023-05-19 14:33 ` [PATCH 2/2] arm64/hyperv: Use CPUHP_AP_HYPERV_ONLINE state to fix CPU online sequencing Michael Kelley
2023-05-19 14:33   ` Michael Kelley
2023-05-23  4:05   ` Dexuan Cui
2023-05-23  4:05     ` Dexuan Cui
2023-05-22  8:56 ` [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline Vitaly Kuznetsov
2023-05-22  8:56   ` Vitaly Kuznetsov
2023-05-22 14:13   ` Michael Kelley (LINUX)
2023-05-22 14:13     ` Michael Kelley (LINUX)
2023-05-23  8:04     ` Vitaly Kuznetsov [this message]
2023-05-23  8:04       ` Vitaly Kuznetsov
2023-05-23 14:34       ` Michael Kelley (LINUX)
2023-05-23 14:34         ` Michael Kelley (LINUX)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wn0zxylc.fsf@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.