public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
@ 2024-11-12 18:18 Stanislav Kinsburskii
  2024-11-12 19:48 ` Michael Kelley
  0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Kinsburskii @ 2024-11-12 18:18 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen
  Cc: kys, haiyangz, wei.liu, decui, x86, hpa, linux-hyperv,
	linux-kernel

Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
independent from invariant TSC and should have never been gated by the
HV_ACCESS_TSC_INVARIANT privilege.

To elaborate, the HV_ACCESS_TSC_INVARIANT privilege allows certain types of
guests to opt-in to invariant TSC by writing the
HV_X64_MSR_TSC_INVARIANT_CONTROL register. Not all guests will have this
privilege and the hypervisor will automatically opt-in certain types of
guests (e.g. EXO partitions) to invariant TSC, but this functionality is
unrelated to the TSC reliability.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
 arch/x86/kernel/cpu/mshyperv.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index d18078834ded..14412afcc398 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -515,7 +515,7 @@ static void __init ms_hyperv_init_platform(void)
 	machine_ops.crash_shutdown = hv_machine_crash_shutdown;
 #endif
 #endif
-	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
+	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
 		/*
 		 * Writing to synthetic MSR 0x40000118 updates/changes the
 		 * guest visible CPUIDs. Setting bit 0 of this MSR  enables
@@ -526,8 +526,8 @@ static void __init ms_hyperv_init_platform(void)
 		 * is called.
 		 */
 		wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL, HV_EXPOSE_INVARIANT_TSC);
-		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
-	}
+
+	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
 
 	/*
 	 * Generation 2 instances don't support reading the NMI status from



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-12 18:18 [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally Stanislav Kinsburskii
@ 2024-11-12 19:48 ` Michael Kelley
  2024-11-20  0:51   ` Stanislav Kinsburskii
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Kelley @ 2024-11-12 19:48 UTC (permalink / raw)
  To: Stanislav Kinsburskii, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, x86@kernel.org, hpa@zytor.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, November 12, 2024 10:18 AM
> 
> Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
> independent from invariant TSC and should have never been gated by the
> HV_ACCESS_TSC_INVARIANT privilege.

I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
TSC Invariant feature because otherwise VM live migration may cause
the TSC value reported by the RDTSC/RDTSCP instruction in the guest
to abruptly change frequency and value. In such cases, the TSC isn't
useable by the kernel or user space.

Enabling the Hyper-V TSC Invariant feature fixes that by using the
hardware scaling available in more recent processors to automatically
fixup the TSC value returned by RDTSC/RDTSCP in the guest.

Is there a practical problem that is fixed by always enabling
X86_FEATURE_TSC_RELIABLE?

Michael

> 
> To elaborate, the HV_ACCESS_TSC_INVARIANT privilege allows certain types of
> guests to opt-in to invariant TSC by writing the
> HV_X64_MSR_TSC_INVARIANT_CONTROL register. Not all guests will have this
> privilege and the hypervisor will automatically opt-in certain types of
> guests (e.g. EXO partitions) to invariant TSC, but this functionality is
> unrelated to the TSC reliability.
> 
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
>  arch/x86/kernel/cpu/mshyperv.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index d18078834ded..14412afcc398 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -515,7 +515,7 @@ static void __init ms_hyperv_init_platform(void)
>  	machine_ops.crash_shutdown = hv_machine_crash_shutdown;
>  #endif
>  #endif
> -	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
> +	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
>  		/*
>  		 * Writing to synthetic MSR 0x40000118 updates/changes the
>  		 * guest visible CPUIDs. Setting bit 0 of this MSR  enables
> @@ -526,8 +526,8 @@ static void __init ms_hyperv_init_platform(void)
>  		 * is called.
>  		 */
>  		wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL, HV_EXPOSE_INVARIANT_TSC);
> -		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> -	}
> +
> +	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> 
>  	/*
>  	 * Generation 2 instances don't support reading the NMI status from
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-12 19:48 ` Michael Kelley
@ 2024-11-20  0:51   ` Stanislav Kinsburskii
  2024-11-22 18:33     ` Michael Kelley
  0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Kinsburskii @ 2024-11-20  0:51 UTC (permalink / raw)
  To: Michael Kelley
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, November 12, 2024 10:18 AM
> > 
> > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
> > independent from invariant TSC and should have never been gated by the
> > HV_ACCESS_TSC_INVARIANT privilege.
> 
> I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> TSC Invariant feature because otherwise VM live migration may cause
> the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> to abruptly change frequency and value. In such cases, the TSC isn't
> useable by the kernel or user space.
> 
> Enabling the Hyper-V TSC Invariant feature fixes that by using the
> hardware scaling available in more recent processors to automatically
> fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> 
> Is there a practical problem that is fixed by always enabling
> X86_FEATURE_TSC_RELIABLE?
> 

The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
nested root, which in turn leads to keeping tsc clocksource watchdog
thread and TSC sycn check timer around.

But the live migration concern you raised is indeed still out there.

Thank you for the input Michael, I'll think more about it.

Stanislav

> Michael
> 
> > 
> > To elaborate, the HV_ACCESS_TSC_INVARIANT privilege allows certain types of
> > guests to opt-in to invariant TSC by writing the
> > HV_X64_MSR_TSC_INVARIANT_CONTROL register. Not all guests will have this
> > privilege and the hypervisor will automatically opt-in certain types of
> > guests (e.g. EXO partitions) to invariant TSC, but this functionality is
> > unrelated to the TSC reliability.
> > 
> > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > ---
> >  arch/x86/kernel/cpu/mshyperv.c |    6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index d18078834ded..14412afcc398 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -515,7 +515,7 @@ static void __init ms_hyperv_init_platform(void)
> >  	machine_ops.crash_shutdown = hv_machine_crash_shutdown;
> >  #endif
> >  #endif
> > -	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
> > +	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
> >  		/*
> >  		 * Writing to synthetic MSR 0x40000118 updates/changes the
> >  		 * guest visible CPUIDs. Setting bit 0 of this MSR  enables
> > @@ -526,8 +526,8 @@ static void __init ms_hyperv_init_platform(void)
> >  		 * is called.
> >  		 */
> >  		wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL, HV_EXPOSE_INVARIANT_TSC);
> > -		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > -	}
> > +
> > +	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > 
> >  	/*
> >  	 * Generation 2 instances don't support reading the NMI status from
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-20  0:51   ` Stanislav Kinsburskii
@ 2024-11-22 18:33     ` Michael Kelley
  2024-11-25 22:24       ` Stanislav Kinsburskii
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Kelley @ 2024-11-22 18:33 UTC (permalink / raw)
  To: Stanislav Kinsburskii
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, November 19, 2024 4:51 PM
> 
> On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> November 12, 2024 10:18 AM
> > >
> > > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
> > > independent from invariant TSC and should have never been gated by the
> > > HV_ACCESS_TSC_INVARIANT privilege.
> >
> > I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> > TSC Invariant feature because otherwise VM live migration may cause
> > the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> > to abruptly change frequency and value. In such cases, the TSC isn't
> > useable by the kernel or user space.
> >
> > Enabling the Hyper-V TSC Invariant feature fixes that by using the
> > hardware scaling available in more recent processors to automatically
> > fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> >
> > Is there a practical problem that is fixed by always enabling
> > X86_FEATURE_TSC_RELIABLE?
> >
> 
> The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
> nested root, which in turn leads to keeping tsc clocksource watchdog
> thread and TSC sycn check timer around.

I have trouble keeping all the different TSC "features" conceptually
separate. :-( The TSC frequency not changing (and the value not
abruptly jumping?) should already be represented by
X86_FEATURE_TSC_CONSTANT.  In the kernel, X86_FEATURE_TSC_RELIABLE
effectively only controls whether the TSC clocksource watchdog is
enabled, and in spite of the live migration foibles, I don't see a need
for that watchdog in a Hyper-V VM. So maybe it's OK to always set
X86_FEATURE_TSC_RELIABLE in a Hyper-V VM, as you have
proposed.

The "tsc_reliable" flag is also exposed to user space as part of the
/proc/cpuinfo "flags" output, so theoretically some user space
program could change behavior based on that flag. But that seems
a bit far-fetched. I know there are user space programs that check
the CPUID INVARIANT_TSC flag to know whether they can use
the raw RDTSC instruction output to do start/stop timing. The
Hyper-V TSC Invariant feature makes that work correctly, even
across live migrations.

Michael

> 
> But the live migration concern you raised is indeed still out there.
> 
> Thank you for the input Michael, I'll think more about it.
> 
> Stanislav
> 
> > Michael
> >
> > >
> > > To elaborate, the HV_ACCESS_TSC_INVARIANT privilege allows certain types of
> > > guests to opt-in to invariant TSC by writing the
> > > HV_X64_MSR_TSC_INVARIANT_CONTROL register. Not all guests will have this
> > > privilege and the hypervisor will automatically opt-in certain types of
> > > guests (e.g. EXO partitions) to invariant TSC, but this functionality is
> > > unrelated to the TSC reliability.
> > >
> > > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > > ---
> > >  arch/x86/kernel/cpu/mshyperv.c |    6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > > index d18078834ded..14412afcc398 100644
> > > --- a/arch/x86/kernel/cpu/mshyperv.c
> > > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > > @@ -515,7 +515,7 @@ static void __init ms_hyperv_init_platform(void)
> > >  	machine_ops.crash_shutdown = hv_machine_crash_shutdown;
> > >  #endif
> > >  #endif
> > > -	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
> > > +	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
> > >  		/*
> > >  		 * Writing to synthetic MSR 0x40000118 updates/changes the
> > >  		 * guest visible CPUIDs. Setting bit 0 of this MSR  enables
> > > @@ -526,8 +526,8 @@ static void __init ms_hyperv_init_platform(void)
> > >  		 * is called.
> > >  		 */
> > >  		wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL,
> HV_EXPOSE_INVARIANT_TSC);
> > > -		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > > -	}
> > > +
> > > +	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > >
> > >  	/*
> > >  	 * Generation 2 instances don't support reading the NMI status from
> > >
> > >
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-22 18:33     ` Michael Kelley
@ 2024-11-25 22:24       ` Stanislav Kinsburskii
  2024-11-26  6:11         ` Michael Kelley
  0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Kinsburskii @ 2024-11-25 22:24 UTC (permalink / raw)
  To: Michael Kelley
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Fri, Nov 22, 2024 at 06:33:12PM +0000, Michael Kelley wrote:
> From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, November 19, 2024 4:51 PM
> > 
> > On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> > November 12, 2024 10:18 AM
> > > >
> > > > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
> > > > independent from invariant TSC and should have never been gated by the
> > > > HV_ACCESS_TSC_INVARIANT privilege.
> > >
> > > I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> > > TSC Invariant feature because otherwise VM live migration may cause
> > > the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> > > to abruptly change frequency and value. In such cases, the TSC isn't
> > > useable by the kernel or user space.
> > >
> > > Enabling the Hyper-V TSC Invariant feature fixes that by using the
> > > hardware scaling available in more recent processors to automatically
> > > fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> > >
> > > Is there a practical problem that is fixed by always enabling
> > > X86_FEATURE_TSC_RELIABLE?
> > >
> > 
> > The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
> > nested root, which in turn leads to keeping tsc clocksource watchdog
> > thread and TSC sycn check timer around.
> 
> I have trouble keeping all the different TSC "features" conceptually
> separate. :-( The TSC frequency not changing (and the value not
> abruptly jumping?) should already be represented by
> X86_FEATURE_TSC_CONSTANT.  In the kernel, X86_FEATURE_TSC_RELIABLE
> effectively only controls whether the TSC clocksource watchdog is
> enabled, and in spite of the live migration foibles, I don't see a need
> for that watchdog in a Hyper-V VM. So maybe it's OK to always set
> X86_FEATURE_TSC_RELIABLE in a Hyper-V VM, as you have
> proposed.
> 
> The "tsc_reliable" flag is also exposed to user space as part of the
> /proc/cpuinfo "flags" output, so theoretically some user space
> program could change behavior based on that flag. But that seems
> a bit far-fetched. I know there are user space programs that check
> the CPUID INVARIANT_TSC flag to know whether they can use
> the raw RDTSC instruction output to do start/stop timing. The
> Hyper-V TSC Invariant feature makes that work correctly, even
> across live migrations.
> 

It sounds to me that if X86_FEATURE_TSC_CONSTANT is available on Hyper-V, then we
can set X86_FEATURE_TSC_RELIABLE.
Is it what you are saying?

Stanislav

> Michael
> 
> > 
> > But the live migration concern you raised is indeed still out there.
> > 
> > Thank you for the input Michael, I'll think more about it.
> > 
> > Stanislav
> > 
> > > Michael
> > >
> > > >
> > > > To elaborate, the HV_ACCESS_TSC_INVARIANT privilege allows certain types of
> > > > guests to opt-in to invariant TSC by writing the
> > > > HV_X64_MSR_TSC_INVARIANT_CONTROL register. Not all guests will have this
> > > > privilege and the hypervisor will automatically opt-in certain types of
> > > > guests (e.g. EXO partitions) to invariant TSC, but this functionality is
> > > > unrelated to the TSC reliability.
> > > >
> > > > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > > > ---
> > > >  arch/x86/kernel/cpu/mshyperv.c |    6 +++---
> > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > > > index d18078834ded..14412afcc398 100644
> > > > --- a/arch/x86/kernel/cpu/mshyperv.c
> > > > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > > > @@ -515,7 +515,7 @@ static void __init ms_hyperv_init_platform(void)
> > > >  	machine_ops.crash_shutdown = hv_machine_crash_shutdown;
> > > >  #endif
> > > >  #endif
> > > > -	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
> > > > +	if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT)
> > > >  		/*
> > > >  		 * Writing to synthetic MSR 0x40000118 updates/changes the
> > > >  		 * guest visible CPUIDs. Setting bit 0 of this MSR  enables
> > > > @@ -526,8 +526,8 @@ static void __init ms_hyperv_init_platform(void)
> > > >  		 * is called.
> > > >  		 */
> > > >  		wrmsrl(HV_X64_MSR_TSC_INVARIANT_CONTROL,
> > HV_EXPOSE_INVARIANT_TSC);
> > > > -		setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > > > -	}
> > > > +
> > > > +	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > > >
> > > >  	/*
> > > >  	 * Generation 2 instances don't support reading the NMI status from
> > > >
> > > >
> > >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-25 22:24       ` Stanislav Kinsburskii
@ 2024-11-26  6:11         ` Michael Kelley
  2024-12-11 18:00           ` Stanislav Kinsburskii
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Kelley @ 2024-11-26  6:11 UTC (permalink / raw)
  To: Stanislav Kinsburskii
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Monday, November 25, 2024 2:25 PM
> 
> On Fri, Nov 22, 2024 at 06:33:12PM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> November 19, 2024 4:51 PM
> > >
> > > On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> > > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> > > November 12, 2024 10:18 AM
> > > > >
> > > > > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE
> is
> > > > > independent from invariant TSC and should have never been gated by the
> > > > > HV_ACCESS_TSC_INVARIANT privilege.
> > > >
> > > > I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> > > > TSC Invariant feature because otherwise VM live migration may cause
> > > > the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> > > > to abruptly change frequency and value. In such cases, the TSC isn't
> > > > useable by the kernel or user space.
> > > >
> > > > Enabling the Hyper-V TSC Invariant feature fixes that by using the
> > > > hardware scaling available in more recent processors to automatically
> > > > fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> > > >
> > > > Is there a practical problem that is fixed by always enabling
> > > > X86_FEATURE_TSC_RELIABLE?
> > > >
> > >
> > > The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
> > > nested root, which in turn leads to keeping tsc clocksource watchdog
> > > thread and TSC sycn check timer around.
> >
> > I have trouble keeping all the different TSC "features" conceptually
> > separate. :-( The TSC frequency not changing (and the value not
> > abruptly jumping?) should already be represented by
> > X86_FEATURE_TSC_CONSTANT.  In the kernel, X86_FEATURE_TSC_RELIABLE
> > effectively only controls whether the TSC clocksource watchdog is
> > enabled, and in spite of the live migration foibles, I don't see a need
> > for that watchdog in a Hyper-V VM. So maybe it's OK to always set
> > X86_FEATURE_TSC_RELIABLE in a Hyper-V VM, as you have
> > proposed.
> >
> > The "tsc_reliable" flag is also exposed to user space as part of the
> > /proc/cpuinfo "flags" output, so theoretically some user space
> > program could change behavior based on that flag. But that seems
> > a bit far-fetched. I know there are user space programs that check
> > the CPUID INVARIANT_TSC flag to know whether they can use
> > the raw RDTSC instruction output to do start/stop timing. The
> > Hyper-V TSC Invariant feature makes that work correctly, even
> > across live migrations.
> >
> 
> It sounds to me that if X86_FEATURE_TSC_CONSTANT is available
> on Hyper-V, then we can set X86_FEATURE_TSC_RELIABLE.
> Is it what you are saying?
> 

No. Sorry I wasn't clear. X86_FEATURE_TSC_CONSTANT will
be set only when the Hyper-V TSC Invariant feature is enabled, so
tying X86_FEATURE_TSC_RELIABLE to that is what happens now.

What I'm suggesting is to take your patch "as is". In other words,
always enable X86_FEATURE_TSC_RELIABLE. From what I can tell,
TSC_RELIABLE is only used to disable the TSC watchdog. Since I
can't see a use for the TSC watchdog in a VM, always setting
TSC_RELIABLE probably makes sense. TSC_RELIABLE doesn't
say anything about whether the TSC frequency might change, such
as across a VM live migration. TSC_CONSTANT is what tells you that
the frequency won't change.

My caveat is that I don't know the history of TSC_RELIABLE. I
don't see any documentation on the details of what it is supposed
to convey, especially in a VM. Maybe someone on the "To:" list
who knows for sure can confirm what I'm thinking.

Michael





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-11-26  6:11         ` Michael Kelley
@ 2024-12-11 18:00           ` Stanislav Kinsburskii
  2024-12-11 22:22             ` Michael Kelley
  0 siblings, 1 reply; 8+ messages in thread
From: Stanislav Kinsburskii @ 2024-12-11 18:00 UTC (permalink / raw)
  To: Michael Kelley
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, Nov 26, 2024 at 06:11:10AM +0000, Michael Kelley wrote:
> From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Monday, November 25, 2024 2:25 PM
> > 
> > On Fri, Nov 22, 2024 at 06:33:12PM +0000, Michael Kelley wrote:
> > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> > November 19, 2024 4:51 PM
> > > >
> > > > On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> > > > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> > > > November 12, 2024 10:18 AM
> > > > > >
> > > > > > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE
> > is
> > > > > > independent from invariant TSC and should have never been gated by the
> > > > > > HV_ACCESS_TSC_INVARIANT privilege.
> > > > >
> > > > > I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> > > > > TSC Invariant feature because otherwise VM live migration may cause
> > > > > the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> > > > > to abruptly change frequency and value. In such cases, the TSC isn't
> > > > > useable by the kernel or user space.
> > > > >
> > > > > Enabling the Hyper-V TSC Invariant feature fixes that by using the
> > > > > hardware scaling available in more recent processors to automatically
> > > > > fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> > > > >
> > > > > Is there a practical problem that is fixed by always enabling
> > > > > X86_FEATURE_TSC_RELIABLE?
> > > > >
> > > >
> > > > The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
> > > > nested root, which in turn leads to keeping tsc clocksource watchdog
> > > > thread and TSC sycn check timer around.
> > >
> > > I have trouble keeping all the different TSC "features" conceptually
> > > separate. :-( The TSC frequency not changing (and the value not
> > > abruptly jumping?) should already be represented by
> > > X86_FEATURE_TSC_CONSTANT.  In the kernel, X86_FEATURE_TSC_RELIABLE
> > > effectively only controls whether the TSC clocksource watchdog is
> > > enabled, and in spite of the live migration foibles, I don't see a need
> > > for that watchdog in a Hyper-V VM. So maybe it's OK to always set
> > > X86_FEATURE_TSC_RELIABLE in a Hyper-V VM, as you have
> > > proposed.
> > >
> > > The "tsc_reliable" flag is also exposed to user space as part of the
> > > /proc/cpuinfo "flags" output, so theoretically some user space
> > > program could change behavior based on that flag. But that seems
> > > a bit far-fetched. I know there are user space programs that check
> > > the CPUID INVARIANT_TSC flag to know whether they can use
> > > the raw RDTSC instruction output to do start/stop timing. The
> > > Hyper-V TSC Invariant feature makes that work correctly, even
> > > across live migrations.
> > >
> > 
> > It sounds to me that if X86_FEATURE_TSC_CONSTANT is available
> > on Hyper-V, then we can set X86_FEATURE_TSC_RELIABLE.
> > Is it what you are saying?
> > 
> 
> No. Sorry I wasn't clear. X86_FEATURE_TSC_CONSTANT will
> be set only when the Hyper-V TSC Invariant feature is enabled, so
> tying X86_FEATURE_TSC_RELIABLE to that is what happens now.
> 
> What I'm suggesting is to take your patch "as is". In other words,
> always enable X86_FEATURE_TSC_RELIABLE. From what I can tell,
> TSC_RELIABLE is only used to disable the TSC watchdog. Since I
> can't see a use for the TSC watchdog in a VM, always setting
> TSC_RELIABLE probably makes sense. TSC_RELIABLE doesn't
> say anything about whether the TSC frequency might change, such
> as across a VM live migration. TSC_CONSTANT is what tells you that
> the frequency won't change.
> 
> My caveat is that I don't know the history of TSC_RELIABLE. I
> don't see any documentation on the details of what it is supposed
> to convey, especially in a VM. Maybe someone on the "To:" list
> who knows for sure can confirm what I'm thinking.
> 
> Michael

We had a long ionternal discussion with hypervisor folks and it looks
like we will propose a more robust solution to go forward.
The hypervisor will provide an additional CPUID bit, which guarantees
TSC reliability (including across live migration).

I'll prepare an updated version.

Thanks,
Stanislav

> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally
  2024-12-11 18:00           ` Stanislav Kinsburskii
@ 2024-12-11 22:22             ` Michael Kelley
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Kelley @ 2024-12-11 22:22 UTC (permalink / raw)
  To: Stanislav Kinsburskii
  Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Wednesday, December 11, 2024 10:01 AM
> 
> On Tue, Nov 26, 2024 at 06:11:10AM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Monday, November 25, 2024 2:25 PM
> > >
> > > On Fri, Nov 22, 2024 at 06:33:12PM +0000, Michael Kelley wrote:
> > > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, November 19, 2024 4:51 PM
> > > > >
> > > > > On Tue, Nov 12, 2024 at 07:48:06PM +0000, Michael Kelley wrote:
> > > > > > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday,
> > > > > November 12, 2024 10:18 AM
> > > > > > >
> > > > > > > Enable X86_FEATURE_TSC_RELIABLE by default as X86_FEATURE_TSC_RELIABLE is
> > > > > > > independent from invariant TSC and should have never been gated by the
> > > > > > > HV_ACCESS_TSC_INVARIANT privilege.
> > > > > >
> > > > > > I think originally X86_FEATURE_TSC_RELIABLE was gated by the Hyper-V
> > > > > > TSC Invariant feature because otherwise VM live migration may cause
> > > > > > the TSC value reported by the RDTSC/RDTSCP instruction in the guest
> > > > > > to abruptly change frequency and value. In such cases, the TSC isn't
> > > > > > useable by the kernel or user space.
> > > > > >
> > > > > > Enabling the Hyper-V TSC Invariant feature fixes that by using the
> > > > > > hardware scaling available in more recent processors to automatically
> > > > > > fixup the TSC value returned by RDTSC/RDTSCP in the guest.
> > > > > >
> > > > > > Is there a practical problem that is fixed by always enabling
> > > > > > X86_FEATURE_TSC_RELIABLE?
> > > > > >
> > > > >
> > > > > The particular problem is that HV_ACCESS_TSC_INVARIANT is not set for the
> > > > > nested root, which in turn leads to keeping tsc clocksource watchdog
> > > > > thread and TSC sycn check timer around.
> > > >
> > > > I have trouble keeping all the different TSC "features" conceptually
> > > > separate. :-( The TSC frequency not changing (and the value not
> > > > abruptly jumping?) should already be represented by
> > > > X86_FEATURE_TSC_CONSTANT.  In the kernel, X86_FEATURE_TSC_RELIABLE
> > > > effectively only controls whether the TSC clocksource watchdog is
> > > > enabled, and in spite of the live migration foibles, I don't see a need
> > > > for that watchdog in a Hyper-V VM. So maybe it's OK to always set
> > > > X86_FEATURE_TSC_RELIABLE in a Hyper-V VM, as you have
> > > > proposed.
> > > >
> > > > The "tsc_reliable" flag is also exposed to user space as part of the
> > > > /proc/cpuinfo "flags" output, so theoretically some user space
> > > > program could change behavior based on that flag. But that seems
> > > > a bit far-fetched. I know there are user space programs that check
> > > > the CPUID INVARIANT_TSC flag to know whether they can use
> > > > the raw RDTSC instruction output to do start/stop timing. The
> > > > Hyper-V TSC Invariant feature makes that work correctly, even
> > > > across live migrations.
> > > >
> > >
> > > It sounds to me that if X86_FEATURE_TSC_CONSTANT is available
> > > on Hyper-V, then we can set X86_FEATURE_TSC_RELIABLE.
> > > Is it what you are saying?
> > >
> >
> > No. Sorry I wasn't clear. X86_FEATURE_TSC_CONSTANT will
> > be set only when the Hyper-V TSC Invariant feature is enabled, so
> > tying X86_FEATURE_TSC_RELIABLE to that is what happens now.
> >
> > What I'm suggesting is to take your patch "as is". In other words,
> > always enable X86_FEATURE_TSC_RELIABLE. From what I can tell,
> > TSC_RELIABLE is only used to disable the TSC watchdog. Since I
> > can't see a use for the TSC watchdog in a VM, always setting
> > TSC_RELIABLE probably makes sense. TSC_RELIABLE doesn't
> > say anything about whether the TSC frequency might change, such
> > as across a VM live migration. TSC_CONSTANT is what tells you that
> > the frequency won't change.
> >
> > My caveat is that I don't know the history of TSC_RELIABLE. I
> > don't see any documentation on the details of what it is supposed
> > to convey, especially in a VM. Maybe someone on the "To:" list
> > who knows for sure can confirm what I'm thinking.
> >
> > Michael
> 
> We had a long ionternal discussion with hypervisor folks and it looks
> like we will propose a more robust solution to go forward.
> The hypervisor will provide an additional CPUID bit, which guarantees
> TSC reliability (including across live migration).

Interesting. Is there a text description of what this new CPUID
bit means? I would be curious, and in particular how it is
different from the various existing CPUID bits describing TSC
behavior.

Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-12-11 22:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-12 18:18 [PATCH] x86/hyperv: Set X86_FEATURE_TSC_RELIABLE unconditionally Stanislav Kinsburskii
2024-11-12 19:48 ` Michael Kelley
2024-11-20  0:51   ` Stanislav Kinsburskii
2024-11-22 18:33     ` Michael Kelley
2024-11-25 22:24       ` Stanislav Kinsburskii
2024-11-26  6:11         ` Michael Kelley
2024-12-11 18:00           ` Stanislav Kinsburskii
2024-12-11 22:22             ` Michael Kelley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox