* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
@ 2016-02-29 14:22 Christopher S. Hall
0 siblings, 0 replies; 9+ messages in thread
From: Christopher S. Hall @ 2016-02-29 14:22 UTC (permalink / raw)
To: tglx, luto, john.stultz
Cc: hpa, x86, linux-kernel, peterz, Christopher S. Hall
On Mon, 22 Feb 2016 18:49:10 -0800, Andy Lutomirski <luto@amacapital.net> wrote:
>> an
>> earlier related thread Peter Zijlstra asserts that TSC adjust "had
>> better" be
>> 0.(http://lkml.iu.edu/hypermail/linux/kernel/1507.3/03734.html).
The comment more or less reinforces your description below of such a
BIOS as "crappy"
> There are three interesting cases that I can think of:
>
> 1. Crappy BIOS that sets TSC_ADJUST. As the not-so-proud owner of a
> piece of crap motherboard that actively messes with the TSC, I don't
> trust BIOS.
I have added code that read this offset MSR at initialization
time. This isn't bulletproof, but it covers the most common case where
BIOS boots with TSC adjust set to something non-zero.
> 2. Hypervisors. What if we're running as a guest with an ART-using
> NIC passed through?
I don't see a lot of use cases where virtualization would be
used. Currently this feature (ART) isn't available on server
platforms. Without explicit support, there are a lot of ways this can
go wrong. For now, I've added an extra check that disables ART on a
VM.
> 3. Hypothetical improved future kernel that politely uses TSC_ADJUST
> to keep the TSC from going backwards across suspend/resume.
This seems like a non-trivial change. Without know how this code is
going to work, it would be difficult to test the ART solution.
*** Commit message below:
On modern Intel systems TSC is derived from the new Always Running Timer
(ART). ART can be captured simultaneous to the capture of
audio and network device clocks, allowing a correlation between timebases
to be constructed. Upon capture, the driver converts the captured ART
value to the appropriate system clock using the correlated clocksource
mechanism.
On systems that support ART a new CPUID leaf (0x15) returns parameters
“m” and “n” such that:
TSC_value = (ART_value * m) / n + k [n >= 1]
[k is an offset that can adjusted by a privileged agent. The
IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
See 17.14.4 of the Intel SDM for more details]
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
[jstultz: Tweaked to fix build issue, also reworked math for
64bit division on 32bit systems]
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
arch/x86/include/asm/cpufeature.h | 2 +-
arch/x86/include/asm/tsc.h | 2 ++
arch/x86/kernel/tsc.c | 59 +++++++++++++++++++++++++++++++++++++++
3 files changed, 62 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 7ad8c94..ff557b4 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
+#define X86_FEATURE_ART (3*32+10) /* Platform has always running timer (ART) */
#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 6d7c547..174c421 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
return rdtsc();
}
+extern struct system_counterval_t convert_art_to_tsc(cycle_t art);
+
extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 3d743da..8644b47 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -43,6 +43,11 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
int tsc_clocksource_reliable;
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+static u64 art_to_tsc_offset;
+struct clocksource *art_related_clocksource;
+
/*
* Use a ring-buffer like data structure, where a writer advances the head by
* writing a new data entry and a reader advances the tail when it observes a
@@ -949,10 +954,41 @@ static struct notifier_block time_cpufreq_notifier_block = {
.notifier_call = time_cpufreq_notifier
};
+#define ART_CPUID_LEAF (0x15)
+#define ART_MIN_DENOMINATOR (1)
+
+
+/*
+ * If ART is present detect the numerator:denominator to convert to TSC
+ */
+static void detect_art(void)
+{
+ unsigned int unused[2];
+
+ if (boot_cpu_data.cpuid_level < ART_CPUID_LEAF)
+ return;
+
+ cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
+ &art_to_tsc_numerator, unused, unused+1);
+
+ /* Don't enable ART in a VM, non-stop TSC required */
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
+ !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) ||
+ art_to_tsc_denominator < ART_MIN_DENOMINATOR)
+ return;
+
+ if (rdmsrl_safe(MSR_IA32_TSC_ADJUST, &art_to_tsc_offset))
+ return;
+
+ /* Make this sticky over multiple CPU init calls */
+ setup_force_cpu_cap(X86_FEATURE_ART);
+}
+
static int __init cpufreq_tsc(void)
{
if (!cpu_has_tsc)
return 0;
+
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
return 0;
cpufreq_register_notifier(&time_cpufreq_notifier_block,
@@ -1071,6 +1107,25 @@ int unsynchronized_tsc(void)
return 0;
}
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+struct system_counterval_t convert_art_to_tsc(cycle_t art)
+{
+ u64 tmp, res, rem;
+
+ rem = do_div(art, art_to_tsc_denominator);
+
+ res = art * art_to_tsc_numerator;
+ tmp = rem * art_to_tsc_numerator;
+
+ do_div(tmp, art_to_tsc_denominator);
+ res += tmp + art_to_tsc_offset;
+
+ return (struct system_counterval_t) {.cs = art_related_clocksource,
+ .cycles = res};
+}
+EXPORT_SYMBOL(convert_art_to_tsc);
static void tsc_refine_calibration_work(struct work_struct *work);
static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1142,6 +1197,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
(unsigned long)tsc_khz % 1000);
out:
+ if (boot_cpu_has(X86_FEATURE_ART))
+ art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
}
@@ -1235,6 +1292,8 @@ void __init tsc_init(void)
mark_tsc_unstable("TSCs unsynchronized");
check_system_tsc_reliable();
+
+ detect_art();
}
#ifdef CONFIG_SMP
--
2.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
@ 2016-02-29 14:33 Christopher S. Hall
2016-03-02 1:11 ` Christopher Hall
0 siblings, 1 reply; 9+ messages in thread
From: Christopher S. Hall @ 2016-02-29 14:33 UTC (permalink / raw)
To: tglx, luto, john.stultz
Cc: hpa, x86, linux-kernel, peterz, Christopher S. Hall
On Mon, 22 Feb 2016 18:49:10 -0800, Andy Lutomirski <luto@amacapital.net> wrote:
>> an
>> earlier related thread Peter Zijlstra asserts that TSC adjust "had
>> better" be
>> 0.(http://lkml.iu.edu/hypermail/linux/kernel/1507.3/03734.html).
The comment more or less reinforces your description below of such a
BIOS as "crappy"
> There are three interesting cases that I can think of:
>
> 1. Crappy BIOS that sets TSC_ADJUST. As the not-so-proud owner of a
> piece of crap motherboard that actively messes with the TSC, I don't
> trust BIOS.
I have added code that read this offset MSR at initialization
time. This isn't bulletproof, but it covers the most common case where
BIOS boots with TSC adjust set to something non-zero.
> 2. Hypervisors. What if we're running as a guest with an ART-using
> NIC passed through?
I don't see a lot of use cases where virtualization would be
used. Currently this feature (ART) isn't available on server
platforms. Without explicit support, there are a lot of ways this can
go wrong. For now, I've added an extra check that disables ART on a
VM.
> 3. Hypothetical improved future kernel that politely uses TSC_ADJUST
> to keep the TSC from going backwards across suspend/resume.
This seems like a non-trivial change. Without know how this code is
going to work, it would be difficult to test the ART solution.
[Removed errant carriage return]
*** Commit message below:
On modern Intel systems TSC is derived from the new Always Running Timer
(ART). ART can be captured simultaneous to the capture of
audio and network device clocks, allowing a correlation between timebases
to be constructed. Upon capture, the driver converts the captured ART
value to the appropriate system clock using the correlated clocksource
mechanism.
On systems that support ART a new CPUID leaf (0x15) returns parameters
“m” and “n” such that:
TSC_value = (ART_value * m) / n + k [n >= 1]
[k is an offset that can adjusted by a privileged agent. The
IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
See 17.14.4 of the Intel SDM for more details]
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
[jstultz: Tweaked to fix build issue, also reworked math for
64bit division on 32bit systems]
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
arch/x86/include/asm/cpufeature.h | 2 +-
arch/x86/include/asm/tsc.h | 2 ++
arch/x86/kernel/tsc.c | 58 +++++++++++++++++++++++++++++++++++++++
3 files changed, 61 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 7ad8c94..ff557b4 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
+#define X86_FEATURE_ART (3*32+10) /* Platform has always running timer (ART) */
#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 6d7c547..174c421 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
return rdtsc();
}
+extern struct system_counterval_t convert_art_to_tsc(cycle_t art);
+
extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 3d743da..a10cff1 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -43,6 +43,11 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
int tsc_clocksource_reliable;
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+static u64 art_to_tsc_offset;
+struct clocksource *art_related_clocksource;
+
/*
* Use a ring-buffer like data structure, where a writer advances the head by
* writing a new data entry and a reader advances the tail when it observes a
@@ -949,6 +954,36 @@ static struct notifier_block time_cpufreq_notifier_block = {
.notifier_call = time_cpufreq_notifier
};
+#define ART_CPUID_LEAF (0x15)
+#define ART_MIN_DENOMINATOR (1)
+
+
+/*
+ * If ART is present detect the numerator:denominator to convert to TSC
+ */
+static void detect_art(void)
+{
+ unsigned int unused[2];
+
+ if (boot_cpu_data.cpuid_level < ART_CPUID_LEAF)
+ return;
+
+ cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
+ &art_to_tsc_numerator, unused, unused+1);
+
+ /* Don't enable ART in a VM, non-stop TSC required */
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
+ !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) ||
+ art_to_tsc_denominator < ART_MIN_DENOMINATOR)
+ return;
+
+ if (rdmsrl_safe(MSR_IA32_TSC_ADJUST, &art_to_tsc_offset))
+ return;
+
+ /* Make this sticky over multiple CPU init calls */
+ setup_force_cpu_cap(X86_FEATURE_ART);
+}
+
static int __init cpufreq_tsc(void)
{
if (!cpu_has_tsc)
@@ -1071,6 +1106,25 @@ int unsynchronized_tsc(void)
return 0;
}
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+struct system_counterval_t convert_art_to_tsc(cycle_t art)
+{
+ u64 tmp, res, rem;
+
+ rem = do_div(art, art_to_tsc_denominator);
+
+ res = art * art_to_tsc_numerator;
+ tmp = rem * art_to_tsc_numerator;
+
+ do_div(tmp, art_to_tsc_denominator);
+ res += tmp + art_to_tsc_offset;
+
+ return (struct system_counterval_t) {.cs = art_related_clocksource,
+ .cycles = res};
+}
+EXPORT_SYMBOL(convert_art_to_tsc);
static void tsc_refine_calibration_work(struct work_struct *work);
static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1142,6 +1196,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
(unsigned long)tsc_khz % 1000);
out:
+ if (boot_cpu_has(X86_FEATURE_ART))
+ art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
}
@@ -1235,6 +1291,8 @@ void __init tsc_init(void)
mark_tsc_unstable("TSCs unsynchronized");
check_system_tsc_reliable();
+
+ detect_art();
}
#ifdef CONFIG_SMP
--
2.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-02-29 14:33 Christopher S. Hall
@ 2016-03-02 1:11 ` Christopher Hall
2016-03-02 23:59 ` Christopher Hall
2016-03-03 0:34 ` Andy Lutomirski
0 siblings, 2 replies; 9+ messages in thread
From: Christopher Hall @ 2016-03-02 1:11 UTC (permalink / raw)
To: tglx, luto, john.stultz, Christopher S. Hall
Cc: hpa, x86, linux-kernel, peterz
Andy,
On Mon, 29 Feb 2016 06:33:47 -0800, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
Do you have any comment on this? John needs your ACK. Thanks.
Chris
> *** Commit message below:
>
> On modern Intel systems TSC is derived from the new Always Running Timer
> (ART). ART can be captured simultaneous to the capture of
> audio and network device clocks, allowing a correlation between timebases
> to be constructed. Upon capture, the driver converts the captured ART
> value to the appropriate system clock using the correlated clocksource
> mechanism.
>
> On systems that support ART a new CPUID leaf (0x15) returns parameters
> “m” and “n” such that:
>
> TSC_value = (ART_value * m) / n + k [n >= 1]
>
> [k is an offset that can adjusted by a privileged agent. The
> IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
> See 17.14.4 of the Intel SDM for more details]
>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> [jstultz: Tweaked to fix build issue, also reworked math for
> 64bit division on 32bit systems]
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
> arch/x86/include/asm/cpufeature.h | 2 +-
> arch/x86/include/asm/tsc.h | 2 ++
> arch/x86/kernel/tsc.c | 58
> +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/cpufeature.h
> b/arch/x86/include/asm/cpufeature.h
> index 7ad8c94..ff557b4 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -85,7 +85,7 @@
> #define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
> #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant
> rate */
> #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
> -/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE
> leaks FOP/FIP/FOP */
> +#define X86_FEATURE_ART (3*32+10) /* Platform has always running timer
> (ART) */
> #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural
> PerfMon */
> #define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
> #define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> index 6d7c547..174c421 100644
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
> return rdtsc();
> }
> +extern struct system_counterval_t convert_art_to_tsc(cycle_t art);
> +
> extern void tsc_init(void);
> extern void mark_tsc_unstable(char *reason);
> extern int unsynchronized_tsc(void);
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 3d743da..a10cff1 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -43,6 +43,11 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
> int tsc_clocksource_reliable;
> +static u32 art_to_tsc_numerator;
> +static u32 art_to_tsc_denominator;
> +static u64 art_to_tsc_offset;
> +struct clocksource *art_related_clocksource;
> +
> /*
> * Use a ring-buffer like data structure, where a writer advances the
> head by
> * writing a new data entry and a reader advances the tail when it
> observes a
> @@ -949,6 +954,36 @@ static struct notifier_block
> time_cpufreq_notifier_block = {
> .notifier_call = time_cpufreq_notifier
> };
> +#define ART_CPUID_LEAF (0x15)
> +#define ART_MIN_DENOMINATOR (1)
> +
> +
> +/*
> + * If ART is present detect the numerator:denominator to convert to TSC
> + */
> +static void detect_art(void)
> +{
> + unsigned int unused[2];
> +
> + if (boot_cpu_data.cpuid_level < ART_CPUID_LEAF)
> + return;
> +
> + cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
> + &art_to_tsc_numerator, unused, unused+1);
> +
> + /* Don't enable ART in a VM, non-stop TSC required */
> + if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
> + !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) ||
> + art_to_tsc_denominator < ART_MIN_DENOMINATOR)
> + return;
> +
> + if (rdmsrl_safe(MSR_IA32_TSC_ADJUST, &art_to_tsc_offset))
> + return;
> +
> + /* Make this sticky over multiple CPU init calls */
> + setup_force_cpu_cap(X86_FEATURE_ART);
> +}
> +
> static int __init cpufreq_tsc(void)
> {
> if (!cpu_has_tsc)
> @@ -1071,6 +1106,25 @@ int unsynchronized_tsc(void)
> return 0;
> }
> +/*
> + * Convert ART to TSC given numerator/denominator found in detect_art()
> + */
> +struct system_counterval_t convert_art_to_tsc(cycle_t art)
> +{
> + u64 tmp, res, rem;
> +
> + rem = do_div(art, art_to_tsc_denominator);
> +
> + res = art * art_to_tsc_numerator;
> + tmp = rem * art_to_tsc_numerator;
> +
> + do_div(tmp, art_to_tsc_denominator);
> + res += tmp + art_to_tsc_offset;
> +
> + return (struct system_counterval_t) {.cs = art_related_clocksource,
> + .cycles = res};
> +}
> +EXPORT_SYMBOL(convert_art_to_tsc);
> static void tsc_refine_calibration_work(struct work_struct *work);
> static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
> @@ -1142,6 +1196,8 @@ static void tsc_refine_calibration_work(struct
> work_struct *work)
> (unsigned long)tsc_khz % 1000);
> out:
> + if (boot_cpu_has(X86_FEATURE_ART))
> + art_related_clocksource = &clocksource_tsc;
> clocksource_register_khz(&clocksource_tsc, tsc_khz);
> }
> @@ -1235,6 +1291,8 @@ void __init tsc_init(void)
> mark_tsc_unstable("TSCs unsynchronized");
> check_system_tsc_reliable();
> +
> + detect_art();
> }
> #ifdef CONFIG_SMP
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-03-02 1:11 ` Christopher Hall
@ 2016-03-02 23:59 ` Christopher Hall
2016-03-03 0:34 ` Andy Lutomirski
1 sibling, 0 replies; 9+ messages in thread
From: Christopher Hall @ 2016-03-02 23:59 UTC (permalink / raw)
To: tglx, luto, john.stultz, Christopher S. Hall
Cc: hpa, x86, linux-kernel, peterz
Hi John,
There was no response from Andy on this. Should I assume that this will
have to wait until 4.7? Is the plan to hold off on this, the ptp and
e1000e patches? Thanks.
Chris
On Tue, 01 Mar 2016 17:11:13 -0800, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> Andy,
>
> On Mon, 29 Feb 2016 06:33:47 -0800, Christopher S. Hall
> <christopher.s.hall@intel.com> wrote:
>
> Do you have any comment on this? John needs your ACK. Thanks.
>
> Chris
>
>> *** Commit message below:
>>
>> On modern Intel systems TSC is derived from the new Always Running Timer
>> (ART). ART can be captured simultaneous to the capture of
>> audio and network device clocks, allowing a correlation between
>> timebases
>> to be constructed. Upon capture, the driver converts the captured ART
>> value to the appropriate system clock using the correlated clocksource
>> mechanism.
>>
>> On systems that support ART a new CPUID leaf (0x15) returns parameters
>> “m” and “n” such that:
>>
>> TSC_value = (ART_value * m) / n + k [n >= 1]
>>
>> [k is an offset that can adjusted by a privileged agent. The
>> IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
>> See 17.14.4 of the Intel SDM for more details]
>>
>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
>> [jstultz: Tweaked to fix build issue, also reworked math for
>> 64bit division on 32bit systems]
>> Signed-off-by: John Stultz <john.stultz@linaro.org>
>> ---
>> arch/x86/include/asm/cpufeature.h | 2 +-
>> arch/x86/include/asm/tsc.h | 2 ++
>> arch/x86/kernel/tsc.c | 58
>> +++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 61 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/cpufeature.h
>> b/arch/x86/include/asm/cpufeature.h
>> index 7ad8c94..ff557b4 100644
>> --- a/arch/x86/include/asm/cpufeature.h
>> +++ b/arch/x86/include/asm/cpufeature.h
>> @@ -85,7 +85,7 @@
>> #define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
>> #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant
>> rate */
>> #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
>> -/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE
>> leaks FOP/FIP/FOP */
>> +#define X86_FEATURE_ART (3*32+10) /* Platform has always running
>> timer (ART) */
>> #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural
>> PerfMon */
>> #define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
>> #define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
>> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
>> index 6d7c547..174c421 100644
>> --- a/arch/x86/include/asm/tsc.h
>> +++ b/arch/x86/include/asm/tsc.h
>> @@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
>> return rdtsc();
>> }
>> +extern struct system_counterval_t convert_art_to_tsc(cycle_t art);
>> +
>> extern void tsc_init(void);
>> extern void mark_tsc_unstable(char *reason);
>> extern int unsynchronized_tsc(void);
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index 3d743da..a10cff1 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -43,6 +43,11 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
>> int tsc_clocksource_reliable;
>> +static u32 art_to_tsc_numerator;
>> +static u32 art_to_tsc_denominator;
>> +static u64 art_to_tsc_offset;
>> +struct clocksource *art_related_clocksource;
>> +
>> /*
>> * Use a ring-buffer like data structure, where a writer advances the
>> head by
>> * writing a new data entry and a reader advances the tail when it
>> observes a
>> @@ -949,6 +954,36 @@ static struct notifier_block
>> time_cpufreq_notifier_block = {
>> .notifier_call = time_cpufreq_notifier
>> };
>> +#define ART_CPUID_LEAF (0x15)
>> +#define ART_MIN_DENOMINATOR (1)
>> +
>> +
>> +/*
>> + * If ART is present detect the numerator:denominator to convert to TSC
>> + */
>> +static void detect_art(void)
>> +{
>> + unsigned int unused[2];
>> +
>> + if (boot_cpu_data.cpuid_level < ART_CPUID_LEAF)
>> + return;
>> +
>> + cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
>> + &art_to_tsc_numerator, unused, unused+1);
>> +
>> + /* Don't enable ART in a VM, non-stop TSC required */
>> + if (boot_cpu_has(X86_FEATURE_HYPERVISOR) ||
>> + !boot_cpu_has(X86_FEATURE_NONSTOP_TSC) ||
>> + art_to_tsc_denominator < ART_MIN_DENOMINATOR)
>> + return;
>> +
>> + if (rdmsrl_safe(MSR_IA32_TSC_ADJUST, &art_to_tsc_offset))
>> + return;
>> +
>> + /* Make this sticky over multiple CPU init calls */
>> + setup_force_cpu_cap(X86_FEATURE_ART);
>> +}
>> +
>> static int __init cpufreq_tsc(void)
>> {
>> if (!cpu_has_tsc)
>> @@ -1071,6 +1106,25 @@ int unsynchronized_tsc(void)
>> return 0;
>> }
>> +/*
>> + * Convert ART to TSC given numerator/denominator found in detect_art()
>> + */
>> +struct system_counterval_t convert_art_to_tsc(cycle_t art)
>> +{
>> + u64 tmp, res, rem;
>> +
>> + rem = do_div(art, art_to_tsc_denominator);
>> +
>> + res = art * art_to_tsc_numerator;
>> + tmp = rem * art_to_tsc_numerator;
>> +
>> + do_div(tmp, art_to_tsc_denominator);
>> + res += tmp + art_to_tsc_offset;
>> +
>> + return (struct system_counterval_t) {.cs = art_related_clocksource,
>> + .cycles = res};
>> +}
>> +EXPORT_SYMBOL(convert_art_to_tsc);
>> static void tsc_refine_calibration_work(struct work_struct *work);
>> static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
>> @@ -1142,6 +1196,8 @@ static void tsc_refine_calibration_work(struct
>> work_struct *work)
>> (unsigned long)tsc_khz % 1000);
>> out:
>> + if (boot_cpu_has(X86_FEATURE_ART))
>> + art_related_clocksource = &clocksource_tsc;
>> clocksource_register_khz(&clocksource_tsc, tsc_khz);
>> }
>> @@ -1235,6 +1291,8 @@ void __init tsc_init(void)
>> mark_tsc_unstable("TSCs unsynchronized");
>> check_system_tsc_reliable();
>> +
>> + detect_art();
>> }
>> #ifdef CONFIG_SMP
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-03-02 1:11 ` Christopher Hall
2016-03-02 23:59 ` Christopher Hall
@ 2016-03-03 0:34 ` Andy Lutomirski
1 sibling, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2016-03-03 0:34 UTC (permalink / raw)
To: Christopher S Hall
Cc: Peter Zijlstra, H. Peter Anvin, John Stultz,
linux-kernel@vger.kernel.org, X86 ML, Thomas Gleixner
On Mar 1, 2016 5:11 PM, "Christopher Hall" <christopher.s.hall@intel.com> wrote:
>
> Andy,
>
> On Mon, 29 Feb 2016 06:33:47 -0800, Christopher S. Hall <christopher.s.hall@intel.com> wrote:
>
> Do you have any comment on this? John needs your ACK. Thanks.
>
It's fine with me. I think Intel messed up the design of the feature
(there should have been an explicit way to read the offset directly),
but there's nothing you can do about that.
Sorry for the slow reply -- I had jury duty.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v7 0/8] Patchset enabling hardware based cross-timestamps for next gen Intel platforms
@ 2016-02-12 20:25 Christopher S. Hall
2016-02-12 20:25 ` [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource Christopher S. Hall
0 siblings, 1 reply; 9+ messages in thread
From: Christopher S. Hall @ 2016-02-12 20:25 UTC (permalink / raw)
To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
kevin.b.stanton, kevin.j.clarke
Modern Intel hardware adds an Always Running Timer (ART) that allows the
network and audio device clocks to precisely cross timestamp the device
clock with the system clock. This allows a precise correlation of the
device time and system time.
This patchset adds interfaces to the timekeeping code allowing drivers
to translate ART time to system time.
Changelog:
Changes from v6 to v7:
* Reorder several patches
* Removed correlated clocksource
* Fixed 32-bit compile issues
* Added multiplication overflow detection to history computation
* Added invariant tsc CPU feature - this is related to ART, but
is a separate feature
Changes from v5 to v6:
* Pulled supporting code for snapshotting, correlated
clocksource, and cycles to nanoseconds translation to separate
patches. Added patches are marked as NEW below. There is,
however, very little *actually* new code, just reorganized
code
* Renamed and moved clocksource change sequence to timekeeper
struct (out of tk_read_base)
* Renamed structs for system counter and synced device time
callback to system_counterval_t and sync_device_time_cb,
respectively
* Changed PTP cross-timestamp callback name to getcrosststamp
for consistency with the timekeeping code - corresponding
function name changes in e1000e driver
* Simplified PTP time calculations making use of ktime_to_* code
Changes from v4 to v5:
* Changes the history mechanism to interpolate system time using
a single historic system time pair (monotonic raw, realtime)
rather than implementing a precise history using shadow
timekeeper (see v4 changes). The advantage of this approach is
that the history can be arbitrarily long. This approach may
also be simpler in terms of coding. The major disadvantage is
that the realtime clock can be adjusted. When adjusted, the
realtime clock time (when interpolating from history) is
always approximate. In general, the longer the interpolation
period the larger the potential error. There isn't any error
interpolating the monotonic raw clock time.
* This patchset also addresses objections to the previous
patchsets overly complex correlated timestamp structure. This
patchset splits that structure into several smaller
structures. The correlated timestamp interface is renamed
cross timestamp to avoid any confusion with the correlated
clocksource.
* The correlated clocksource is separated from the cross
timestamp mechanism.
* Add monotonic raw to the PTP user interface
* Add e1000e driver configuration option that wraps Intel PCH
specific code
Changes v3 to v4:
* Adds a history mechanism to accomodate slower devices. In this
case the response time for timestamp reads to the Intel DSP
are too slow to be accomodated by the original correlated time
mechanism. The history mechanism turns shadow timekeeper into
an array where the history is stored.
Christopher S. Hall (8):
time: Add cycles to nanoseconds translation
time: Add timekeeping snapshot code capturing system time and counter
time: Remove duplicated code in ktime_get_raw_and_real()
time: Add driver cross timestamp interface for higher precision time
synchronization
time: Add history to cross timestamp interface supporting slower
devices
x86: tsc: Always Running Timer (ART) correlated clocksource
ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
net: e1000e: Adds hardware supported cross timestamp on e1000e nic
Documentation/ptp/testptp.c | 6 +-
arch/x86/include/asm/cpufeature.h | 3 +-
arch/x86/include/asm/tsc.h | 2 +
arch/x86/kernel/cpu/scattered.c | 1 +
arch/x86/kernel/tsc.c | 50 +++++
drivers/net/ethernet/intel/Kconfig | 9 +
drivers/net/ethernet/intel/e1000e/defines.h | 5 +
drivers/net/ethernet/intel/e1000e/ptp.c | 85 ++++++++
drivers/net/ethernet/intel/e1000e/regs.h | 4 +
drivers/ptp/ptp_chardev.c | 27 +++
include/linux/pps_kernel.h | 17 +-
include/linux/ptp_clock_kernel.h | 8 +
include/linux/timekeeper_internal.h | 2 +
include/linux/timekeeping.h | 58 ++++++
include/uapi/linux/ptp_clock.h | 13 +-
kernel/time/timekeeping.c | 289 +++++++++++++++++++++++++---
16 files changed, 539 insertions(+), 40 deletions(-)
--
2.1.4
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-02-12 20:25 [PATCH v7 0/8] Patchset enabling hardware based cross-timestamps for next gen Intel platforms Christopher S. Hall
@ 2016-02-12 20:25 ` Christopher S. Hall
2016-02-18 21:11 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: Christopher S. Hall @ 2016-02-12 20:25 UTC (permalink / raw)
To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
kevin.b.stanton, kevin.j.clarke
On modern Intel systems TSC is derived from the new Always Running Timer
(ART). ART can be captured simultaneous to the capture of
audio and network device clocks, allowing a correlation between timebases
to be constructed. Upon capture, the driver converts the captured ART
value to the appropriate system clock using the correlated clocksource
mechanism.
On systems that support ART a new CPUID leaf (0x15) returns parameters
“m” and “n” such that:
TSC_value = (ART_value * m) / n + k [n >= 2]
[k is an offset that can adjusted by a privileged agent. The
IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
See 17.14.4 of the Intel SDM for more details]
Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
[jstultz: Tweaked to fix build issue, also reworked math for
64bit division on 32bit systems]
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
arch/x86/include/asm/cpufeature.h | 3 ++-
arch/x86/include/asm/tsc.h | 2 ++
arch/x86/kernel/cpu/scattered.c | 1 +
arch/x86/kernel/tsc.c | 50 +++++++++++++++++++++++++++++++++++++++
4 files changed, 55 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 7ad8c94..111b892 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
+#define X86_FEATURE_ART (3*32+10) /* Platform has always running timer (ART) */
#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
@@ -188,6 +188,7 @@
#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
+#define X86_FEATURE_INVARIANT_TSC (7*32+4) /* Intel Invariant TSC */
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 6d7c547..174c421 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
return rdtsc();
}
+extern struct system_counterval_t convert_art_to_tsc(cycle_t art);
+
extern void tsc_init(void);
extern void mark_tsc_unstable(char *reason);
extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 8cb57df..af0ecd7 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -35,6 +35,7 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
{ X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 },
+ { X86_FEATURE_INVARIANT_TSC, CR_EDX, 8, 0x80000007, 0 },
{ X86_FEATURE_CPB, CR_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CR_EDX,11, 0x80000007, 0 },
{ 0, 0, 0, 0, 0 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 3d743da..0ee3b62 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -43,6 +43,10 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);
int tsc_clocksource_reliable;
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+struct clocksource *art_related_clocksource;
+
/*
* Use a ring-buffer like data structure, where a writer advances the head by
* writing a new data entry and a reader advances the tail when it observes a
@@ -949,10 +953,35 @@ static struct notifier_block time_cpufreq_notifier_block = {
.notifier_call = time_cpufreq_notifier
};
+#define ART_CPUID_LEAF (0x15)
+/* The denominator will never be less that 2 */
+#define ART_MIN_DENOMINATOR (2)
+
+
+/*
+ * If ART is present detect the numerator:denominator to convert to TSC
+ */
+static void detect_art(void)
+{
+ unsigned int unused[2];
+
+ if (boot_cpu_data.cpuid_level >= ART_CPUID_LEAF) {
+ cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
+ &art_to_tsc_numerator, unused, unused+1);
+
+ if (boot_cpu_has(X86_FEATURE_INVARIANT_TSC) &&
+ art_to_tsc_denominator >= ART_MIN_DENOMINATOR)
+ set_cpu_cap(&boot_cpu_data, X86_FEATURE_ART);
+ }
+}
+
static int __init cpufreq_tsc(void)
{
if (!cpu_has_tsc)
return 0;
+
+ detect_art();
+
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
return 0;
cpufreq_register_notifier(&time_cpufreq_notifier_block,
@@ -1071,6 +1100,25 @@ int unsynchronized_tsc(void)
return 0;
}
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+struct system_counterval_t convert_art_to_tsc(cycle_t art)
+{
+ u64 tmp, res, rem;
+
+ rem = do_div(art, art_to_tsc_denominator);
+
+ res = art * art_to_tsc_numerator;
+ tmp = rem * art_to_tsc_numerator;
+
+ do_div(tmp, art_to_tsc_denominator);
+ res += tmp;
+
+ return (struct system_counterval_t) {.cs = art_related_clocksource,
+ .cycles = res};
+}
+EXPORT_SYMBOL(convert_art_to_tsc);
static void tsc_refine_calibration_work(struct work_struct *work);
static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1142,6 +1190,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
(unsigned long)tsc_khz % 1000);
out:
+ if (boot_cpu_has(X86_FEATURE_ART))
+ art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
}
--
2.1.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-02-12 20:25 ` [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource Christopher S. Hall
@ 2016-02-18 21:11 ` Andy Lutomirski
2016-02-23 2:38 ` Christopher Hall
0 siblings, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2016-02-18 21:11 UTC (permalink / raw)
To: Christopher S. Hall, tglx, richardcochran, mingo, john.stultz,
hpa, jeffrey.t.kirsher
Cc: x86, linux-kernel, intel-wired-lan, netdev, kevin.b.stanton,
kevin.j.clarke
On 02/12/2016 12:25 PM, Christopher S. Hall wrote:
> On modern Intel systems TSC is derived from the new Always Running Timer
> (ART). ART can be captured simultaneous to the capture of
> audio and network device clocks, allowing a correlation between timebases
> to be constructed. Upon capture, the driver converts the captured ART
> value to the appropriate system clock using the correlated clocksource
> mechanism.
>
> On systems that support ART a new CPUID leaf (0x15) returns parameters
> “m” and “n” such that:
>
> TSC_value = (ART_value * m) / n + k [n >= 2]
>
> [k is an offset that can adjusted by a privileged agent. The
> IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
> See 17.14.4 of the Intel SDM for more details]
>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> [jstultz: Tweaked to fix build issue, also reworked math for
> 64bit division on 32bit systems]
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
> arch/x86/include/asm/cpufeature.h | 3 ++-
> arch/x86/include/asm/tsc.h | 2 ++
> arch/x86/kernel/cpu/scattered.c | 1 +
> arch/x86/kernel/tsc.c | 50 +++++++++++++++++++++++++++++++++++++++
> 4 files changed, 55 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index 7ad8c94..111b892 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -85,7 +85,7 @@
> #define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
> #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
> #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
> -/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
> +#define X86_FEATURE_ART (3*32+10) /* Platform has always running timer (ART) */
> #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
> #define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
> #define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
> @@ -188,6 +188,7 @@
>
> #define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
> #define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
> +#define X86_FEATURE_INVARIANT_TSC (7*32+4) /* Intel Invariant TSC */
How is this related to the rest of the patch?
> +/*
> + * Convert ART to TSC given numerator/denominator found in detect_art()
> + */
> +struct system_counterval_t convert_art_to_tsc(cycle_t art)
> +{
> + u64 tmp, res, rem;
> +
> + rem = do_div(art, art_to_tsc_denominator);
> +
> + res = art * art_to_tsc_numerator;
> + tmp = rem * art_to_tsc_numerator;
> +
> + do_div(tmp, art_to_tsc_denominator);
> + res += tmp;
> +
> + return (struct system_counterval_t) {.cs = art_related_clocksource,
> + .cycles = res};
The SDM and the patch description both mention an offset "k". Shouldn't
this code at least have a comment about how it deals with the k != 0 case?
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-02-18 21:11 ` Andy Lutomirski
@ 2016-02-23 2:38 ` Christopher Hall
2016-02-23 2:49 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: Christopher Hall @ 2016-02-23 2:38 UTC (permalink / raw)
To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher,
Andy Lutomirski, Peter Zijlstra
Cc: x86, linux-kernel, intel-wired-lan, netdev, kevin.b.stanton,
kevin.j.clarke
On Thu, 18 Feb 2016 13:11:33 -0800, Andy Lutomirski <luto@kernel.org>
wrote:
>> +#define X86_FEATURE_INVARIANT_TSC (7*32+4) /* Intel Invariant TSC */
This is removed. It was basically an alias for NONSTOP_TSC and not needed.
>
>> +/*
>> + * Convert ART to TSC given numerator/denominator found in detect_art()
>> + */
>> +struct system_counterval_t convert_art_to_tsc(cycle_t art)
>> +{
>> + u64 tmp, res, rem;
>> +
>> + rem = do_div(art, art_to_tsc_denominator);
>> +
>> + res = art * art_to_tsc_numerator;
>> + tmp = rem * art_to_tsc_numerator;
>> +
>> + do_div(tmp, art_to_tsc_denominator);
>> + res += tmp;
>> +
>> + return (struct system_counterval_t) {.cs = art_related_clocksource,
>> + .cycles = res};
>
> The SDM and the patch description both mention an offset "k". Shouldn't
> this code at least have a comment about how it deals with the k != 0
> case?
I don't deal with the k != 0 case. I assume that IA32 TSC adjust MSR is 0
because it's almost always a *bad idea* to change it. I've discussed this
with a few other developers and there is some consensus agreeing. From an
earlier related thread Peter Zijlstra asserts that TSC adjust "had
better" be 0.(http://lkml.iu.edu/hypermail/linux/kernel/1507.3/03734.html).
Do we really need to accommodate BIOS's that do this?
Chris
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource
2016-02-23 2:38 ` Christopher Hall
@ 2016-02-23 2:49 ` Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2016-02-23 2:49 UTC (permalink / raw)
To: Christopher Hall
Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, John Stultz,
H. Peter Anvin, Kirsher, Jeffrey T, Andy Lutomirski,
Peter Zijlstra, X86 ML, linux-kernel@vger.kernel.org,
intel-wired-lan, Network Development, kevin.b.stanton,
kevin.j.clarke
On Mon, Feb 22, 2016 at 6:38 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> On Thu, 18 Feb 2016 13:11:33 -0800, Andy Lutomirski <luto@kernel.org> wrote:
>>>
>>> +#define X86_FEATURE_INVARIANT_TSC (7*32+4) /* Intel Invariant TSC */
>
>
> This is removed. It was basically an alias for NONSTOP_TSC and not needed.
>
>>
>>> +/*
>>> + * Convert ART to TSC given numerator/denominator found in detect_art()
>>> + */
>>> +struct system_counterval_t convert_art_to_tsc(cycle_t art)
>>> +{
>>> + u64 tmp, res, rem;
>>> +
>>> + rem = do_div(art, art_to_tsc_denominator);
>>> +
>>> + res = art * art_to_tsc_numerator;
>>> + tmp = rem * art_to_tsc_numerator;
>>> +
>>> + do_div(tmp, art_to_tsc_denominator);
>>> + res += tmp;
>>> +
>>> + return (struct system_counterval_t) {.cs =
>>> art_related_clocksource,
>>> + .cycles = res};
>>
>>
>> The SDM and the patch description both mention an offset "k". Shouldn't
>> this code at least have a comment about how it deals with the k != 0 case?
>
>
> I don't deal with the k != 0 case. I assume that IA32 TSC adjust MSR is 0
> because it's almost always a *bad idea* to change it. I've discussed this
> with a few other developers and there is some consensus agreeing. From an
> earlier related thread Peter Zijlstra asserts that TSC adjust "had
> better" be 0.(http://lkml.iu.edu/hypermail/linux/kernel/1507.3/03734.html).
I'm having trouble finding that in the link you sent.
>
> Do we really need to accommodate BIOS's that do this?
There are three interesting cases that I can think of:
1. Crappy BIOS that sets TSC_ADJUST. As the not-so-proud owner of a
piece of crap motherboard that actively messes with the TSC, I don't
trust BIOS.
2. Hypervisors. What if we're running as a guest with an ART-using
NIC passed through?
3. Hypothetical improved future kernel that politely uses TSC_ADJUST
to keep the TSC from going backwards across suspend/resume.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-03-03 0:34 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-29 14:22 [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource Christopher S. Hall
-- strict thread matches above, loose matches on Subject: below --
2016-02-29 14:33 Christopher S. Hall
2016-03-02 1:11 ` Christopher Hall
2016-03-02 23:59 ` Christopher Hall
2016-03-03 0:34 ` Andy Lutomirski
2016-02-12 20:25 [PATCH v7 0/8] Patchset enabling hardware based cross-timestamps for next gen Intel platforms Christopher S. Hall
2016-02-12 20:25 ` [PATCH v7 6/8] x86: tsc: Always Running Timer (ART) correlated clocksource Christopher S. Hall
2016-02-18 21:11 ` Andy Lutomirski
2016-02-23 2:38 ` Christopher Hall
2016-02-23 2:49 ` Andy Lutomirski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).