public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] x86: Export tsc related information in sysfs
@ 2010-05-20 19:19 Brian Bloniarz
  2010-05-22  2:03 ` john stultz
  0 siblings, 1 reply; 80+ messages in thread
From: Brian Bloniarz @ 2010-05-20 19:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Peter Zijlstra, Andi Kleen, H. Peter Anvin,
	Dan Magenheimer, Arjan van de Ven, Venkatesh Pallipadi,
	chris.mason, linux-kernel

Ingo Molnar wrote:
> The point is for the kernel to not be complicit in 
> practices that are technically not reliable.

One usecase that hasn't been discussed is when userspace needs this info to
calibrate the TSC.

Take NTP as an example. It does a pretty good job of observing the drift in
gettimeofday() against a reference clock and correcting for it. This seems
to work well even when GTOD uses the TSC. But, it assumes that the drift
changes slowly.

That goes out the window on reboot, because the kernel only spends 25ms on
TSC<->PIT calibration and the value of tsc_khz can vary a lot from boot to
boot. Then NTP starts up and reads a drift value from /var/lib/ntp/ntp.drift
that it *thinks* is accurate. In our experience, it'll then spend up to 48
hours doing god knows what to the clock until it converges on the real
drift at the new tsc_khz.  initscripts could correct for the kernel's
recalibration, but tsc_khz isn't exported.

So it's too bad that it can't be exported somehow. The TSC on our
machines has proven to be stable for all intents and purposes; I just
checked 25 of my machines, most have uptime of >200 days, all of them
still have current_clocksource=tsc. After NTP or PTPd has been running
for a while, things converge, but being unable to reboot is a headache.
Using the HPET for gettimeofday() would be impractical for performance
reasons.

^ permalink raw reply	[flat|nested] 80+ messages in thread
* [PATCH] x86: Export tsc related information in sysfs
@ 2010-05-15  1:40 Venkatesh Pallipadi
  2010-05-15  9:57 ` Andi Kleen
  2010-05-15 12:35 ` Jaswinder Singh Rajput
  0 siblings, 2 replies; 80+ messages in thread
From: Venkatesh Pallipadi @ 2010-05-15  1:40 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: Chris Mason, linux-kernel, Venkatesh Pallipadi, Dan Magenheimer

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Kernel information about calibrated value of tsc_khz and
tsc_stability (result of tsc warp test) are useful bits of information
for any app that wants to use TSC directly. Export this read_only
information in sysfs.

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
---
 arch/x86/kernel/tsc.c |   76 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 76 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 9faf91a..24dd484 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -10,6 +10,7 @@
 #include <linux/clocksource.h>
 #include <linux/percpu.h>
 #include <linux/timex.h>
+#include <linux/sysdev.h>
 
 #include <asm/hpet.h>
 #include <asm/timer.h>
@@ -857,6 +858,81 @@ static void __init init_tsc_clocksource(void)
 	clocksource_register(&clocksource_tsc);
 }
 
+#ifdef CONFIG_SYSFS
+/*
+ * Export TSC related info to user land. This reflects kernel usage of TSC
+ * as hints to userspace users of TSC. The read_only info provided here:
+ * - tsc_stable: 1 implies system has TSC that always counts at a constant
+ *   rate, sync across CPUs and has passed the kernel warp test.
+ * - tsc_khz: TSC frequency in khz.
+ * - tsc_mult and tsc_shift: multiplier and shift to optimally convert
+ *   TSC delta to ns; ns = ((u64) delta * mult) >> shift
+ */
+
+#define define_show_var_function(_name, _var) \
+static ssize_t show_##_name( \
+	struct sys_device *dev, struct sysdev_attribute *attr, char *buf) \
+{ \
+	return sprintf(buf, "%u\n", (unsigned int) _var);\
+}
+
+define_show_var_function(tsc_stable, !tsc_unstable);
+define_show_var_function(tsc_khz, tsc_khz);
+define_show_var_function(tsc_mult, clocksource_tsc.mult);
+define_show_var_function(tsc_shift, clocksource_tsc.shift);
+
+static SYSDEV_ATTR(tsc_stable, 0444, show_tsc_stable, NULL);
+static SYSDEV_ATTR(tsc_khz, 0444, show_tsc_khz, NULL);
+static SYSDEV_ATTR(tsc_mult, 0444, show_tsc_mult, NULL);
+static SYSDEV_ATTR(tsc_shift, 0444, show_tsc_shift, NULL);
+
+static struct sysdev_attribute *tsc_attrs[] = {
+	&attr_tsc_stable,
+	&attr_tsc_khz,
+	&attr_tsc_mult,
+	&attr_tsc_shift,
+};
+
+static struct sysdev_class tsc_sysclass = {
+	.name = "tsc",
+};
+
+static struct sys_device device_tsc = {
+	.id = 0,
+	.cls = &tsc_sysclass,
+};
+
+static int __init init_tsc_sysfs(void)
+{
+	int err, i = 0;
+
+	err = sysdev_class_register(&tsc_sysclass);
+	if (err)
+		return err;
+
+	err = sysdev_register(&device_tsc);
+	if (err)
+		goto fail;
+
+	for (i = 0; i < ARRAY_SIZE(tsc_attrs); i++) {
+		err = sysdev_create_file(&device_tsc, tsc_attrs[i]);
+		if (err)
+			goto fail;
+	}
+
+	return 0;
+
+fail:
+	while (--i >= 0)
+		sysdev_remove_file(&device_tsc, tsc_attrs[i]);
+
+	sysdev_unregister(&device_tsc);
+	sysdev_class_unregister(&tsc_sysclass);
+	return err;
+}
+device_initcall(init_tsc_sysfs);
+#endif
+
 #ifdef CONFIG_X86_64
 /*
  * calibrate_cpu is used on systems with fixed rate TSCs to determine
-- 
1.7.0.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2010-06-04 14:25 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-20 19:19 [PATCH] x86: Export tsc related information in sysfs Brian Bloniarz
2010-05-22  2:03 ` john stultz
2010-05-22  3:33   ` H. Peter Anvin
2010-05-24 18:13   ` Dan Magenheimer
2010-05-24 18:19     ` H. Peter Anvin
2010-05-24 18:51     ` john stultz
2010-05-24 20:20       ` H. Peter Anvin
2010-05-24 20:39         ` john stultz
2010-05-24 21:26           ` H. Peter Anvin
2010-05-24 22:04           ` Dan Magenheimer
2010-05-24 22:30             ` H. Peter Anvin
2010-05-24 22:49               ` john stultz
2010-05-24 23:16                 ` Dan Magenheimer
2010-05-24 23:19                   ` H. Peter Anvin
2010-05-24 23:30                   ` john stultz
2010-05-24 23:42                     ` Andi Kleen
2010-05-25  0:01                     ` Dan Magenheimer
2010-05-25  0:07                       ` H. Peter Anvin
2010-05-25  1:33               ` Brian Bloniarz
2010-05-26  0:16                 ` Brian Bloniarz
2010-05-26  0:48                   ` john stultz
2010-05-26  2:50                     ` Brian Bloniarz
2010-05-26 12:35                       ` Thomas Gleixner
2010-05-26 14:26                         ` Dan Magenheimer
2010-05-26 14:41                           ` Thomas Gleixner
2010-05-26 15:04                       ` john stultz
2010-05-26 16:02                         ` Brian Bloniarz
2010-05-26 16:25                           ` john stultz
2010-05-26 18:24                             ` H. Peter Anvin
2010-05-26 18:44                             ` Brian Bloniarz
2010-05-26 18:51                               ` H. Peter Anvin
2010-05-26 20:19                                 ` john stultz
2010-05-26 21:06                                   ` H. Peter Anvin
2010-05-26 19:49                               ` john stultz
2010-05-26 20:22                                 ` Brian Bloniarz
2010-05-26 12:30                   ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2010-05-15  1:40 Venkatesh Pallipadi
2010-05-15  9:57 ` Andi Kleen
2010-05-15 13:29   ` Dan Magenheimer
2010-05-15 16:48     ` Venkatesh Pallipadi
2010-05-15 19:14     ` Arjan van de Ven
2010-05-15 22:32       ` Dan Magenheimer
2010-05-16  5:43         ` Arjan van de Ven
2010-05-16  9:20           ` Thomas Gleixner
2010-05-16 16:42             ` Dan Magenheimer
2010-05-16 19:14               ` Thomas Gleixner
2010-05-17  1:31                 ` Dan Magenheimer
2010-05-17  5:06                   ` Arjan van de Ven
2010-05-18  9:58                     ` Peter Zijlstra
2010-05-18 10:03                       ` Peter Zijlstra
2010-05-18 11:25                       ` Andi Kleen
2010-05-18 11:58                         ` Peter Zijlstra
2010-05-18 15:13                           ` Dan Magenheimer
2010-05-18 16:40                           ` H. Peter Anvin
2010-05-18 16:52                             ` Peter Zijlstra
2010-05-18 17:04                               ` H. Peter Anvin
2010-05-18 17:49                                 ` Dan Magenheimer
2010-05-18 18:46                                   ` H. Peter Anvin
2010-05-18 19:00                                     ` Dan Magenheimer
2010-05-18 19:16                                       ` Dan Magenheimer
2010-05-18 19:26                                         ` H. Peter Anvin
2010-05-18 20:29                                           ` Dan Magenheimer
2010-05-18 20:34                                             ` H. Peter Anvin
2010-05-18 21:02                                               ` Dan Magenheimer
2010-05-18 21:13                                               ` Andi Kleen
2010-05-19  6:26                                                 ` Peter Zijlstra
2010-05-17 10:20                   ` Thomas Gleixner
2010-05-16 20:29               ` Arjan van de Ven
2010-05-17 10:26         ` Andi Kleen
2010-06-04 14:24           ` Pavel Machek
2010-05-15 22:45     ` Thomas Gleixner
2010-05-17 10:22     ` Andi Kleen
2010-05-17 15:23       ` Dan Magenheimer
2010-05-17 16:56         ` Andi Kleen
2010-05-17 22:36         ` Thomas Gleixner
2010-05-17 23:33           ` Dan Magenheimer
2010-05-18  0:00             ` Ingo Molnar
2010-05-18  0:02             ` Ingo Molnar
2010-05-15 12:35 ` Jaswinder Singh Rajput
2010-05-15 14:37   ` Venkatesh Pallipadi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox