public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][patch 00/12] clocksource / timekeeping rework V2
@ 2009-07-29 13:41 Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 01/12] introduce timekeeping_leap_insert Martin Schwidefsky
                   ` (12 more replies)
  0 siblings, 13 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker

Greetings,
version 2 of the clocksource / timekeeping cleanup patches. The series
has grown quite a bit, what started with a simple idea to replace the
tick based clocksource update with stop_machine is now a full fledged
code rework.

The code is working on s390 and on my Athlon system at home which has
a broken tsc clocksource:

[    0.000000] Fast TSC calibration using PIT
[    0.000341] hpet clockevent registered
[    0.000343] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    0.204021] hpet0: at MMIO 0xfefff000, IRQs 2, 8, 31
[    0.204027] hpet0: 3 comparators, 32-bit 25.000000 MHz counter
[    0.208007] Switching to clock hpet
[    0.211544] Switched to high resolution mode on CPU 0
[    0.211960] Switched to high resolution mode on CPU 1
[    8.000020] Clocksource tsc unstable (delta = -172310085 ns)

So the clocksource switch via stop_machine and the clocksource watchdog
are working. I keep the fingers crossed that nothing else breaks.

The patch set is based on todays upstream tree plus the patches from
the tip tree, if anyone wants to try them you need to pull from the
master branch of
    git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip

There is still more room for improvement. Some sore points are:

1) The cycle_last value still is in the struct clocksource. It should
   be in the struct timekeeper but the check against cycles_last in the
   read function of the TSC clock source makes it hard.
2) read_persistent_clock returns seconds. With a really good initial
   time source this is not very precise. read_persistent_clock should
   return a struct timespec.
3) xtime, raw_time, total_sleep_time, timekeeping_suspended, jiffies,
   the ntp state and probably a few other values may be better located
   in the struct timekeeper as well.

and a few more I forgot.

Many thanks to John who pushed me into the right directions.
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 01/12] introduce timekeeping_leap_insert
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-30 21:02   ` john stultz
  2009-07-29 13:41 ` [RFC][patch 02/12] remove clocksource inline functions Martin Schwidefsky
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker

[-- Attachment #1: leap_insert.diff --]
[-- Type: text/plain, Size: 2213 bytes --]

From: john stultz <johnstul@us.ibm.com>

---
 include/linux/time.h      |    1 +
 kernel/time/ntp.c         |    7 ++-----
 kernel/time/timekeeping.c |    7 +++++++
 3 files changed, 10 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/time.h
===================================================================
--- linux-2.6.orig/include/linux/time.h
+++ linux-2.6/include/linux/time.h
@@ -147,6 +147,7 @@ extern struct timespec timespec_trunc(st
 extern int timekeeping_valid_for_hres(void);
 extern void update_wall_time(void);
 extern void update_xtime_cache(u64 nsec);
+extern void timekeeping_leap_insert(int leapsecond);
 
 struct tms;
 extern void do_sys_times(struct tms *);
Index: linux-2.6/kernel/time/ntp.c
===================================================================
--- linux-2.6.orig/kernel/time/ntp.c
+++ linux-2.6/kernel/time/ntp.c
@@ -194,8 +194,7 @@ static enum hrtimer_restart ntp_leap_sec
 	case TIME_OK:
 		break;
 	case TIME_INS:
-		xtime.tv_sec--;
-		wall_to_monotonic.tv_sec++;
+		timekeeping_leap_insert(-1);
 		time_state = TIME_OOP;
 		printk(KERN_NOTICE
 			"Clock: inserting leap second 23:59:60 UTC\n");
@@ -203,9 +202,8 @@ static enum hrtimer_restart ntp_leap_sec
 		res = HRTIMER_RESTART;
 		break;
 	case TIME_DEL:
-		xtime.tv_sec++;
+		timekeeping_leap_insert(1);
 		time_tai--;
-		wall_to_monotonic.tv_sec--;
 		time_state = TIME_WAIT;
 		printk(KERN_NOTICE
 			"Clock: deleting leap second 23:59:59 UTC\n");
@@ -219,7 +217,6 @@ static enum hrtimer_restart ntp_leap_sec
 			time_state = TIME_OK;
 		break;
 	}
-	update_vsyscall(&xtime, clock);
 
 	write_sequnlock(&xtime_lock);
 
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -58,6 +58,13 @@ void update_xtime_cache(u64 nsec)
 
 struct clocksource *clock;
 
+/* must hold xtime_lock */
+void timekeeping_leap_insert(int leapsecond)
+{
+	xtime.tv_sec += leapsecond;
+	wall_to_monotonic.tv_sec -= leapsecond;
+	update_vsyscall(&xtime, clock);
+}
 
 #ifdef CONFIG_GENERIC_TIME
 /**

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 02/12] remove clocksource inline functions
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 01/12] introduce timekeeping_leap_insert Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 14:15   ` Daniel Walker
  2009-07-30 21:05   ` john stultz
  2009-07-29 13:41 ` [RFC][patch 03/12] cleanup clocksource selection Martin Schwidefsky
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: clocksource-inline.diff --]
[-- Type: text/plain, Size: 5591 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Remove clocksource_read, clocksource_enable and clocksource_disable
inline functions. No functional change.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/linux/clocksource.h |   46 --------------------------------------------
 kernel/time/timekeeping.c   |   32 +++++++++++++++++-------------
 2 files changed, 18 insertions(+), 60 deletions(-)

Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -79,7 +79,7 @@ static void clocksource_forward_now(void
 	cycle_t cycle_now, cycle_delta;
 	s64 nsec;
 
-	cycle_now = clocksource_read(clock);
+	cycle_now = clock->read(clock);
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 	clock->cycle_last = cycle_now;
 
@@ -114,7 +114,7 @@ void getnstimeofday(struct timespec *ts)
 		*ts = xtime;
 
 		/* read clocksource: */
-		cycle_now = clocksource_read(clock);
+		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
@@ -146,7 +146,7 @@ ktime_t ktime_get(void)
 		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
 
 		/* read clocksource: */
-		cycle_now = clocksource_read(clock);
+		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
@@ -186,7 +186,7 @@ void ktime_get_ts(struct timespec *ts)
 		tomono = wall_to_monotonic;
 
 		/* read clocksource: */
-		cycle_now = clocksource_read(clock);
+		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
@@ -274,16 +274,18 @@ static void change_clocksource(void)
 
 	clocksource_forward_now();
 
-	if (clocksource_enable(new))
+	if (new->enable && ! new->enable(new))
 		return;
+	/* save mult_orig on enable */
+	new->mult_orig = new->mult;
 
 	new->raw_time = clock->raw_time;
 	old = clock;
 	clock = new;
-	clocksource_disable(old);
+	if (old->disable)
+		old->disable(old);
 
-	clock->cycle_last = 0;
-	clock->cycle_last = clocksource_read(clock);
+	clock->cycle_last = clock->read(clock);
 	clock->error = 0;
 	clock->xtime_nsec = 0;
 	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
@@ -373,7 +375,7 @@ void getrawmonotonic(struct timespec *ts
 		seq = read_seqbegin(&xtime_lock);
 
 		/* read clocksource: */
-		cycle_now = clocksource_read(clock);
+		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
@@ -435,9 +437,12 @@ void __init timekeeping_init(void)
 	ntp_init();
 
 	clock = clocksource_get_next();
-	clocksource_enable(clock);
+	if (clock->enable)
+		clock->enable(clock);
+	/* save mult_orig on enable */
+	clock->mult_orig = clock->mult;
 	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
-	clock->cycle_last = clocksource_read(clock);
+	clock->cycle_last = clock->read(clock);
 
 	xtime.tv_sec = sec;
 	xtime.tv_nsec = 0;
@@ -477,8 +482,7 @@ static int timekeeping_resume(struct sys
 	}
 	update_xtime_cache(0);
 	/* re-base the last cycle value */
-	clock->cycle_last = 0;
-	clock->cycle_last = clocksource_read(clock);
+	clock->cycle_last = clock->read(clock);
 	clock->error = 0;
 	timekeeping_suspended = 0;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
@@ -630,7 +634,7 @@ void update_wall_time(void)
 		return;
 
 #ifdef CONFIG_GENERIC_TIME
-	offset = (clocksource_read(clock) - clock->cycle_last) & clock->mask;
+	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
 #else
 	offset = clock->cycle_interval;
 #endif
Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -268,52 +268,6 @@ static inline u32 clocksource_hz2mult(u3
 }
 
 /**
- * clocksource_read: - Access the clocksource's current cycle value
- * @cs:		pointer to clocksource being read
- *
- * Uses the clocksource to return the current cycle_t value
- */
-static inline cycle_t clocksource_read(struct clocksource *cs)
-{
-	return cs->read(cs);
-}
-
-/**
- * clocksource_enable: - enable clocksource
- * @cs:		pointer to clocksource
- *
- * Enables the specified clocksource. The clocksource callback
- * function should start up the hardware and setup mult and field
- * members of struct clocksource to reflect hardware capabilities.
- */
-static inline int clocksource_enable(struct clocksource *cs)
-{
-	int ret = 0;
-
-	if (cs->enable)
-		ret = cs->enable(cs);
-
-	/* save mult_orig on enable */
-	cs->mult_orig = cs->mult;
-
-	return ret;
-}
-
-/**
- * clocksource_disable: - disable clocksource
- * @cs:		pointer to clocksource
- *
- * Disables the specified clocksource. The clocksource callback
- * function should power down the now unused hardware block to
- * save power.
- */
-static inline void clocksource_disable(struct clocksource *cs)
-{
-	if (cs->disable)
-		cs->disable(cs);
-}
-
-/**
  * cyc2ns - converts clocksource cycles to nanoseconds
  * @cs:		Pointer to clocksource
  * @cycles:	Cycles

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 03/12] cleanup clocksource selection
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 01/12] introduce timekeeping_leap_insert Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 02/12] remove clocksource inline functions Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 04/12] clocksource watchdog highres enablement Martin Schwidefsky
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: clocksource-select.diff --]
[-- Type: text/plain, Size: 10364 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Sorting the list of clocksources according to their rating is not
necessary as the clocksource selection is not performance critical.
Simplify clocksource_register and sysfs_override_clocksource,
remove clocksource_enqueue and let clocksource_select do the job.
The check in clocksource_register for double registration of the
same clocksource is removed without replacement.

To find the initial clocksource a new weak function in jiffies.c is
defined that returns the jiffies clocksource. The architecture code
can then override the weak function with a more suitable clocksource,
e.g. the TOD clock on s390.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/kernel/time.c     |    4 +
 include/linux/clocksource.h |    2 
 kernel/time/clocksource.c   |  148 ++++++++++++++------------------------------
 kernel/time/jiffies.c       |    6 +
 kernel/time/timekeeping.c   |    4 -
 5 files changed, 62 insertions(+), 102 deletions(-)

Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -21,7 +21,6 @@
  *
  * TODO WishList:
  *   o Allow clocksource drivers to be unregistered
- *   o get rid of clocksource_jiffies extern
  */
 
 #include <linux/clocksource.h>
@@ -107,9 +106,6 @@ u64 timecounter_cyc2time(struct timecoun
 }
 EXPORT_SYMBOL(timecounter_cyc2time);
 
-/* XXX - Would like a better way for initializing curr_clocksource */
-extern struct clocksource clocksource_jiffies;
-
 /*[Clocksource internal variables]---------
  * curr_clocksource:
  *	currently selected clocksource. Initialized to clocksource_jiffies.
@@ -123,9 +119,8 @@ extern struct clocksource clocksource_ji
  * override_name:
  *	Name of the user-specified clocksource.
  */
-static struct clocksource *curr_clocksource = &clocksource_jiffies;
+static struct clocksource *curr_clocksource;
 static struct clocksource *next_clocksource;
-static struct clocksource *clocksource_override;
 static LIST_HEAD(clocksource_list);
 static DEFINE_SPINLOCK(clocksource_lock);
 static char override_name[32];
@@ -320,6 +315,7 @@ void clocksource_touch_watchdog(void)
 	clocksource_resume_watchdog();
 }
 
+#ifdef CONFIG_GENERIC_TIME
 /**
  * clocksource_get_next - Returns the selected clocksource
  *
@@ -339,77 +335,69 @@ struct clocksource *clocksource_get_next
 }
 
 /**
- * select_clocksource - Selects the best registered clocksource.
- *
- * Private function. Must hold clocksource_lock when called.
- *
- * Select the clocksource with the best rating, or the clocksource,
- * which is selected by userspace override.
+ * clocksource_select - Select the best clocksource available
  */
-static struct clocksource *select_clocksource(void)
+static int clocksource_select(void)
 {
-	struct clocksource *next;
-
-	if (list_empty(&clocksource_list))
-		return NULL;
-
-	if (clocksource_override)
-		next = clocksource_override;
-	else
-		next = list_entry(clocksource_list.next, struct clocksource,
-				  list);
+	struct clocksource *best, *cs;
+	int rc;
 
-	if (next == curr_clocksource)
-		return NULL;
-
-	return next;
-}
-
-/*
- * Enqueue the clocksource sorted by rating
- */
-static int clocksource_enqueue(struct clocksource *c)
-{
-	struct list_head *tmp, *entry = &clocksource_list;
-
-	list_for_each(tmp, &clocksource_list) {
-		struct clocksource *cs;
-
-		cs = list_entry(tmp, struct clocksource, list);
-		if (cs == c)
-			return -EBUSY;
-		/* Keep track of the place, where to insert */
-		if (cs->rating >= c->rating)
-			entry = tmp;
+	rc = 0;
+	best = NULL;
+	list_for_each_entry(cs, &clocksource_list, list) {
+		/* Check for the override clocksource. */
+		if (strcmp(cs->name, override_name) == 0) {
+			/*
+			 * Check to make sure we don't switch to a non-highres
+			 * capable clocksource if the tick code is in oneshot
+			 * mode (highres or nohz)
+			 */
+			if ((cs->flags & CLOCK_SOURCE_VALID_FOR_HRES) ||
+			    !tick_oneshot_mode_active()) {
+				/* Override clocksource can be used. */
+				best = cs;
+				break;
+			}
+			/* Override clocksource cannot be used. */
+			printk(KERN_WARNING "Override clocksource %s is not "
+			       "HRT compatible. Cannot switch while in "
+			       "HRT/NOHZ mode\n", cs->name);
+			override_name[0] = 0;
+		}
+		/* Check rating */
+		if (!best || cs->rating > best->rating)
+			best = cs;
 	}
-	list_add(&c->list, entry);
+	if (curr_clocksource != best)
+		next_clocksource = best;
+	return rc;
+}
 
-	if (strlen(c->name) == strlen(override_name) &&
-	    !strcmp(c->name, override_name))
-		clocksource_override = c;
+#else /* CONFIG_GENERIC_TIME */
 
+static inline int clocksource_select(void)
+{
 	return 0;
 }
 
+#endif
+
 /**
  * clocksource_register - Used to install new clocksources
  * @t:		clocksource to be registered
  *
  * Returns -EBUSY if registration fails, zero otherwise.
  */
-int clocksource_register(struct clocksource *c)
+int clocksource_register(struct clocksource *cs)
 {
 	unsigned long flags;
-	int ret;
 
 	spin_lock_irqsave(&clocksource_lock, flags);
-	ret = clocksource_enqueue(c);
-	if (!ret)
-		next_clocksource = select_clocksource();
+	list_add(&cs->list, &clocksource_list);
+	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
-	if (!ret)
-		clocksource_check_watchdog(c);
-	return ret;
+	clocksource_check_watchdog(cs);
+	return 0;
 }
 EXPORT_SYMBOL(clocksource_register);
 
@@ -422,10 +410,8 @@ void clocksource_change_rating(struct cl
 	unsigned long flags;
 
 	spin_lock_irqsave(&clocksource_lock, flags);
-	list_del(&cs->list);
 	cs->rating = rating;
-	clocksource_enqueue(cs);
-	next_clocksource = select_clocksource();
+	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
 }
 
@@ -438,9 +424,7 @@ void clocksource_unregister(struct clock
 
 	spin_lock_irqsave(&clocksource_lock, flags);
 	list_del(&cs->list);
-	if (clocksource_override == cs)
-		clocksource_override = NULL;
-	next_clocksource = select_clocksource();
+	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
 }
 
@@ -478,10 +462,6 @@ static ssize_t sysfs_override_clocksourc
 					  struct sysdev_attribute *attr,
 					  const char *buf, size_t count)
 {
-	struct clocksource *ovr = NULL;
-	size_t ret = count;
-	int len;
-
 	/* strings from sysfs write are not 0 terminated! */
 	if (count >= sizeof(override_name))
 		return -EINVAL;
@@ -495,41 +475,11 @@ static ssize_t sysfs_override_clocksourc
 	if (count > 0)
 		memcpy(override_name, buf, count);
 	override_name[count] = 0;
-
-	len = strlen(override_name);
-	if (len) {
-		struct clocksource *cs;
-
-		ovr = clocksource_override;
-		/* try to select it: */
-		list_for_each_entry(cs, &clocksource_list, list) {
-			if (strlen(cs->name) == len &&
-			    !strcmp(cs->name, override_name))
-				ovr = cs;
-		}
-	}
-
-	/*
-	 * Check to make sure we don't switch to a non-highres capable
-	 * clocksource if the tick code is in oneshot mode (highres or nohz)
-	 */
-	if (tick_oneshot_mode_active() && ovr &&
-	    !(ovr->flags & CLOCK_SOURCE_VALID_FOR_HRES)) {
-		printk(KERN_WARNING "%s clocksource is not HRT compatible. "
-			"Cannot switch while in HRT/NOHZ mode\n", ovr->name);
-		ovr = NULL;
-		override_name[0] = 0;
-	}
-
-	/* Reselect, when the override name has changed */
-	if (ovr != clocksource_override) {
-		clocksource_override = ovr;
-		next_clocksource = select_clocksource();
-	}
+	clocksource_select();
 
 	spin_unlock_irq(&clocksource_lock);
 
-	return ret;
+	return count;
 }
 
 /**
Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -14,6 +14,7 @@
 #include <linux/list.h>
 #include <linux/cache.h>
 #include <linux/timer.h>
+#include <linux/init.h>
 #include <asm/div64.h>
 #include <asm/io.h>
 
@@ -322,6 +323,7 @@ extern void clocksource_touch_watchdog(v
 extern struct clocksource* clocksource_get_next(void);
 extern void clocksource_change_rating(struct clocksource *cs, int rating);
 extern void clocksource_resume(void);
+extern struct clocksource * __init __weak clocksource_default_clock(void);
 
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL
 extern void update_vsyscall(struct timespec *ts, struct clocksource *c);
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -269,7 +269,7 @@ static void change_clocksource(void)
 
 	new = clocksource_get_next();
 
-	if (clock == new)
+	if (!new || clock == new)
 		return;
 
 	clocksource_forward_now();
@@ -436,7 +436,7 @@ void __init timekeeping_init(void)
 
 	ntp_init();
 
-	clock = clocksource_get_next();
+	clock = clocksource_default_clock();
 	if (clock->enable)
 		clock->enable(clock);
 	/* save mult_orig on enable */
Index: linux-2.6/kernel/time/jiffies.c
===================================================================
--- linux-2.6.orig/kernel/time/jiffies.c
+++ linux-2.6/kernel/time/jiffies.c
@@ -61,7 +61,6 @@ struct clocksource clocksource_jiffies =
 	.read		= jiffies_read,
 	.mask		= 0xffffffff, /*32bits*/
 	.mult		= NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
-	.mult_orig	= NSEC_PER_JIFFY << JIFFIES_SHIFT,
 	.shift		= JIFFIES_SHIFT,
 };
 
@@ -71,3 +70,8 @@ static int __init init_jiffies_clocksour
 }
 
 core_initcall(init_jiffies_clocksource);
+
+struct clocksource * __init __weak clocksource_default_clock(void)
+{
+	return &clocksource_jiffies;
+}
Index: linux-2.6/arch/s390/kernel/time.c
===================================================================
--- linux-2.6.orig/arch/s390/kernel/time.c
+++ linux-2.6/arch/s390/kernel/time.c
@@ -205,6 +205,10 @@ static struct clocksource clocksource_to
 	.flags		= CLOCK_SOURCE_IS_CONTINUOUS,
 };
 
+struct clocksource * __init clocksource_default_clock(void)
+{
+	return &clocksource_tod;
+}
 
 void update_vsyscall(struct timespec *wall_time, struct clocksource *clock)
 {

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 04/12] clocksource watchdog highres enablement
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (2 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 03/12] cleanup clocksource selection Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 05/12] clocksource watchdog resume logic Martin Schwidefsky
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: watchdog-highres.diff --]
[-- Type: text/plain, Size: 3178 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

The clocksource watchdog marks a clock as highres capable before it
checked the deviation from the watchdog clocksource even for a single
time. Make sure that the deviation is at least checked once before
doing the switch to highres mode.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 kernel/time/clocksource.c |   47 +++++++++++++++++++++-------------------------
 1 file changed, 22 insertions(+), 25 deletions(-)

Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -153,11 +153,8 @@ static unsigned long watchdog_resumed;
 #define WATCHDOG_INTERVAL (HZ >> 1)
 #define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
 
-static void clocksource_ratewd(struct clocksource *cs, int64_t delta)
+static void clocksource_unstable(struct clocksource *cs, int64_t delta)
 {
-	if (delta > -WATCHDOG_THRESHOLD && delta < WATCHDOG_THRESHOLD)
-		return;
-
 	printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
 	       cs->name, delta);
 	cs->flags &= ~(CLOCK_SOURCE_VALID_FOR_HRES | CLOCK_SOURCE_WATCHDOG);
@@ -183,31 +180,31 @@ static void clocksource_watchdog(unsigne
 	list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) {
 		csnow = cs->read(cs);
 
-		if (unlikely(resumed)) {
+		/* Clocksource initialized ? */
+		if (!(cs->flags & CLOCK_SOURCE_WATCHDOG)) {
+			cs->flags |= CLOCK_SOURCE_WATCHDOG;
 			cs->wd_last = csnow;
 			continue;
 		}
 
-		/* Initialized ? */
-		if (!(cs->flags & CLOCK_SOURCE_WATCHDOG)) {
-			if ((cs->flags & CLOCK_SOURCE_IS_CONTINUOUS) &&
-			    (watchdog->flags & CLOCK_SOURCE_IS_CONTINUOUS)) {
-				cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
-				/*
-				 * We just marked the clocksource as
-				 * highres-capable, notify the rest of the
-				 * system as well so that we transition
-				 * into high-res mode:
-				 */
-				tick_clock_notify();
-			}
-			cs->flags |= CLOCK_SOURCE_WATCHDOG;
-			cs->wd_last = csnow;
-		} else {
-			cs_nsec = cyc2ns(cs, (csnow - cs->wd_last) & cs->mask);
-			cs->wd_last = csnow;
-			/* Check the delta. Might remove from the list ! */
-			clocksource_ratewd(cs, cs_nsec - wd_nsec);
+		/* Check the deviation from the watchdog clocksource. */
+		cs_nsec = cyc2ns(cs, (csnow - cs->wd_last) & cs->mask);
+		cs->wd_last = csnow;
+		if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
+			clocksource_unstable(cs, cs_nsec - wd_nsec);
+			continue;
+		}
+
+		if (!(cs->flags & CLOCK_SOURCE_VALID_FOR_HRES) &&
+		    (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS) &&
+		    (watchdog->flags & CLOCK_SOURCE_IS_CONTINUOUS)) {
+			cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
+			/*
+			 * We just marked the clocksource as highres-capable,
+			 * notify the rest of the system as well so that we
+			 * transition into high-res mode:
+			 */
+			tick_clock_notify();
 		}
 	}
 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 05/12] clocksource watchdog resume logic
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (3 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 04/12] clocksource watchdog highres enablement Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 06/12] clocksource watchdog refactoring Martin Schwidefsky
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: watchdog-resume.diff --]
[-- Type: text/plain, Size: 2478 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

To resume the clocksource watchdog just remove the CLOCK_SOURCE_WATCHDOG
bit from the watched clocksource.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 kernel/time/clocksource.c |   23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -145,7 +145,6 @@ static struct clocksource *watchdog;
 static struct timer_list watchdog_timer;
 static DEFINE_SPINLOCK(watchdog_lock);
 static cycle_t watchdog_last;
-static unsigned long watchdog_resumed;
 
 /*
  * Interval: 0.5sec Threshold: 0.0625s
@@ -167,12 +166,9 @@ static void clocksource_watchdog(unsigne
 	struct clocksource *cs, *tmp;
 	cycle_t csnow, wdnow;
 	int64_t wd_nsec, cs_nsec;
-	int resumed;
 
 	spin_lock(&watchdog_lock);
 
-	resumed = test_and_clear_bit(0, &watchdog_resumed);
-
 	wdnow = watchdog->read(watchdog);
 	wd_nsec = cyc2ns(watchdog, (wdnow - watchdog_last) & watchdog->mask);
 	watchdog_last = wdnow;
@@ -223,14 +219,26 @@ static void clocksource_watchdog(unsigne
 	}
 	spin_unlock(&watchdog_lock);
 }
+
+static inline void clocksource_reset_watchdog(void)
+{
+	struct clocksource *cs;
+
+	list_for_each_entry(cs, &watchdog_list, wd_list)
+		cs->flags &= ~CLOCK_SOURCE_WATCHDOG;
+}
+
 static void clocksource_resume_watchdog(void)
 {
-	set_bit(0, &watchdog_resumed);
+	unsigned long flags;
+
+	spin_lock_irqsave(&watchdog_lock, flags);
+	clocksource_reset_watchdog();
+	spin_unlock_irqrestore(&watchdog_lock, flags);
 }
 
 static void clocksource_check_watchdog(struct clocksource *cs)
 {
-	struct clocksource *cse;
 	unsigned long flags;
 
 	spin_lock_irqsave(&watchdog_lock, flags);
@@ -256,8 +264,7 @@ static void clocksource_check_watchdog(s
 			watchdog_timer.function = clocksource_watchdog;
 
 			/* Reset watchdog cycles */
-			list_for_each_entry(cse, &watchdog_list, wd_list)
-				cse->flags &= ~CLOCK_SOURCE_WATCHDOG;
+			clocksource_reset_watchdog();
 			/* Start if list is not empty */
 			if (!list_empty(&watchdog_list)) {
 				watchdog_last = watchdog->read(watchdog);

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 06/12] clocksource watchdog refactoring
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (4 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 05/12] clocksource watchdog resume logic Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 07/12] clocksource watchdog work Martin Schwidefsky
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: watchdog-refactor.diff --]
[-- Type: text/plain, Size: 6254 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Refactor clocksource watchdog code to make it more readable. Add
clocksource_dequeue_watchdog to remove a clocksource from the
watckdog list when it is unregistered.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 kernel/time/clocksource.c |   96 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 68 insertions(+), 28 deletions(-)

Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -145,6 +145,7 @@ static struct clocksource *watchdog;
 static struct timer_list watchdog_timer;
 static DEFINE_SPINLOCK(watchdog_lock);
 static cycle_t watchdog_last;
+static int watchdog_running;
 
 /*
  * Interval: 0.5sec Threshold: 0.0625s
@@ -168,6 +169,8 @@ static void clocksource_watchdog(unsigne
 	int64_t wd_nsec, cs_nsec;
 
 	spin_lock(&watchdog_lock);
+	if (!watchdog_running)
+		goto out;
 
 	wdnow = watchdog->read(watchdog);
 	wd_nsec = cyc2ns(watchdog, (wdnow - watchdog_last) & watchdog->mask);
@@ -217,9 +220,30 @@ static void clocksource_watchdog(unsigne
 		watchdog_timer.expires += WATCHDOG_INTERVAL;
 		add_timer_on(&watchdog_timer, next_cpu);
 	}
+out:
 	spin_unlock(&watchdog_lock);
 }
 
+static inline void clocksource_start_watchdog(void)
+{
+	if (watchdog_running || !watchdog || list_empty(&watchdog_list))
+		return;
+	init_timer(&watchdog_timer);
+	watchdog_timer.function = clocksource_watchdog;
+	watchdog_last = watchdog->read(watchdog);
+	watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
+	add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask));
+	watchdog_running = 1;
+}
+
+static inline void clocksource_stop_watchdog(void)
+{
+	if (!watchdog_running || (watchdog && !list_empty(&watchdog_list)))
+		return;
+	del_timer(&watchdog_timer);
+	watchdog_running = 0;
+}
+
 static inline void clocksource_reset_watchdog(void)
 {
 	struct clocksource *cs;
@@ -237,55 +261,69 @@ static void clocksource_resume_watchdog(
 	spin_unlock_irqrestore(&watchdog_lock, flags);
 }
 
-static void clocksource_check_watchdog(struct clocksource *cs)
+static void clocksource_enqueue_watchdog(struct clocksource *cs)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&watchdog_lock, flags);
 	if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
-		int started = !list_empty(&watchdog_list);
-
+		/* cs is a clocksource to be watched. */
 		list_add(&cs->wd_list, &watchdog_list);
-		if (!started && watchdog) {
-			watchdog_last = watchdog->read(watchdog);
-			watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
-			add_timer_on(&watchdog_timer,
-				     cpumask_first(cpu_online_mask));
-		}
+		cs->flags &= ~CLOCK_SOURCE_WATCHDOG;
 	} else {
+		/* cs is a watchdog. */
 		if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS)
 			cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
-
+		/* Pick the best watchdog. */
 		if (!watchdog || cs->rating > watchdog->rating) {
-			if (watchdog)
-				del_timer(&watchdog_timer);
 			watchdog = cs;
-			init_timer(&watchdog_timer);
-			watchdog_timer.function = clocksource_watchdog;
-
 			/* Reset watchdog cycles */
 			clocksource_reset_watchdog();
-			/* Start if list is not empty */
-			if (!list_empty(&watchdog_list)) {
-				watchdog_last = watchdog->read(watchdog);
-				watchdog_timer.expires =
-					jiffies + WATCHDOG_INTERVAL;
-				add_timer_on(&watchdog_timer,
-					     cpumask_first(cpu_online_mask));
-			}
 		}
 	}
+	/* Check if the watchdog timer needs to be started. */
+	clocksource_start_watchdog();
+	spin_unlock_irqrestore(&watchdog_lock, flags);
+}
+
+static void clocksource_dequeue_watchdog(struct clocksource *cs)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&watchdog_lock, flags);
+	if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
+		/* cs is a watched clocksource. */
+		list_del_init(&cs->wd_list);
+	} else if (cs == watchdog) {
+		/* Reset watchdog cycles */
+		clocksource_reset_watchdog();
+		/* Current watchdog is removed. Find an alternative. */
+		watchdog = NULL;
+		list_for_each_entry(cs, &clocksource_list, list) {
+			if (cs->flags & CLOCK_SOURCE_MUST_VERIFY)
+				continue;
+			if (!watchdog || cs->rating > watchdog->rating)
+				watchdog = cs;
+		}
+	}
+	cs->flags &= ~CLOCK_SOURCE_WATCHDOG;
+	/* Check if the watchdog timer needs to be stopped. */
+	clocksource_stop_watchdog();
 	spin_unlock_irqrestore(&watchdog_lock, flags);
 }
-#else
-static void clocksource_check_watchdog(struct clocksource *cs)
+
+#else /* CONFIG_CLOCKSOURCE_WATCHDOG */
+
+static void clocksource_enqueue_watchdog(struct clocksource *cs)
 {
 	if (cs->flags & CLOCK_SOURCE_IS_CONTINUOUS)
 		cs->flags |= CLOCK_SOURCE_VALID_FOR_HRES;
 }
 
+static inline void clocksource_dequeue_watchdog(struct clocksource *cs) { }
 static inline void clocksource_resume_watchdog(void) { }
-#endif
+
+#endif /* CONFIG_CLOCKSOURCE_WATCHDOG */
 
 /**
  * clocksource_resume - resume the clocksource(s)
@@ -400,14 +438,13 @@ int clocksource_register(struct clocksou
 	list_add(&cs->list, &clocksource_list);
 	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
-	clocksource_check_watchdog(cs);
+	clocksource_enqueue_watchdog(cs);
 	return 0;
 }
 EXPORT_SYMBOL(clocksource_register);
 
 /**
  * clocksource_change_rating - Change the rating of a registered clocksource
- *
  */
 void clocksource_change_rating(struct clocksource *cs, int rating)
 {
@@ -418,6 +455,7 @@ void clocksource_change_rating(struct cl
 	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
 }
+EXPORT_SYMBOL(clocksource_change_rating);
 
 /**
  * clocksource_unregister - remove a registered clocksource
@@ -426,11 +464,13 @@ void clocksource_unregister(struct clock
 {
 	unsigned long flags;
 
+	clocksource_dequeue_watchdog(cs);
 	spin_lock_irqsave(&clocksource_lock, flags);
 	list_del(&cs->list);
 	clocksource_select();
 	spin_unlock_irqrestore(&clocksource_lock, flags);
 }
+EXPORT_SYMBOL(clocksource_unregister);
 
 #ifdef CONFIG_SYSFS
 /**

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 07/12] clocksource watchdog work
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (5 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 06/12] clocksource watchdog refactoring Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 08/12] introduce struct timekeeper Martin Schwidefsky
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: watchdog-work.diff --]
[-- Type: text/plain, Size: 4769 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Move the downgrade of an unstable clocksource from the timer interrupt
context into the process context of a work queue thread. This is needed
to be able to do the clocksource switch with stop_machine.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/linux/clocksource.h |    1 
 kernel/time/clocksource.c   |   56 ++++++++++++++++++++++++++++++--------------
 2 files changed, 40 insertions(+), 17 deletions(-)

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -213,6 +213,7 @@ extern struct clocksource *clock;	/* cur
 
 #define CLOCK_SOURCE_WATCHDOG			0x10
 #define CLOCK_SOURCE_VALID_FOR_HRES		0x20
+#define CLOCK_SOURCE_UNSTABLE			0x40
 
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1)
Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -143,10 +143,13 @@ fs_initcall(clocksource_done_booting);
 static LIST_HEAD(watchdog_list);
 static struct clocksource *watchdog;
 static struct timer_list watchdog_timer;
+static struct work_struct watchdog_work;
 static DEFINE_SPINLOCK(watchdog_lock);
 static cycle_t watchdog_last;
 static int watchdog_running;
 
+static void clocksource_watchdog_work(struct work_struct *work);
+
 /*
  * Interval: 0.5sec Threshold: 0.0625s
  */
@@ -158,15 +161,16 @@ static void clocksource_unstable(struct 
 	printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
 	       cs->name, delta);
 	cs->flags &= ~(CLOCK_SOURCE_VALID_FOR_HRES | CLOCK_SOURCE_WATCHDOG);
-	clocksource_change_rating(cs, 0);
-	list_del(&cs->wd_list);
+	cs->flags |= CLOCK_SOURCE_UNSTABLE;
+	schedule_work(&watchdog_work);
 }
 
 static void clocksource_watchdog(unsigned long data)
 {
-	struct clocksource *cs, *tmp;
+	struct clocksource *cs;
 	cycle_t csnow, wdnow;
 	int64_t wd_nsec, cs_nsec;
+	int next_cpu;
 
 	spin_lock(&watchdog_lock);
 	if (!watchdog_running)
@@ -176,7 +180,12 @@ static void clocksource_watchdog(unsigne
 	wd_nsec = cyc2ns(watchdog, (wdnow - watchdog_last) & watchdog->mask);
 	watchdog_last = wdnow;
 
-	list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) {
+	list_for_each_entry(cs, &watchdog_list, wd_list) {
+
+		/* Clocksource already marked unstable? */
+		if (cs->flags & CLOCK_SOURCE_UNSTABLE)
+			continue;
+
 		csnow = cs->read(cs);
 
 		/* Clocksource initialized ? */
@@ -207,19 +216,15 @@ static void clocksource_watchdog(unsigne
 		}
 	}
 
-	if (!list_empty(&watchdog_list)) {
-		/*
-		 * Cycle through CPUs to check if the CPUs stay
-		 * synchronized to each other.
-		 */
-		int next_cpu = cpumask_next(raw_smp_processor_id(),
-					    cpu_online_mask);
-
-		if (next_cpu >= nr_cpu_ids)
-			next_cpu = cpumask_first(cpu_online_mask);
-		watchdog_timer.expires += WATCHDOG_INTERVAL;
-		add_timer_on(&watchdog_timer, next_cpu);
-	}
+	/*
+	 * Cycle through CPUs to check if the CPUs stay synchronized
+	 * to each other.
+	 */
+	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
+	if (next_cpu >= nr_cpu_ids)
+		next_cpu = cpumask_first(cpu_online_mask);
+	watchdog_timer.expires += WATCHDOG_INTERVAL;
+	add_timer_on(&watchdog_timer, next_cpu);
 out:
 	spin_unlock(&watchdog_lock);
 }
@@ -228,6 +233,7 @@ static inline void clocksource_start_wat
 {
 	if (watchdog_running || !watchdog || list_empty(&watchdog_list))
 		return;
+	INIT_WORK(&watchdog_work, clocksource_watchdog_work);
 	init_timer(&watchdog_timer);
 	watchdog_timer.function = clocksource_watchdog;
 	watchdog_last = watchdog->read(watchdog);
@@ -312,6 +318,22 @@ static void clocksource_dequeue_watchdog
 	spin_unlock_irqrestore(&watchdog_lock, flags);
 }
 
+static void clocksource_watchdog_work(struct work_struct *work)
+{
+	struct clocksource *cs, *tmp;
+	unsigned long flags;
+
+	spin_lock_irqsave(&watchdog_lock, flags);
+	list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list)
+		if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
+			list_del_init(&cs->wd_list);
+			clocksource_change_rating(cs, 0);
+		}
+	/* Check if the watchdog timer needs to be stopped. */
+	clocksource_stop_watchdog();
+	spin_unlock(&watchdog_lock);
+}
+
 #else /* CONFIG_CLOCKSOURCE_WATCHDOG */
 
 static void clocksource_enqueue_watchdog(struct clocksource *cs)

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 08/12] introduce struct timekeeper
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (6 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 07/12] clocksource watchdog work Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 13:41 ` [RFC][patch 09/12] add xtime_shift and ntp_error_shift to " Martin Schwidefsky
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: timekeeper-struct.diff --]
[-- Type: text/plain, Size: 22103 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Add struct timekeeper to keep the internal values timekeeping.c needs
in regard to the currently selected clock source. This moves the
timekeeping intervals, xtime_nsec and the ntp error value from
struct clocksource to struct timekeeper. The raw_time is removed from
the clocksource as well. It gets treated like xtime as a global variable.
Eventually xtime raw_time should be moved to struct timekeeper.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/kernel/time.c     |    1 
 include/linux/clocksource.h |   54 ----------
 kernel/time/clocksource.c   |    6 -
 kernel/time/timekeeping.c   |  223 ++++++++++++++++++++++++++++----------------
 4 files changed, 152 insertions(+), 132 deletions(-)

Index: linux-2.6/arch/s390/kernel/time.c
===================================================================
--- linux-2.6.orig/arch/s390/kernel/time.c
+++ linux-2.6/arch/s390/kernel/time.c
@@ -280,7 +280,6 @@ void __init time_init(void)
 	now = get_clock();
 	tod_to_timeval(now - TOD_UNIX_EPOCH, &xtime);
 	clocksource_tod.cycle_last = now;
-	clocksource_tod.raw_time = xtime;
 	tod_to_timeval(sched_clock_base_cc - TOD_UNIX_EPOCH, &ts);
 	set_normalized_timespec(&wall_to_monotonic, -ts.tv_sec, -ts.tv_nsec);
 	write_sequnlock_irqrestore(&xtime_lock, flags);
Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -155,8 +155,6 @@ extern u64 timecounter_cyc2time(struct t
  * @flags:		flags describing special properties
  * @vread:		vsyscall based read
  * @resume:		resume function for the clocksource, if necessary
- * @cycle_interval:	Used internally by timekeeping core, please ignore.
- * @xtime_interval:	Used internally by timekeeping core, please ignore.
  */
 struct clocksource {
 	/*
@@ -182,19 +180,12 @@ struct clocksource {
 #define CLKSRC_FSYS_MMIO_SET(mmio, addr)      do { } while (0)
 #endif
 
-	/* timekeeping specific data, ignore */
-	cycle_t cycle_interval;
-	u64	xtime_interval;
-	u32	raw_interval;
 	/*
 	 * Second part is written at each timer interrupt
 	 * Keep it in a different cache line to dirty no
 	 * more than one cache line.
 	 */
 	cycle_t cycle_last ____cacheline_aligned_in_smp;
-	u64 xtime_nsec;
-	s64 error;
-	struct timespec raw_time;
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 	/* Watchdog related data, used by the framework */
@@ -203,8 +194,6 @@ struct clocksource {
 #endif
 };
 
-extern struct clocksource *clock;	/* current clocksource */
-
 /*
  * Clock source flags bits::
  */
@@ -270,50 +259,15 @@ static inline u32 clocksource_hz2mult(u3
 }
 
 /**
- * cyc2ns - converts clocksource cycles to nanoseconds
- * @cs:		Pointer to clocksource
- * @cycles:	Cycles
+ * clocksource_cyc2ns - converts clocksource cycles to nanoseconds
  *
- * Uses the clocksource and ntp ajdustment to convert cycle_ts to nanoseconds.
+ * Converts cycles to nanoseconds, using the given mult and shift.
  *
  * XXX - This could use some mult_lxl_ll() asm optimization
  */
-static inline s64 cyc2ns(struct clocksource *cs, cycle_t cycles)
-{
-	u64 ret = (u64)cycles;
-	ret = (ret * cs->mult) >> cs->shift;
-	return ret;
-}
-
-/**
- * clocksource_calculate_interval - Calculates a clocksource interval struct
- *
- * @c:		Pointer to clocksource.
- * @length_nsec: Desired interval length in nanoseconds.
- *
- * Calculates a fixed cycle/nsec interval for a given clocksource/adjustment
- * pair and interval request.
- *
- * Unless you're the timekeeping code, you should not be using this!
- */
-static inline void clocksource_calculate_interval(struct clocksource *c,
-					  	  unsigned long length_nsec)
+static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
 {
-	u64 tmp;
-
-	/* Do the ns -> cycle conversion first, using original mult */
-	tmp = length_nsec;
-	tmp <<= c->shift;
-	tmp += c->mult_orig/2;
-	do_div(tmp, c->mult_orig);
-
-	c->cycle_interval = (cycle_t)tmp;
-	if (c->cycle_interval == 0)
-		c->cycle_interval = 1;
-
-	/* Go back from cycles -> shifted ns, this time use ntp adjused mult */
-	c->xtime_interval = (u64)c->cycle_interval * c->mult;
-	c->raw_interval = ((u64)c->cycle_interval * c->mult_orig) >> c->shift;
+	return ((u64) cycles * mult) >> shift;
 }
 
 
Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -177,7 +177,8 @@ static void clocksource_watchdog(unsigne
 		goto out;
 
 	wdnow = watchdog->read(watchdog);
-	wd_nsec = cyc2ns(watchdog, (wdnow - watchdog_last) & watchdog->mask);
+	wd_nsec = clocksource_cyc2ns((wdnow - watchdog_last) & watchdog->mask,
+				     watchdog->mult, watchdog->shift);
 	watchdog_last = wdnow;
 
 	list_for_each_entry(cs, &watchdog_list, wd_list) {
@@ -196,7 +197,8 @@ static void clocksource_watchdog(unsigne
 		}
 
 		/* Check the deviation from the watchdog clocksource. */
-		cs_nsec = cyc2ns(cs, (csnow - cs->wd_last) & cs->mask);
+		cs_nsec = clocksource_cyc2ns((csnow - cs->wd_last) &
+					     cs->mask, cs->mult, cs->shift);
 		cs->wd_last = csnow;
 		if (abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD) {
 			clocksource_unstable(cs, cs_nsec - wd_nsec);
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -19,6 +19,56 @@
 #include <linux/time.h>
 #include <linux/tick.h>
 
+/* Structure holding internal timekeeping values. */
+struct timekeeper {
+	struct clocksource *clock;
+	cycle_t cycle_interval;
+	u64	xtime_interval;
+	u32	raw_interval;
+	u64	xtime_nsec;
+	s64	ntp_error;
+};
+
+struct timekeeper timekeeper;
+
+/**
+ * timekeeper_setup_internals - Set up internals to use clocksource clock.
+ *
+ * @clock:		Pointer to clocksource.
+ *
+ * Calculates a fixed cycle/nsec interval for a given clocksource/adjustment
+ * pair and interval request.
+ *
+ * Unless you're the timekeeping code, you should not be using this!
+ */
+static void timekeeper_setup_internals(struct clocksource *clock)
+{
+	cycle_t interval;
+	u64 tmp;
+
+	timekeeper.clock = clock;
+	clock->cycle_last = clock->read(clock);
+
+	/* Do the ns -> cycle conversion first, using original mult */
+	tmp = NTP_INTERVAL_LENGTH;
+	tmp <<= clock->shift;
+	tmp += clock->mult_orig/2;
+	do_div(tmp, clock->mult_orig);
+	if (tmp == 0)
+		tmp = 1;
+
+	interval = (cycle_t) tmp;
+	timekeeper.cycle_interval = interval;
+
+	/* Go back from cycles -> shifted ns */
+	timekeeper.xtime_interval = (u64) interval * clock->mult;
+	timekeeper.raw_interval =
+		((u64) interval * clock->mult_orig) >> clock->shift;
+
+	timekeeper.xtime_nsec = 0;
+
+	timekeeper.ntp_error = 0;
+}
 
 /*
  * This read-write spinlock protects us from races in SMP while
@@ -46,6 +96,11 @@ struct timespec xtime __attribute__ ((al
 struct timespec wall_to_monotonic __attribute__ ((aligned (16)));
 static unsigned long total_sleep_time;		/* seconds */
 
+/*
+ * The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock.
+ */
+struct timespec raw_time;
+
 /* flag for if timekeeping is suspended */
 int __read_mostly timekeeping_suspended;
 
@@ -56,42 +111,42 @@ void update_xtime_cache(u64 nsec)
 	timespec_add_ns(&xtime_cache, nsec);
 }
 
-struct clocksource *clock;
-
 /* must hold xtime_lock */
 void timekeeping_leap_insert(int leapsecond)
 {
 	xtime.tv_sec += leapsecond;
 	wall_to_monotonic.tv_sec -= leapsecond;
-	update_vsyscall(&xtime, clock);
+	update_vsyscall(&xtime, timekeeper.clock);
 }
 
 #ifdef CONFIG_GENERIC_TIME
 /**
- * clocksource_forward_now - update clock to the current time
+ * timekeeping_forward_now - update clock to the current time
  *
  * Forward the current clock to update its state since the last call to
  * update_wall_time(). This is useful before significant clock changes,
  * as it avoids having to deal with this time offset explicitly.
  */
-static void clocksource_forward_now(void)
+static void timekeeping_forward_now(void)
 {
 	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
 	s64 nsec;
 
+	clock = timekeeper.clock;
 	cycle_now = clock->read(clock);
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 	clock->cycle_last = cycle_now;
 
-	nsec = cyc2ns(clock, cycle_delta);
+	nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
 
 	/* If arch requires, add in gettimeoffset() */
 	nsec += arch_gettimeoffset();
 
 	timespec_add_ns(&xtime, nsec);
 
-	nsec = ((s64)cycle_delta * clock->mult_orig) >> clock->shift;
-	clock->raw_time.tv_nsec += nsec;
+	nsec = clocksource_cyc2ns(cycle_delta, clock->mult_orig, clock->shift);
+	timespec_add_ns(&raw_time, nsec);
 }
 
 /**
@@ -103,6 +158,7 @@ static void clocksource_forward_now(void
 void getnstimeofday(struct timespec *ts)
 {
 	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
 	unsigned long seq;
 	s64 nsecs;
 
@@ -114,13 +170,15 @@ void getnstimeofday(struct timespec *ts)
 		*ts = xtime;
 
 		/* read clocksource: */
+		clock = timekeeper.clock;
 		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = cyc2ns(clock, cycle_delta);
+		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
+					   clock->shift);
 
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -135,6 +193,7 @@ EXPORT_SYMBOL(getnstimeofday);
 ktime_t ktime_get(void)
 {
 	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
 	unsigned int seq;
 	s64 secs, nsecs;
 
@@ -146,13 +205,15 @@ ktime_t ktime_get(void)
 		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
 
 		/* read clocksource: */
+		clock = timekeeper.clock;
 		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs += cyc2ns(clock, cycle_delta);
+		nsecs += clocksource_cyc2ns(cycle_delta, clock->mult,
+					    clock->shift);
 
 	} while (read_seqretry(&xtime_lock, seq));
 	/*
@@ -174,6 +235,7 @@ EXPORT_SYMBOL_GPL(ktime_get);
 void ktime_get_ts(struct timespec *ts)
 {
 	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
 	struct timespec tomono;
 	unsigned int seq;
 	s64 nsecs;
@@ -186,13 +248,15 @@ void ktime_get_ts(struct timespec *ts)
 		tomono = wall_to_monotonic;
 
 		/* read clocksource: */
+		clock = timekeeper.clock;
 		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = cyc2ns(clock, cycle_delta);
+		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
+					   clock->shift);
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -233,7 +297,7 @@ int do_settimeofday(struct timespec *tv)
 
 	write_seqlock_irqsave(&xtime_lock, flags);
 
-	clocksource_forward_now();
+	timekeeping_forward_now();
 
 	ts_delta.tv_sec = tv->tv_sec - xtime.tv_sec;
 	ts_delta.tv_nsec = tv->tv_nsec - xtime.tv_nsec;
@@ -243,10 +307,10 @@ int do_settimeofday(struct timespec *tv)
 
 	update_xtime_cache(0);
 
-	clock->error = 0;
+	timekeeper.ntp_error = 0;
 	ntp_clear();
 
-	update_vsyscall(&xtime, clock);
+	update_vsyscall(&xtime, timekeeper.clock);
 
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
@@ -269,38 +333,25 @@ static void change_clocksource(void)
 
 	new = clocksource_get_next();
 
-	if (!new || clock == new)
+	if (!new || timekeeper.clock == new)
 		return;
 
-	clocksource_forward_now();
+	timekeeping_forward_now();
 
 	if (new->enable && ! new->enable(new))
 		return;
 	/* save mult_orig on enable */
 	new->mult_orig = new->mult;
 
-	new->raw_time = clock->raw_time;
-	old = clock;
-	clock = new;
+	old = timekeeper.clock;
+	timekeeper_setup_internals(new);
 	if (old->disable)
 		old->disable(old);
 
-	clock->cycle_last = clock->read(clock);
-	clock->error = 0;
-	clock->xtime_nsec = 0;
-	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
-
 	tick_clock_notify();
-
-	/*
-	 * We're holding xtime lock and waking up klogd would deadlock
-	 * us on enqueue.  So no printing!
-	printk(KERN_INFO "Time: %s clocksource has been installed.\n",
-	       clock->name);
-	 */
 }
 #else /* GENERIC_TIME */
-static inline void clocksource_forward_now(void) { }
+static inline void timekeeping_forward_now(void) { }
 static inline void change_clocksource(void) { }
 
 /**
@@ -370,20 +421,23 @@ void getrawmonotonic(struct timespec *ts
 	unsigned long seq;
 	s64 nsecs;
 	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
 
 	do {
 		seq = read_seqbegin(&xtime_lock);
 
 		/* read clocksource: */
+		clock = timekeeper.clock;
 		cycle_now = clock->read(clock);
 
 		/* calculate the delta since the last update_wall_time: */
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = ((s64)cycle_delta * clock->mult_orig) >> clock->shift;
+		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult_orig,
+					   clock->shift);
 
-		*ts = clock->raw_time;
+		*ts = raw_time;
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -403,7 +457,7 @@ int timekeeping_valid_for_hres(void)
 	do {
 		seq = read_seqbegin(&xtime_lock);
 
-		ret = clock->flags & CLOCK_SOURCE_VALID_FOR_HRES;
+		ret = timekeeper.clock->flags & CLOCK_SOURCE_VALID_FOR_HRES;
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -429,6 +483,7 @@ unsigned long __attribute__((weak)) read
  */
 void __init timekeeping_init(void)
 {
+	struct clocksource *clock;
 	unsigned long flags;
 	unsigned long sec = read_persistent_clock();
 
@@ -441,11 +496,13 @@ void __init timekeeping_init(void)
 		clock->enable(clock);
 	/* save mult_orig on enable */
 	clock->mult_orig = clock->mult;
-	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
-	clock->cycle_last = clock->read(clock);
+
+	timekeeper_setup_internals(clock);
 
 	xtime.tv_sec = sec;
 	xtime.tv_nsec = 0;
+	raw_time.tv_sec = 0;
+	raw_time.tv_nsec = 0;
 	set_normalized_timespec(&wall_to_monotonic,
 		-xtime.tv_sec, -xtime.tv_nsec);
 	update_xtime_cache(0);
@@ -482,8 +539,8 @@ static int timekeeping_resume(struct sys
 	}
 	update_xtime_cache(0);
 	/* re-base the last cycle value */
-	clock->cycle_last = clock->read(clock);
-	clock->error = 0;
+	timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
+	timekeeper.ntp_error = 0;
 	timekeeping_suspended = 0;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
@@ -504,7 +561,7 @@ static int timekeeping_suspend(struct sy
 	timekeeping_suspend_time = read_persistent_clock();
 
 	write_seqlock_irqsave(&xtime_lock, flags);
-	clocksource_forward_now();
+	timekeeping_forward_now();
 	timekeeping_suspended = 1;
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
@@ -539,7 +596,7 @@ device_initcall(timekeeping_init_device)
  * If the error is already larger, we look ahead even further
  * to compensate for late or lost adjustments.
  */
-static __always_inline int clocksource_bigadjust(s64 error, s64 *interval,
+static __always_inline int timekeeping_bigadjust(s64 error, s64 *interval,
 						 s64 *offset)
 {
 	s64 tick_error, i;
@@ -555,7 +612,7 @@ static __always_inline int clocksource_b
 	 * here.  This is tuned so that an error of about 1 msec is adjusted
 	 * within about 1 sec (or 2^20 nsec in 2^SHIFT_HZ ticks).
 	 */
-	error2 = clock->error >> (NTP_SCALE_SHIFT + 22 - 2 * SHIFT_HZ);
+	error2 = timekeeper.ntp_error >> (NTP_SCALE_SHIFT + 22 - 2 * SHIFT_HZ);
 	error2 = abs(error2);
 	for (look_ahead = 0; error2 > 0; look_ahead++)
 		error2 >>= 2;
@@ -564,8 +621,9 @@ static __always_inline int clocksource_b
 	 * Now calculate the error in (1 << look_ahead) ticks, but first
 	 * remove the single look ahead already included in the error.
 	 */
-	tick_error = tick_length >> (NTP_SCALE_SHIFT - clock->shift + 1);
-	tick_error -= clock->xtime_interval >> 1;
+	tick_error = tick_length >>
+			(NTP_SCALE_SHIFT - timekeeper.clock->shift + 1);
+	tick_error -= timekeeper.xtime_interval >> 1;
 	error = ((error - tick_error) >> look_ahead) + tick_error;
 
 	/* Finally calculate the adjustment shift value.  */
@@ -590,18 +648,19 @@ static __always_inline int clocksource_b
  * this is optimized for the most common adjustments of -1,0,1,
  * for other values we can do a bit more work.
  */
-static void clocksource_adjust(s64 offset)
+static void timekeeping_adjust(s64 offset)
 {
-	s64 error, interval = clock->cycle_interval;
+	s64 error, interval = timekeeper.cycle_interval;
 	int adj;
 
-	error = clock->error >> (NTP_SCALE_SHIFT - clock->shift - 1);
+	error = timekeeper.ntp_error >>
+		(NTP_SCALE_SHIFT - timekeeper.clock->shift - 1);
 	if (error > interval) {
 		error >>= 2;
 		if (likely(error <= interval))
 			adj = 1;
 		else
-			adj = clocksource_bigadjust(error, &interval, &offset);
+			adj = timekeeping_bigadjust(error, &interval, &offset);
 	} else if (error < -interval) {
 		error >>= 2;
 		if (likely(error >= -interval)) {
@@ -609,15 +668,15 @@ static void clocksource_adjust(s64 offse
 			interval = -interval;
 			offset = -offset;
 		} else
-			adj = clocksource_bigadjust(error, &interval, &offset);
+			adj = timekeeping_bigadjust(error, &interval, &offset);
 	} else
 		return;
 
-	clock->mult += adj;
-	clock->xtime_interval += interval;
-	clock->xtime_nsec -= offset;
-	clock->error -= (interval - offset) <<
-			(NTP_SCALE_SHIFT - clock->shift);
+	timekeeper.clock->mult += adj;
+	timekeeper.xtime_interval += interval;
+	timekeeper.xtime_nsec -= offset;
+	timekeeper.ntp_error -= (interval - offset) <<
+			(NTP_SCALE_SHIFT - timekeeper.clock->shift);
 }
 
 /**
@@ -627,53 +686,57 @@ static void clocksource_adjust(s64 offse
  */
 void update_wall_time(void)
 {
+	struct clocksource *clock;
 	cycle_t offset;
+	s64 nsecs;
 
 	/* Make sure we're fully resumed: */
 	if (unlikely(timekeeping_suspended))
 		return;
 
+	clock = timekeeper.clock;
 #ifdef CONFIG_GENERIC_TIME
 	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
 #else
-	offset = clock->cycle_interval;
+	offset = timekeeper.cycle_interval;
 #endif
-	clock->xtime_nsec = (s64)xtime.tv_nsec << clock->shift;
+	timekeeper.xtime_nsec = (s64)xtime.tv_nsec << clock->shift;
 
 	/* normally this loop will run just once, however in the
 	 * case of lost or late ticks, it will accumulate correctly.
 	 */
-	while (offset >= clock->cycle_interval) {
+	while (offset >= timekeeper.cycle_interval) {
 		/* accumulate one interval */
-		offset -= clock->cycle_interval;
-		clock->cycle_last += clock->cycle_interval;
+		offset -= timekeeper.cycle_interval;
+		clock->cycle_last += timekeeper.cycle_interval;
 
-		clock->xtime_nsec += clock->xtime_interval;
-		if (clock->xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
-			clock->xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
+		timekeeper.xtime_nsec += timekeeper.xtime_interval;
+		if (timekeeper.xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
+			timekeeper.xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
 			xtime.tv_sec++;
 			second_overflow();
 		}
 
-		clock->raw_time.tv_nsec += clock->raw_interval;
-		if (clock->raw_time.tv_nsec >= NSEC_PER_SEC) {
-			clock->raw_time.tv_nsec -= NSEC_PER_SEC;
-			clock->raw_time.tv_sec++;
+		raw_time.tv_nsec += timekeeper.raw_interval;
+		if (raw_time.tv_nsec >= NSEC_PER_SEC) {
+			raw_time.tv_nsec -= NSEC_PER_SEC;
+			raw_time.tv_sec++;
 		}
 
 		/* accumulate error between NTP and clock interval */
-		clock->error += tick_length;
-		clock->error -= clock->xtime_interval << (NTP_SCALE_SHIFT - clock->shift);
+		timekeeper.ntp_error += tick_length;
+		timekeeper.ntp_error -= timekeeper.xtime_interval <<
+					(NTP_SCALE_SHIFT - clock->shift);
 	}
 
 	/* correct the clock when NTP error is too big */
-	clocksource_adjust(offset);
+	timekeeping_adjust(offset);
 
 	/*
 	 * Since in the loop above, we accumulate any amount of time
 	 * in xtime_nsec over a second into xtime.tv_sec, its possible for
 	 * xtime_nsec to be fairly small after the loop. Further, if we're
-	 * slightly speeding the clocksource up in clocksource_adjust(),
+	 * slightly speeding the clocksource up in timekeeping_adjust(),
 	 * its possible the required corrective factor to xtime_nsec could
 	 * cause it to underflow.
 	 *
@@ -685,24 +748,26 @@ void update_wall_time(void)
 	 * We'll correct this error next time through this function, when
 	 * xtime_nsec is not as small.
 	 */
-	if (unlikely((s64)clock->xtime_nsec < 0)) {
-		s64 neg = -(s64)clock->xtime_nsec;
-		clock->xtime_nsec = 0;
-		clock->error += neg << (NTP_SCALE_SHIFT - clock->shift);
+	if (unlikely((s64)timekeeper.xtime_nsec < 0)) {
+		s64 neg = -(s64)timekeeper.xtime_nsec;
+		timekeeper.xtime_nsec = 0;
+		timekeeper.ntp_error += neg << (NTP_SCALE_SHIFT - clock->shift);
 	}
 
 	/* store full nanoseconds into xtime after rounding it up and
 	 * add the remainder to the error difference.
 	 */
-	xtime.tv_nsec = ((s64)clock->xtime_nsec >> clock->shift) + 1;
-	clock->xtime_nsec -= (s64)xtime.tv_nsec << clock->shift;
-	clock->error += clock->xtime_nsec << (NTP_SCALE_SHIFT - clock->shift);
+	xtime.tv_nsec = ((s64)timekeeper.xtime_nsec >> clock->shift) + 1;
+	timekeeper.xtime_nsec -= (s64)xtime.tv_nsec << clock->shift;
+	timekeeper.ntp_error += timekeeper.xtime_nsec <<
+				(NTP_SCALE_SHIFT - clock->shift);
 
-	update_xtime_cache(cyc2ns(clock, offset));
+	nsecs = clocksource_cyc2ns(offset, clock->mult, clock->shift);
+	update_xtime_cache(nsecs);
 
 	/* check to see if there is a new clocksource to use */
 	change_clocksource();
-	update_vsyscall(&xtime, clock);
+	update_vsyscall(&xtime, timekeeper.clock);
 }
 
 /**

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 09/12] add xtime_shift and ntp_error_shift to struct timekeeper
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (7 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 08/12] introduce struct timekeeper Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-30 22:15   ` john stultz
  2009-07-29 13:41 ` [RFC][patch 10/12] move NTP adjusted clock multiplier " Martin Schwidefsky
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: timekeeper-shift.diff --]
[-- Type: text/plain, Size: 4897 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

The xtime_nsec value in the timekeeper structure is shifted by a few
bits to improve precision. This happens to be the same value as the
clock->shift. To improve readability add xtime_shift to the timekeeper
and use it instead of the clock->shift. Likewise add ntp_error_shift
and replace all (NTP_SCALE_SHIFT - clock->shift) expressions.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 kernel/time/timekeeping.c |   33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -27,6 +27,8 @@ struct timekeeper {
 	u32	raw_interval;
 	u64	xtime_nsec;
 	s64	ntp_error;
+	int	xtime_shift;
+	int	ntp_error_shift;
 };
 
 struct timekeeper timekeeper;
@@ -66,8 +68,10 @@ static void timekeeper_setup_internals(s
 		((u64) interval * clock->mult_orig) >> clock->shift;
 
 	timekeeper.xtime_nsec = 0;
+	timekeeper.xtime_shift = clock->shift;
 
 	timekeeper.ntp_error = 0;
+	timekeeper.ntp_error_shift = NTP_SCALE_SHIFT - clock->shift;
 }
 
 /*
@@ -621,8 +625,7 @@ static __always_inline int timekeeping_b
 	 * Now calculate the error in (1 << look_ahead) ticks, but first
 	 * remove the single look ahead already included in the error.
 	 */
-	tick_error = tick_length >>
-			(NTP_SCALE_SHIFT - timekeeper.clock->shift + 1);
+	tick_error = tick_length >> (timekeeper.ntp_error_shift + 1);
 	tick_error -= timekeeper.xtime_interval >> 1;
 	error = ((error - tick_error) >> look_ahead) + tick_error;
 
@@ -653,8 +656,7 @@ static void timekeeping_adjust(s64 offse
 	s64 error, interval = timekeeper.cycle_interval;
 	int adj;
 
-	error = timekeeper.ntp_error >>
-		(NTP_SCALE_SHIFT - timekeeper.clock->shift - 1);
+	error = timekeeper.ntp_error >> (timekeeper.ntp_error_shift - 1);
 	if (error > interval) {
 		error >>= 2;
 		if (likely(error <= interval))
@@ -676,7 +678,7 @@ static void timekeeping_adjust(s64 offse
 	timekeeper.xtime_interval += interval;
 	timekeeper.xtime_nsec -= offset;
 	timekeeper.ntp_error -= (interval - offset) <<
-			(NTP_SCALE_SHIFT - timekeeper.clock->shift);
+ 				timekeeper.ntp_error_shift;
 }
 
 /**
@@ -688,7 +690,7 @@ void update_wall_time(void)
 {
 	struct clocksource *clock;
 	cycle_t offset;
-	s64 nsecs;
+	u64 nsecs;
 
 	/* Make sure we're fully resumed: */
 	if (unlikely(timekeeping_suspended))
@@ -700,7 +702,7 @@ void update_wall_time(void)
 #else
 	offset = timekeeper.cycle_interval;
 #endif
-	timekeeper.xtime_nsec = (s64)xtime.tv_nsec << clock->shift;
+	timekeeper.xtime_nsec = (s64)xtime.tv_nsec << timekeeper.xtime_shift;
 
 	/* normally this loop will run just once, however in the
 	 * case of lost or late ticks, it will accumulate correctly.
@@ -711,8 +713,9 @@ void update_wall_time(void)
 		clock->cycle_last += timekeeper.cycle_interval;
 
 		timekeeper.xtime_nsec += timekeeper.xtime_interval;
-		if (timekeeper.xtime_nsec >= (u64)NSEC_PER_SEC << clock->shift) {
-			timekeeper.xtime_nsec -= (u64)NSEC_PER_SEC << clock->shift;
+		nsecs = (u64) NSEC_PER_SEC << timekeeper.xtime_shift;
+		if (timekeeper.xtime_nsec >= nsecs) {
+			timekeeper.xtime_nsec -= nsecs;
 			xtime.tv_sec++;
 			second_overflow();
 		}
@@ -726,7 +729,7 @@ void update_wall_time(void)
 		/* accumulate error between NTP and clock interval */
 		timekeeper.ntp_error += tick_length;
 		timekeeper.ntp_error -= timekeeper.xtime_interval <<
-					(NTP_SCALE_SHIFT - clock->shift);
+					timekeeper.ntp_error_shift;
 	}
 
 	/* correct the clock when NTP error is too big */
@@ -751,16 +754,16 @@ void update_wall_time(void)
 	if (unlikely((s64)timekeeper.xtime_nsec < 0)) {
 		s64 neg = -(s64)timekeeper.xtime_nsec;
 		timekeeper.xtime_nsec = 0;
-		timekeeper.ntp_error += neg << (NTP_SCALE_SHIFT - clock->shift);
+		timekeeper.ntp_error += neg << timekeeper.ntp_error_shift;
 	}
 
 	/* store full nanoseconds into xtime after rounding it up and
 	 * add the remainder to the error difference.
 	 */
-	xtime.tv_nsec = ((s64)timekeeper.xtime_nsec >> clock->shift) + 1;
-	timekeeper.xtime_nsec -= (s64)xtime.tv_nsec << clock->shift;
-	timekeeper.ntp_error += timekeeper.xtime_nsec <<
-				(NTP_SCALE_SHIFT - clock->shift);
+	xtime.tv_nsec =	((s64) timekeeper.xtime_nsec >> timekeeper.xtime_shift) + 1;
+	timekeeper.xtime_nsec -= (s64) xtime.tv_nsec << timekeeper.xtime_shift;
+	timekeeper.ntp_error +=	timekeeper.xtime_nsec <<
+				timekeeper.ntp_error_shift;
 
 	nsecs = clocksource_cyc2ns(offset, clock->mult, clock->shift);
 	update_xtime_cache(nsecs);

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (8 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 09/12] add xtime_shift and ntp_error_shift to " Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-30 22:04   ` john stultz
  2009-08-13 11:15   ` Linus Walleij
  2009-07-29 13:41 ` [RFC][patch 11/12] timekeeper read clock helper functions Martin Schwidefsky
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: timekeeper-mult.diff --]
[-- Type: text/plain, Size: 7157 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

The clocksource structure has two multipliers, the unmodified multiplier
clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
multiplier is misplaced in the struct clocksource, this is private
information of the timekeeping code. Add the mult field to the struct
timekeeper to contain the NTP corrected value, keep the unmodifed
multiplier in clock->mult and remove clock->mult_orig. For consistency
add the shift value associated with the NTP corrected mult value to
struct timekeeper as well.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/arm/plat-omap/common.c |    2 +-
 include/linux/clocksource.h |    4 +---
 kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
 3 files changed, 27 insertions(+), 22 deletions(-)

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
  * @disable:		optional function to disable the clocksource
  * @mask:		bitmask for two's complement
  *			subtraction of non 64 bit counters
- * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
- * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
+ * @mult:		cycle to nanosecond multiplier
  * @shift:		cycle to nanosecond divisor (power of two)
  * @flags:		flags describing special properties
  * @vread:		vsyscall based read
@@ -168,7 +167,6 @@ struct clocksource {
 	void (*disable)(struct clocksource *cs);
 	cycle_t mask;
 	u32 mult;
-	u32 mult_orig;
 	u32 shift;
 	unsigned long flags;
 	cycle_t (*vread)(void);
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -29,6 +29,8 @@ struct timekeeper {
 	s64	ntp_error;
 	int	xtime_shift;
 	int	ntp_error_shift;
+	u32	mult;			/* NTP adjusted clock multiplier */
+	u32	shift;			/* NTP adjusted clock shift */
 };
 
 struct timekeeper timekeeper;
@@ -54,8 +56,8 @@ static void timekeeper_setup_internals(s
 	/* Do the ns -> cycle conversion first, using original mult */
 	tmp = NTP_INTERVAL_LENGTH;
 	tmp <<= clock->shift;
-	tmp += clock->mult_orig/2;
-	do_div(tmp, clock->mult_orig);
+	tmp += clock->mult/2;
+	do_div(tmp, clock->mult);
 	if (tmp == 0)
 		tmp = 1;
 
@@ -65,13 +67,21 @@ static void timekeeper_setup_internals(s
 	/* Go back from cycles -> shifted ns */
 	timekeeper.xtime_interval = (u64) interval * clock->mult;
 	timekeeper.raw_interval =
-		((u64) interval * clock->mult_orig) >> clock->shift;
+		((u64) interval * clock->mult) >> clock->shift;
 
 	timekeeper.xtime_nsec = 0;
 	timekeeper.xtime_shift = clock->shift;
 
 	timekeeper.ntp_error = 0;
 	timekeeper.ntp_error_shift = NTP_SCALE_SHIFT - clock->shift;
+
+	/*
+	 * The timekeeper keeps its own mult & shift values for the
+	 * currently active clocksource. These value will be adjusted
+	 * via NTP to counteract clock drifting.
+	 */
+	timekeeper.mult = clock->mult;
+	timekeeper.shift = clock->shift;
 }
 
 /*
@@ -142,14 +152,15 @@ static void timekeeping_forward_now(void
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 	clock->cycle_last = cycle_now;
 
-	nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
+	nsec = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
+				  timekeeper.shift);
 
 	/* If arch requires, add in gettimeoffset() */
 	nsec += arch_gettimeoffset();
 
 	timespec_add_ns(&xtime, nsec);
 
-	nsec = clocksource_cyc2ns(cycle_delta, clock->mult_orig, clock->shift);
+	nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
 	timespec_add_ns(&raw_time, nsec);
 }
 
@@ -181,8 +192,8 @@ void getnstimeofday(struct timespec *ts)
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
-					   clock->shift);
+		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
+					   timekeeper.shift);
 
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -216,8 +227,8 @@ ktime_t ktime_get(void)
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs += clocksource_cyc2ns(cycle_delta, clock->mult,
-					    clock->shift);
+		nsecs += clocksource_cyc2ns(cycle_delta, timekeeper.mult,
+					    timekeeper.shift);
 
 	} while (read_seqretry(&xtime_lock, seq));
 	/*
@@ -259,8 +270,8 @@ void ktime_get_ts(struct timespec *ts)
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
-					   clock->shift);
+		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
+					   timekeeper.shift);
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -344,8 +355,6 @@ static void change_clocksource(void)
 
 	if (new->enable && ! new->enable(new))
 		return;
-	/* save mult_orig on enable */
-	new->mult_orig = new->mult;
 
 	old = timekeeper.clock;
 	timekeeper_setup_internals(new);
@@ -438,7 +447,7 @@ void getrawmonotonic(struct timespec *ts
 		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
 		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult_orig,
+		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
 					   clock->shift);
 
 		*ts = raw_time;
@@ -498,8 +507,6 @@ void __init timekeeping_init(void)
 	clock = clocksource_default_clock();
 	if (clock->enable)
 		clock->enable(clock);
-	/* save mult_orig on enable */
-	clock->mult_orig = clock->mult;
 
 	timekeeper_setup_internals(clock);
 
@@ -674,7 +681,7 @@ static void timekeeping_adjust(s64 offse
 	} else
 		return;
 
-	timekeeper.clock->mult += adj;
+	timekeeper.mult += adj;
 	timekeeper.xtime_interval += interval;
 	timekeeper.xtime_nsec -= offset;
 	timekeeper.ntp_error -= (interval - offset) <<
@@ -765,7 +772,7 @@ void update_wall_time(void)
 	timekeeper.ntp_error +=	timekeeper.xtime_nsec <<
 				timekeeper.ntp_error_shift;
 
-	nsecs = clocksource_cyc2ns(offset, clock->mult, clock->shift);
+	nsecs = clocksource_cyc2ns(offset, timekeeper.mult, timekeeper.shift);
 	update_xtime_cache(nsecs);
 
 	/* check to see if there is a new clocksource to use */
Index: linux-2.6/arch/arm/plat-omap/common.c
===================================================================
--- linux-2.6.orig/arch/arm/plat-omap/common.c
+++ linux-2.6/arch/arm/plat-omap/common.c
@@ -256,7 +256,7 @@ unsigned long long sched_clock(void)
 	unsigned long long ret;
 
 	ret = (unsigned long long)clocksource_32k.read(&clocksource_32k);
-	ret = (ret * clocksource_32k.mult_orig) >> clocksource_32k.shift;
+	ret = (ret * clocksource_32k.mult) >> clocksource_32k.shift;
 	return ret;
 }
 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 11/12] timekeeper read clock helper functions
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (9 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 10/12] move NTP adjusted clock multiplier " Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-30 21:39   ` john stultz
  2009-07-29 13:41 ` [RFC][patch 12/12] update clocksource with stop_machine Martin Schwidefsky
  2009-07-29 15:10 ` [RFC][patch 00/12] clocksource / timekeeping rework V2 Daniel Walker
  12 siblings, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: timekeeper-helper.diff --]
[-- Type: text/plain, Size: 5067 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Add timekeeper_read_clock_ntp and timekeeper_read_clock_raw and use
them for getnstimeofday, ktime_get, ktime_get_ts and getrawmonotonic.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 kernel/time/timekeeping.c |   91 +++++++++++++++++++---------------------------
 1 file changed, 38 insertions(+), 53 deletions(-)

Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -84,6 +84,40 @@ static void timekeeper_setup_internals(s
 	timekeeper.shift = clock->shift;
 }
 
+/* Timekeeper helper functions. */
+static inline s64 timekeeper_read_clock_ntp(void)
+{
+	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
+
+	/* read clocksource: */
+	clock = timekeeper.clock;
+	cycle_now = clock->read(clock);
+
+	/* calculate the delta since the last update_wall_time: */
+	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+	/* return delta convert to nanoseconds using ntp adjusted mult. */
+	return clocksource_cyc2ns(cycle_delta, timekeeper.mult,
+				  timekeeper.shift);
+}
+
+static inline s64 timekeeper_read_clock_raw(void)
+{
+	cycle_t cycle_now, cycle_delta;
+	struct clocksource *clock;
+
+	/* read clocksource: */
+	clock = timekeeper.clock;
+	cycle_now = clock->read(clock);
+
+	/* calculate the delta since the last update_wall_time: */
+	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+
+	/* return delta convert to nanoseconds using ntp adjusted mult. */
+	return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
+}
+
 /*
  * This read-write spinlock protects us from races in SMP while
  * playing with xtime.
@@ -172,8 +206,6 @@ static void timekeeping_forward_now(void
  */
 void getnstimeofday(struct timespec *ts)
 {
-	cycle_t cycle_now, cycle_delta;
-	struct clocksource *clock;
 	unsigned long seq;
 	s64 nsecs;
 
@@ -183,17 +215,7 @@ void getnstimeofday(struct timespec *ts)
 		seq = read_seqbegin(&xtime_lock);
 
 		*ts = xtime;
-
-		/* read clocksource: */
-		clock = timekeeper.clock;
-		cycle_now = clock->read(clock);
-
-		/* calculate the delta since the last update_wall_time: */
-		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
-
-		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
-					   timekeeper.shift);
+		nsecs = timekeeper_read_clock_ntp();
 
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -207,8 +229,6 @@ EXPORT_SYMBOL(getnstimeofday);
 
 ktime_t ktime_get(void)
 {
-	cycle_t cycle_now, cycle_delta;
-	struct clocksource *clock;
 	unsigned int seq;
 	s64 secs, nsecs;
 
@@ -218,17 +238,7 @@ ktime_t ktime_get(void)
 		seq = read_seqbegin(&xtime_lock);
 		secs = xtime.tv_sec + wall_to_monotonic.tv_sec;
 		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
-
-		/* read clocksource: */
-		clock = timekeeper.clock;
-		cycle_now = clock->read(clock);
-
-		/* calculate the delta since the last update_wall_time: */
-		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
-
-		/* convert to nanoseconds: */
-		nsecs += clocksource_cyc2ns(cycle_delta, timekeeper.mult,
-					    timekeeper.shift);
+		nsecs += timekeeper_read_clock_ntp();
 
 	} while (read_seqretry(&xtime_lock, seq));
 	/*
@@ -249,8 +259,6 @@ EXPORT_SYMBOL_GPL(ktime_get);
  */
 void ktime_get_ts(struct timespec *ts)
 {
-	cycle_t cycle_now, cycle_delta;
-	struct clocksource *clock;
 	struct timespec tomono;
 	unsigned int seq;
 	s64 nsecs;
@@ -261,17 +269,7 @@ void ktime_get_ts(struct timespec *ts)
 		seq = read_seqbegin(&xtime_lock);
 		*ts = xtime;
 		tomono = wall_to_monotonic;
-
-		/* read clocksource: */
-		clock = timekeeper.clock;
-		cycle_now = clock->read(clock);
-
-		/* calculate the delta since the last update_wall_time: */
-		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
-
-		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
-					   timekeeper.shift);
+		nsecs = timekeeper_read_clock_ntp();
 
 	} while (read_seqretry(&xtime_lock, seq));
 
@@ -433,23 +431,10 @@ void getrawmonotonic(struct timespec *ts
 {
 	unsigned long seq;
 	s64 nsecs;
-	cycle_t cycle_now, cycle_delta;
-	struct clocksource *clock;
 
 	do {
 		seq = read_seqbegin(&xtime_lock);
-
-		/* read clocksource: */
-		clock = timekeeper.clock;
-		cycle_now = clock->read(clock);
-
-		/* calculate the delta since the last update_wall_time: */
-		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
-
-		/* convert to nanoseconds: */
-		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
-					   clock->shift);
-
+		nsecs = timekeeper_read_clock_raw();
 		*ts = raw_time;
 
 	} while (read_seqretry(&xtime_lock, seq));

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][patch 12/12] update clocksource with stop_machine
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (10 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 11/12] timekeeper read clock helper functions Martin Schwidefsky
@ 2009-07-29 13:41 ` Martin Schwidefsky
  2009-07-29 15:10 ` [RFC][patch 00/12] clocksource / timekeeping rework V2 Daniel Walker
  12 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-29 13:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, john stultz, Daniel Walker,
	Martin Schwidefsky

[-- Attachment #1: timekeeper-stop-machine.diff --]
[-- Type: text/plain, Size: 10385 bytes --]

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

update_wall_time calls change_clocksource HZ times per second to check
if a new clock source is available. In close to 100% of all calls there
is no new clock. Replace the tick based check by an update done with
stop_machine.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 include/linux/clocksource.h |    2 
 kernel/time/clocksource.c   |  117 ++++++++++++++++----------------------------
 kernel/time/timekeeping.c   |   40 +++++++++------
 3 files changed, 73 insertions(+), 86 deletions(-)

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -291,4 +291,6 @@ static inline void update_vsyscall_tz(vo
 }
 #endif
 
+extern void timekeeping_notify(struct clocksource *clock);
+
 #endif /* _LINUX_CLOCKSOURCE_H */
Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -109,35 +109,17 @@ EXPORT_SYMBOL(timecounter_cyc2time);
 /*[Clocksource internal variables]---------
  * curr_clocksource:
  *	currently selected clocksource. Initialized to clocksource_jiffies.
- * next_clocksource:
- *	pending next selected clocksource.
  * clocksource_list:
  *	linked list with the registered clocksources
- * clocksource_lock:
- *	protects manipulations to curr_clocksource and next_clocksource
- *	and the clocksource_list
+ * clocksource_mutex:
+ *	protects manipulations to curr_clocksource and the clocksource_list
  * override_name:
  *	Name of the user-specified clocksource.
  */
 static struct clocksource *curr_clocksource;
-static struct clocksource *next_clocksource;
 static LIST_HEAD(clocksource_list);
-static DEFINE_SPINLOCK(clocksource_lock);
+static DEFINE_MUTEX(clocksource_mutex);
 static char override_name[32];
-static int finished_booting;
-
-/* clocksource_done_booting - Called near the end of core bootup
- *
- * Hack to avoid lots of clocksource churn at boot time.
- * We use fs_initcall because we want this to start before
- * device_initcall but after subsys_initcall.
- */
-static int __init clocksource_done_booting(void)
-{
-	finished_booting = 1;
-	return 0;
-}
-fs_initcall(clocksource_done_booting);
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 static LIST_HEAD(watchdog_list);
@@ -355,18 +337,16 @@ static inline void clocksource_resume_wa
 void clocksource_resume(void)
 {
 	struct clocksource *cs;
-	unsigned long flags;
 
-	spin_lock_irqsave(&clocksource_lock, flags);
+	mutex_lock(&clocksource_mutex);
 
-	list_for_each_entry(cs, &clocksource_list, list) {
+	list_for_each_entry(cs, &clocksource_list, list)
 		if (cs->resume)
 			cs->resume();
-	}
 
 	clocksource_resume_watchdog();
 
-	spin_unlock_irqrestore(&clocksource_lock, flags);
+	mutex_unlock(&clocksource_mutex);
 }
 
 /**
@@ -382,33 +362,19 @@ void clocksource_touch_watchdog(void)
 }
 
 #ifdef CONFIG_GENERIC_TIME
-/**
- * clocksource_get_next - Returns the selected clocksource
- *
- */
-struct clocksource *clocksource_get_next(void)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&clocksource_lock, flags);
-	if (next_clocksource && finished_booting) {
-		curr_clocksource = next_clocksource;
-		next_clocksource = NULL;
-	}
-	spin_unlock_irqrestore(&clocksource_lock, flags);
 
-	return curr_clocksource;
-}
+static int finished_booting;
 
 /**
  * clocksource_select - Select the best clocksource available
  */
-static int clocksource_select(void)
+static void clocksource_select(void)
 {
 	struct clocksource *best, *cs;
-	int rc;
 
-	rc = 0;
+	if (!finished_booting)
+		return;
+
 	best = NULL;
 	list_for_each_entry(cs, &clocksource_list, list) {
 		/* Check for the override clocksource. */
@@ -434,17 +400,31 @@ static int clocksource_select(void)
 		if (!best || cs->rating > best->rating)
 			best = cs;
 	}
-	if (curr_clocksource != best)
-		next_clocksource = best;
-	return rc;
+	if (curr_clocksource != best) {
+		printk(KERN_INFO "Switching to clocksource %s\n", best->name);
+		curr_clocksource = best;
+		timekeeping_notify(curr_clocksource);
+	}
 }
 
-#else /* CONFIG_GENERIC_TIME */
-
-static inline int clocksource_select(void)
+/*
+ * clocksource_done_booting - Called near the end of core bootup
+ *
+ * Hack to avoid lots of clocksource churn at boot time.
+ * We use fs_initcall because we want this to start before
+ * device_initcall but after subsys_initcall.
+ */
+static int __init clocksource_done_booting(void)
 {
+	finished_booting = 1;
+	clocksource_select();
 	return 0;
 }
+fs_initcall(clocksource_done_booting);
+
+#else /* CONFIG_GENERIC_TIME */
+
+static inline void clocksource_select(void) { }
 
 #endif
 
@@ -456,13 +436,11 @@ static inline int clocksource_select(voi
  */
 int clocksource_register(struct clocksource *cs)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&clocksource_lock, flags);
+	mutex_lock(&clocksource_mutex);
 	list_add(&cs->list, &clocksource_list);
 	clocksource_select();
-	spin_unlock_irqrestore(&clocksource_lock, flags);
 	clocksource_enqueue_watchdog(cs);
+	mutex_unlock(&clocksource_mutex);
 	return 0;
 }
 EXPORT_SYMBOL(clocksource_register);
@@ -472,12 +450,10 @@ EXPORT_SYMBOL(clocksource_register);
  */
 void clocksource_change_rating(struct clocksource *cs, int rating)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&clocksource_lock, flags);
+	mutex_lock(&clocksource_mutex);
 	cs->rating = rating;
 	clocksource_select();
-	spin_unlock_irqrestore(&clocksource_lock, flags);
+	mutex_unlock(&clocksource_mutex);
 }
 EXPORT_SYMBOL(clocksource_change_rating);
 
@@ -486,13 +462,11 @@ EXPORT_SYMBOL(clocksource_change_rating)
  */
 void clocksource_unregister(struct clocksource *cs)
 {
-	unsigned long flags;
-
+	mutex_lock(&clocksource_mutex);
 	clocksource_dequeue_watchdog(cs);
-	spin_lock_irqsave(&clocksource_lock, flags);
 	list_del(&cs->list);
 	clocksource_select();
-	spin_unlock_irqrestore(&clocksource_lock, flags);
+	mutex_unlock(&clocksource_mutex);
 }
 EXPORT_SYMBOL(clocksource_unregister);
 
@@ -510,9 +484,9 @@ sysfs_show_current_clocksources(struct s
 {
 	ssize_t count = 0;
 
-	spin_lock_irq(&clocksource_lock);
+	mutex_lock(&clocksource_mutex);
 	count = snprintf(buf, PAGE_SIZE, "%s\n", curr_clocksource->name);
-	spin_unlock_irq(&clocksource_lock);
+	mutex_unlock(&clocksource_mutex);
 
 	return count;
 }
@@ -538,14 +512,14 @@ static ssize_t sysfs_override_clocksourc
 	if (buf[count-1] == '\n')
 		count--;
 
-	spin_lock_irq(&clocksource_lock);
+	mutex_lock(&clocksource_mutex);
 
 	if (count > 0)
 		memcpy(override_name, buf, count);
 	override_name[count] = 0;
 	clocksource_select();
 
-	spin_unlock_irq(&clocksource_lock);
+	mutex_unlock(&clocksource_mutex);
 
 	return count;
 }
@@ -565,7 +539,7 @@ sysfs_show_available_clocksources(struct
 	struct clocksource *src;
 	ssize_t count = 0;
 
-	spin_lock_irq(&clocksource_lock);
+	mutex_lock(&clocksource_mutex);
 	list_for_each_entry(src, &clocksource_list, list) {
 		/*
 		 * Don't show non-HRES clocksource if the tick code is
@@ -577,7 +551,7 @@ sysfs_show_available_clocksources(struct
 				  max((ssize_t)PAGE_SIZE - count, (ssize_t)0),
 				  "%s ", src->name);
 	}
-	spin_unlock_irq(&clocksource_lock);
+	mutex_unlock(&clocksource_mutex);
 
 	count += snprintf(buf + count,
 			  max((ssize_t)PAGE_SIZE - count, (ssize_t)0), "\n");
@@ -632,11 +606,10 @@ device_initcall(init_clocksource_sysfs);
  */
 static int __init boot_override_clocksource(char* str)
 {
-	unsigned long flags;
-	spin_lock_irqsave(&clocksource_lock, flags);
+	mutex_lock(&clocksource_mutex);
 	if (str)
 		strlcpy(override_name, str, sizeof(override_name));
-	spin_unlock_irqrestore(&clocksource_lock, flags);
+	mutex_unlock(&clocksource_mutex);
 	return 1;
 }
 
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -18,6 +18,7 @@
 #include <linux/jiffies.h>
 #include <linux/time.h>
 #include <linux/tick.h>
+#include <linux/stop_machine.h>
 
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
@@ -168,6 +169,7 @@ void timekeeping_leap_insert(int leapsec
 }
 
 #ifdef CONFIG_GENERIC_TIME
+
 /**
  * timekeeping_forward_now - update clock to the current time
  *
@@ -340,30 +342,40 @@ EXPORT_SYMBOL(do_settimeofday);
  *
  * Accumulates current time interval and initializes new clocksource
  */
-static void change_clocksource(void)
+static int change_clocksource(void *data)
 {
 	struct clocksource *new, *old;
 
-	new = clocksource_get_next();
-
-	if (!new || timekeeper.clock == new)
-		return;
+	new = (struct clocksource *) data;
 
 	timekeeping_forward_now();
+	if (!new->enable || new->enable(new) == 0) {
+		old = timekeeper.clock;
+		timekeeper_setup_internals(new);
+		if (old->disable)
+			old->disable(old);
+	}
+	return 0;
+}
 
-	if (new->enable && ! new->enable(new))
+/**
+ * timekeeping_notify - Install a new clock source
+ * @clock:		pointer to the clock source
+ *
+ * This function is called from clocksource.c after a new, better clock
+ * source has been registered. The caller holds the clocksource_mutex.
+ */
+void timekeeping_notify(struct clocksource *clock)
+{
+	if (timekeeper.clock == clock)
 		return;
-
-	old = timekeeper.clock;
-	timekeeper_setup_internals(new);
-	if (old->disable)
-		old->disable(old);
-
+	stop_machine(change_clocksource, clock, NULL);
 	tick_clock_notify();
 }
+
 #else /* GENERIC_TIME */
+
 static inline void timekeeping_forward_now(void) { }
-static inline void change_clocksource(void) { }
 
 /**
  * ktime_get - get the monotonic time in ktime_t format
@@ -404,6 +416,7 @@ void ktime_get_ts(struct timespec *ts)
 				ts->tv_nsec + tomono.tv_nsec);
 }
 EXPORT_SYMBOL_GPL(ktime_get_ts);
+
 #endif /* !GENERIC_TIME */
 
 /**
@@ -761,7 +774,6 @@ void update_wall_time(void)
 	update_xtime_cache(nsecs);
 
 	/* check to see if there is a new clocksource to use */
-	change_clocksource();
 	update_vsyscall(&xtime, timekeeper.clock);
 }
 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 02/12] remove clocksource inline functions
  2009-07-29 13:41 ` [RFC][patch 02/12] remove clocksource inline functions Martin Schwidefsky
@ 2009-07-29 14:15   ` Daniel Walker
  2009-07-30 21:46     ` Christoph Hellwig
  2009-07-30 21:05   ` john stultz
  1 sibling, 1 reply; 32+ messages in thread
From: Daniel Walker @ 2009-07-29 14:15 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, john stultz

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (clocksource-inline.diff)
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> Remove clocksource_read, clocksource_enable and clocksource_disable
> inline functions. No functional change.
> 

Your still not really explaining this one, is this suppose to be
cleaner? Or is this related to some other part of your clean up?

Daniel


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 00/12] clocksource / timekeeping rework V2
  2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
                   ` (11 preceding siblings ...)
  2009-07-29 13:41 ` [RFC][patch 12/12] update clocksource with stop_machine Martin Schwidefsky
@ 2009-07-29 15:10 ` Daniel Walker
  12 siblings, 0 replies; 32+ messages in thread
From: Daniel Walker @ 2009-07-29 15:10 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, john stultz

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> There is still more room for improvement. Some sore points are:
> 
> 1) The cycle_last value still is in the struct clocksource. It should
>    be in the struct timekeeper but the check against cycles_last in
> the
>    read function of the TSC clock source makes it hard.
> 2) read_persistent_clock returns seconds. With a really good initial
>    time source this is not very precise. read_persistent_clock should
>    return a struct timespec.
> 3) xtime, raw_time, total_sleep_time, timekeeping_suspended, jiffies,
>    the ntp state and probably a few other values may be better located
>    in the struct timekeeper as well.


You could also consolidate the clocksource_unregister() path and the
clocksource_change_rating(0) path  , both are basically doing the same
thing.. Neither one is heavily used..

Daniel


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 01/12] introduce timekeeping_leap_insert
  2009-07-29 13:41 ` [RFC][patch 01/12] introduce timekeeping_leap_insert Martin Schwidefsky
@ 2009-07-30 21:02   ` john stultz
  0 siblings, 0 replies; 32+ messages in thread
From: john stultz @ 2009-07-30 21:02 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (leap_insert.diff)
> From: john stultz <johnstul@us.ibm.com>
> 
> ---
>  include/linux/time.h      |    1 +
>  kernel/time/ntp.c         |    7 ++-----
>  kernel/time/timekeeping.c |    7 +++++++
>  3 files changed, 10 insertions(+), 5 deletions(-)


Signed-off-by: John Stultz <johnstul@us.ibm.com>

-john


> Index: linux-2.6/include/linux/time.h
> ===================================================================
> --- linux-2.6.orig/include/linux/time.h
> +++ linux-2.6/include/linux/time.h
> @@ -147,6 +147,7 @@ extern struct timespec timespec_trunc(st
>  extern int timekeeping_valid_for_hres(void);
>  extern void update_wall_time(void);
>  extern void update_xtime_cache(u64 nsec);
> +extern void timekeeping_leap_insert(int leapsecond);
> 
>  struct tms;
>  extern void do_sys_times(struct tms *);
> Index: linux-2.6/kernel/time/ntp.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/ntp.c
> +++ linux-2.6/kernel/time/ntp.c
> @@ -194,8 +194,7 @@ static enum hrtimer_restart ntp_leap_sec
>  	case TIME_OK:
>  		break;
>  	case TIME_INS:
> -		xtime.tv_sec--;
> -		wall_to_monotonic.tv_sec++;
> +		timekeeping_leap_insert(-1);
>  		time_state = TIME_OOP;
>  		printk(KERN_NOTICE
>  			"Clock: inserting leap second 23:59:60 UTC\n");
> @@ -203,9 +202,8 @@ static enum hrtimer_restart ntp_leap_sec
>  		res = HRTIMER_RESTART;
>  		break;
>  	case TIME_DEL:
> -		xtime.tv_sec++;
> +		timekeeping_leap_insert(1);
>  		time_tai--;
> -		wall_to_monotonic.tv_sec--;
>  		time_state = TIME_WAIT;
>  		printk(KERN_NOTICE
>  			"Clock: deleting leap second 23:59:59 UTC\n");
> @@ -219,7 +217,6 @@ static enum hrtimer_restart ntp_leap_sec
>  			time_state = TIME_OK;
>  		break;
>  	}
> -	update_vsyscall(&xtime, clock);
> 
>  	write_sequnlock(&xtime_lock);
> 
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -58,6 +58,13 @@ void update_xtime_cache(u64 nsec)
> 
>  struct clocksource *clock;
> 
> +/* must hold xtime_lock */
> +void timekeeping_leap_insert(int leapsecond)
> +{
> +	xtime.tv_sec += leapsecond;
> +	wall_to_monotonic.tv_sec -= leapsecond;
> +	update_vsyscall(&xtime, clock);
> +}
> 
>  #ifdef CONFIG_GENERIC_TIME
>  /**
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 02/12] remove clocksource inline functions
  2009-07-29 13:41 ` [RFC][patch 02/12] remove clocksource inline functions Martin Schwidefsky
  2009-07-29 14:15   ` Daniel Walker
@ 2009-07-30 21:05   ` john stultz
  1 sibling, 0 replies; 32+ messages in thread
From: john stultz @ 2009-07-30 21:05 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (clocksource-inline.diff)
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> Remove clocksource_read, clocksource_enable and clocksource_disable
> inline functions. No functional change.
> 
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: john stultz <johnstul@us.ibm.com>
> Cc: Daniel Walker <dwalker@fifo99.com>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

Again, Magnus's fix for mult_orig that Thomas has (or *should* have)
queued at this point for 2.6.31 will collide with this, but the fix is
trivial.

Acked-by: John Stultz <johnstul@us.ibm.com>


> ---
>  include/linux/clocksource.h |   46 --------------------------------------------
>  kernel/time/timekeeping.c   |   32 +++++++++++++++++-------------
>  2 files changed, 18 insertions(+), 60 deletions(-)
> 
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -79,7 +79,7 @@ static void clocksource_forward_now(void
>  	cycle_t cycle_now, cycle_delta;
>  	s64 nsec;
> 
> -	cycle_now = clocksource_read(clock);
> +	cycle_now = clock->read(clock);
>  	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
>  	clock->cycle_last = cycle_now;
> 
> @@ -114,7 +114,7 @@ void getnstimeofday(struct timespec *ts)
>  		*ts = xtime;
> 
>  		/* read clocksource: */
> -		cycle_now = clocksource_read(clock);
> +		cycle_now = clock->read(clock);
> 
>  		/* calculate the delta since the last update_wall_time: */
>  		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> @@ -146,7 +146,7 @@ ktime_t ktime_get(void)
>  		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
> 
>  		/* read clocksource: */
> -		cycle_now = clocksource_read(clock);
> +		cycle_now = clock->read(clock);
> 
>  		/* calculate the delta since the last update_wall_time: */
>  		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> @@ -186,7 +186,7 @@ void ktime_get_ts(struct timespec *ts)
>  		tomono = wall_to_monotonic;
> 
>  		/* read clocksource: */
> -		cycle_now = clocksource_read(clock);
> +		cycle_now = clock->read(clock);
> 
>  		/* calculate the delta since the last update_wall_time: */
>  		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> @@ -274,16 +274,18 @@ static void change_clocksource(void)
> 
>  	clocksource_forward_now();
> 
> -	if (clocksource_enable(new))
> +	if (new->enable && ! new->enable(new))
>  		return;
> +	/* save mult_orig on enable */
> +	new->mult_orig = new->mult;
> 
>  	new->raw_time = clock->raw_time;
>  	old = clock;
>  	clock = new;
> -	clocksource_disable(old);
> +	if (old->disable)
> +		old->disable(old);
> 
> -	clock->cycle_last = 0;
> -	clock->cycle_last = clocksource_read(clock);
> +	clock->cycle_last = clock->read(clock);
>  	clock->error = 0;
>  	clock->xtime_nsec = 0;
>  	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
> @@ -373,7 +375,7 @@ void getrawmonotonic(struct timespec *ts
>  		seq = read_seqbegin(&xtime_lock);
> 
>  		/* read clocksource: */
> -		cycle_now = clocksource_read(clock);
> +		cycle_now = clock->read(clock);
> 
>  		/* calculate the delta since the last update_wall_time: */
>  		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> @@ -435,9 +437,12 @@ void __init timekeeping_init(void)
>  	ntp_init();
> 
>  	clock = clocksource_get_next();
> -	clocksource_enable(clock);
> +	if (clock->enable)
> +		clock->enable(clock);
> +	/* save mult_orig on enable */
> +	clock->mult_orig = clock->mult;
>  	clocksource_calculate_interval(clock, NTP_INTERVAL_LENGTH);
> -	clock->cycle_last = clocksource_read(clock);
> +	clock->cycle_last = clock->read(clock);
> 
>  	xtime.tv_sec = sec;
>  	xtime.tv_nsec = 0;
> @@ -477,8 +482,7 @@ static int timekeeping_resume(struct sys
>  	}
>  	update_xtime_cache(0);
>  	/* re-base the last cycle value */
> -	clock->cycle_last = 0;
> -	clock->cycle_last = clocksource_read(clock);
> +	clock->cycle_last = clock->read(clock);
>  	clock->error = 0;
>  	timekeeping_suspended = 0;
>  	write_sequnlock_irqrestore(&xtime_lock, flags);
> @@ -630,7 +634,7 @@ void update_wall_time(void)
>  		return;
> 
>  #ifdef CONFIG_GENERIC_TIME
> -	offset = (clocksource_read(clock) - clock->cycle_last) & clock->mask;
> +	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
>  #else
>  	offset = clock->cycle_interval;
>  #endif
> Index: linux-2.6/include/linux/clocksource.h
> ===================================================================
> --- linux-2.6.orig/include/linux/clocksource.h
> +++ linux-2.6/include/linux/clocksource.h
> @@ -268,52 +268,6 @@ static inline u32 clocksource_hz2mult(u3
>  }
> 
>  /**
> - * clocksource_read: - Access the clocksource's current cycle value
> - * @cs:		pointer to clocksource being read
> - *
> - * Uses the clocksource to return the current cycle_t value
> - */
> -static inline cycle_t clocksource_read(struct clocksource *cs)
> -{
> -	return cs->read(cs);
> -}
> -
> -/**
> - * clocksource_enable: - enable clocksource
> - * @cs:		pointer to clocksource
> - *
> - * Enables the specified clocksource. The clocksource callback
> - * function should start up the hardware and setup mult and field
> - * members of struct clocksource to reflect hardware capabilities.
> - */
> -static inline int clocksource_enable(struct clocksource *cs)
> -{
> -	int ret = 0;
> -
> -	if (cs->enable)
> -		ret = cs->enable(cs);
> -
> -	/* save mult_orig on enable */
> -	cs->mult_orig = cs->mult;
> -
> -	return ret;
> -}
> -
> -/**
> - * clocksource_disable: - disable clocksource
> - * @cs:		pointer to clocksource
> - *
> - * Disables the specified clocksource. The clocksource callback
> - * function should power down the now unused hardware block to
> - * save power.
> - */
> -static inline void clocksource_disable(struct clocksource *cs)
> -{
> -	if (cs->disable)
> -		cs->disable(cs);
> -}
> -
> -/**
>   * cyc2ns - converts clocksource cycles to nanoseconds
>   * @cs:		Pointer to clocksource
>   * @cycles:	Cycles
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 11/12] timekeeper read clock helper functions
  2009-07-29 13:41 ` [RFC][patch 11/12] timekeeper read clock helper functions Martin Schwidefsky
@ 2009-07-30 21:39   ` john stultz
  2009-07-31  7:45     ` Martin Schwidefsky
  0 siblings, 1 reply; 32+ messages in thread
From: john stultz @ 2009-07-30 21:39 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (timekeeper-helper.diff)
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> Add timekeeper_read_clock_ntp and timekeeper_read_clock_raw and use
> them for getnstimeofday, ktime_get, ktime_get_ts and getrawmonotonic.
>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: john stultz <johnstul@us.ibm.com>
> Cc: Daniel Walker <dwalker@fifo99.com>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
>  kernel/time/timekeeping.c |   91 +++++++++++++++++++---------------------------
>  1 file changed, 38 insertions(+), 53 deletions(-)
> 
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -84,6 +84,40 @@ static void timekeeper_setup_internals(s
>  	timekeeper.shift = clock->shift;
>  }
> 
> +/* Timekeeper helper functions. */
> +static inline s64 timekeeper_read_clock_ntp(void)
> +{
> +	cycle_t cycle_now, cycle_delta;
> +	struct clocksource *clock;
> +
> +	/* read clocksource: */
> +	clock = timekeeper.clock;
> +	cycle_now = clock->read(clock);
> +

I know it seems nice to have it here, but I think these helpers would be
more reusable in other contexts if they took the cycle_now value as an
argument. Also I'd drop the ntp bit, just to avoid confusing it with
some ntp specific function. So:

 timekeeping_get_ns(cycle_t now);
 timekeeping_get_ns_raw(cycle_t now);

That way in some situations we don't have to make two accesses to the
hardware if we want to get both values at the same point.

Seem reasonable?

thanks
-john



> +	/* calculate the delta since the last update_wall_time: */
> +	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> +
> +	/* return delta convert to nanoseconds using ntp adjusted mult. */
> +	return clocksource_cyc2ns(cycle_delta, timekeeper.mult,
> +				  timekeeper.shift);
> +}
> +
> +static inline s64 timekeeper_read_clock_raw(void)
> +{
> +	cycle_t cycle_now, cycle_delta;
> +	struct clocksource *clock;
> +
> +	/* read clocksource: */
> +	clock = timekeeper.clock;
> +	cycle_now = clock->read(clock);
> +
> +	/* calculate the delta since the last update_wall_time: */
> +	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> +
> +	/* return delta convert to nanoseconds using ntp adjusted mult. */
> +	return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
> +}
> +
>  /*
>   * This read-write spinlock protects us from races in SMP while
>   * playing with xtime.
> @@ -172,8 +206,6 @@ static void timekeeping_forward_now(void
>   */
>  void getnstimeofday(struct timespec *ts)
>  {
> -	cycle_t cycle_now, cycle_delta;
> -	struct clocksource *clock;
>  	unsigned long seq;
>  	s64 nsecs;
> 
> @@ -183,17 +215,7 @@ void getnstimeofday(struct timespec *ts)
>  		seq = read_seqbegin(&xtime_lock);
> 
>  		*ts = xtime;
> -
> -		/* read clocksource: */
> -		clock = timekeeper.clock;
> -		cycle_now = clock->read(clock);
> -
> -		/* calculate the delta since the last update_wall_time: */
> -		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> -
> -		/* convert to nanoseconds: */
> -		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
> -					   timekeeper.shift);
> +		nsecs = timekeeper_read_clock_ntp();
> 
>  		/* If arch requires, add in gettimeoffset() */
>  		nsecs += arch_gettimeoffset();
> @@ -207,8 +229,6 @@ EXPORT_SYMBOL(getnstimeofday);
> 
>  ktime_t ktime_get(void)
>  {
> -	cycle_t cycle_now, cycle_delta;
> -	struct clocksource *clock;
>  	unsigned int seq;
>  	s64 secs, nsecs;
> 
> @@ -218,17 +238,7 @@ ktime_t ktime_get(void)
>  		seq = read_seqbegin(&xtime_lock);
>  		secs = xtime.tv_sec + wall_to_monotonic.tv_sec;
>  		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
> -
> -		/* read clocksource: */
> -		clock = timekeeper.clock;
> -		cycle_now = clock->read(clock);
> -
> -		/* calculate the delta since the last update_wall_time: */
> -		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> -
> -		/* convert to nanoseconds: */
> -		nsecs += clocksource_cyc2ns(cycle_delta, timekeeper.mult,
> -					    timekeeper.shift);
> +		nsecs += timekeeper_read_clock_ntp();
> 
>  	} while (read_seqretry(&xtime_lock, seq));
>  	/*
> @@ -249,8 +259,6 @@ EXPORT_SYMBOL_GPL(ktime_get);
>   */
>  void ktime_get_ts(struct timespec *ts)
>  {
> -	cycle_t cycle_now, cycle_delta;
> -	struct clocksource *clock;
>  	struct timespec tomono;
>  	unsigned int seq;
>  	s64 nsecs;
> @@ -261,17 +269,7 @@ void ktime_get_ts(struct timespec *ts)
>  		seq = read_seqbegin(&xtime_lock);
>  		*ts = xtime;
>  		tomono = wall_to_monotonic;
> -
> -		/* read clocksource: */
> -		clock = timekeeper.clock;
> -		cycle_now = clock->read(clock);
> -
> -		/* calculate the delta since the last update_wall_time: */
> -		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> -
> -		/* convert to nanoseconds: */
> -		nsecs = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
> -					   timekeeper.shift);
> +		nsecs = timekeeper_read_clock_ntp();
> 
>  	} while (read_seqretry(&xtime_lock, seq));
> 
> @@ -433,23 +431,10 @@ void getrawmonotonic(struct timespec *ts
>  {
>  	unsigned long seq;
>  	s64 nsecs;
> -	cycle_t cycle_now, cycle_delta;
> -	struct clocksource *clock;
> 
>  	do {
>  		seq = read_seqbegin(&xtime_lock);
> -
> -		/* read clocksource: */
> -		clock = timekeeper.clock;
> -		cycle_now = clock->read(clock);
> -
> -		/* calculate the delta since the last update_wall_time: */
> -		cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> -
> -		/* convert to nanoseconds: */
> -		nsecs = clocksource_cyc2ns(cycle_delta, clock->mult,
> -					   clock->shift);
> -
> +		nsecs = timekeeper_read_clock_raw();
>  		*ts = raw_time;
> 
>  	} while (read_seqretry(&xtime_lock, seq));
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 02/12] remove clocksource inline functions
  2009-07-29 14:15   ` Daniel Walker
@ 2009-07-30 21:46     ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2009-07-30 21:46 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Martin Schwidefsky, linux-kernel, Ingo Molnar, Thomas Gleixner,
	john stultz

On Wed, Jul 29, 2009 at 07:15:30AM -0700, Daniel Walker wrote:
> On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > plain text document attachment (clocksource-inline.diff)
> > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > 
> > Remove clocksource_read, clocksource_enable and clocksource_disable
> > inline functions. No functional change.
> > 
> 
> Your still not really explaining this one, is this suppose to be
> cleaner? Or is this related to some other part of your clean up?

It removes an entirely useless layer of obsfucation.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-29 13:41 ` [RFC][patch 10/12] move NTP adjusted clock multiplier " Martin Schwidefsky
@ 2009-07-30 22:04   ` john stultz
  2009-07-31  7:52     ` Martin Schwidefsky
  2009-08-13 11:15   ` Linus Walleij
  1 sibling, 1 reply; 32+ messages in thread
From: john stultz @ 2009-07-30 22:04 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (timekeeper-mult.diff)
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> The clocksource structure has two multipliers, the unmodified multiplier
> clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> multiplier is misplaced in the struct clocksource, this is private
> information of the timekeeping code. Add the mult field to the struct
> timekeeper to contain the NTP corrected value, keep the unmodifed
> multiplier in clock->mult and remove clock->mult_orig. For consistency
> add the shift value associated with the NTP corrected mult value to
> struct timekeeper as well.
> 
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: john stultz <johnstul@us.ibm.com>
> Cc: Daniel Walker <dwalker@fifo99.com>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
>  arch/arm/plat-omap/common.c |    2 +-
>  include/linux/clocksource.h |    4 +---
>  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
>  3 files changed, 27 insertions(+), 22 deletions(-)
> 
> Index: linux-2.6/include/linux/clocksource.h
> ===================================================================
> --- linux-2.6.orig/include/linux/clocksource.h
> +++ linux-2.6/include/linux/clocksource.h
> @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
>   * @disable:		optional function to disable the clocksource
>   * @mask:		bitmask for two's complement
>   *			subtraction of non 64 bit counters
> - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> + * @mult:		cycle to nanosecond multiplier
>   * @shift:		cycle to nanosecond divisor (power of two)
>   * @flags:		flags describing special properties
>   * @vread:		vsyscall based read
> @@ -168,7 +167,6 @@ struct clocksource {
>  	void (*disable)(struct clocksource *cs);
>  	cycle_t mask;
>  	u32 mult;
> -	u32 mult_orig;
>  	u32 shift;
>  	unsigned long flags;
>  	cycle_t (*vread)(void);
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -29,6 +29,8 @@ struct timekeeper {
>  	s64	ntp_error;
>  	int	xtime_shift;
>  	int	ntp_error_shift;
> +	u32	mult;			/* NTP adjusted clock multiplier */
> +	u32	shift;			/* NTP adjusted clock shift */

I'm a little confused here. NTP doesn't adjust the shift value. This
seems redundant with the xtime_shift value introduced in [patch 9/12].

thanks
-john



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 09/12] add xtime_shift and ntp_error_shift to struct timekeeper
  2009-07-29 13:41 ` [RFC][patch 09/12] add xtime_shift and ntp_error_shift to " Martin Schwidefsky
@ 2009-07-30 22:15   ` john stultz
  2009-07-31  8:13     ` Martin Schwidefsky
  0 siblings, 1 reply; 32+ messages in thread
From: john stultz @ 2009-07-30 22:15 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> plain text document attachment (timekeeper-shift.diff)
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> 
> The xtime_nsec value in the timekeeper structure is shifted by a few
> bits to improve precision. This happens to be the same value as the
> clock->shift. To improve readability add xtime_shift to the timekeeper
> and use it instead of the clock->shift. Likewise add ntp_error_shift
> and replace all (NTP_SCALE_SHIFT - clock->shift) expressions.
> 
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: john stultz <johnstul@us.ibm.com>
> Cc: Daniel Walker <dwalker@fifo99.com>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
>  kernel/time/timekeeping.c |   33 ++++++++++++++++++---------------
>  1 file changed, 18 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6/kernel/time/timekeeping.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/timekeeping.c
> +++ linux-2.6/kernel/time/timekeeping.c
> @@ -27,6 +27,8 @@ struct timekeeper {
>  	u32	raw_interval;
>  	u64	xtime_nsec;
>  	s64	ntp_error;
> +	int	xtime_shift;
> +	int	ntp_error_shift;
>  };

I suspect ntp_error_shift is one of the lease intuitive values in the
timekeeper structure. They all probably need nice comments explaining
what they store, but especially this one. 

Something like:

struct clocksource *clock; /* current clocksource used for timekeeping*/
cycle_t cycle_interval;	   /* fixed chunk of cycles used in accumulation */
u64     xtime_interval;    /* number clock shifted nsecs accumulated per cycle_interval */
u32	raw_interval;      /* raw nsecs accumulated per cycle_interval */
u64	xtime_nsec;        /* clock shifted nsecs remainder not stored in xtime.tv_nsec */
s64	ntp_error;         /* Difference between accumulated time and NTP time in ntp shifted nsecs */
int	xtime_shift;       /* The shift value of the current clocksource */
int	ntp_error_shift;   /* Shift conversion between clock shifted nsecs and ntp shifted nsecs */


Other then that this patch looks ok, but will need rigorous testing, as
its very complicated and difficult to understand code thats being
changed.

thanks
-john



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 11/12] timekeeper read clock helper functions
  2009-07-30 21:39   ` john stultz
@ 2009-07-31  7:45     ` Martin Schwidefsky
  2009-07-31  8:11       ` john stultz
  0 siblings, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-31  7:45 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Thu, 30 Jul 2009 14:39:54 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > plain text document attachment (timekeeper-helper.diff)
> > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > 
> > Add timekeeper_read_clock_ntp and timekeeper_read_clock_raw and use
> > them for getnstimeofday, ktime_get, ktime_get_ts and getrawmonotonic.
> >
> > Cc: Ingo Molnar <mingo@elte.hu>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: john stultz <johnstul@us.ibm.com>
> > Cc: Daniel Walker <dwalker@fifo99.com>
> > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > ---
> >  kernel/time/timekeeping.c |   91 +++++++++++++++++++---------------------------
> >  1 file changed, 38 insertions(+), 53 deletions(-)
> > 
> > Index: linux-2.6/kernel/time/timekeeping.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/time/timekeeping.c
> > +++ linux-2.6/kernel/time/timekeeping.c
> > @@ -84,6 +84,40 @@ static void timekeeper_setup_internals(s
> >  	timekeeper.shift = clock->shift;
> >  }
> > 
> > +/* Timekeeper helper functions. */
> > +static inline s64 timekeeper_read_clock_ntp(void)
> > +{
> > +	cycle_t cycle_now, cycle_delta;
> > +	struct clocksource *clock;
> > +
> > +	/* read clocksource: */
> > +	clock = timekeeper.clock;
> > +	cycle_now = clock->read(clock);
> > +
> 
> I know it seems nice to have it here, but I think these helpers would be
> more reusable in other contexts if they took the cycle_now value as an
> argument. Also I'd drop the ntp bit, just to avoid confusing it with
> some ntp specific function. So:
> 
>  timekeeping_get_ns(cycle_t now);
>  timekeeping_get_ns_raw(cycle_t now);
> 
> That way in some situations we don't have to make two accesses to the
> hardware if we want to get both values at the same point.
> 
> Seem reasonable?

The new names are fine but if we pull out the ->read call to the
caller we again have a rather strange mix. The caller gets the cycle
value using some clock, the helper uses the value of the timekeeper
clock or the timerkeeper mult/shift. I would like to keep the ->read
call in the helper. Is there a situation where we need both
calculations for the same cycles value? There is none in the current
code as far as I can see.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-30 22:04   ` john stultz
@ 2009-07-31  7:52     ` Martin Schwidefsky
  2009-07-31  8:12       ` john stultz
  0 siblings, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-31  7:52 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Thu, 30 Jul 2009 15:04:48 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > plain text document attachment (timekeeper-mult.diff)
> > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > 
> > The clocksource structure has two multipliers, the unmodified multiplier
> > clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> > multiplier is misplaced in the struct clocksource, this is private
> > information of the timekeeping code. Add the mult field to the struct
> > timekeeper to contain the NTP corrected value, keep the unmodifed
> > multiplier in clock->mult and remove clock->mult_orig. For consistency
> > add the shift value associated with the NTP corrected mult value to
> > struct timekeeper as well.
> > 
> > Cc: Ingo Molnar <mingo@elte.hu>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: john stultz <johnstul@us.ibm.com>
> > Cc: Daniel Walker <dwalker@fifo99.com>
> > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > ---
> >  arch/arm/plat-omap/common.c |    2 +-
> >  include/linux/clocksource.h |    4 +---
> >  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
> >  3 files changed, 27 insertions(+), 22 deletions(-)
> > 
> > Index: linux-2.6/include/linux/clocksource.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/clocksource.h
> > +++ linux-2.6/include/linux/clocksource.h
> > @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
> >   * @disable:		optional function to disable the clocksource
> >   * @mask:		bitmask for two's complement
> >   *			subtraction of non 64 bit counters
> > - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> > - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> > + * @mult:		cycle to nanosecond multiplier
> >   * @shift:		cycle to nanosecond divisor (power of two)
> >   * @flags:		flags describing special properties
> >   * @vread:		vsyscall based read
> > @@ -168,7 +167,6 @@ struct clocksource {
> >  	void (*disable)(struct clocksource *cs);
> >  	cycle_t mask;
> >  	u32 mult;
> > -	u32 mult_orig;
> >  	u32 shift;
> >  	unsigned long flags;
> >  	cycle_t (*vread)(void);
> > Index: linux-2.6/kernel/time/timekeeping.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/time/timekeeping.c
> > +++ linux-2.6/kernel/time/timekeeping.c
> > @@ -29,6 +29,8 @@ struct timekeeper {
> >  	s64	ntp_error;
> >  	int	xtime_shift;
> >  	int	ntp_error_shift;
> > +	u32	mult;			/* NTP adjusted clock multiplier */
> > +	u32	shift;			/* NTP adjusted clock shift */
> 
> I'm a little confused here. NTP doesn't adjust the shift value. This
> seems redundant with the xtime_shift value introduced in [patch 9/12].

True the NTP code does not adjust the shift value. But mult and shift
build a pair, no? I think it is more readable if we use timekeeper.mult 
and timekeeper.shift instead of timekeeper.mult and clock->shift.
And the xtime_shift does not need to be the same value as the clock
shift - at least in theory. It is just a shift to increase precision. 
Therefore I would like to use the xtime_shift and ntp_error_shift values
exclusively as the shifts for the respective xtime_nsec and ntp_error
values. Makes sense ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 11/12] timekeeper read clock helper functions
  2009-07-31  7:45     ` Martin Schwidefsky
@ 2009-07-31  8:11       ` john stultz
  0 siblings, 0 replies; 32+ messages in thread
From: john stultz @ 2009-07-31  8:11 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 2009-07-31 at 09:45 +0200, Martin Schwidefsky wrote:
> On Thu, 30 Jul 2009 14:39:54 -0700
> john stultz <johnstul@us.ibm.com> wrote:
> 
> > On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > > plain text document attachment (timekeeper-helper.diff)
> > > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > 
> > > Add timekeeper_read_clock_ntp and timekeeper_read_clock_raw and use
> > > them for getnstimeofday, ktime_get, ktime_get_ts and getrawmonotonic.
> > >
> > > Cc: Ingo Molnar <mingo@elte.hu>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: john stultz <johnstul@us.ibm.com>
> > > Cc: Daniel Walker <dwalker@fifo99.com>
> > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > ---
> > >  kernel/time/timekeeping.c |   91 +++++++++++++++++++---------------------------
> > >  1 file changed, 38 insertions(+), 53 deletions(-)
> > > 
> > > Index: linux-2.6/kernel/time/timekeeping.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/time/timekeeping.c
> > > +++ linux-2.6/kernel/time/timekeeping.c
> > > @@ -84,6 +84,40 @@ static void timekeeper_setup_internals(s
> > >  	timekeeper.shift = clock->shift;
> > >  }
> > > 
> > > +/* Timekeeper helper functions. */
> > > +static inline s64 timekeeper_read_clock_ntp(void)
> > > +{
> > > +	cycle_t cycle_now, cycle_delta;
> > > +	struct clocksource *clock;
> > > +
> > > +	/* read clocksource: */
> > > +	clock = timekeeper.clock;
> > > +	cycle_now = clock->read(clock);
> > > +
> > 
> > I know it seems nice to have it here, but I think these helpers would be
> > more reusable in other contexts if they took the cycle_now value as an
> > argument. Also I'd drop the ntp bit, just to avoid confusing it with
> > some ntp specific function. So:
> > 
> >  timekeeping_get_ns(cycle_t now);
> >  timekeeping_get_ns_raw(cycle_t now);
> > 
> > That way in some situations we don't have to make two accesses to the
> > hardware if we want to get both values at the same point.
> > 
> > Seem reasonable?
> 
> The new names are fine but if we pull out the ->read call to the
> caller we again have a rather strange mix. The caller gets the cycle
> value using some clock, the helper uses the value of the timekeeper
> clock or the timerkeeper mult/shift. I would like to keep the ->read
> call in the helper. Is there a situation where we need both
> calculations for the same cycles value? There is none in the current
> code as far as I can see.

One instance: Changing it would allow us to use this code in
timekeeping_forward() as we want to be able to accumulate the current
cycles into xtime and then set cycles_last equal to the read value at
that time. By embedding the read into the timekeeping_read_clock_x() it
avoids us from using the function with any of the management calls. If
you look at the mega-patch I sent out earlier, I think there were a few
uses for such helper functions in update_wall_time and other spots.

Also I have had some discussions with folks that would like to have the
ability to generate multiple CLOCK_ID values at the same "instance".
There isn't a good interface to userland for such a feature, but I'm
hesitant to make it difficult to compute on the kernel side.

thanks
-john



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-31  7:52     ` Martin Schwidefsky
@ 2009-07-31  8:12       ` john stultz
  2009-07-31  8:27         ` Martin Schwidefsky
  2009-07-31  9:00         ` Martin Schwidefsky
  0 siblings, 2 replies; 32+ messages in thread
From: john stultz @ 2009-07-31  8:12 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 2009-07-31 at 09:52 +0200, Martin Schwidefsky wrote:
> On Thu, 30 Jul 2009 15:04:48 -0700
> john stultz <johnstul@us.ibm.com> wrote:
> 
> > On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > > plain text document attachment (timekeeper-mult.diff)
> > > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > 
> > > The clocksource structure has two multipliers, the unmodified multiplier
> > > clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> > > multiplier is misplaced in the struct clocksource, this is private
> > > information of the timekeeping code. Add the mult field to the struct
> > > timekeeper to contain the NTP corrected value, keep the unmodifed
> > > multiplier in clock->mult and remove clock->mult_orig. For consistency
> > > add the shift value associated with the NTP corrected mult value to
> > > struct timekeeper as well.
> > > 
> > > Cc: Ingo Molnar <mingo@elte.hu>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: john stultz <johnstul@us.ibm.com>
> > > Cc: Daniel Walker <dwalker@fifo99.com>
> > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > ---
> > >  arch/arm/plat-omap/common.c |    2 +-
> > >  include/linux/clocksource.h |    4 +---
> > >  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
> > >  3 files changed, 27 insertions(+), 22 deletions(-)
> > > 
> > > Index: linux-2.6/include/linux/clocksource.h
> > > ===================================================================
> > > --- linux-2.6.orig/include/linux/clocksource.h
> > > +++ linux-2.6/include/linux/clocksource.h
> > > @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
> > >   * @disable:		optional function to disable the clocksource
> > >   * @mask:		bitmask for two's complement
> > >   *			subtraction of non 64 bit counters
> > > - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> > > - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> > > + * @mult:		cycle to nanosecond multiplier
> > >   * @shift:		cycle to nanosecond divisor (power of two)
> > >   * @flags:		flags describing special properties
> > >   * @vread:		vsyscall based read
> > > @@ -168,7 +167,6 @@ struct clocksource {
> > >  	void (*disable)(struct clocksource *cs);
> > >  	cycle_t mask;
> > >  	u32 mult;
> > > -	u32 mult_orig;
> > >  	u32 shift;
> > >  	unsigned long flags;
> > >  	cycle_t (*vread)(void);
> > > Index: linux-2.6/kernel/time/timekeeping.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/time/timekeeping.c
> > > +++ linux-2.6/kernel/time/timekeeping.c
> > > @@ -29,6 +29,8 @@ struct timekeeper {
> > >  	s64	ntp_error;
> > >  	int	xtime_shift;
> > >  	int	ntp_error_shift;
> > > +	u32	mult;			/* NTP adjusted clock multiplier */
> > > +	u32	shift;			/* NTP adjusted clock shift */
> > 
> > I'm a little confused here. NTP doesn't adjust the shift value. This
> > seems redundant with the xtime_shift value introduced in [patch 9/12].
> 
> True the NTP code does not adjust the shift value. But mult and shift
> build a pair, no? I think it is more readable if we use timekeeper.mult 
> and timekeeper.shift instead of timekeeper.mult and clock->shift.
> And the xtime_shift does not need to be the same value as the clock
> shift - at least in theory. It is just a shift to increase precision. 
> Therefore I would like to use the xtime_shift and ntp_error_shift values
> exclusively as the shifts for the respective xtime_nsec and ntp_error
> values. Makes sense ?

But can't we drop the xtime_shift if we have the clocksource shift kept
internally? Unless I'm missing something more subtle in your patches,
they should be the same at all times.

thanks
-john




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 09/12] add xtime_shift and ntp_error_shift to struct timekeeper
  2009-07-30 22:15   ` john stultz
@ 2009-07-31  8:13     ` Martin Schwidefsky
  0 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-31  8:13 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Thu, 30 Jul 2009 15:15:23 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > plain text document attachment (timekeeper-shift.diff)
> > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > 
> > The xtime_nsec value in the timekeeper structure is shifted by a few
> > bits to improve precision. This happens to be the same value as the
> > clock->shift. To improve readability add xtime_shift to the timekeeper
> > and use it instead of the clock->shift. Likewise add ntp_error_shift
> > and replace all (NTP_SCALE_SHIFT - clock->shift) expressions.
> > 
> > Cc: Ingo Molnar <mingo@elte.hu>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: john stultz <johnstul@us.ibm.com>
> > Cc: Daniel Walker <dwalker@fifo99.com>
> > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > ---
> >  kernel/time/timekeeping.c |   33 ++++++++++++++++++---------------
> >  1 file changed, 18 insertions(+), 15 deletions(-)
> > 
> > Index: linux-2.6/kernel/time/timekeeping.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/time/timekeeping.c
> > +++ linux-2.6/kernel/time/timekeeping.c
> > @@ -27,6 +27,8 @@ struct timekeeper {
> >  	u32	raw_interval;
> >  	u64	xtime_nsec;
> >  	s64	ntp_error;
> > +	int	xtime_shift;
> > +	int	ntp_error_shift;
> >  };
> 
> I suspect ntp_error_shift is one of the lease intuitive values in the
> timekeeper structure. They all probably need nice comments explaining
> what they store, but especially this one. 
> 
> Something like:
> 
> struct clocksource *clock; /* current clocksource used for timekeeping*/
> cycle_t cycle_interval;	   /* fixed chunk of cycles used in accumulation */
> u64     xtime_interval;    /* number clock shifted nsecs accumulated per cycle_interval */
> u32	raw_interval;      /* raw nsecs accumulated per cycle_interval */
> u64	xtime_nsec;        /* clock shifted nsecs remainder not stored in xtime.tv_nsec */
> s64	ntp_error;         /* Difference between accumulated time and NTP time in ntp shifted nsecs */
> int	xtime_shift;       /* The shift value of the current clocksource */
> int	ntp_error_shift;   /* Shift conversion between clock shifted nsecs and ntp shifted nsecs */

For xtime_shift I would like to use a different comment. As I think
about the value it is the number of bits the xtime_nsec value is
shifted to improve precision. It is the same as clock->shift but the
semantics is different. How about:

        /* Shift conversion between the xtime_nsec remainder and nsecs */
        int     xtime_shift;

The other comments are good, I've added them.


> Other then that this patch looks ok, but will need rigorous testing, as
> its very complicated and difficult to understand code thats being
> changed.

This patch in particular is rather easy to verify, but the overall series
will definitely need some time in linux-next.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-31  8:12       ` john stultz
@ 2009-07-31  8:27         ` Martin Schwidefsky
  2009-07-31  9:00         ` Martin Schwidefsky
  1 sibling, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-31  8:27 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 31 Jul 2009 01:12:59 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Fri, 2009-07-31 at 09:52 +0200, Martin Schwidefsky wrote:
> > On Thu, 30 Jul 2009 15:04:48 -0700
> > john stultz <johnstul@us.ibm.com> wrote:
> > 
> > > On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > > > plain text document attachment (timekeeper-mult.diff)
> > > > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > 
> > > > The clocksource structure has two multipliers, the unmodified multiplier
> > > > clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> > > > multiplier is misplaced in the struct clocksource, this is private
> > > > information of the timekeeping code. Add the mult field to the struct
> > > > timekeeper to contain the NTP corrected value, keep the unmodifed
> > > > multiplier in clock->mult and remove clock->mult_orig. For consistency
> > > > add the shift value associated with the NTP corrected mult value to
> > > > struct timekeeper as well.
> > > > 
> > > > Cc: Ingo Molnar <mingo@elte.hu>
> > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > Cc: john stultz <johnstul@us.ibm.com>
> > > > Cc: Daniel Walker <dwalker@fifo99.com>
> > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > ---
> > > >  arch/arm/plat-omap/common.c |    2 +-
> > > >  include/linux/clocksource.h |    4 +---
> > > >  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
> > > >  3 files changed, 27 insertions(+), 22 deletions(-)
> > > > 
> > > > Index: linux-2.6/include/linux/clocksource.h
> > > > ===================================================================
> > > > --- linux-2.6.orig/include/linux/clocksource.h
> > > > +++ linux-2.6/include/linux/clocksource.h
> > > > @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
> > > >   * @disable:		optional function to disable the clocksource
> > > >   * @mask:		bitmask for two's complement
> > > >   *			subtraction of non 64 bit counters
> > > > - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> > > > - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> > > > + * @mult:		cycle to nanosecond multiplier
> > > >   * @shift:		cycle to nanosecond divisor (power of two)
> > > >   * @flags:		flags describing special properties
> > > >   * @vread:		vsyscall based read
> > > > @@ -168,7 +167,6 @@ struct clocksource {
> > > >  	void (*disable)(struct clocksource *cs);
> > > >  	cycle_t mask;
> > > >  	u32 mult;
> > > > -	u32 mult_orig;
> > > >  	u32 shift;
> > > >  	unsigned long flags;
> > > >  	cycle_t (*vread)(void);
> > > > Index: linux-2.6/kernel/time/timekeeping.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/time/timekeeping.c
> > > > +++ linux-2.6/kernel/time/timekeeping.c
> > > > @@ -29,6 +29,8 @@ struct timekeeper {
> > > >  	s64	ntp_error;
> > > >  	int	xtime_shift;
> > > >  	int	ntp_error_shift;
> > > > +	u32	mult;			/* NTP adjusted clock multiplier */
> > > > +	u32	shift;			/* NTP adjusted clock shift */
> > > 
> > > I'm a little confused here. NTP doesn't adjust the shift value. This
> > > seems redundant with the xtime_shift value introduced in [patch 9/12].
> > 
> > True the NTP code does not adjust the shift value. But mult and shift
> > build a pair, no? I think it is more readable if we use timekeeper.mult 
> > and timekeeper.shift instead of timekeeper.mult and clock->shift.
> > And the xtime_shift does not need to be the same value as the clock
> > shift - at least in theory. It is just a shift to increase precision. 
> > Therefore I would like to use the xtime_shift and ntp_error_shift values
> > exclusively as the shifts for the respective xtime_nsec and ntp_error
> > values. Makes sense ?
> 
> But can't we drop the xtime_shift if we have the clocksource shift kept
> internally? Unless I'm missing something more subtle in your patches,
> they should be the same at all times.

They are the same. Perhaps a good comment in the timekeeper struct is
good enough. Then the patch that introduces the shift values would add
timekeeper.shift, timekeeper.xtime_shift is dropped, and the patch that
add the timekeeper.mult would just use the timekeeper.shift as it is.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-31  8:12       ` john stultz
  2009-07-31  8:27         ` Martin Schwidefsky
@ 2009-07-31  9:00         ` Martin Schwidefsky
  2009-07-31 23:32           ` john stultz
  1 sibling, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2009-07-31  9:00 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 31 Jul 2009 01:12:59 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Fri, 2009-07-31 at 09:52 +0200, Martin Schwidefsky wrote:
> > On Thu, 30 Jul 2009 15:04:48 -0700
> > john stultz <johnstul@us.ibm.com> wrote:
> > 
> > > On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > > > plain text document attachment (timekeeper-mult.diff)
> > > > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > 
> > > > The clocksource structure has two multipliers, the unmodified multiplier
> > > > clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> > > > multiplier is misplaced in the struct clocksource, this is private
> > > > information of the timekeeping code. Add the mult field to the struct
> > > > timekeeper to contain the NTP corrected value, keep the unmodifed
> > > > multiplier in clock->mult and remove clock->mult_orig. For consistency
> > > > add the shift value associated with the NTP corrected mult value to
> > > > struct timekeeper as well.
> > > > 
> > > > Cc: Ingo Molnar <mingo@elte.hu>
> > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > Cc: john stultz <johnstul@us.ibm.com>
> > > > Cc: Daniel Walker <dwalker@fifo99.com>
> > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > ---
> > > >  arch/arm/plat-omap/common.c |    2 +-
> > > >  include/linux/clocksource.h |    4 +---
> > > >  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
> > > >  3 files changed, 27 insertions(+), 22 deletions(-)
> > > > 
> > > > Index: linux-2.6/include/linux/clocksource.h
> > > > ===================================================================
> > > > --- linux-2.6.orig/include/linux/clocksource.h
> > > > +++ linux-2.6/include/linux/clocksource.h
> > > > @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
> > > >   * @disable:		optional function to disable the clocksource
> > > >   * @mask:		bitmask for two's complement
> > > >   *			subtraction of non 64 bit counters
> > > > - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> > > > - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> > > > + * @mult:		cycle to nanosecond multiplier
> > > >   * @shift:		cycle to nanosecond divisor (power of two)
> > > >   * @flags:		flags describing special properties
> > > >   * @vread:		vsyscall based read
> > > > @@ -168,7 +167,6 @@ struct clocksource {
> > > >  	void (*disable)(struct clocksource *cs);
> > > >  	cycle_t mask;
> > > >  	u32 mult;
> > > > -	u32 mult_orig;
> > > >  	u32 shift;
> > > >  	unsigned long flags;
> > > >  	cycle_t (*vread)(void);
> > > > Index: linux-2.6/kernel/time/timekeeping.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/kernel/time/timekeeping.c
> > > > +++ linux-2.6/kernel/time/timekeeping.c
> > > > @@ -29,6 +29,8 @@ struct timekeeper {
> > > >  	s64	ntp_error;
> > > >  	int	xtime_shift;
> > > >  	int	ntp_error_shift;
> > > > +	u32	mult;			/* NTP adjusted clock multiplier */
> > > > +	u32	shift;			/* NTP adjusted clock shift */
> > > 
> > > I'm a little confused here. NTP doesn't adjust the shift value. This
> > > seems redundant with the xtime_shift value introduced in [patch 9/12].
> > 
> > True the NTP code does not adjust the shift value. But mult and shift
> > build a pair, no? I think it is more readable if we use timekeeper.mult 
> > and timekeeper.shift instead of timekeeper.mult and clock->shift.
> > And the xtime_shift does not need to be the same value as the clock
> > shift - at least in theory. It is just a shift to increase precision. 
> > Therefore I would like to use the xtime_shift and ntp_error_shift values
> > exclusively as the shifts for the respective xtime_nsec and ntp_error
> > values. Makes sense ?
> 
> But can't we drop the xtime_shift if we have the clocksource shift kept
> internally? Unless I'm missing something more subtle in your patches,
> they should be the same at all times.

How about this definition for struct timekeeper?

/* Structure holding internal timekeeping values. */
struct timekeeper {
        /* Current clocksource used for timekeeping. */
        struct clocksource *clock;
        /* The shift value of the current clocksource. */
        int     shift;

        /* Number of clock cycles in one NTP interval. */
        cycle_t cycle_interval;
        /* Number of clock shifted nano seconds in one NTP interval. */
        u64     xtime_interval;
        /* Raw nano seconds accumulated per NTP interval. */
        u32     raw_interval;

        /* Clock shifted nano seconds remainder not stored in xtime.tv_nsec. */
        u64     xtime_nsec;
        /* Difference between accumulated time and NTP time in ntp
         * shifted nano seconds. */
        s64     ntp_error;
        /* Shift conversion between clock shifted nano seconds and
         * ntp shifted nano seconds. */
        int     ntp_error_shift;
        /* NTP adjusted clock multiplier */
        u32     mult;
};

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-31  9:00         ` Martin Schwidefsky
@ 2009-07-31 23:32           ` john stultz
  2009-08-03  8:02             ` Martin Schwidefsky
  0 siblings, 1 reply; 32+ messages in thread
From: john stultz @ 2009-07-31 23:32 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 2009-07-31 at 11:00 +0200, Martin Schwidefsky wrote:
> On Fri, 31 Jul 2009 01:12:59 -0700
> john stultz <johnstul@us.ibm.com> wrote:
> 
> > On Fri, 2009-07-31 at 09:52 +0200, Martin Schwidefsky wrote:
> > > On Thu, 30 Jul 2009 15:04:48 -0700
> > > john stultz <johnstul@us.ibm.com> wrote:
> > > 
> > > > On Wed, 2009-07-29 at 15:41 +0200, Martin Schwidefsky wrote:
> > > > > plain text document attachment (timekeeper-mult.diff)
> > > > > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > 
> > > > > The clocksource structure has two multipliers, the unmodified multiplier
> > > > > clock->mult_orig and the NTP corrected multiplier clock->mult. The NTP
> > > > > multiplier is misplaced in the struct clocksource, this is private
> > > > > information of the timekeeping code. Add the mult field to the struct
> > > > > timekeeper to contain the NTP corrected value, keep the unmodifed
> > > > > multiplier in clock->mult and remove clock->mult_orig. For consistency
> > > > > add the shift value associated with the NTP corrected mult value to
> > > > > struct timekeeper as well.
> > > > > 
> > > > > Cc: Ingo Molnar <mingo@elte.hu>
> > > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > > Cc: john stultz <johnstul@us.ibm.com>
> > > > > Cc: Daniel Walker <dwalker@fifo99.com>
> > > > > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > > ---
> > > > >  arch/arm/plat-omap/common.c |    2 +-
> > > > >  include/linux/clocksource.h |    4 +---
> > > > >  kernel/time/timekeeping.c   |   43 +++++++++++++++++++++++++------------------
> > > > >  3 files changed, 27 insertions(+), 22 deletions(-)
> > > > > 
> > > > > Index: linux-2.6/include/linux/clocksource.h
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/include/linux/clocksource.h
> > > > > +++ linux-2.6/include/linux/clocksource.h
> > > > > @@ -149,8 +149,7 @@ extern u64 timecounter_cyc2time(struct t
> > > > >   * @disable:		optional function to disable the clocksource
> > > > >   * @mask:		bitmask for two's complement
> > > > >   *			subtraction of non 64 bit counters
> > > > > - * @mult:		cycle to nanosecond multiplier (adjusted by NTP)
> > > > > - * @mult_orig:		cycle to nanosecond multiplier (unadjusted by NTP)
> > > > > + * @mult:		cycle to nanosecond multiplier
> > > > >   * @shift:		cycle to nanosecond divisor (power of two)
> > > > >   * @flags:		flags describing special properties
> > > > >   * @vread:		vsyscall based read
> > > > > @@ -168,7 +167,6 @@ struct clocksource {
> > > > >  	void (*disable)(struct clocksource *cs);
> > > > >  	cycle_t mask;
> > > > >  	u32 mult;
> > > > > -	u32 mult_orig;
> > > > >  	u32 shift;
> > > > >  	unsigned long flags;
> > > > >  	cycle_t (*vread)(void);
> > > > > Index: linux-2.6/kernel/time/timekeeping.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/kernel/time/timekeeping.c
> > > > > +++ linux-2.6/kernel/time/timekeeping.c
> > > > > @@ -29,6 +29,8 @@ struct timekeeper {
> > > > >  	s64	ntp_error;
> > > > >  	int	xtime_shift;
> > > > >  	int	ntp_error_shift;
> > > > > +	u32	mult;			/* NTP adjusted clock multiplier */
> > > > > +	u32	shift;			/* NTP adjusted clock shift */
> > > > 
> > > > I'm a little confused here. NTP doesn't adjust the shift value. This
> > > > seems redundant with the xtime_shift value introduced in [patch 9/12].
> > > 
> > > True the NTP code does not adjust the shift value. But mult and shift
> > > build a pair, no? I think it is more readable if we use timekeeper.mult 
> > > and timekeeper.shift instead of timekeeper.mult and clock->shift.
> > > And the xtime_shift does not need to be the same value as the clock
> > > shift - at least in theory. It is just a shift to increase precision. 
> > > Therefore I would like to use the xtime_shift and ntp_error_shift values
> > > exclusively as the shifts for the respective xtime_nsec and ntp_error
> > > values. Makes sense ?
> > 
> > But can't we drop the xtime_shift if we have the clocksource shift kept
> > internally? Unless I'm missing something more subtle in your patches,
> > they should be the same at all times.
> 
> How about this definition for struct timekeeper?
> 
> /* Structure holding internal timekeeping values. */
> struct timekeeper {
>         /* Current clocksource used for timekeeping. */
>         struct clocksource *clock;
>         /* The shift value of the current clocksource. */
>         int     shift;
> 
>         /* Number of clock cycles in one NTP interval. */
>         cycle_t cycle_interval;
>         /* Number of clock shifted nano seconds in one NTP interval. */
>         u64     xtime_interval;
>         /* Raw nano seconds accumulated per NTP interval. */

I might just use the term "accumulation interval"  instead of NTP
interval. This isn't something that's strictly connected to NTP, other
then they both just need to have some chunk of time that we use as an
comparative interval.


>         u32     raw_interval;
> 
>         /* Clock shifted nano seconds remainder not stored in xtime.tv_nsec. */
>         u64     xtime_nsec;
>         /* Difference between accumulated time and NTP time in ntp
>          * shifted nano seconds. */
>         s64     ntp_error;
>         /* Shift conversion between clock shifted nano seconds and
>          * ntp shifted nano seconds. */
>         int     ntp_error_shift;
>         /* NTP adjusted clock multiplier */
>         u32     mult;
> };

Other then that it looks good to me!

thanks
-john



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-07-31 23:32           ` john stultz
@ 2009-08-03  8:02             ` Martin Schwidefsky
  0 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-08-03  8:02 UTC (permalink / raw)
  To: john stultz; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Daniel Walker

On Fri, 31 Jul 2009 16:32:58 -0700
john stultz <johnstul@us.ibm.com> wrote:

> On Fri, 2009-07-31 at 11:00 +0200, Martin Schwidefsky wrote:
> > How about this definition for struct timekeeper?
> > 
> > /* Structure holding internal timekeeping values. */
> > struct timekeeper {
> >         /* Current clocksource used for timekeeping. */
> >         struct clocksource *clock;
> >         /* The shift value of the current clocksource. */
> >         int     shift;
> > 
> >         /* Number of clock cycles in one NTP interval. */
> >         cycle_t cycle_interval;
> >         /* Number of clock shifted nano seconds in one NTP interval. */
> >         u64     xtime_interval;
> >         /* Raw nano seconds accumulated per NTP interval. */
> 
> I might just use the term "accumulation interval"  instead of NTP
> interval. This isn't something that's strictly connected to NTP, other
> then they both just need to have some chunk of time that we use as an
> comparative interval.

Acuumulation interval is more generic but it IS the NTP interval, no?
Do we plan to ever use something different than the NTP interval? 

> >         u32     raw_interval;
> > 
> >         /* Clock shifted nano seconds remainder not stored in xtime.tv_nsec. */
> >         u64     xtime_nsec;
> >         /* Difference between accumulated time and NTP time in ntp
> >          * shifted nano seconds. */
> >         s64     ntp_error;
> >         /* Shift conversion between clock shifted nano seconds and
> >          * ntp shifted nano seconds. */
> >         int     ntp_error_shift;
> >         /* NTP adjusted clock multiplier */
> >         u32     mult;
> > };
> 
> Other then that it looks good to me!

Ok, I'll update the patches.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct  timekeeper
  2009-07-29 13:41 ` [RFC][patch 10/12] move NTP adjusted clock multiplier " Martin Schwidefsky
  2009-07-30 22:04   ` john stultz
@ 2009-08-13 11:15   ` Linus Walleij
  2009-08-13 11:23     ` Martin Schwidefsky
  1 sibling, 1 reply; 32+ messages in thread
From: Linus Walleij @ 2009-08-13 11:15 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, john stultz,
	Daniel Walker

2009/7/29 Martin Schwidefsky <schwidefsky@de.ibm.com>:

> Index: linux-2.6/arch/arm/plat-omap/common.c
> ===================================================================
> --- linux-2.6.orig/arch/arm/plat-omap/common.c
> +++ linux-2.6/arch/arm/plat-omap/common.c
> @@ -256,7 +256,7 @@ unsigned long long sched_clock(void)
>        unsigned long long ret;
>
>        ret = (unsigned long long)clocksource_32k.read(&clocksource_32k);
> -       ret = (ret * clocksource_32k.mult_orig) >> clocksource_32k.shift;
> +       ret = (ret * clocksource_32k.mult) >> clocksource_32k.shift;

Doesn't the overall effect of your patch mean that the above can simply be
replaced by

ret = cyc2ns(&clocksource_32k, ret);

Or clocksource_cyc2ns() if that was renamed (I don't have the entire
picture here...)

>        return ret;
>  }

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC][patch 10/12] move NTP adjusted clock multiplier to struct timekeeper
  2009-08-13 11:15   ` Linus Walleij
@ 2009-08-13 11:23     ` Martin Schwidefsky
  0 siblings, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2009-08-13 11:23 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, john stultz,
	Daniel Walker

On Thu, 13 Aug 2009 13:15:06 +0200
Linus Walleij <linus.ml.walleij@gmail.com> wrote:

> 2009/7/29 Martin Schwidefsky <schwidefsky@de.ibm.com>:
> 
> > Index: linux-2.6/arch/arm/plat-omap/common.c
> > ===================================================================
> > --- linux-2.6.orig/arch/arm/plat-omap/common.c
> > +++ linux-2.6/arch/arm/plat-omap/common.c
> > @@ -256,7 +256,7 @@ unsigned long long sched_clock(void)
> >        unsigned long long ret;
> >
> >        ret = (unsigned long long)clocksource_32k.read(&clocksource_32k);
> > -       ret = (ret * clocksource_32k.mult_orig) >> clocksource_32k.shift;
> > +       ret = (ret * clocksource_32k.mult) >> clocksource_32k.shift;
> 
> Doesn't the overall effect of your patch mean that the above can simply be
> replaced by
> 
> ret = cyc2ns(&clocksource_32k, ret);
> 
> Or clocksource_cyc2ns() if that was renamed (I don't have the entire
> picture here...)
> 
> >        return ret;
> >  }

Yes, you can do that. More specifically it will be

  clocksource_cyc2ns(ret, clocksource_32k.mult, clocksource_32k.shift)

after all patches are applied.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2009-08-13 11:23 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-29 13:41 [RFC][patch 00/12] clocksource / timekeeping rework V2 Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 01/12] introduce timekeeping_leap_insert Martin Schwidefsky
2009-07-30 21:02   ` john stultz
2009-07-29 13:41 ` [RFC][patch 02/12] remove clocksource inline functions Martin Schwidefsky
2009-07-29 14:15   ` Daniel Walker
2009-07-30 21:46     ` Christoph Hellwig
2009-07-30 21:05   ` john stultz
2009-07-29 13:41 ` [RFC][patch 03/12] cleanup clocksource selection Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 04/12] clocksource watchdog highres enablement Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 05/12] clocksource watchdog resume logic Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 06/12] clocksource watchdog refactoring Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 07/12] clocksource watchdog work Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 08/12] introduce struct timekeeper Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 09/12] add xtime_shift and ntp_error_shift to " Martin Schwidefsky
2009-07-30 22:15   ` john stultz
2009-07-31  8:13     ` Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 10/12] move NTP adjusted clock multiplier " Martin Schwidefsky
2009-07-30 22:04   ` john stultz
2009-07-31  7:52     ` Martin Schwidefsky
2009-07-31  8:12       ` john stultz
2009-07-31  8:27         ` Martin Schwidefsky
2009-07-31  9:00         ` Martin Schwidefsky
2009-07-31 23:32           ` john stultz
2009-08-03  8:02             ` Martin Schwidefsky
2009-08-13 11:15   ` Linus Walleij
2009-08-13 11:23     ` Martin Schwidefsky
2009-07-29 13:41 ` [RFC][patch 11/12] timekeeper read clock helper functions Martin Schwidefsky
2009-07-30 21:39   ` john stultz
2009-07-31  7:45     ` Martin Schwidefsky
2009-07-31  8:11       ` john stultz
2009-07-29 13:41 ` [RFC][patch 12/12] update clocksource with stop_machine Martin Schwidefsky
2009-07-29 15:10 ` [RFC][patch 00/12] clocksource / timekeeping rework V2 Daniel Walker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox