linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Tickless Hz/hrtimers/etc. on PowerPC
@ 2007-07-11 18:06 Matt Sealey
  2007-07-11 18:17 ` Sergei Shtylyov
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Matt Sealey @ 2007-07-11 18:06 UTC (permalink / raw)
  To: ppc-dev

Does anyone have the definitive patchset to enable the tickless hz,
some kind of hrtimer and the other related improvements in the
PowerPC tree?

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 18:06 Tickless Hz/hrtimers/etc. on PowerPC Matt Sealey
@ 2007-07-11 18:17 ` Sergei Shtylyov
  2007-07-11 18:33 ` Michael Neuling
  2007-07-12  6:51 ` Domen Puncer
  2 siblings, 0 replies; 19+ messages in thread
From: Sergei Shtylyov @ 2007-07-11 18:17 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev

Hello.

Matt Sealey wrote:

> Does anyone have the definitive patchset to enable the tickless hz,
> some kind of hrtimer and the other related improvements in the
> PowerPC tree?

    Look into the -rt patch which has the all this (minus TOD vsyscalls).

MBR, Sergei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 18:06 Tickless Hz/hrtimers/etc. on PowerPC Matt Sealey
  2007-07-11 18:17 ` Sergei Shtylyov
@ 2007-07-11 18:33 ` Michael Neuling
  2007-07-11 22:15   ` Matt Sealey
  2007-07-12  6:51 ` Domen Puncer
  2 siblings, 1 reply; 19+ messages in thread
From: Michael Neuling @ 2007-07-11 18:33 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev

> Does anyone have the definitive patchset to enable the tickless hz,
> some kind of hrtimer and the other related improvements in the PowerPC
> tree?

Tony Breeds has been looking at this.  I think he wanted to clean his
patch set up before he posted it.

Mikey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 18:33 ` Michael Neuling
@ 2007-07-11 22:15   ` Matt Sealey
  2007-07-12  6:41     ` Tony Breeds
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Matt Sealey @ 2007-07-11 22:15 UTC (permalink / raw)
  To: Michael Neuling; +Cc: ppc-dev

Okay.

What I didn't want to do is spend a day sifting some other development
tree picking out what I think might be possibly sort of the right patches
for it.

I'd get it wrong because having not worked on it, I don't know what I am
even looking for.

And I don't want to run -rt or wireless-dev for the benefit of a single
feature. What I am after is something like Ingo Molnar throws out..
single patches done the old way, not git trees. It's so much easier to
handle and integrate for example into a Gentoo ebuild or to make a
tarball of accumulated patches from a certain release kernel.

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

Michael Neuling wrote:
>> Does anyone have the definitive patchset to enable the tickless hz,
>> some kind of hrtimer and the other related improvements in the PowerPC
>> tree?
> 
> Tony Breeds has been looking at this.  I think he wanted to clean his
> patch set up before he posted it.
> 
> Mikey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 22:15   ` Matt Sealey
@ 2007-07-12  6:41     ` Tony Breeds
  2007-07-12 12:07       ` Matt Sealey
  2007-07-12 15:49     ` Michael Neuling
  2007-07-12 16:32     ` Sergei Shtylyov
  2 siblings, 1 reply; 19+ messages in thread
From: Tony Breeds @ 2007-07-12  6:41 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev, Michael Neuling

On Wed, Jul 11, 2007 at 11:15:16PM +0100, Matt Sealey wrote:
 
> And I don't want to run -rt or wireless-dev for the benefit of a single
> feature. What I am after is something like Ingo Molnar throws out..
> single patches done the old way, not git trees. It's so much easier to
> handle and integrate for example into a Gentoo ebuild or to make a
> tarball of accumulated patches from a certain release kernel.

Hi Matt,
	In the near future I will have something that I can pass around
for review.  Which will be a quilt series of about 5 patches (based on
mainline).  I'll make sure to include you in the reviewers list.  At
this stage I'd hope they'll be in 2.6.24.

I have HRT in a state where you can enable it and it works, but NO_HZ
isn't quite right yet.

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 18:06 Tickless Hz/hrtimers/etc. on PowerPC Matt Sealey
  2007-07-11 18:17 ` Sergei Shtylyov
  2007-07-11 18:33 ` Michael Neuling
@ 2007-07-12  6:51 ` Domen Puncer
  2007-07-12 12:07   ` Matt Sealey
  2007-07-12 14:11   ` Sergei Shtylyov
  2 siblings, 2 replies; 19+ messages in thread
From: Domen Puncer @ 2007-07-12  6:51 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev

[-- Attachment #1: Type: text/plain, Size: 472 bytes --]

On 11/07/07 19:06 +0100, Matt Sealey wrote:
> Does anyone have the definitive patchset to enable the tickless hz,
> some kind of hrtimer and the other related improvements in the
> PowerPC tree?

I use attached patches for tickless.
Order in which they're applied:

PowerPC_GENERIC_CLOCKEVENTS.patch
PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch
PowerPC_enable_HRT_and_dynticks_support.patch
PowerPC_no_hz_fix.patch
tickless-enable.patch

HTH


	Domen

[-- Attachment #2: PowerPC_GENERIC_CLOCKEVENTS.patch --]
[-- Type: text/plain, Size: 6823 bytes --]

===================================================================
---
 arch/powerpc/Kconfig       |   12 +++-
 arch/powerpc/kernel/time.c |  124 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 134 insertions(+), 2 deletions(-)

Index: work-powerpc.git/arch/powerpc/Kconfig
===================================================================
--- work-powerpc.git.orig/arch/powerpc/Kconfig
+++ work-powerpc.git/arch/powerpc/Kconfig
@@ -347,7 +347,7 @@ config PPC_MM_SLICES
 
 config VIRT_CPU_ACCOUNTING
 	bool "Deterministic task and CPU time accounting"
-	depends on PPC64
+	depends on PPC64 && !GENERIC_CLOCKEVENTS
 	default y
 	help
 	  Select this option to enable more accurate task and CPU time
@@ -406,6 +406,16 @@ config HIGHMEM
 	depends on PPC32
 
 source kernel/Kconfig.hz
+
+config GENERIC_CLOCKEVENTS
+	bool "Clock event devices support"
+	default n
+	help
+	  Enable support for the clock event devices necessary for the
+	  high-resolution timers and the tickless system support.
+	  NOTE: This is not compatible with the deterministic time accounting
+	  option on PPC64.
+
 source kernel/Kconfig.preempt
 source "fs/Kconfig.binfmt"
 
Index: work-powerpc.git/arch/powerpc/kernel/time.c
===================================================================
--- work-powerpc.git.orig/arch/powerpc/kernel/time.c
+++ work-powerpc.git/arch/powerpc/kernel/time.c
@@ -52,6 +52,7 @@
 #include <linux/jiffies.h>
 #include <linux/posix-timers.h>
 #include <linux/irq.h>
+#include <linux/clockchips.h>
 
 #include <asm/io.h>
 #include <asm/processor.h>
@@ -127,6 +128,83 @@ unsigned long ppc_tb_freq;
 static u64 tb_last_jiffy __cacheline_aligned_in_smp;
 static DEFINE_PER_CPU(u64, last_jiffy);
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS
+
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+#define DECREMENTER_MAX 0xffffffff
+#else
+#define DECREMENTER_MAX 0x7fffffff /* setting MSB triggers an interrupt */
+#endif
+
+static int decrementer_set_next_event(unsigned long evt,
+				      struct clock_event_device *dev)
+{
+#if defined(CONFIG_40x)
+	mtspr(SPRN_PIT, evt);	/* 40x has a hidden PIT auto-reload register */
+#elif defined(CONFIG_BOOKE)
+	mtspr(SPRN_DECAR, evt); /* Book E has  separate auto-reload register */
+	set_dec(evt);
+#else
+	set_dec(evt - 1);	/* Classic decrementer interrupts at -1 */
+#endif
+	return 0;
+}
+
+static void decrementer_set_mode(enum	clock_event_mode   mode,
+				 struct clock_event_device *dev)
+{
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+	u32 tcr = mfspr(SPRN_TCR);
+
+	tcr |= TCR_DIE;
+	switch (mode) {
+	case CLOCK_EVT_MODE_PERIODIC:
+		tcr |=  TCR_ARE;
+		break;
+	case CLOCK_EVT_MODE_ONESHOT:
+		tcr &= ~TCR_ARE;
+		break;
+	case CLOCK_EVT_MODE_UNUSED:
+	case CLOCK_EVT_MODE_SHUTDOWN:
+		tcr &= ~TCR_DIE;
+		break;
+	}
+	mtspr(SPRN_TCR, tcr);
+#endif
+	if (mode == CLOCK_EVT_MODE_PERIODIC)
+		decrementer_set_next_event(tb_ticks_per_jiffy, dev);
+}
+
+static struct clock_event_device decrementer_clockevent = {
+	.name		= "decrementer",
+#if defined(CONFIG_40x) || defined(CONFIG_BOOKE)
+	.features	= CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC,
+#else
+	.features	= CLOCK_EVT_FEAT_ONESHOT,
+#endif
+	.shift		= 32,
+	.rating		= 200,
+	.irq		= -1,
+	.set_next_event	= decrementer_set_next_event,
+	.set_mode	= decrementer_set_mode,
+};
+
+static DEFINE_PER_CPU(struct clock_event_device, decrementers);
+
+static void register_decrementer(void)
+{
+	int cpu = smp_processor_id();
+	struct clock_event_device *decrementer = &per_cpu(decrementers, cpu);
+
+	memcpy(decrementer, &decrementer_clockevent, sizeof(*decrementer));
+
+	decrementer->cpumask = cpumask_of_cpu(cpu);
+
+	clockevents_register_device(decrementer);
+}
+
+#endif /* CONFIG_GENERIC_CLOCKEVENTS */
+
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING
 /*
  * Factors for converting from cputime_t (timebase ticks) to
@@ -312,6 +390,9 @@ void snapshot_timebase(void)
 {
 	__get_cpu_var(last_jiffy) = get_tb();
 	snapshot_purr();
+#ifdef CONFIG_GENERIC_CLOCKEVENTS
+	register_decrementer();
+#endif
 }
 
 void __delay(unsigned long loops)
@@ -627,7 +708,31 @@ void timer_interrupt(struct pt_regs * re
 	old_regs = set_irq_regs(regs);
 	irq_enter();
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS
+#ifdef CONFIG_PPC_MULTIPLATFORM
+	/*
+	 * We must write a positive value to the decrementer to clear
+	 * the interrupt on the IBM 970 CPU series.  In periodic mode,
+	 * this happens when the decrementer gets reloaded later, but
+	 * in one-shot mode, we have to do it here since an event handler
+	 * may skip loading the new value...
+	 */
+	if (per_cpu(decrementers, cpu).mode != CLOCK_EVT_MODE_PERIODIC)
+		set_dec(DECREMENTER_MAX);
+#endif
+	/*
+	 * We can't disable the decrementer, so in the period between
+	 * CPU being marked offline and calling stop-self, it's taking
+	 * timer interrupts...
+	 */
+	if (!cpu_is_offline(cpu)) {
+		struct clock_event_device *dev = &per_cpu(decrementers, cpu);
+
+		dev->event_handler(dev);
+	}
+#else
 	profile_tick(CPU_PROFILING);
+#endif
 	calculate_steal_time();
 
 #ifdef CONFIG_PPC_ISERIES
@@ -643,6 +748,7 @@ void timer_interrupt(struct pt_regs * re
 		if (__USE_RTC() && per_cpu(last_jiffy, cpu) >= 1000000000)
 			per_cpu(last_jiffy, cpu) -= 1000000000;
 
+#ifndef CONFIG_GENERIC_CLOCKEVENTS
 		/*
 		 * We cannot disable the decrementer, so in the period
 		 * between this cpu's being marked offline in cpu_online_map
@@ -652,6 +758,7 @@ void timer_interrupt(struct pt_regs * re
 		 */
 		if (!cpu_is_offline(cpu))
 			account_process_time(regs);
+#endif
 
 		/*
 		 * No need to check whether cpu is offline here; boot_cpuid
@@ -664,15 +771,19 @@ void timer_interrupt(struct pt_regs * re
 		tb_next_jiffy = tb_last_jiffy + tb_ticks_per_jiffy;
 		if (per_cpu(last_jiffy, cpu) >= tb_next_jiffy) {
 			tb_last_jiffy = tb_next_jiffy;
+#ifndef CONFIG_GENERIC_CLOCKEVENTS
 			do_timer(1);
+#endif
 			timer_recalc_offset(tb_last_jiffy);
 			timer_check_rtc();
 		}
 		write_sequnlock(&xtime_lock);
 	}
-	
+
+#ifndef CONFIG_GENERIC_CLOCKEVENTS
 	next_dec = tb_ticks_per_jiffy - ticks;
 	set_dec(next_dec);
+#endif
 
 #ifdef CONFIG_PPC_ISERIES
 	if (firmware_has_feature(FW_FEATURE_ISERIES) && hvlpevent_is_pending())
@@ -996,8 +1107,19 @@ void __init time_init(void)
 	                        -xtime.tv_sec, -xtime.tv_nsec);
 	write_sequnlock_irqrestore(&xtime_lock, flags);
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS
+	decrementer_clockevent.mult = div_sc(ppc_tb_freq, NSEC_PER_SEC,
+					     decrementer_clockevent.shift);
+	decrementer_clockevent.max_delta_ns =
+		clockevent_delta2ns(DECREMENTER_MAX, &decrementer_clockevent);
+	decrementer_clockevent.min_delta_ns =
+		clockevent_delta2ns(0xf, &decrementer_clockevent);
+
+	register_decrementer();
+#else
 	/* Not exact, but the timer interrupt takes care of this */
 	set_dec(tb_ticks_per_jiffy);
+#endif
 }
 
 

[-- Attachment #3: PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch --]
[-- Type: text/plain, Size: 10567 bytes --]

Early pass on powerpc conversion to generic timekeeping.

Signed-off-by: John Stultz <johnstul@us.ibm.com>

 arch/powerpc/Kconfig       |    4 
 arch/powerpc/kernel/time.c |  278 +++++----------------------------------------
 2 files changed, 37 insertions(+), 245 deletions(-)

linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch
============================================
Index: work-powerpc.git/arch/powerpc/Kconfig
===================================================================
--- work-powerpc.git.orig/arch/powerpc/Kconfig
+++ work-powerpc.git/arch/powerpc/Kconfig
@@ -31,6 +31,10 @@ config MMU
 	bool
 	default y
 
+config GENERIC_TIME
+	bool
+	default y
+
 config GENERIC_HARDIRQS
 	bool
 	default y
Index: work-powerpc.git/arch/powerpc/kernel/time.c
===================================================================
--- work-powerpc.git.orig/arch/powerpc/kernel/time.c
+++ work-powerpc.git/arch/powerpc/kernel/time.c
@@ -117,8 +117,6 @@ EXPORT_SYMBOL_GPL(rtc_lock);
 u64 tb_to_ns_scale;
 unsigned tb_to_ns_shift;
 
-struct gettimeofday_struct do_gtod;
-
 extern struct timezone sys_tz;
 static long timezone_offset;
 
@@ -456,160 +454,6 @@ static __inline__ void timer_check_rtc(v
         }
 }
 
-/*
- * This version of gettimeofday has microsecond resolution.
- */
-static inline void __do_gettimeofday(struct timeval *tv)
-{
-	unsigned long sec, usec;
-	u64 tb_ticks, xsec;
-	struct gettimeofday_vars *temp_varp;
-	u64 temp_tb_to_xs, temp_stamp_xsec;
-
-	/*
-	 * These calculations are faster (gets rid of divides)
-	 * if done in units of 1/2^20 rather than microseconds.
-	 * The conversion to microseconds at the end is done
-	 * without a divide (and in fact, without a multiply)
-	 */
-	temp_varp = do_gtod.varp;
-
-	/* Sampling the time base must be done after loading
-	 * do_gtod.varp in order to avoid racing with update_gtod.
-	 */
-	data_barrier(temp_varp);
-	tb_ticks = get_tb() - temp_varp->tb_orig_stamp;
-	temp_tb_to_xs = temp_varp->tb_to_xs;
-	temp_stamp_xsec = temp_varp->stamp_xsec;
-	xsec = temp_stamp_xsec + mulhdu(tb_ticks, temp_tb_to_xs);
-	sec = xsec / XSEC_PER_SEC;
-	usec = (unsigned long)xsec & (XSEC_PER_SEC - 1);
-	usec = SCALE_XSEC(usec, 1000000);
-
-	tv->tv_sec = sec;
-	tv->tv_usec = usec;
-}
-
-void do_gettimeofday(struct timeval *tv)
-{
-	if (__USE_RTC()) {
-		/* do this the old way */
-		unsigned long flags, seq;
-		unsigned int sec, nsec, usec;
-
-		do {
-			seq = read_seqbegin_irqsave(&xtime_lock, flags);
-			sec = xtime.tv_sec;
-			nsec = xtime.tv_nsec + tb_ticks_since(tb_last_jiffy);
-		} while (read_seqretry_irqrestore(&xtime_lock, seq, flags));
-		usec = nsec / 1000;
-		while (usec >= 1000000) {
-			usec -= 1000000;
-			++sec;
-		}
-		tv->tv_sec = sec;
-		tv->tv_usec = usec;
-		return;
-	}
-	__do_gettimeofday(tv);
-}
-
-EXPORT_SYMBOL(do_gettimeofday);
-
-/*
- * There are two copies of tb_to_xs and stamp_xsec so that no
- * lock is needed to access and use these values in
- * do_gettimeofday.  We alternate the copies and as long as a
- * reasonable time elapses between changes, there will never
- * be inconsistent values.  ntpd has a minimum of one minute
- * between updates.
- */
-static inline void update_gtod(u64 new_tb_stamp, u64 new_stamp_xsec,
-			       u64 new_tb_to_xs)
-{
-	unsigned temp_idx;
-	struct gettimeofday_vars *temp_varp;
-
-	temp_idx = (do_gtod.var_idx == 0);
-	temp_varp = &do_gtod.vars[temp_idx];
-
-	temp_varp->tb_to_xs = new_tb_to_xs;
-	temp_varp->tb_orig_stamp = new_tb_stamp;
-	temp_varp->stamp_xsec = new_stamp_xsec;
-	smp_mb();
-	do_gtod.varp = temp_varp;
-	do_gtod.var_idx = temp_idx;
-
-	/*
-	 * tb_update_count is used to allow the userspace gettimeofday code
-	 * to assure itself that it sees a consistent view of the tb_to_xs and
-	 * stamp_xsec variables.  It reads the tb_update_count, then reads
-	 * tb_to_xs and stamp_xsec and then reads tb_update_count again.  If
-	 * the two values of tb_update_count match and are even then the
-	 * tb_to_xs and stamp_xsec values are consistent.  If not, then it
-	 * loops back and reads them again until this criteria is met.
-	 * We expect the caller to have done the first increment of
-	 * vdso_data->tb_update_count already.
-	 */
-	vdso_data->tb_orig_stamp = new_tb_stamp;
-	vdso_data->stamp_xsec = new_stamp_xsec;
-	vdso_data->tb_to_xs = new_tb_to_xs;
-	vdso_data->wtom_clock_sec = wall_to_monotonic.tv_sec;
-	vdso_data->wtom_clock_nsec = wall_to_monotonic.tv_nsec;
-	smp_wmb();
-	++(vdso_data->tb_update_count);
-}
-
-/*
- * When the timebase - tb_orig_stamp gets too big, we do a manipulation
- * between tb_orig_stamp and stamp_xsec. The goal here is to keep the
- * difference tb - tb_orig_stamp small enough to always fit inside a
- * 32 bits number. This is a requirement of our fast 32 bits userland
- * implementation in the vdso. If we "miss" a call to this function
- * (interrupt latency, CPU locked in a spinlock, ...) and we end up
- * with a too big difference, then the vdso will fallback to calling
- * the syscall
- */
-static __inline__ void timer_recalc_offset(u64 cur_tb)
-{
-	unsigned long offset;
-	u64 new_stamp_xsec;
-	u64 tlen, t2x;
-	u64 tb, xsec_old, xsec_new;
-	struct gettimeofday_vars *varp;
-
-	if (__USE_RTC())
-		return;
-	tlen = current_tick_length();
-	offset = cur_tb - do_gtod.varp->tb_orig_stamp;
-	if (tlen == last_tick_len && offset < 0x80000000u)
-		return;
-	if (tlen != last_tick_len) {
-		t2x = mulhdu(tlen << TICKLEN_SHIFT, ticklen_to_xs);
-		last_tick_len = tlen;
-	} else
-		t2x = do_gtod.varp->tb_to_xs;
-	new_stamp_xsec = (u64) xtime.tv_nsec * XSEC_PER_SEC;
-	do_div(new_stamp_xsec, 1000000000);
-	new_stamp_xsec += (u64) xtime.tv_sec * XSEC_PER_SEC;
-
-	++vdso_data->tb_update_count;
-	smp_mb();
-
-	/*
-	 * Make sure time doesn't go backwards for userspace gettimeofday.
-	 */
-	tb = get_tb();
-	varp = do_gtod.varp;
-	xsec_old = mulhdu(tb - varp->tb_orig_stamp, varp->tb_to_xs)
-		+ varp->stamp_xsec;
-	xsec_new = mulhdu(tb - cur_tb, t2x) + new_stamp_xsec;
-	if (xsec_new < xsec_old)
-		new_stamp_xsec += xsec_old - xsec_new;
-
-	update_gtod(cur_tb, new_stamp_xsec, t2x);
-}
-
 #ifdef CONFIG_SMP
 unsigned long profile_pc(struct pt_regs *regs)
 {
@@ -659,11 +503,7 @@ static void iSeries_tb_recal(void)
 				tb_ticks_per_sec   = new_tb_ticks_per_sec;
 				calc_cputime_factors();
 				div128_by_32( XSEC_PER_SEC, 0, tb_ticks_per_sec, &divres );
-				do_gtod.tb_ticks_per_sec = tb_ticks_per_sec;
 				tb_to_xs = divres.result_low;
-				do_gtod.varp->tb_to_xs = tb_to_xs;
-				vdso_data->tb_ticks_per_sec = tb_ticks_per_sec;
-				vdso_data->tb_to_xs = tb_to_xs;
 			}
 			else {
 				printk( "Titan recalibrate: FAILED (difference > 4 percent)\n"
@@ -849,76 +689,6 @@ unsigned long long sched_clock(void)
 	return mulhdu(get_tb(), tb_to_ns_scale) << tb_to_ns_shift;
 }
 
-int do_settimeofday(struct timespec *tv)
-{
-	time_t wtm_sec, new_sec = tv->tv_sec;
-	long wtm_nsec, new_nsec = tv->tv_nsec;
-	unsigned long flags;
-	u64 new_xsec;
-	unsigned long tb_delta;
-
-	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
-		return -EINVAL;
-
-	write_seqlock_irqsave(&xtime_lock, flags);
-
-	/*
-	 * Updating the RTC is not the job of this code. If the time is
-	 * stepped under NTP, the RTC will be updated after STA_UNSYNC
-	 * is cleared.  Tools like clock/hwclock either copy the RTC
-	 * to the system time, in which case there is no point in writing
-	 * to the RTC again, or write to the RTC but then they don't call
-	 * settimeofday to perform this operation.
-	 */
-#ifdef CONFIG_PPC_ISERIES
-	if (firmware_has_feature(FW_FEATURE_ISERIES) && first_settimeofday) {
-		iSeries_tb_recal();
-		first_settimeofday = 0;
-	}
-#endif
-
-	/* Make userspace gettimeofday spin until we're done. */
-	++vdso_data->tb_update_count;
-	smp_mb();
-
-	/*
-	 * Subtract off the number of nanoseconds since the
-	 * beginning of the last tick.
-	 */
-	tb_delta = tb_ticks_since(tb_last_jiffy);
-	tb_delta = mulhdu(tb_delta, do_gtod.varp->tb_to_xs); /* in xsec */
-	new_nsec -= SCALE_XSEC(tb_delta, 1000000000);
-
-	wtm_sec  = wall_to_monotonic.tv_sec + (xtime.tv_sec - new_sec);
-	wtm_nsec = wall_to_monotonic.tv_nsec + (xtime.tv_nsec - new_nsec);
-
- 	set_normalized_timespec(&xtime, new_sec, new_nsec);
-	set_normalized_timespec(&wall_to_monotonic, wtm_sec, wtm_nsec);
-
-	/* In case of a large backwards jump in time with NTP, we want the 
-	 * clock to be updated as soon as the PLL is again in lock.
-	 */
-	last_rtc_update = new_sec - 658;
-
-	ntp_clear();
-
-	new_xsec = xtime.tv_nsec;
-	if (new_xsec != 0) {
-		new_xsec *= XSEC_PER_SEC;
-		do_div(new_xsec, NSEC_PER_SEC);
-	}
-	new_xsec += (u64)xtime.tv_sec * XSEC_PER_SEC;
-	update_gtod(tb_last_jiffy, new_xsec, do_gtod.varp->tb_to_xs);
-
-	vdso_data->tz_minuteswest = sys_tz.tz_minuteswest;
-	vdso_data->tz_dsttime = sys_tz.tz_dsttime;
-
-	write_sequnlock_irqrestore(&xtime_lock, flags);
-	clock_was_set();
-	return 0;
-}
-
-EXPORT_SYMBOL(do_settimeofday);
 
 static int __init get_freq(char *name, int cells, unsigned long *val)
 {
@@ -1085,20 +855,6 @@ void __init time_init(void)
 
 	xtime.tv_sec = tm;
 	xtime.tv_nsec = 0;
-	do_gtod.varp = &do_gtod.vars[0];
-	do_gtod.var_idx = 0;
-	do_gtod.varp->tb_orig_stamp = tb_last_jiffy;
-	__get_cpu_var(last_jiffy) = tb_last_jiffy;
-	do_gtod.varp->stamp_xsec = (u64) xtime.tv_sec * XSEC_PER_SEC;
-	do_gtod.tb_ticks_per_sec = tb_ticks_per_sec;
-	do_gtod.varp->tb_to_xs = tb_to_xs;
-	do_gtod.tb_to_us = tb_to_us;
-
-	vdso_data->tb_orig_stamp = tb_last_jiffy;
-	vdso_data->tb_update_count = 0;
-	vdso_data->tb_ticks_per_sec = tb_ticks_per_sec;
-	vdso_data->stamp_xsec = (u64) xtime.tv_sec * XSEC_PER_SEC;
-	vdso_data->tb_to_xs = tb_to_xs;
 
 	time_freq = 0;
 
@@ -1122,7 +878,6 @@ void __init time_init(void)
 #endif
 }
 
-
 #define FEBRUARY	2
 #define	STARTOFTIME	1970
 #define SECDAY		86400L
@@ -1267,3 +1022,36 @@ void div128_by_32(u64 dividend_high, u64
 	dr->result_low  = ((u64)y << 32) + z;
 
 }
+
+
+/* powerpc clocksource code */
+
+#include <linux/clocksource.h>
+static cycle_t timebase_read(void)
+{
+	return (cycle_t)get_tb();
+}
+
+struct clocksource clocksource_timebase = {
+	.name = "timebase",
+	.rating = 200,
+	.read = timebase_read,
+	.mask = (cycle_t)-1,
+	.mult = 0,
+	.shift = 22,
+};
+
+
+/* XXX - this should be calculated or properly externed! */
+static int __init init_timebase_clocksource(void)
+{
+	if (__USE_RTC())
+		return -ENODEV;
+
+	clocksource_timebase.mult = clocksource_hz2mult(tb_ticks_per_sec,
+					clocksource_timebase.shift);
+	return clocksource_register(&clocksource_timebase);
+}
+
+module_init(init_timebase_clocksource);
+

[-- Attachment #4: PowerPC_enable_HRT_and_dynticks_support.patch --]
[-- Type: text/plain, Size: 1350 bytes --]

---
 arch/powerpc/Kconfig       |    1 +
 arch/powerpc/kernel/idle.c |    3 +++
 2 files changed, 4 insertions(+)

Index: work-powerpc.git/arch/powerpc/Kconfig
===================================================================
--- work-powerpc.git.orig/arch/powerpc/Kconfig
+++ work-powerpc.git/arch/powerpc/Kconfig
@@ -416,6 +416,7 @@ config GENERIC_CLOCKEVENTS
 	  NOTE: This is not compatible with the deterministic time accounting
 	  option on PPC64.
 
+source kernel/time/Kconfig
 source kernel/Kconfig.preempt
 source "fs/Kconfig.binfmt"
 
Index: work-powerpc.git/arch/powerpc/kernel/idle.c
===================================================================
--- work-powerpc.git.orig/arch/powerpc/kernel/idle.c
+++ work-powerpc.git/arch/powerpc/kernel/idle.c
@@ -24,6 +24,7 @@
 #include <linux/smp.h>
 #include <linux/cpu.h>
 #include <linux/sysctl.h>
+#include <linux/tick.h>
 
 #include <asm/system.h>
 #include <asm/processor.h>
@@ -59,6 +60,7 @@ void cpu_idle(void)
 
 	set_thread_flag(TIF_POLLING_NRFLAG);
 	while (1) {
+		tick_nohz_stop_sched_tick();
 		while (!need_resched() && !cpu_should_die()) {
 			ppc64_runlatch_off();
 
@@ -92,6 +94,7 @@ void cpu_idle(void)
 		ppc64_runlatch_on();
 		if (cpu_should_die())
 			cpu_die();
+		tick_nohz_restart_sched_tick();
 		preempt_enable_no_resched();
 		schedule();
 		preempt_disable();

[-- Attachment #5: PowerPC_no_hz_fix.patch --]
[-- Type: text/plain, Size: 577 bytes --]

---
 arch/powerpc/kernel/time.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: work-powerpc.git/arch/powerpc/kernel/time.c
===================================================================
--- work-powerpc.git.orig/arch/powerpc/kernel/time.c
+++ work-powerpc.git/arch/powerpc/kernel/time.c
@@ -614,7 +614,7 @@ void timer_interrupt(struct pt_regs * re
 #ifndef CONFIG_GENERIC_CLOCKEVENTS
 			do_timer(1);
 #endif
-			timer_recalc_offset(tb_last_jiffy);
+			/*timer_recalc_offset(tb_last_jiffy);*/
 			timer_check_rtc();
 		}
 		write_sequnlock(&xtime_lock);

[-- Attachment #6: tickless-enable.patch --]
[-- Type: text/plain, Size: 701 bytes --]

This is needed for hrtimer_switch_to_hres() to get called.
hrtimer_run_queues()
|-tick_check_oneshot_change()
| \-timekeeping_is_continuous()
|   \- flags check
\-hrtimer_switch_to_hres()

Signed-off-by: Domen Puncer <domen.puncer@telargo.com>

---
 arch/powerpc/kernel/time.c |    1 +
 1 file changed, 1 insertion(+)

Index: work-powerpc.git/arch/powerpc/kernel/time.c
===================================================================
--- work-powerpc.git.orig/arch/powerpc/kernel/time.c
+++ work-powerpc.git/arch/powerpc/kernel/time.c
@@ -1039,6 +1039,7 @@ struct clocksource clocksource_timebase 
 	.mask = (cycle_t)-1,
 	.mult = 0,
 	.shift = 22,
+	.flags = CLOCK_SOURCE_VALID_FOR_HRES,
 };
 
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12  6:41     ` Tony Breeds
@ 2007-07-12 12:07       ` Matt Sealey
  2007-07-16  0:45         ` Tony Breeds
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Sealey @ 2007-07-12 12:07 UTC (permalink / raw)
  To: Tony Breeds; +Cc: ppc-dev, Michael Neuling

Hi Tony,

What does "isn't quite right yet" mean? Broken, acts funny, or just
a messy patch?

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

Tony Breeds wrote:
> On Wed, Jul 11, 2007 at 11:15:16PM +0100, Matt Sealey wrote:
>  
>> And I don't want to run -rt or wireless-dev for the benefit of a single
>> feature. What I am after is something like Ingo Molnar throws out..
>> single patches done the old way, not git trees. It's so much easier to
>> handle and integrate for example into a Gentoo ebuild or to make a
>> tarball of accumulated patches from a certain release kernel.
> 
> Hi Matt,
> 	In the near future I will have something that I can pass around
> for review.  Which will be a quilt series of about 5 patches (based on
> mainline).  I'll make sure to include you in the reviewers list.  At
> this stage I'd hope they'll be in 2.6.24.
> 
> I have HRT in a state where you can enable it and it works, but NO_HZ
> isn't quite right yet.
> 
> Yours Tony
> 
>   linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
>   Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12  6:51 ` Domen Puncer
@ 2007-07-12 12:07   ` Matt Sealey
  2007-07-13  8:41     ` Domen Puncer
  2007-07-12 14:11   ` Sergei Shtylyov
  1 sibling, 1 reply; 19+ messages in thread
From: Matt Sealey @ 2007-07-12 12:07 UTC (permalink / raw)
  To: Domen Puncer; +Cc: ppc-dev

Domen,

You wouldn't have tried these on the Efika yet, would you?

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

Domen Puncer wrote:
> On 11/07/07 19:06 +0100, Matt Sealey wrote:
>> Does anyone have the definitive patchset to enable the tickless hz,
>> some kind of hrtimer and the other related improvements in the
>> PowerPC tree?
> 
> I use attached patches for tickless.
> Order in which they're applied:
> 
> PowerPC_GENERIC_CLOCKEVENTS.patch
> PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch
> PowerPC_enable_HRT_and_dynticks_support.patch
> PowerPC_no_hz_fix.patch
> tickless-enable.patch
> 
> HTH
> 
> 
> 	Domen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12  6:51 ` Domen Puncer
  2007-07-12 12:07   ` Matt Sealey
@ 2007-07-12 14:11   ` Sergei Shtylyov
  2007-07-12 16:41     ` Sergei Shtylyov
  2007-07-13  8:49     ` Domen Puncer
  1 sibling, 2 replies; 19+ messages in thread
From: Sergei Shtylyov @ 2007-07-12 14:11 UTC (permalink / raw)
  To: Matt Sealey; +Cc: linuxppc-dev, Domen Puncer

[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]

Hello.

Domen Puncer wrote:

>>Does anyone have the definitive patchset to enable the tickless hz,
>>some kind of hrtimer and the other related improvements in the
>>PowerPC tree?

> I use attached patches for tickless.
> Order in which they're applied:

> PowerPC_GENERIC_CLOCKEVENTS.patch

    That's my patch which used to have both description and signoff that I'm 
not seeing in the attached version...

> PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch

    This one should come first of all, I'd say...
    Note that it breaks TOD vsyscalls, so you need my patch that removes 
support for those for the time being (i.e. until Tony hopefully fixes this 
:-).  Also, there was a patch implementing read_persistent_clock() and getting 
rid of the code setting xtime in time_init().  Attaching them both...

> PowerPC_enable_HRT_and_dynticks_support.patch

    Again looks like my patch with description/signoff missing for whatever
reason...

> PowerPC_no_hz_fix.patch

    This has nothing to do with CONFIG_NO_HZ per se -- it fixes the 
compilation error introduced by John's patch.

> tickless-enable.patch

    That one doesn't look quite right...

> HTH

> 	Domen

[...]

> ------------------------------------------------------------------------
> 
> This is needed for hrtimer_switch_to_hres() to get called.
> hrtimer_run_queues()
> |-tick_check_oneshot_change()
> | \-timekeeping_is_continuous()
> |   \- flags check
> \-hrtimer_switch_to_hres()

> Signed-off-by: Domen Puncer <domen.puncer@telargo.com>

> Index: work-powerpc.git/arch/powerpc/kernel/time.c
> ===================================================================
> --- work-powerpc.git.orig/arch/powerpc/kernel/time.c
> +++ work-powerpc.git/arch/powerpc/kernel/time.c
> @@ -1039,6 +1039,7 @@ struct clocksource clocksource_timebase 
>  	.mask = (cycle_t)-1,
>  	.mult = 0,
>  	.shift = 22,
> +	.flags = CLOCK_SOURCE_VALID_FOR_HRES,
>  };

     Hm, has the flag name changed from CLOCK_SOURCE_IS_CONTINOUS? I'm seeign
both there flags, therefore it must be CLOCK_SOURCE_IS_CONTINOUS, not
CLOCK_SOURCE_VALID_FOR_HRES.

WBR, Sergei

PS: All attached patches are against 2.6.21-rt2 -- fitting them into the 
current (or whatever) version of the kernel is left as an excercise to the 
readers. ;-)


[-- Attachment #2: ppc-remove-broken-vsyscalls.patch --]
[-- Type: text/x-patch, Size: 23295 bytes --]

Remove PowerPC vsyscalls that were broken by the generic TOD patch.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>

---
Since there's still no working PowerPC TOD vsyscalls fix, and they continue to
be broken in the RT patch, I've respun this patch again...

 arch/powerpc/kernel/vdso32/gettimeofday.S |  322 ------------------------------
 arch/powerpc/kernel/vdso64/gettimeofday.S |  254 -----------------------
 arch/powerpc/kernel/asm-offsets.c         |   15 -
 arch/powerpc/kernel/smp.c                 |    2 
 arch/powerpc/kernel/vdso32/Makefile       |    2 
 arch/powerpc/kernel/vdso32/datapage.S     |   18 -
 arch/powerpc/kernel/vdso32/vdso32.lds.S   |    4 
 arch/powerpc/kernel/vdso64/Makefile       |    2 
 arch/powerpc/kernel/vdso64/datapage.S     |   18 -
 arch/powerpc/kernel/vdso64/vdso64.lds.S   |    4 
 include/asm-powerpc/time.h                |   20 -
 include/asm-powerpc/vdso_datapage.h       |   14 -
 12 files changed, 2 insertions(+), 673 deletions(-)

Index: linux-2.6/arch/powerpc/kernel/asm-offsets.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/asm-offsets.c
+++ linux-2.6/arch/powerpc/kernel/asm-offsets.c
@@ -267,16 +267,7 @@ int main(void)
 #endif /* ! CONFIG_PPC64 */
 
 	/* datapage offsets for use by vdso */
-	DEFINE(CFG_TB_ORIG_STAMP, offsetof(struct vdso_data, tb_orig_stamp));
-	DEFINE(CFG_TB_TICKS_PER_SEC, offsetof(struct vdso_data, tb_ticks_per_sec));
-	DEFINE(CFG_TB_TO_XS, offsetof(struct vdso_data, tb_to_xs));
-	DEFINE(CFG_STAMP_XSEC, offsetof(struct vdso_data, stamp_xsec));
-	DEFINE(CFG_TB_UPDATE_COUNT, offsetof(struct vdso_data, tb_update_count));
-	DEFINE(CFG_TZ_MINUTEWEST, offsetof(struct vdso_data, tz_minuteswest));
-	DEFINE(CFG_TZ_DSTTIME, offsetof(struct vdso_data, tz_dsttime));
 	DEFINE(CFG_SYSCALL_MAP32, offsetof(struct vdso_data, syscall_map_32));
-	DEFINE(WTOM_CLOCK_SEC, offsetof(struct vdso_data, wtom_clock_sec));
-	DEFINE(WTOM_CLOCK_NSEC, offsetof(struct vdso_data, wtom_clock_nsec));
 #ifdef CONFIG_PPC64
 	DEFINE(CFG_SYSCALL_MAP64, offsetof(struct vdso_data, syscall_map_64));
 	DEFINE(TVAL64_TV_SEC, offsetof(struct timeval, tv_sec));
@@ -297,12 +288,6 @@ int main(void)
 	DEFINE(TZONE_TZ_MINWEST, offsetof(struct timezone, tz_minuteswest));
 	DEFINE(TZONE_TZ_DSTTIME, offsetof(struct timezone, tz_dsttime));
 
-	/* Other bits used by the vdso */
-	DEFINE(CLOCK_REALTIME, CLOCK_REALTIME);
-	DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC);
-	DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
-	DEFINE(CLOCK_REALTIME_RES, TICK_NSEC);
-
 #ifdef CONFIG_BUG
 	DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
 #endif
Index: linux-2.6/arch/powerpc/kernel/smp.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6/arch/powerpc/kernel/smp.c
@@ -308,8 +308,6 @@ void smp_call_function_interrupt(void)
 	}
 }
 
-extern struct gettimeofday_struct do_gtod;
-
 struct thread_info *current_set[NR_CPUS];
 
 DECLARE_PER_CPU(unsigned int, pvr);
Index: linux-2.6/arch/powerpc/kernel/vdso32/Makefile
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso32/Makefile
+++ linux-2.6/arch/powerpc/kernel/vdso32/Makefile
@@ -1,7 +1,7 @@
 
 # List of files in the vdso, has to be asm only for now
 
-obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o
+obj-vdso32 = sigtramp.o datapage.o cacheflush.o note.o
 
 # Build rules
 
Index: linux-2.6/arch/powerpc/kernel/vdso32/datapage.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso32/datapage.S
+++ linux-2.6/arch/powerpc/kernel/vdso32/datapage.S
@@ -65,21 +65,3 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_ma
 	blr
   .cfi_endproc
 V_FUNCTION_END(__kernel_get_syscall_map)
-
-/*
- * void unsigned long long  __kernel_get_tbfreq(void);
- *
- * returns the timebase frequency in HZ
- */
-V_FUNCTION_BEGIN(__kernel_get_tbfreq)
-  .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-	bl	__get_datapage@local
-	lwz	r4,(CFG_TB_TICKS_PER_SEC + 4)(r3)
-	lwz	r3,CFG_TB_TICKS_PER_SEC(r3)
-	mtlr	r12
-	crclr	cr0*4+so
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_get_tbfreq)
Index: linux-2.6/arch/powerpc/kernel/vdso32/gettimeofday.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ /dev/null
@@ -1,322 +0,0 @@
-/*
- * Userland implementation of gettimeofday() for 32 bits processes in a
- * ppc64 kernel for use in the vDSO
- *
- * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org,
- *                    IBM Corp.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-#include <asm/processor.h>
-#include <asm/ppc_asm.h>
-#include <asm/vdso.h>
-#include <asm/asm-offsets.h>
-#include <asm/unistd.h>
-
-	.text
-/*
- * Exact prototype of gettimeofday
- *
- * int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz);
- *
- */
-V_FUNCTION_BEGIN(__kernel_gettimeofday)
-  .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-
-	mr	r10,r3			/* r10 saves tv */
-	mr	r11,r4			/* r11 saves tz */
-	bl	__get_datapage@local	/* get data page */
-	mr	r9, r3			/* datapage ptr in r9 */
-	bl	__do_get_xsec@local	/* get xsec from tb & kernel */
-	bne-	2f			/* out of line -> do syscall */
-
-	/* seconds are xsec >> 20 */
-	rlwinm	r5,r4,12,20,31
-	rlwimi	r5,r3,12,0,19
-	stw	r5,TVAL32_TV_SEC(r10)
-
-	/* get remaining xsec and convert to usec. we scale
-	 * up remaining xsec by 12 bits and get the top 32 bits
-	 * of the multiplication
-	 */
-	rlwinm	r5,r4,12,0,19
-	lis	r6,1000000@h
-	ori	r6,r6,1000000@l
-	mulhwu	r5,r5,r6
-	stw	r5,TVAL32_TV_USEC(r10)
-
-	cmpli	cr0,r11,0		/* check if tz is NULL */
-	beq	1f
-	lwz	r4,CFG_TZ_MINUTEWEST(r9)/* fill tz */
-	lwz	r5,CFG_TZ_DSTTIME(r9)
-	stw	r4,TZONE_TZ_MINWEST(r11)
-	stw	r5,TZONE_TZ_DSTTIME(r11)
-
-1:	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-2:
-	mtlr	r12
-	mr	r3,r10
-	mr	r4,r11
-	li	r0,__NR_gettimeofday
-	sc
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_gettimeofday)
-
-/*
- * Exact prototype of clock_gettime()
- *
- * int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp);
- *
- */
-V_FUNCTION_BEGIN(__kernel_clock_gettime)
-  .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpli	cr0,r3,CLOCK_REALTIME
-	cmpli	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	mflr	r12			/* r12 saves lr */
-  .cfi_register lr,r12
-	mr	r10,r3			/* r10 saves id */
-	mr	r11,r4			/* r11 saves tp */
-	bl	__get_datapage@local	/* get data page */
-	mr	r9,r3			/* datapage ptr in r9 */
-	beq	cr1,50f			/* if monotonic -> jump there */
-
-	/*
-	 * CLOCK_REALTIME
-	 */
-
-	bl	__do_get_xsec@local	/* get xsec from tb & kernel */
-	bne-	98f			/* out of line -> do syscall */
-
-	/* seconds are xsec >> 20 */
-	rlwinm	r5,r4,12,20,31
-	rlwimi	r5,r3,12,0,19
-	stw	r5,TSPC32_TV_SEC(r11)
-
-	/* get remaining xsec and convert to nsec. we scale
-	 * up remaining xsec by 12 bits and get the top 32 bits
-	 * of the multiplication, then we multiply by 1000
-	 */
-	rlwinm	r5,r4,12,0,19
-	lis	r6,1000000@h
-	ori	r6,r6,1000000@l
-	mulhwu	r5,r5,r6
-	mulli	r5,r5,1000
-	stw	r5,TSPC32_TV_NSEC(r11)
-	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-	/*
-	 * CLOCK_MONOTONIC
-	 */
-
-50:	bl	__do_get_xsec@local	/* get xsec from tb & kernel */
-	bne-	98f			/* out of line -> do syscall */
-
-	/* seconds are xsec >> 20 */
-	rlwinm	r6,r4,12,20,31
-	rlwimi	r6,r3,12,0,19
-
-	/* get remaining xsec and convert to nsec. we scale
-	 * up remaining xsec by 12 bits and get the top 32 bits
-	 * of the multiplication, then we multiply by 1000
-	 */
-	rlwinm	r7,r4,12,0,19
-	lis	r5,1000000@h
-	ori	r5,r5,1000000@l
-	mulhwu	r7,r7,r5
-	mulli	r7,r7,1000
-
-	/* now we must fixup using wall to monotonic. We need to snapshot
-	 * that value and do the counter trick again. Fortunately, we still
-	 * have the counter value in r8 that was returned by __do_get_xsec.
-	 * At this point, r6,r7 contain our sec/nsec values, r3,r4 and r5
-	 * can be used
-	 */
-
-	lwz	r3,WTOM_CLOCK_SEC(r9)
-	lwz	r4,WTOM_CLOCK_NSEC(r9)
-
-	/* We now have our result in r3,r4. We create a fake dependency
-	 * on that result and re-check the counter
-	 */
-	or	r5,r4,r3
-	xor	r0,r5,r5
-	add	r9,r9,r0
-#ifdef CONFIG_PPC64
-	lwz	r0,(CFG_TB_UPDATE_COUNT+4)(r9)
-#else
-	lwz	r0,(CFG_TB_UPDATE_COUNT)(r9)
-#endif
-        cmpl    cr0,r8,r0		/* check if updated */
-	bne-	50b
-
-	/* Calculate and store result. Note that this mimmics the C code,
-	 * which may cause funny results if nsec goes negative... is that
-	 * possible at all ?
-	 */
-	add	r3,r3,r6
-	add	r4,r4,r7
-	lis	r5,NSEC_PER_SEC@h
-	ori	r5,r5,NSEC_PER_SEC@l
-	cmpl	cr0,r4,r5
-	cmpli	cr1,r4,0
-	blt	1f
-	subf	r4,r5,r4
-	addi	r3,r3,1
-1:	bge	cr1,1f
-	addi	r3,r3,-1
-	add	r4,r4,r5
-1:	stw	r3,TSPC32_TV_SEC(r11)
-	stw	r4,TSPC32_TV_NSEC(r11)
-
-	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-98:
-	mtlr	r12
-	mr	r3,r10
-	mr	r4,r11
-99:
-	li	r0,__NR_clock_gettime
-	sc
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_clock_gettime)
-
-
-/*
- * Exact prototype of clock_getres()
- *
- * int __kernel_clock_getres(clockid_t clock_id, struct timespec *res);
- *
- */
-V_FUNCTION_BEGIN(__kernel_clock_getres)
-  .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpwi	cr0,r3,CLOCK_REALTIME
-	cmpwi	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	li	r3,0
-	cmpli	cr0,r4,0
-	crclr	cr0*4+so
-	beqlr
-	lis	r5,CLOCK_REALTIME_RES@h
-	ori	r5,r5,CLOCK_REALTIME_RES@l
-	stw	r3,TSPC32_TV_SEC(r4)
-	stw	r5,TSPC32_TV_NSEC(r4)
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-99:
-	li	r0,__NR_clock_getres
-	sc
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_clock_getres)
-
-
-/*
- * This is the core of gettimeofday() & friends, it returns the xsec
- * value in r3 & r4 and expects the datapage ptr (non clobbered)
- * in r9. clobbers r0,r4,r5,r6,r7,r8.
- * When returning, r8 contains the counter value that can be reused
- * by the monotonic clock implementation
- */
-__do_get_xsec:
-  .cfi_startproc
-	/* Check for update count & load values. We use the low
-	 * order 32 bits of the update count
-	 */
-#ifdef CONFIG_PPC64
-1:	lwz	r8,(CFG_TB_UPDATE_COUNT+4)(r9)
-#else
-1:	lwz	r8,(CFG_TB_UPDATE_COUNT)(r9)
-#endif
-	andi.	r0,r8,1			/* pending update ? loop */
-	bne-	1b
-	xor	r0,r8,r8		/* create dependency */
-	add	r9,r9,r0
-
-	/* Load orig stamp (offset to TB) */
-	lwz	r5,CFG_TB_ORIG_STAMP(r9)
-	lwz	r6,(CFG_TB_ORIG_STAMP+4)(r9)
-
-	/* Get a stable TB value */
-2:	mftbu	r3
-	mftbl	r4
-	mftbu	r0
-	cmpl	cr0,r3,r0
-	bne-	2b
-
-	/* Substract tb orig stamp. If the high part is non-zero, we jump to
-	 * the slow path which call the syscall.
-	 * If it's ok, then we have our 32 bits tb_ticks value in r7
-	 */
-	subfc	r7,r6,r4
-	subfe.	r0,r5,r3
-	bne-	3f
-
-	/* Load scale factor & do multiplication */
-	lwz	r5,CFG_TB_TO_XS(r9)	/* load values */
-	lwz	r6,(CFG_TB_TO_XS+4)(r9)
-	mulhwu	r4,r7,r5
-	mulhwu	r6,r7,r6
-	mullw	r0,r7,r5
-	addc	r6,r6,r0
-
-	/* At this point, we have the scaled xsec value in r4 + XER:CA
-	 * we load & add the stamp since epoch
-	 */
-	lwz	r5,CFG_STAMP_XSEC(r9)
-	lwz	r6,(CFG_STAMP_XSEC+4)(r9)
-	adde	r4,r4,r6
-	addze	r3,r5
-
-	/* We now have our result in r3,r4. We create a fake dependency
-	 * on that result and re-check the counter
-	 */
-	or	r6,r4,r3
-	xor	r0,r6,r6
-	add	r9,r9,r0
-#ifdef CONFIG_PPC64
-	lwz	r0,(CFG_TB_UPDATE_COUNT+4)(r9)
-#else
-	lwz	r0,(CFG_TB_UPDATE_COUNT)(r9)
-#endif
-        cmpl    cr0,r8,r0		/* check if updated */
-	bne-	1b
-
-	/* Warning ! The caller expects CR:EQ to be set to indicate a
-	 * successful calculation (so it won't fallback to the syscall
-	 * method). We have overriden that CR bit in the counter check,
-	 * but fortunately, the loop exit condition _is_ CR:EQ set, so
-	 * we can exit safely here. If you change this code, be careful
-	 * of that side effect.
-	 */
-3:	blr
-  .cfi_endproc
Index: linux-2.6/arch/powerpc/kernel/vdso32/vdso32.lds.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso32/vdso32.lds.S
+++ linux-2.6/arch/powerpc/kernel/vdso32/vdso32.lds.S
@@ -117,10 +117,6 @@ VERSION
     global:
 	__kernel_datapage_offset; /* Has to be there for the kernel to find */
 	__kernel_get_syscall_map;
-	__kernel_gettimeofday;
-	__kernel_clock_gettime;
-	__kernel_clock_getres;
-	__kernel_get_tbfreq;
 	__kernel_sync_dicache;
 	__kernel_sync_dicache_p5;
 	__kernel_sigtramp32;
Index: linux-2.6/arch/powerpc/kernel/vdso64/Makefile
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso64/Makefile
+++ linux-2.6/arch/powerpc/kernel/vdso64/Makefile
@@ -1,6 +1,6 @@
 # List of files in the vdso, has to be asm only for now
 
-obj-vdso64 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o
+obj-vdso64 = sigtramp.o datapage.o cacheflush.o note.o
 
 # Build rules
 
Index: linux-2.6/arch/powerpc/kernel/vdso64/datapage.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso64/datapage.S
+++ linux-2.6/arch/powerpc/kernel/vdso64/datapage.S
@@ -65,21 +65,3 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_ma
 	blr
   .cfi_endproc
 V_FUNCTION_END(__kernel_get_syscall_map)
-
-
-/*
- * void unsigned long  __kernel_get_tbfreq(void);
- *
- * returns the timebase frequency in HZ
- */
-V_FUNCTION_BEGIN(__kernel_get_tbfreq)
-  .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-	bl	V_LOCAL_FUNC(__get_datapage)
-	ld	r3,CFG_TB_TICKS_PER_SEC(r3)
-	mtlr	r12
-	crclr	cr0*4+so
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_get_tbfreq)
Index: linux-2.6/arch/powerpc/kernel/vdso64/gettimeofday.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ /dev/null
@@ -1,254 +0,0 @@
-
-	/*
- * Userland implementation of gettimeofday() for 64 bits processes in a
- * ppc64 kernel for use in the vDSO
- *
- * Copyright (C) 2004 Benjamin Herrenschmuidt (benh@kernel.crashing.org),
- *                    IBM Corp.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-#include <asm/processor.h>
-#include <asm/ppc_asm.h>
-#include <asm/vdso.h>
-#include <asm/asm-offsets.h>
-#include <asm/unistd.h>
-
-	.text
-/*
- * Exact prototype of gettimeofday
- *
- * int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz);
- *
- */
-V_FUNCTION_BEGIN(__kernel_gettimeofday)
-  .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-
-	mr	r11,r3			/* r11 holds tv */
-	mr	r10,r4			/* r10 holds tz */
-	bl	V_LOCAL_FUNC(__get_datapage)	/* get data page */
-	bl	V_LOCAL_FUNC(__do_get_xsec)	/* get xsec from tb & kernel */
-	lis     r7,15			/* r7 = 1000000 = USEC_PER_SEC */
-	ori     r7,r7,16960
-	rldicl  r5,r4,44,20		/* r5 = sec = xsec / XSEC_PER_SEC */
-	rldicr  r6,r5,20,43		/* r6 = sec * XSEC_PER_SEC */
-	std	r5,TVAL64_TV_SEC(r11)	/* store sec in tv */
-	subf	r0,r6,r4		/* r0 = xsec = (xsec - r6) */
-	mulld   r0,r0,r7		/* usec = (xsec * USEC_PER_SEC) /
-					 * XSEC_PER_SEC
-					 */
-	rldicl  r0,r0,44,20
-	cmpldi	cr0,r10,0		/* check if tz is NULL */
-	std	r0,TVAL64_TV_USEC(r11)	/* store usec in tv */
-	beq	1f
-	lwz	r4,CFG_TZ_MINUTEWEST(r3)/* fill tz */
-	lwz	r5,CFG_TZ_DSTTIME(r3)
-	stw	r4,TZONE_TZ_MINWEST(r10)
-	stw	r5,TZONE_TZ_DSTTIME(r10)
-1:	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0			/* always success */
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_gettimeofday)
-
-
-/*
- * Exact prototype of clock_gettime()
- *
- * int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp);
- *
- */
-V_FUNCTION_BEGIN(__kernel_clock_gettime)
-  .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpwi	cr0,r3,CLOCK_REALTIME
-	cmpwi	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	mflr	r12			/* r12 saves lr */
-  .cfi_register lr,r12
-	mr	r10,r3			/* r10 saves id */
-	mr	r11,r4			/* r11 saves tp */
-	bl	V_LOCAL_FUNC(__get_datapage)	/* get data page */
-	beq	cr1,50f			/* if monotonic -> jump there */
-
-	/*
-	 * CLOCK_REALTIME
-	 */
-
-	bl	V_LOCAL_FUNC(__do_get_xsec)	/* get xsec from tb & kernel */
-
-	lis     r7,15			/* r7 = 1000000 = USEC_PER_SEC */
-	ori     r7,r7,16960
-	rldicl  r5,r4,44,20		/* r5 = sec = xsec / XSEC_PER_SEC */
-	rldicr  r6,r5,20,43		/* r6 = sec * XSEC_PER_SEC */
-	std	r5,TSPC64_TV_SEC(r11)	/* store sec in tv */
-	subf	r0,r6,r4		/* r0 = xsec = (xsec - r6) */
-	mulld   r0,r0,r7		/* usec = (xsec * USEC_PER_SEC) /
-					 * XSEC_PER_SEC
-					 */
-	rldicl  r0,r0,44,20
-	mulli	r0,r0,1000		/* nsec = usec * 1000 */
-	std	r0,TSPC64_TV_NSEC(r11)	/* store nsec in tp */
-
-	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-	/*
-	 * CLOCK_MONOTONIC
-	 */
-
-50:	bl	V_LOCAL_FUNC(__do_get_xsec)	/* get xsec from tb & kernel */
-
-	lis     r7,15			/* r7 = 1000000 = USEC_PER_SEC */
-	ori     r7,r7,16960
-	rldicl  r5,r4,44,20		/* r5 = sec = xsec / XSEC_PER_SEC */
-	rldicr  r6,r5,20,43		/* r6 = sec * XSEC_PER_SEC */
-	subf	r0,r6,r4		/* r0 = xsec = (xsec - r6) */
-	mulld   r0,r0,r7		/* usec = (xsec * USEC_PER_SEC) /
-					 * XSEC_PER_SEC
-					 */
-	rldicl  r6,r0,44,20
-	mulli	r6,r6,1000		/* nsec = usec * 1000 */
-
-	/* now we must fixup using wall to monotonic. We need to snapshot
-	 * that value and do the counter trick again. Fortunately, we still
-	 * have the counter value in r8 that was returned by __do_get_xsec.
-	 * At this point, r5,r6 contain our sec/nsec values.
-	 * can be used
-	 */
-
-	lwa	r4,WTOM_CLOCK_SEC(r3)
-	lwa	r7,WTOM_CLOCK_NSEC(r3)
-
-	/* We now have our result in r4,r7. We create a fake dependency
-	 * on that result and re-check the counter
-	 */
-	or	r9,r4,r7
-	xor	r0,r9,r9
-	add	r3,r3,r0
-	ld	r0,CFG_TB_UPDATE_COUNT(r3)
-        cmpld   cr0,r0,r8		/* check if updated */
-	bne-	50b
-
-	/* Calculate and store result. Note that this mimmics the C code,
-	 * which may cause funny results if nsec goes negative... is that
-	 * possible at all ?
-	 */
-	add	r4,r4,r5
-	add	r7,r7,r6
-	lis	r9,NSEC_PER_SEC@h
-	ori	r9,r9,NSEC_PER_SEC@l
-	cmpl	cr0,r7,r9
-	cmpli	cr1,r7,0
-	blt	1f
-	subf	r7,r9,r7
-	addi	r4,r4,1
-1:	bge	cr1,1f
-	addi	r4,r4,-1
-	add	r7,r7,r9
-1:	std	r4,TSPC64_TV_SEC(r11)
-	std	r7,TSPC64_TV_NSEC(r11)
-
-	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-98:
-	mtlr	r12
-	mr	r3,r10
-	mr	r4,r11
-99:
-	li	r0,__NR_clock_gettime
-	sc
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_clock_gettime)
-
-
-/*
- * Exact prototype of clock_getres()
- *
- * int __kernel_clock_getres(clockid_t clock_id, struct timespec *res);
- *
- */
-V_FUNCTION_BEGIN(__kernel_clock_getres)
-  .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpwi	cr0,r3,CLOCK_REALTIME
-	cmpwi	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	li	r3,0
-	cmpli	cr0,r4,0
-	crclr	cr0*4+so
-	beqlr
-	lis	r5,CLOCK_REALTIME_RES@h
-	ori	r5,r5,CLOCK_REALTIME_RES@l
-	std	r3,TSPC64_TV_SEC(r4)
-	std	r5,TSPC64_TV_NSEC(r4)
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-99:
-	li	r0,__NR_clock_getres
-	sc
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__kernel_clock_getres)
-
-
-/*
- * This is the core of gettimeofday(), it returns the xsec
- * value in r4 and expects the datapage ptr (non clobbered)
- * in r3. clobbers r0,r4,r5,r6,r7,r8
- * When returning, r8 contains the counter value that can be reused
- */
-V_FUNCTION_BEGIN(__do_get_xsec)
-  .cfi_startproc
-	/* check for update count & load values */
-1:	ld	r8,CFG_TB_UPDATE_COUNT(r3)
-	andi.	r0,r8,1			/* pending update ? loop */
-	bne-	1b
-	xor	r0,r8,r8		/* create dependency */
-	add	r3,r3,r0
-
-	/* Get TB & offset it. We use the MFTB macro which will generate
-	 * workaround code for Cell.
-	 */
-	MFTB(r7)
-	ld	r9,CFG_TB_ORIG_STAMP(r3)
-	subf	r7,r9,r7
-
-	/* Scale result */
-	ld	r5,CFG_TB_TO_XS(r3)
-	mulhdu	r7,r7,r5
-
-	/* Add stamp since epoch */
-	ld	r6,CFG_STAMP_XSEC(r3)
-	add	r4,r6,r7
-
-	xor	r0,r4,r4
-	add	r3,r3,r0
-	ld	r0,CFG_TB_UPDATE_COUNT(r3)
-        cmpld   cr0,r0,r8		/* check if updated */
-	bne-	1b
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__do_get_xsec)
Index: linux-2.6/arch/powerpc/kernel/vdso64/vdso64.lds.S
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ linux-2.6/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -115,10 +115,6 @@ VERSION
     global:
 	__kernel_datapage_offset; /* Has to be there for the kernel to find */
 	__kernel_get_syscall_map;
-    	__kernel_gettimeofday;
-	__kernel_clock_gettime;
-	__kernel_clock_getres;
-	__kernel_get_tbfreq;
 	__kernel_sync_dicache;
 	__kernel_sync_dicache_p5;
 	__kernel_sigtramp_rt64;
Index: linux-2.6/include/asm-powerpc/time.h
===================================================================
--- linux-2.6.orig/include/asm-powerpc/time.h
+++ linux-2.6/include/asm-powerpc/time.h
@@ -48,26 +48,6 @@ extern unsigned long ppc_proc_freq;
 extern unsigned long ppc_tb_freq;
 #define DEFAULT_TB_FREQ		125000000UL
 
-/*
- * By putting all of this stuff into a single struct we 
- * reduce the number of cache lines touched by do_gettimeofday.
- * Both by collecting all of the data in one cache line and
- * by touching only one TOC entry on ppc64.
- */
-struct gettimeofday_vars {
-	u64 tb_to_xs;
-	u64 stamp_xsec;
-	u64 tb_orig_stamp;
-};
-
-struct gettimeofday_struct {
-	unsigned long tb_ticks_per_sec;
-	struct gettimeofday_vars vars[2];
-	struct gettimeofday_vars * volatile varp;
-	unsigned      var_idx;
-	unsigned      tb_to_us;
-};
-
 struct div_result {
 	u64 result_high;
 	u64 result_low;
Index: linux-2.6/include/asm-powerpc/vdso_datapage.h
===================================================================
--- linux-2.6.orig/include/asm-powerpc/vdso_datapage.h
+++ linux-2.6/include/asm-powerpc/vdso_datapage.h
@@ -74,11 +74,6 @@ struct vdso_data {
 	__u32 icache_size;		/* L1 i-cache size		0x68 */
 	__u32 icache_line_size;		/* L1 i-cache line size		0x6C */
 
-	/* those additional ones don't have to be located anywhere
-	 * special as they were not part of the original systemcfg
-	 */
-	__s32 wtom_clock_sec;			/* Wall to monotonic clock */
-	__s32 wtom_clock_nsec;
    	__u32 syscall_map_64[SYSCALL_MAP_SIZE]; /* map of syscalls  */
    	__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
 };
@@ -89,15 +84,6 @@ struct vdso_data {
  * And here is the simpler 32 bits version
  */
 struct vdso_data {
-	__u64 tb_orig_stamp;		/* Timebase at boot		0x30 */
-	__u64 tb_ticks_per_sec;		/* Timebase tics / sec		0x38 */
-	__u64 tb_to_xs;			/* Inverse of TB to 2^20	0x40 */
-	__u64 stamp_xsec;		/*				0x48 */
-	__u32 tb_update_count;		/* Timebase atomicity ctr	0x50 */
-	__u32 tz_minuteswest;		/* Minutes west of Greenwich	0x58 */
-	__u32 tz_dsttime;		/* Type of dst correction	0x5C */
-	__s32 wtom_clock_sec;			/* Wall to monotonic clock */
-	__s32 wtom_clock_nsec;
    	__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
 };
 


[-- Attachment #3: gtod-persistent-clock-support-ppc.patch --]
[-- Type: text/x-patch, Size: 2969 bytes --]

Here's the read_persistent_clock() implementation for PowerPC.

I'm deliberately renaming get_boot_time() despite it's not static as it
doesn't get called from anywhere else.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>

---
Have almost forgotten about this one... :-)
This patch hasn't received a good testing though -- at least it doesn't break
without RTC... ;-)

 arch/powerpc/kernel/time.c |   62 ++++++++++++++++++++-------------------------
 1 files changed, 28 insertions(+), 34 deletions(-)

Index: linux-2.6/arch/powerpc/kernel/time.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/time.c
+++ linux-2.6/arch/powerpc/kernel/time.c
@@ -762,31 +762,46 @@ void __init generic_calibrate_decr(void)
 #endif
 }
 
-unsigned long get_boot_time(void)
+unsigned long read_persistent_clock(void)
 {
-	struct rtc_time tm;
+	unsigned long time = 0;
+	static int first = 1;
+
+	if (first && ppc_md.time_init) {
+		timezone_offset = ppc_md.time_init();
+
+		/* If platform provided a timezone (pmac), we correct the time */
+		if (timezone_offset) {
+			sys_tz.tz_minuteswest = -timezone_offset / 60;
+			sys_tz.tz_dsttime = 0;
+		}
+	}
 
 	if (ppc_md.get_boot_time)
-		return ppc_md.get_boot_time();
-	if (!ppc_md.get_rtc_time)
-		return 0;
-	ppc_md.get_rtc_time(&tm);
-	return mktime(tm.tm_year+1900, tm.tm_mon+1, tm.tm_mday,
-		      tm.tm_hour, tm.tm_min, tm.tm_sec);
+		time = ppc_md.get_boot_time();
+	else if (ppc_md.get_rtc_time) {
+		struct rtc_time tm;
+
+		ppc_md.get_rtc_time(&tm);
+		time = mktime(tm.tm_year+1900, tm.tm_mon+1, tm.tm_mday,
+			      tm.tm_hour, tm.tm_min, tm.tm_sec);
+	}
+	time -= timezone_offset;
+
+	if (first) {
+		last_rtc_update = time;
+		first = 0;
+	}
+	return time;
 }
 
 /* This function is only called on the boot processor */
 void __init time_init(void)
 {
-	unsigned long flags;
-	unsigned long tm = 0;
 	struct div_result res;
 	u64 scale, x;
 	unsigned shift;
 
-        if (ppc_md.time_init != NULL)
-                timezone_offset = ppc_md.time_init();
-
 	if (__USE_RTC()) {
 		/* 601 processor: dec counts down by 128 every 128ns */
 		ppc_tb_freq = 1000000000;
@@ -860,27 +875,6 @@ void __init time_init(void)
 	tb_to_ns_scale = scale;
 	tb_to_ns_shift = shift;
 
-	tm = get_boot_time();
-
-	write_seqlock_irqsave(&xtime_lock, flags);
-
-	/* If platform provided a timezone (pmac), we correct the time */
-        if (timezone_offset) {
-		sys_tz.tz_minuteswest = -timezone_offset / 60;
-		sys_tz.tz_dsttime = 0;
-		tm -= timezone_offset;
-        }
-
-	xtime.tv_sec = tm;
-	xtime.tv_nsec = 0;
-
-	time_freq = 0;
-
-	last_rtc_update = xtime.tv_sec;
-	set_normalized_timespec(&wall_to_monotonic,
-	                        -xtime.tv_sec, -xtime.tv_nsec);
-	write_sequnlock_irqrestore(&xtime_lock, flags);
-
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 	decrementer_clockevent.mult = div_sc(ppc_tb_freq, NSEC_PER_SEC,
 					     decrementer_clockevent.shift);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 22:15   ` Matt Sealey
  2007-07-12  6:41     ` Tony Breeds
@ 2007-07-12 15:49     ` Michael Neuling
  2007-07-12 20:04       ` Matt Sealey
  2007-07-12 16:32     ` Sergei Shtylyov
  2 siblings, 1 reply; 19+ messages in thread
From: Michael Neuling @ 2007-07-12 15:49 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev

> Okay.
> 
> What I didn't want to do is spend a day sifting some other development
> tree picking out what I think might be possibly sort of the right patches
> for it.
> 
> I'd get it wrong because having not worked on it, I don't know what I am
> even looking for.
> 
> And I don't want to run -rt or wireless-dev for the benefit of a single
> feature. What I am after is something like Ingo Molnar throws out..
> single patches done the old way, not git trees. It's so much easier to
> handle and integrate for example into a Gentoo ebuild or to make a
> tarball of accumulated patches from a certain release kernel.

I'm sooo with ya.  I like my patches alphabetised, but no one ever does
it for me.

:-)

Mikey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-11 22:15   ` Matt Sealey
  2007-07-12  6:41     ` Tony Breeds
  2007-07-12 15:49     ` Michael Neuling
@ 2007-07-12 16:32     ` Sergei Shtylyov
  2007-07-12 20:08       ` Matt Sealey
  2 siblings, 1 reply; 19+ messages in thread
From: Sergei Shtylyov @ 2007-07-12 16:32 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev, Michael Neuling

Hello.

Matt Sealey wrote:

> And I don't want to run -rt or wireless-dev for the benefit of a single

    BTW, the latest -rt has has been also released in the broken out form (at 
last!) which has "PPC gtod and highres support" section in its series file.

> feature. What I am after is something like Ingo Molnar throws out..

    One 1.5 Megabyte patch?! Bleh... thanks goodness he (or Thomas) has 
finally changed his mind about this. :-)

> single patches done the old way, not git trees. It's so much easier to
> handle and integrate for example into a Gentoo ebuild or to make a
> tarball of accumulated patches from a certain release kernel.

    This also can become a nightmare if you intend to pick up some later fixes 
only.

WBR, Sergei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 14:11   ` Sergei Shtylyov
@ 2007-07-12 16:41     ` Sergei Shtylyov
  2007-07-13  8:49     ` Domen Puncer
  1 sibling, 0 replies; 19+ messages in thread
From: Sergei Shtylyov @ 2007-07-12 16:41 UTC (permalink / raw)
  To: Matt Sealey; +Cc: linuxppc-dev, Domen Puncer

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

Sergei Shtylyov wrote:
> Hello.
> 
> Domen Puncer wrote:
> 
>>> Does anyone have the definitive patchset to enable the tickless hz,
>>> some kind of hrtimer and the other related improvements in the
>>> PowerPC tree?
> 
> 
>> I use attached patches for tickless.
>> Order in which they're applied:

>> PowerPC_GENERIC_CLOCKEVENTS.patch

    Argh! Finally had forgotten to attach a fixlet for this one, somewaht 
adjusting the code for classic PPC decrementer.

[...]

WBR, Sergei

> PS: All attached patches are against 2.6.21-rt2 -- fitting them into the 
> current (or whatever) version of the kernel is left as an excercise to 
> the readers. ;-)

    This one is for 2.6.21-rt2 as well.

[-- Attachment #2: PowerPC-fix-clockevents-for-classic-CPU.patch --]
[-- Type: text/x-patch, Size: 2054 bytes --]

Uncoditionally set a maximum positive value to the decrementer before calling
an event handler for all "classic" PPC CPUs (although this is only necessary
to clear interrupt on POWER4+, I've been asked to do it this way) -- otherwise
it wouldn't have been done for an offline CPU in periodic mode since the event
reprogramming has been delegated to the timer subsystem.
Also, as the classic decrementer doesn't have periodic mode, make set_mode()
method for this case completely empty.
While at it, add a switch case for CLOCK_EVT_MODE_RESUME to hush the warning.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>

 arch/powerpc/kernel/time.c |   15 +++++++--------
 1 files changed, 7 insertions(+), 8 deletions(-)

Index: linux-2.6/arch/powerpc/kernel/time.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/time.c
+++ linux-2.6/arch/powerpc/kernel/time.c
@@ -166,11 +166,14 @@ static void decrementer_set_mode(enum	cl
 	case CLOCK_EVT_MODE_SHUTDOWN:
 		tcr &= ~TCR_DIE;
 		break;
+	case CLOCK_EVT_MODE_RESUME:
+		break;
 	}
 	mtspr(SPRN_TCR, tcr);
-#endif
+
 	if (mode == CLOCK_EVT_MODE_PERIODIC)
 		decrementer_set_next_event(tb_ticks_per_jiffy, dev);
+#endif
 }
 
 static struct clock_event_device decrementer_clockevent = {
@@ -549,16 +552,12 @@ void timer_interrupt(struct pt_regs * re
 	irq_enter();
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
-#ifdef CONFIG_PPC_MULTIPLATFORM
+#if !defined(CONFIG_40x) && !defined(CONFIG_BOOKE)
 	/*
 	 * We must write a positive value to the decrementer to clear
-	 * the interrupt on the IBM 970 CPU series.  In periodic mode,
-	 * this happens when the decrementer gets reloaded later, but
-	 * in one-shot mode, we have to do it here since an event handler
-	 * may skip loading the new value...
+	 * the interrupt on POWER4+ compatible CPUs.
 	 */
-	if (per_cpu(decrementers, cpu).mode != CLOCK_EVT_MODE_PERIODIC)
-		set_dec(DECREMENTER_MAX);
+	set_dec(DECREMENTER_MAX);
 #endif
 	/*
 	 * We can't disable the decrementer, so in the period between

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 15:49     ` Michael Neuling
@ 2007-07-12 20:04       ` Matt Sealey
  2007-07-12 20:12         ` Geert Uytterhoeven
  2007-07-13  8:34         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 19+ messages in thread
From: Matt Sealey @ 2007-07-12 20:04 UTC (permalink / raw)
  To: Michael Neuling; +Cc: ppc-dev

Michael Neuling wrote:
>> Okay.
>>
>> What I didn't want to do is spend a day sifting some other development
>> tree picking out what I think might be possibly sort of the right patches
>> for it.
>>
>> I'd get it wrong because having not worked on it, I don't know what I am
>> even looking for.
>>
>> And I don't want to run -rt or wireless-dev for the benefit of a single
>> feature. What I am after is something like Ingo Molnar throws out..
>> single patches done the old way, not git trees. It's so much easier to
>> handle and integrate for example into a Gentoo ebuild or to make a
>> tarball of accumulated patches from a certain release kernel.
> 
> I'm sooo with ya.  I like my patches alphabetised, but no one ever does
> it for me.

Well my only requirement is numbering them so they patch in order; there
are some cute little requirements on that, but I'd rather patch the bare
minimum and bring in as few quirks and new features than just grab an
entire -rt tree with what amounts to 60 or 70 individual patches and
start renumbering them.

We already have some ~20 for Efika support on top of 2.6.22 including
minor bugfixes and stuff, and the Gentoo genpatches stuff. I really want
to get a good start on CFS, hrtimers, dynticks and so on though and see
if we can push it to users and get out some decent testing and bug
reports. I think it will help everyone if it is not just a feature which
hits mainline after 6 months through supposed maturity (when a lack of
bug reports may well also be down to lack of interest).

I think the Efika as a low-power and fairly average performance board,
would benefit (and does benefit!) from features like dynticks, 5200B
has a bunch of hrtimers, SLUB gave me some insane speed improvement
exactly as the best benchmarks predicted (also got the same under VMware
on my x86 laptop, never going back to SLAB now!). The more processor
time we can eke out, and the less unecessary work the processor does
on housekeeping, the better the board will run for users. Everyone wins :)

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 16:32     ` Sergei Shtylyov
@ 2007-07-12 20:08       ` Matt Sealey
  0 siblings, 0 replies; 19+ messages in thread
From: Matt Sealey @ 2007-07-12 20:08 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: ppc-dev, Michael Neuling


Sergei Shtylyov wrote:
> Hello.
> 
> Matt Sealey wrote:
> 
>> And I don't want to run -rt or wireless-dev for the benefit of a single
> 
>    BTW, the latest -rt has has been also released in the broken out form
> (at last!) which has "PPC gtod and highres support" section in its
> series file.

Hm okay.

>> feature. What I am after is something like Ingo Molnar throws out..
> 
>    One 1.5 Megabyte patch?! Bleh... thanks goodness he (or Thomas) has
> finally changed his mind about this. :-)

Good point, but at least it was nice and self-contained and not mindlessly
interdependant on other minor fixes. I think if you are going to patch in
a feature that feature should be a feature. Not 8 bugfixes and 8 seperate
minifeatures which cooperate to bring in a whole, especialy as a lot of
them are so interdependant that they would not have any effect or maybe
even break things on their own :)

>> single patches done the old way, not git trees. It's so much easier to
>> handle and integrate for example into a Gentoo ebuild or to make a
>> tarball of accumulated patches from a certain release kernel.
> 
>    This also can become a nightmare if you intend to pick up some later
> fixes only.

Indeed, but you trade off one for the other in everything :)

-- 
Matt Sealey <matt@genesi-usa.com>
Genesi, Manager, Developer Relations

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 20:04       ` Matt Sealey
@ 2007-07-12 20:12         ` Geert Uytterhoeven
  2007-07-13  8:34         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 19+ messages in thread
From: Geert Uytterhoeven @ 2007-07-12 20:12 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev, Michael Neuling

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1533 bytes --]

On Thu, 12 Jul 2007, Matt Sealey wrote:
> Michael Neuling wrote:
> >> And I don't want to run -rt or wireless-dev for the benefit of a single
> >> feature. What I am after is something like Ingo Molnar throws out..
> >> single patches done the old way, not git trees. It's so much easier to
> >> handle and integrate for example into a Gentoo ebuild or to make a
> >> tarball of accumulated patches from a certain release kernel.
> > 
> > I'm sooo with ya.  I like my patches alphabetised, but no one ever does
> > it for me.
> 
> Well my only requirement is numbering them so they patch in order; there
> are some cute little requirements on that, but I'd rather patch the bare
> minimum and bring in as few quirks and new features than just grab an
> entire -rt tree with what amounts to 60 or 70 individual patches and
> start renumbering them.

That's why quilt uses a series file, so the patch names don't have to be
ordered.

With kind regards,
 
Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
 
Phone:    +32 (0)2 700 8453	
Fax:      +32 (0)2 700 8622	
E-mail:   Geert.Uytterhoeven@sonycom.com	
Internet: http://www.sony-europe.com/
 	
Sony Network and Software Technology Center Europe	
A division of Sony Service Centre (Europe) N.V.	
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium	
VAT BE 0413.825.160 · RPR Brussels	
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 20:04       ` Matt Sealey
  2007-07-12 20:12         ` Geert Uytterhoeven
@ 2007-07-13  8:34         ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-13  8:34 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev, Michael Neuling

On Thu, 2007-07-12 at 21:04 +0100, Matt Sealey wrote:
> We already have some ~20 for Efika support on top of 2.6.22 including
> minor bugfixes and stuff, and the Gentoo genpatches stuff. I really
> want
> to get a good start on CFS, hrtimers, dynticks and so on though and
> see
> if we can push it to users and get out some decent testing and bug
> reports. I think it will help everyone if it is not just a feature
> which
> hits mainline after 6 months through supposed maturity (when a lack of
> bug reports may well also be down to lack of interest).

Why haven't you submited those patches ? The merge window for 2.6.23 is
open _now_

Ben

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 12:07   ` Matt Sealey
@ 2007-07-13  8:41     ` Domen Puncer
  0 siblings, 0 replies; 19+ messages in thread
From: Domen Puncer @ 2007-07-13  8:41 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev

On 12/07/07 13:07 +0100, Matt Sealey wrote:
> Domen,
> 
> You wouldn't have tried these on the Efika yet, would you?

Not yet, but they worked on a mpc5200b based board (lite5200b).


	Domen

> 
> -- 
> Matt Sealey <matt@genesi-usa.com>
> Genesi, Manager, Developer Relations
> 
> Domen Puncer wrote:
> > On 11/07/07 19:06 +0100, Matt Sealey wrote:
> >> Does anyone have the definitive patchset to enable the tickless hz,
> >> some kind of hrtimer and the other related improvements in the
> >> PowerPC tree?
> > 
> > I use attached patches for tickless.
> > Order in which they're applied:
> > 
> > PowerPC_GENERIC_CLOCKEVENTS.patch
> > PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch
> > PowerPC_enable_HRT_and_dynticks_support.patch
> > PowerPC_no_hz_fix.patch
> > tickless-enable.patch
> > 
> > HTH
> > 
> > 
> > 	Domen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 14:11   ` Sergei Shtylyov
  2007-07-12 16:41     ` Sergei Shtylyov
@ 2007-07-13  8:49     ` Domen Puncer
  1 sibling, 0 replies; 19+ messages in thread
From: Domen Puncer @ 2007-07-13  8:49 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: linuxppc-dev

On 12/07/07 18:11 +0400, Sergei Shtylyov wrote:
> Hello.
> 
> Domen Puncer wrote:
> 
> >>Does anyone have the definitive patchset to enable the tickless hz,
> >>some kind of hrtimer and the other related improvements in the
> >>PowerPC tree?
> 
> >I use attached patches for tickless.
> >Order in which they're applied:
> 
> >PowerPC_GENERIC_CLOCKEVENTS.patch
> 
>    That's my patch which used to have both description and signoff that I'm 
> not seeing in the attached version...

Err, yes, sorry, I don't remember anymore where I picked them,
but I'm pretty sure I didn't go delete description and signoff
by hand.

> 
> >PowerPC_GENERIC_TIME.linux-2.6.18-rc6_timeofday-arch-ppc_C6.patch
> 
>    This one should come first of all, I'd say...
>    Note that it breaks TOD vsyscalls, so you need my patch that removes 
> support for those for the time being (i.e. until Tony hopefully fixes this 
> :-).  Also, there was a patch implementing read_persistent_clock() and 
> getting rid of the code setting xtime in time_init().  Attaching them 
> both...
> 
> >PowerPC_enable_HRT_and_dynticks_support.patch
> 
>    Again looks like my patch with description/signoff missing for whatever
> reason...
> 
> >PowerPC_no_hz_fix.patch
> 
>    This has nothing to do with CONFIG_NO_HZ per se -- it fixes the 
> compilation error introduced by John's patch.
> 
> >tickless-enable.patch
> 
>    That one doesn't look quite right...

That's because I made it ;-)

It did the trick of enabling tickless though.


	Domen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Tickless Hz/hrtimers/etc. on PowerPC
  2007-07-12 12:07       ` Matt Sealey
@ 2007-07-16  0:45         ` Tony Breeds
  0 siblings, 0 replies; 19+ messages in thread
From: Tony Breeds @ 2007-07-16  0:45 UTC (permalink / raw)
  To: Matt Sealey; +Cc: ppc-dev, Michael Neuling

On Thu, Jul 12, 2007 at 01:07:26PM +0100, Matt Sealey wrote:

Hi Matt,
 
> What does "isn't quite right yet" mean? Broken, acts funny, or just
> a messy patch?

If you compile with CONFIG_NO_HZ=y, the kernel is broken.

Yours Tony

  linux.conf.au        http://linux.conf.au/ || http://lca2008.linux.org.au/
  Jan 28 - Feb 02 2008 The Australian Linux Technical Conference!

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-07-16  0:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-11 18:06 Tickless Hz/hrtimers/etc. on PowerPC Matt Sealey
2007-07-11 18:17 ` Sergei Shtylyov
2007-07-11 18:33 ` Michael Neuling
2007-07-11 22:15   ` Matt Sealey
2007-07-12  6:41     ` Tony Breeds
2007-07-12 12:07       ` Matt Sealey
2007-07-16  0:45         ` Tony Breeds
2007-07-12 15:49     ` Michael Neuling
2007-07-12 20:04       ` Matt Sealey
2007-07-12 20:12         ` Geert Uytterhoeven
2007-07-13  8:34         ` Benjamin Herrenschmidt
2007-07-12 16:32     ` Sergei Shtylyov
2007-07-12 20:08       ` Matt Sealey
2007-07-12  6:51 ` Domen Puncer
2007-07-12 12:07   ` Matt Sealey
2007-07-13  8:41     ` Domen Puncer
2007-07-12 14:11   ` Sergei Shtylyov
2007-07-12 16:41     ` Sergei Shtylyov
2007-07-13  8:49     ` Domen Puncer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).