public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix "timer tick before it's due"
@ 2004-05-12 23:21 Bjorn Helgaas
  2004-05-12 23:42 ` David Mosberger
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2004-05-12 23:21 UTC (permalink / raw)
  To: linux-ia64

Fix the "timer tick before it's due" complaint from timer_interrupt().
The problem was that smp_callin() turned on the periodic timer tick
before syncing the ITC with the BP.

Syncing the ITC happens with interrupts disabled, and if you're
unlucky enough to (1) pend a timer interrupt, and (2) set the ITC
back before the ITM value that caused the timer interrupt, you
can get stuck for several iterations in the following cycle
(assume 100 clocks per tick):

    ITC     ITM
    ---	    ---
                    ia64_init_itm()
    100	    200	        schedule first tick at 200
                    ia64_sync_itc()
                        disable interrupts
    200     200         ITC = ITM; pend IT interrupt
    150                 set ITC to sync with BP
                        enable interrupts
                        recognize pending IT interrupt
                        disable IT interrupts
                    timer_interrupt()
    160     200         notice that 160 < 200,
                            printk "timer tick before it's due")
    200     200         ITC = ITM; pend IT interrupt
	    300         set ITM for next tick
                        re-enable IT interrupt
                        recognize pending IT interrupt
                        disable IT interrupts
                    timer_interrupt()
    260     300         notice that 260 < 300,
                            printk "timer tick before it's due")

    ...	            repeat until you're tired or timer_interrupt()
                    takes long enough that the ITC lands after the
                    ITM

This patch syncs the ITC with the BP before starting up the
periodic tick, so the above scenario should never happen.

This doesn't change how the timer tick on the BP is started;
that happens quite early (and must be early because things
like calibrate_delay() depend on jiffies updates).

(This is against 2.6.6, but I think it will conflict slightly
with my iobase init change that is in your tree but not Linus'.)

=== arch/ia64/kernel/smpboot.c 1.49 vs edited ==--- 1.49/arch/ia64/kernel/smpboot.c	Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/smpboot.c	Wed May 12 14:36:17 2004
@@ -289,11 +289,6 @@
 	smp_setup_percpu_timer();
 
 	/*
-	 * Get our bogomips.
-	 */
-	ia64_init_itm();
-
-	/*
 	 * Set I/O port base per CPU
 	 */
 	ia64_set_kr(IA64_KR_IO_BASE, __pa(ia64_iobase));
@@ -305,11 +300,6 @@
 #endif
 
 	local_irq_enable();
-	calibrate_delay();
-	local_cpu_data->loops_per_jiffy = loops_per_jiffy;
-#ifdef CONFIG_IA32_SUPPORT
-	ia32_gdt_init();
-#endif
 
 	if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT)) {
 		/*
@@ -321,6 +311,17 @@
 		Dprintk("Going to syncup ITC with BP.\n");
 		ia64_sync_itc(0);
 	}
+
+	/*
+	 * Get our bogomips.
+	 */
+	ia64_init_itm();
+	calibrate_delay();
+	local_cpu_data->loops_per_jiffy = loops_per_jiffy;
+
+#ifdef CONFIG_IA32_SUPPORT
+	ia32_gdt_init();
+#endif
 
 	/*
 	 * Allow the master to continue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix "timer tick before it's due"
  2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
@ 2004-05-12 23:42 ` David Mosberger
  2004-05-13 19:02 ` Bjorn Helgaas
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-12 23:42 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Wed, 12 May 2004 17:21:30 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:

  Bjorn> Fix the "timer tick before it's due" complaint from timer_interrupt().
  Bjorn> The problem was that smp_callin() turned on the periodic timer tick
  Bjorn> before syncing the ITC with the BP.

  Bjorn> Syncing the ITC happens with interrupts disabled, and if you're
  Bjorn> unlucky enough to (1) pend a timer interrupt, and (2) set the ITC
  Bjorn> back before the ITM value that caused the timer interrupt, you
  Bjorn> can get stuck for several iterations in the following cycle
  Bjorn> (assume 100 clocks per tick):

The original idea was for ia64_sync_itc() to be safe to be called any
time (e.g., for CPU hotplug or in case we ever want to support
periodic re-syncing between CPUs).  Indeed, ia64_sync_itc() has this
code already:

	/*
	 * Check whether we sync'd the itc ahead of the next timer interrupt.
	 * If so, just reset it.
	 */
	if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
		Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
			smp_processor_id());
		ia64_cpu_local_tick();
	}

However, it's kind of hard to imagine a clean solution to the timer
tick problem you observed so I think it's reasonable to require that
timer ticks be disabled before syncing the ITCs.  If so, we should
state that assumption in a comment at least (a BUG_ON might be even
better).  Also, the above code can then be deleted from
ia64_sync_itc().

	--david

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix "timer tick before it's due"
  2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
  2004-05-12 23:42 ` David Mosberger
@ 2004-05-13 19:02 ` Bjorn Helgaas
  2004-05-13 20:33 ` David Mosberger
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2004-05-13 19:02 UTC (permalink / raw)
  To: linux-ia64

On Wednesday 12 May 2004 5:42 pm, David Mosberger wrote:
> However, it's kind of hard to imagine a clean solution to the timer
> tick problem you observed so I think it's reasonable to require that
> timer ticks be disabled before syncing the ITCs.

Isn't the real problem the fact that the printk in timer_interrupt()
takes a large fraction of a tick?  If we just eliminate the printk,
that problem goes away, and then we'd just have to make sure we
aren't syncing the ITC too far backwards.

Actually, shouldn't we check that we don't set it too far backwards
anyway?  If the new CPU came online with an ITC way ahead of the
BP, we could schedule a tick, sync with the BP, and wait for a
looong time.

=== arch/ia64/kernel/smpboot.c 1.49 vs edited ==--- 1.49/arch/ia64/kernel/smpboot.c	Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/smpboot.c	Thu May 13 12:46:06 2004
@@ -249,12 +249,15 @@
 	       "maxerr %lu cycles)\n", smp_processor_id(), master, delta, rt);
 
 	/*
-	 * Check whether we sync'd the itc ahead of the next timer interrupt.  If so, just
-	 * reset it.
+	 * Check whether we sync'd the itc too far from the next timer
+	 * interrupt.  If so, just reschedule the next tick.
 	 */
-	if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
-		Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
-			smp_processor_id());
+	if (time_after(ia64_get_itc(), local_cpu_data->itm_next) ||
+	    time_before(ia64_get_itc(), local_cpu_data->itm_next - local_cpu_data->itm_delta)) {
+		Dprintk("CPU %d: oops, timer tick too far away; resetting "
+			"timer (itc=0x%lx,itm=0x%lx).\n",
+			smp_processor_id(), ia64_get_itc(),
+			local_cpu_data->itm_next);
 		ia64_cpu_local_tick();
 	}
 }
=== arch/ia64/kernel/time.c 1.39 vs edited ==--- 1.39/arch/ia64/kernel/time.c	Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/time.c	Thu May 13 12:46:21 2004
@@ -248,10 +248,6 @@
 
 	new_itm = local_cpu_data->itm_next;
 
-	if (!time_after(ia64_get_itc(), new_itm))
-		printk(KERN_ERR "Oops: timer tick before it's due (itc=%lx,itm=%lx)\n",
-		       ia64_get_itc(), new_itm);
-
 	ia64_do_profile(regs);
 
 	while (1) {

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix "timer tick before it's due"
  2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
  2004-05-12 23:42 ` David Mosberger
  2004-05-13 19:02 ` Bjorn Helgaas
@ 2004-05-13 20:33 ` David Mosberger
  2004-05-13 21:25 ` Bjorn Helgaas
  2004-05-13 22:36 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-13 20:33 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 13 May 2004 13:02:42 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:

  Bjorn> Isn't the real problem the fact that the printk in timer_interrupt()
  Bjorn> takes a large fraction of a tick?  If we just eliminate the printk,
  Bjorn> that problem goes away, and then we'd just have to make sure we
  Bjorn> aren't syncing the ITC too far backwards.

I think it may really be necessary to distinguish between "I'm a new
CPU, tell me what to set the ITC to" and "my CPU's may haved drifted,
let me resync".  The big difference is that in the latter case, you
may have pending timers which you'd have to preserve in some fashion
(the timer tick itself is just a special case, all scheduled timers
would have to be preserved).  The way to do that would be with an
NTP-like protocol, where you move to the target value in a gradual
fashion.  That would be fairly complicated and quite different from
what's there now.

IMHO, it makes sense to limit the syncing code to the former case for
the time being (there hasn't been a real need for the latter).  Now,
the printk() does take a lot of time.  I think it would be a mistake
to remove it entirely, because it does catch important classes of
bugs.  It may make sense to rate-limit it though, so the printk itself
doesn't (persistently) make the situation worse in and of itself.

	--david

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix "timer tick before it's due"
  2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2004-05-13 20:33 ` David Mosberger
@ 2004-05-13 21:25 ` Bjorn Helgaas
  2004-05-13 22:36 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2004-05-13 21:25 UTC (permalink / raw)
  To: linux-ia64

On Thursday 13 May 2004 2:33 pm, David Mosberger wrote:
> I think it may really be necessary to distinguish between "I'm a new
> CPU, tell me what to set the ITC to" and "my CPU's may haved drifted,
> let me resync". ...
> 
> IMHO, it makes sense to limit the syncing code to the former case for
> the time being (there hasn't been a real need for the latter).

I agree.  How about this (same as the first patch, plus the BUG_ON()
and removal of the "jumped tick" check):

=== arch/ia64/kernel/smpboot.c 1.49 vs edited ==--- 1.49/arch/ia64/kernel/smpboot.c	Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/smpboot.c	Thu May 13 15:15:57 2004
@@ -202,6 +202,14 @@
 	} t[NUM_ROUNDS];
 #endif
 
+	/*
+	 * Make sure local timer ticks are disabled while we sync.  If
+	 * they were enabled, we'd have to worry about nasty issues
+	 * like setting the ITC ahead of (or a long time before) the
+	 * next scheduled tick.
+	 */
+	BUG_ON((ia64_get_itv() & (1 << 16)) = 0);
+
 	go[MASTER] = 1;
 
 	if (smp_call_function_single(master, sync_master, NULL, 1, 0) < 0) {
@@ -247,16 +255,6 @@
 
 	printk(KERN_INFO "CPU %d: synchronized ITC with CPU %u (last diff %ld cycles, "
 	       "maxerr %lu cycles)\n", smp_processor_id(), master, delta, rt);
-
-	/*
-	 * Check whether we sync'd the itc ahead of the next timer interrupt.  If so, just
-	 * reset it.
-	 */
-	if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
-		Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
-			smp_processor_id());
-		ia64_cpu_local_tick();
-	}
 }
 
 /*
@@ -289,11 +287,6 @@
 	smp_setup_percpu_timer();
 
 	/*
-	 * Get our bogomips.
-	 */
-	ia64_init_itm();
-
-	/*
 	 * Set I/O port base per CPU
 	 */
 	ia64_set_kr(IA64_KR_IO_BASE, __pa(ia64_iobase));
@@ -305,11 +298,6 @@
 #endif
 
 	local_irq_enable();
-	calibrate_delay();
-	local_cpu_data->loops_per_jiffy = loops_per_jiffy;
-#ifdef CONFIG_IA32_SUPPORT
-	ia32_gdt_init();
-#endif
 
 	if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT)) {
 		/*
@@ -321,6 +309,17 @@
 		Dprintk("Going to syncup ITC with BP.\n");
 		ia64_sync_itc(0);
 	}
+
+	/*
+	 * Get our bogomips.
+	 */
+	ia64_init_itm();
+	calibrate_delay();
+	local_cpu_data->loops_per_jiffy = loops_per_jiffy;
+
+#ifdef CONFIG_IA32_SUPPORT
+	ia32_gdt_init();
+#endif
 
 	/*
 	 * Allow the master to continue.
=== include/asm-ia64/delay.h 1.8 vs edited ==--- 1.8/include/asm-ia64/delay.h	Wed Oct 15 15:07:14 2003
+++ edited/include/asm-ia64/delay.h	Thu May 13 11:31:29 2004
@@ -44,6 +44,16 @@
 	ia64_srlz_d();
 }
 
+static __inline__ unsigned long
+ia64_get_itv (void)
+{
+	unsigned long result;
+
+	result = ia64_getreg(_IA64_REG_CR_ITV);
+	ia64_srlz_d();
+	return result;
+}
+
 static __inline__ void
 ia64_set_itc (unsigned long val)
 {

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fix "timer tick before it's due"
  2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2004-05-13 21:25 ` Bjorn Helgaas
@ 2004-05-13 22:36 ` David Mosberger
  4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-13 22:36 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 13 May 2004 15:25:31 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:

  Bjorn> I agree.  How about this (same as the first patch, plus the
  Bjorn> BUG_ON() and removal of the "jumped tick" check):

Looks great to me (modulo a change to ia64_get_itv() to drop ia64_srlz_d()).

Thanks!

	--david

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-05-13 22:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
2004-05-12 23:42 ` David Mosberger
2004-05-13 19:02 ` Bjorn Helgaas
2004-05-13 20:33 ` David Mosberger
2004-05-13 21:25 ` Bjorn Helgaas
2004-05-13 22:36 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox