* Re: [PATCH] fix "timer tick before it's due"
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
@ 2004-05-12 23:42 ` David Mosberger
2004-05-13 19:02 ` Bjorn Helgaas
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-12 23:42 UTC (permalink / raw)
To: linux-ia64
>>>>> On Wed, 12 May 2004 17:21:30 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:
Bjorn> Fix the "timer tick before it's due" complaint from timer_interrupt().
Bjorn> The problem was that smp_callin() turned on the periodic timer tick
Bjorn> before syncing the ITC with the BP.
Bjorn> Syncing the ITC happens with interrupts disabled, and if you're
Bjorn> unlucky enough to (1) pend a timer interrupt, and (2) set the ITC
Bjorn> back before the ITM value that caused the timer interrupt, you
Bjorn> can get stuck for several iterations in the following cycle
Bjorn> (assume 100 clocks per tick):
The original idea was for ia64_sync_itc() to be safe to be called any
time (e.g., for CPU hotplug or in case we ever want to support
periodic re-syncing between CPUs). Indeed, ia64_sync_itc() has this
code already:
/*
* Check whether we sync'd the itc ahead of the next timer interrupt.
* If so, just reset it.
*/
if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
smp_processor_id());
ia64_cpu_local_tick();
}
However, it's kind of hard to imagine a clean solution to the timer
tick problem you observed so I think it's reasonable to require that
timer ticks be disabled before syncing the ITCs. If so, we should
state that assumption in a comment at least (a BUG_ON might be even
better). Also, the above code can then be deleted from
ia64_sync_itc().
--david
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fix "timer tick before it's due"
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
2004-05-12 23:42 ` David Mosberger
@ 2004-05-13 19:02 ` Bjorn Helgaas
2004-05-13 20:33 ` David Mosberger
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2004-05-13 19:02 UTC (permalink / raw)
To: linux-ia64
On Wednesday 12 May 2004 5:42 pm, David Mosberger wrote:
> However, it's kind of hard to imagine a clean solution to the timer
> tick problem you observed so I think it's reasonable to require that
> timer ticks be disabled before syncing the ITCs.
Isn't the real problem the fact that the printk in timer_interrupt()
takes a large fraction of a tick? If we just eliminate the printk,
that problem goes away, and then we'd just have to make sure we
aren't syncing the ITC too far backwards.
Actually, shouldn't we check that we don't set it too far backwards
anyway? If the new CPU came online with an ITC way ahead of the
BP, we could schedule a tick, sync with the BP, and wait for a
looong time.
=== arch/ia64/kernel/smpboot.c 1.49 vs edited ==--- 1.49/arch/ia64/kernel/smpboot.c Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/smpboot.c Thu May 13 12:46:06 2004
@@ -249,12 +249,15 @@
"maxerr %lu cycles)\n", smp_processor_id(), master, delta, rt);
/*
- * Check whether we sync'd the itc ahead of the next timer interrupt. If so, just
- * reset it.
+ * Check whether we sync'd the itc too far from the next timer
+ * interrupt. If so, just reschedule the next tick.
*/
- if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
- Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
- smp_processor_id());
+ if (time_after(ia64_get_itc(), local_cpu_data->itm_next) ||
+ time_before(ia64_get_itc(), local_cpu_data->itm_next - local_cpu_data->itm_delta)) {
+ Dprintk("CPU %d: oops, timer tick too far away; resetting "
+ "timer (itc=0x%lx,itm=0x%lx).\n",
+ smp_processor_id(), ia64_get_itc(),
+ local_cpu_data->itm_next);
ia64_cpu_local_tick();
}
}
=== arch/ia64/kernel/time.c 1.39 vs edited ==--- 1.39/arch/ia64/kernel/time.c Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/time.c Thu May 13 12:46:21 2004
@@ -248,10 +248,6 @@
new_itm = local_cpu_data->itm_next;
- if (!time_after(ia64_get_itc(), new_itm))
- printk(KERN_ERR "Oops: timer tick before it's due (itc=%lx,itm=%lx)\n",
- ia64_get_itc(), new_itm);
-
ia64_do_profile(regs);
while (1) {
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fix "timer tick before it's due"
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
2004-05-12 23:42 ` David Mosberger
2004-05-13 19:02 ` Bjorn Helgaas
@ 2004-05-13 20:33 ` David Mosberger
2004-05-13 21:25 ` Bjorn Helgaas
2004-05-13 22:36 ` David Mosberger
4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-13 20:33 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 13 May 2004 13:02:42 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:
Bjorn> Isn't the real problem the fact that the printk in timer_interrupt()
Bjorn> takes a large fraction of a tick? If we just eliminate the printk,
Bjorn> that problem goes away, and then we'd just have to make sure we
Bjorn> aren't syncing the ITC too far backwards.
I think it may really be necessary to distinguish between "I'm a new
CPU, tell me what to set the ITC to" and "my CPU's may haved drifted,
let me resync". The big difference is that in the latter case, you
may have pending timers which you'd have to preserve in some fashion
(the timer tick itself is just a special case, all scheduled timers
would have to be preserved). The way to do that would be with an
NTP-like protocol, where you move to the target value in a gradual
fashion. That would be fairly complicated and quite different from
what's there now.
IMHO, it makes sense to limit the syncing code to the former case for
the time being (there hasn't been a real need for the latter). Now,
the printk() does take a lot of time. I think it would be a mistake
to remove it entirely, because it does catch important classes of
bugs. It may make sense to rate-limit it though, so the printk itself
doesn't (persistently) make the situation worse in and of itself.
--david
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] fix "timer tick before it's due"
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
` (2 preceding siblings ...)
2004-05-13 20:33 ` David Mosberger
@ 2004-05-13 21:25 ` Bjorn Helgaas
2004-05-13 22:36 ` David Mosberger
4 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2004-05-13 21:25 UTC (permalink / raw)
To: linux-ia64
On Thursday 13 May 2004 2:33 pm, David Mosberger wrote:
> I think it may really be necessary to distinguish between "I'm a new
> CPU, tell me what to set the ITC to" and "my CPU's may haved drifted,
> let me resync". ...
>
> IMHO, it makes sense to limit the syncing code to the former case for
> the time being (there hasn't been a real need for the latter).
I agree. How about this (same as the first patch, plus the BUG_ON()
and removal of the "jumped tick" check):
=== arch/ia64/kernel/smpboot.c 1.49 vs edited ==--- 1.49/arch/ia64/kernel/smpboot.c Thu Mar 25 12:53:03 2004
+++ edited/arch/ia64/kernel/smpboot.c Thu May 13 15:15:57 2004
@@ -202,6 +202,14 @@
} t[NUM_ROUNDS];
#endif
+ /*
+ * Make sure local timer ticks are disabled while we sync. If
+ * they were enabled, we'd have to worry about nasty issues
+ * like setting the ITC ahead of (or a long time before) the
+ * next scheduled tick.
+ */
+ BUG_ON((ia64_get_itv() & (1 << 16)) = 0);
+
go[MASTER] = 1;
if (smp_call_function_single(master, sync_master, NULL, 1, 0) < 0) {
@@ -247,16 +255,6 @@
printk(KERN_INFO "CPU %d: synchronized ITC with CPU %u (last diff %ld cycles, "
"maxerr %lu cycles)\n", smp_processor_id(), master, delta, rt);
-
- /*
- * Check whether we sync'd the itc ahead of the next timer interrupt. If so, just
- * reset it.
- */
- if (time_after(ia64_get_itc(), local_cpu_data->itm_next)) {
- Dprintk("CPU %d: oops, jumped a timer tick; resetting timer.\n",
- smp_processor_id());
- ia64_cpu_local_tick();
- }
}
/*
@@ -289,11 +287,6 @@
smp_setup_percpu_timer();
/*
- * Get our bogomips.
- */
- ia64_init_itm();
-
- /*
* Set I/O port base per CPU
*/
ia64_set_kr(IA64_KR_IO_BASE, __pa(ia64_iobase));
@@ -305,11 +298,6 @@
#endif
local_irq_enable();
- calibrate_delay();
- local_cpu_data->loops_per_jiffy = loops_per_jiffy;
-#ifdef CONFIG_IA32_SUPPORT
- ia32_gdt_init();
-#endif
if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT)) {
/*
@@ -321,6 +309,17 @@
Dprintk("Going to syncup ITC with BP.\n");
ia64_sync_itc(0);
}
+
+ /*
+ * Get our bogomips.
+ */
+ ia64_init_itm();
+ calibrate_delay();
+ local_cpu_data->loops_per_jiffy = loops_per_jiffy;
+
+#ifdef CONFIG_IA32_SUPPORT
+ ia32_gdt_init();
+#endif
/*
* Allow the master to continue.
=== include/asm-ia64/delay.h 1.8 vs edited ==--- 1.8/include/asm-ia64/delay.h Wed Oct 15 15:07:14 2003
+++ edited/include/asm-ia64/delay.h Thu May 13 11:31:29 2004
@@ -44,6 +44,16 @@
ia64_srlz_d();
}
+static __inline__ unsigned long
+ia64_get_itv (void)
+{
+ unsigned long result;
+
+ result = ia64_getreg(_IA64_REG_CR_ITV);
+ ia64_srlz_d();
+ return result;
+}
+
static __inline__ void
ia64_set_itc (unsigned long val)
{
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fix "timer tick before it's due"
2004-05-12 23:21 [PATCH] fix "timer tick before it's due" Bjorn Helgaas
` (3 preceding siblings ...)
2004-05-13 21:25 ` Bjorn Helgaas
@ 2004-05-13 22:36 ` David Mosberger
4 siblings, 0 replies; 6+ messages in thread
From: David Mosberger @ 2004-05-13 22:36 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 13 May 2004 15:25:31 -0600, Bjorn Helgaas <bjorn.helgaas@hp.com> said:
Bjorn> I agree. How about this (same as the first patch, plus the
Bjorn> BUG_ON() and removal of the "jumped tick" check):
Looks great to me (modulo a change to ia64_get_itv() to drop ia64_srlz_d()).
Thanks!
--david
^ permalink raw reply [flat|nested] 6+ messages in thread