* hrtimer problem on AT91RM9200 @ 2009-08-19 12:24 Bo Hansen 2009-08-19 15:33 ` Remy Bohmer 2009-08-20 19:52 ` Uwe Kleine-König 0 siblings, 2 replies; 19+ messages in thread From: Bo Hansen @ 2009-08-19 12:24 UTC (permalink / raw) To: rt-users [-- Attachment #1: Type: text/plain, Size: 1118 bytes --] Hi, I have tested 2.6.29.6 with/without rt23 and 2.6.31-rc6 with/without rt2, but I keep getting kernel panics sometimes soon after I start cyclictest (v0.5), but sometimes it runs for hours without problems. I have tested with different options and the only one making a difference is disabling high resolution timers. Whenever this is disabled (RT patch or not) the system does not panic. This of course does not make the latency as good as I would like it to be. A kernel panic from both an RT and non-RT kernel with high resolution timer has been attached. The source seems to vary from time to time. I noticed than when I stress the system I sometimes get the following message: hrtimer: interrupt too slow, forcing clock min delta to 52482 ns And often (but no always) the system panics a little while later. I guess it tells the system is too busy? but should it really end in a kernel panic? Using the RT patch I understand as we can set the priority as we like and stall the system completely, but on a non-RT kernel this should not be possible? Any thoughts on how to proceed? Thanks in advance, Bo [-- Attachment #2: 190809_kernel_panic_2.6.29.6-highres_2 --] [-- Type: text/plain, Size: 4836 bytes --] hrtimer: interrupt too slow, forcing clock min delta to 52482 ns Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c3a8c000 [00000000] *pgd=23a27031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] Modules linked in: CPU: 0 Not tainted (2.6.29.6 #1) PC is at clkevt32k_next_event+0x70/0xac LR is at clockevents_program_event+0xfc/0x168 pc : [<c002fb90>] lr : [<c005687c>] psr: 60000093 sp : c3a35ca0 ip : c3a35cc0 fp : c3a35cbc r10: 00000000 r9 : 00000000 r8 : c02da8a8 r7 : 00000041 r6 : 00000000 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00000000 r1 : c02da8a8 r0 : 00000001 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: c000717f Table: 23a8c000 DAC: 00000015 Process cyclictest (pid: 498, stack limit = 0xc3a34268) Stack: (0xc3a35ca0 to 0xc3a36000) 5ca0: 00000041 00000000 000225c1 00000000 c3a35ce4 c3a35cc0 c005687c c002fb30 5cc0: 00000041 23cc4bf5 00000041 23cbf12f 00000041 c02da8a8 c3a35d34 c3a35ce8 5ce0: c00572b8 c0056790 23cbf12f 00000041 c02f8620 c3a34000 c3a35d2c c3a35d08 5d00: 23cbf12f 00000041 c004f9e0 23cc4bf5 00000041 00000041 00000000 00000000 5d20: c02dd1f8 c02dd1f8 c3a35d54 c3a35d38 c005735c c00571fc 00000001 c02da8a8 5d40: c02dd1c8 23cbde68 c3a35db4 c3a35d58 c004fc04 c005733c c00572b8 c02da8a8 5d60: 23cbde68 00000041 23cc4bf5 00000041 00000000 00000001 23cbde68 00000041 5d80: c3a35d9c 23cc4bf5 00000041 c02da888 00000000 00000001 00000000 c02f10f8 5da0: c02f8620 c3a34000 c3a35dd4 c3a35db8 c002fc80 c004fa34 c02dd1c8 c02da888 5dc0: 00000000 00000001 c3a35df4 c3a35dd8 c005f604 c002fbdc c02dd754 00000001 5de0: 00000000 c38165e0 c3a35e0c c3a35df8 c0061130 c005f5d0 00000001 c02faab8 5e00: c3a35e2c c3a35e10 c002606c c00610c0 c3a35e34 ffffffff fefff000 00000001 5e20: c3a35ea4 c3a35e30 c00269a8 c0026010 c02f0b64 00000002 c3946000 4a96e930 5e40: c394601c c3816e20 00000015 c38165e0 c02f10f8 c02f8620 c3a34000 c3a35ea4 5e60: c3a34044 c3a35e78 c0026d00 c0050a80 a0000013 ffffffff c3a35e9c 7fffffff 5e80: 7fffffff c3a34000 00000000 c3a35f78 c3a34000 c3a35ef0 c3a35edc c3a35ea8 5ea0: c023cd54 c023c868 00000000 60000093 60000093 60000013 0000000e 7fffffff 5ec0: 7fffffff 00000000 c3a34000 00000000 c3a35eec c3a35ee0 c023cde0 c023ccd4 5ee0: c3a35fa4 c3a35ef0 c004437c c023cdcc 0000000e 00000000 0001fffe 00000011 5f00: c3a35f4c c3a35f10 c0053fd0 c01b4b58 00000000 00000000 c3a35f88 0000419e 5f20: 00000000 0000419e c3a35f88 c02f7400 c02f73e0 c02f73e0 c3a34000 000172f4 5f40: c3a35f74 c3a35f50 c004f9e0 c003aeb4 00000000 c3a35f88 4916ddf4 40029770 5f60: 00000107 4916ddf4 40029770 00000107 c0026ec4 4916ddf4 ffffdfff ffffffff 5f80: 4916ddfc 00000000 4916ddfc 000000b1 c0026ec4 40040000 00000000 c3a35fa8 5fa0: c0026d40 c00441bc 4916ddfc 00000000 4916dcfc 00000000 00000000 00000008 5fc0: 4916ddfc 00000000 4916ddfc 000000b1 00017978 00015c60 40040000 4916ddf4 5fe0: 4916dcfc 4916dbc0 40033904 40033864 60000010 4916dcfc 00000000 00000000 Backtrace: [<c002fb20>] (clkevt32k_next_event+0x0/0xac) from [<c005687c>] (clockevents_program_e) r6:00000000 r5:000225c1 r4:00000000 [<c0056780>] (clockevents_program_event+0x0/0x168) from [<c00572b8>] (tick_dev_progra) r8:c02da8a8 r7:00000041 r6:23cbf12f r5:00000041 r4:23cc4bf5 [<c00571ec>] (tick_dev_program_event+0x0/0xf8) from [<c005735c>] (tick_program_event+) [<c005732c>] (tick_program_event+0x0/0x3c) from [<c004fc04>] (hrtimer_interrupt+0x1e0) r5:23cbde68 r4:c02dd1c8 [<c004fa24>] (hrtimer_interrupt+0x0/0x284) from [<c002fc80>] (at91rm9200_timer_interr) [<c002fbcc>] (at91rm9200_timer_interrupt+0x0/0xc8) from [<c005f604>] (handle_IRQ_even) r6:00000001 r5:00000000 r4:c02da888 [<c005f5c0>] (handle_IRQ_event+0x0/0x80) from [<c0061130>] (handle_level_irq+0x80/0x1) r7:c38165e0 r6:00000000 r5:00000001 r4:c02dd754 [<c00610b0>] (handle_level_irq+0x0/0x11c) from [<c002606c>] (__exception_text_start+0) r5:c02faab8 r4:00000001 [<c0026000>] (__exception_text_start+0x0/0x8c) from [<c00269a8>] (__irq_svc+0x28/0x60) Exception stack(0xc3a35e30 to 0xc3a35e78) 5e20: c02f0b64 00000002 c3946000 4a96e930 5e40: c394601c c3816e20 00000015 c38165e0 c02f10f8 c02f8620 c3a34000 c3a35ea4 5e60: c3a34044 c3a35e78 c0026d00 c0050a80 a0000013 ffffffff r6:00000001 r5:fefff000 r4:ffffffff [<c023c858>] (schedule+0x0/0x2ac) from [<c023cd54>] (schedule_timeout+0x90/0xd0) [<c023ccc4>] (schedule_timeout+0x0/0xd0) from [<c023cde0>] (schedule_timeout_interrup) r7:00000000 r6:c3a34000 r5:00000000 r4:7fffffff [<c023cdbc>] (schedule_timeout_interruptible+0x0/0x28) from [<c004437c>] (sys_rt_sigt) [<c00441ac>] (sys_rt_sigtimedwait+0x0/0x270) from [<c0026d40>] (ret_fast_syscall+0x0/) Code: e121f004 e3a00000 e89da878 e3a03000 (e5833000) Kernel panic - not syncing: Fatal exception in interrupt [-- Attachment #3: 180809_kernel_panic_2.6.29.6-rt23_5 --] [-- Type: text/plain, Size: 4082 bytes --] Unable to handle kernel NULL pointer dereference at virtual a0 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT Modules linked in: CPU: 0 Not tainted (2.6.29.6-rt23 #1) PC is at clkevt32k_next_event+0x70/0xac LR is at clockevents_program_event+0xfc/0x168 pc : [<c002ef88>] lr : [<c005a1e0>] psr: 60000093 sp : c02f3d98 ip : c02f3db8 fp : c02f3db4 r10: 00000000 r9 : 00000000 r8 : c02f6a30 r7 : 00000045 r6 : 00000000 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00000000 r1 : c02f6a30 r0 : 00000001 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: c000717f Table: 23970000 DAC: 00000017 Process swapper (pid: 0, stack limit = 0xc02f2270) Stack: (0xc02f3d98 to 0xc02f4000) 3d80: 2d2c08e3 00000000 3da0: 000225c1 00000000 c02f3ddc c02f3db8 c005a1e0 c002ef28 00000045 2d484309 3dc0: 00000045 2d483fe8 00000045 c02f6a30 c02f3e2c c02f3de0 c005aea8 c005a0f4 3de0: 2d483fe8 00000045 68e2da6f 00000000 00000000 c02f3e70 2d483fe8 00000045 3e00: 689e360d 2d484309 00000045 00000045 c02f9de0 2d484309 00000045 c02f2000 3e20: c02f3e4c c02f3e30 c005af4c c005adec 00000001 00000000 c02f9de0 2d48106f 3e40: c02f3eac c02f3e50 c005215c c005af2c 3fd06258 c02f6a30 2d48106f 00000045 3e60: 00000001 c02f9de0 00000001 00000000 2d48106f 00000045 00000001 00000000 3e80: 00000000 c02f6a08 00010002 c02f2000 00000001 00000000 00000000 00000001 3ea0: c02f3ecc c02f3eb0 c002f078 c0051f28 c385f7a0 c02f6a08 00010002 c02f2000 3ec0: c02f3efc c02f3ed0 c006513c c002efd4 c02f3ef4 c02fa4f8 c02f2000 c02f6a08 3ee0: 00000001 00000001 c02f2000 2001fc9c c02f3f1c c02f3f00 c00678dc c00650f8 3f00: 00000001 c0302d40 00000000 00000002 c02f3f3c c02f3f20 c002506c c00677e8 3f20: c02f3f54 ffffffff fefff000 00000001 c02f3f94 c02f3f40 c00259dc c0025010 3f40: 00000000 00000001 c02f2000 60000013 c0026eb0 c02f2000 c02f5eb0 c03104e0 3f60: 2001fcd0 41129200 2001fc9c c02f3f94 c02f3f98 c02f3f88 c0026ef0 c0026efc 3f80: 60000013 ffffffff c02f3fb4 c02f3f98 c0026db0 c0026ec0 c02f2000 c031048c 3fa0: c00222c0 c02f5ce0 c02f3fcc c02f3fb8 c02540dc c0026d70 c031048c c03195fc 3fc0: c02f3ff4 c02f3fd0 c0008b38 c0254068 c0008594 00000000 00000000 c0021ebc 3fe0: c0007175 c0310518 00000000 c02f3ff8 20008034 c0008970 00000000 00000000 Backtrace: [<c002ef18>] (clkevt32k_next_event+0x0/0xac) from [<c005a1e0>] (clockevents_pro) r6:00000000 r5:000225c1 r4:00000000 [<c005a0e4>] (clockevents_program_event+0x0/0x168) from [<c005aea8>] (tick_dev_) r8:c02f6a30 r7:00000045 r6:2d483fe8 r5:00000045 r4:2d484309 [<c005addc>] (tick_dev_program_event+0x0/0xf8) from [<c005af4c>] (tick_program_) [<c005af1c>] (tick_program_event+0x0/0x3c) from [<c005215c>] (hrtimer_interrupt) r5:2d48106f r4:c02f9de0 [<c0051f18>] (hrtimer_interrupt+0x0/0x2e8) from [<c002f078>] (at91rm9200_timer_) [<c002efc4>] (at91rm9200_timer_interrupt+0x0/0xc8) from [<c006513c>] (handle_IR) r6:c02f2000 r5:00010002 r4:c02f6a08 [<c00650e8>] (handle_IRQ_event+0x0/0xec) from [<c00678dc>] (handle_level_irq+0x) [<c00677d8>] (handle_level_irq+0x0/0x174) from [<c002506c>] (_text+0x6c/0x8c) r7:00000002 r6:00000000 r5:c0302d40 r4:00000001 [<c0025000>] (_text+0x0/0x8c) from [<c00259dc>] (__irq_svc+0x3c/0x80) Exception stack(0xc02f3f40 to 0xc02f3f88) 3f40: 00000000 00000001 c02f2000 60000013 c0026eb0 c02f2000 c02f5eb0 c03104e0 3f60: 2001fcd0 41129200 2001fc9c c02f3f94 c02f3f98 c02f3f88 c0026ef0 c0026efc 3f80: 60000013 ffffffff r6:00000001 r5:fefff000 r4:ffffffff [<c0026eb0>] (default_idle+0x0/0x54) from [<c0026db0>] (cpu_idle+0x50/0xac) [<c0026d60>] (cpu_idle+0x0/0xac) from [<c02540dc>] (rest_init+0x84/0x9c) r7:c02f5ce0 r6:c00222c0 r5:c031048c r4:c02f2000 [<c0254058>] (rest_init+0x0/0x9c) from [<c0008b38>] (start_kernel+0x1d8/0x2c8) r4:c03195fc [<c0008960>] (start_kernel+0x0/0x2c8) from [<20008034>] (0x20008034) r5:c0310518 r4:c0007175 Code: e121f004 e3a00000 e89da878 e3a03000 (e5833000) Kernel panic - not syncing: Fatal exception in interrupt ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-08-19 12:24 hrtimer problem on AT91RM9200 Bo Hansen @ 2009-08-19 15:33 ` Remy Bohmer 2009-08-20 19:52 ` Uwe Kleine-König 1 sibling, 0 replies; 19+ messages in thread From: Remy Bohmer @ 2009-08-19 15:33 UTC (permalink / raw) To: Bo Hansen; +Cc: rt-users Hi Bo, > I have tested with different options and the only one making a > difference is disabling high resolution timers. Whenever this is disabled > (RT patch or not) the system does not panic. This of course does not > make the latency as good as I would like it to be. Reading this I guess you want a timer accuracy better than about 1 millisecond on this processor.(at91rm9200) Setting the CONFIG_HZ to 1000 and disabling HRT would give you about 1 millisecond accuracy. But, timers in the order of 1 millisecond magnitude can easily have an system (interrupt+softirqs) overhead up to 250us while using HRT on this processor. So, if you want sub-millisecond timers, then to me it seems that things are somewhat out of balance here. But, If you want an accurate period time for your realtime application, you can also use the TC-library to provide you a periodic interrupt on which you schedule your RT-task. This will provide you much better latencies, and a much lower CPU-load. > A kernel panic from both an RT and non-RT kernel with high resolution > timer has been attached. The source seems to vary from time to time. > > I noticed than when I stress the system I sometimes get the > following message: > hrtimer: interrupt too slow, forcing clock min delta to 52482 ns This is exactly what I would expect, except that these 52us is still quite fast for this processor... > And often (but no always) the system panics a little while later. I guess > it tells the system is too busy? but should it really end in a kernel > panic? Using the RT patch I understand as we can set the priority > as we like and stall the system completely, but on a non-RT kernel this > should not be possible? But anyway, it should not panic... Kind regards, Remy ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-08-19 12:24 hrtimer problem on AT91RM9200 Bo Hansen 2009-08-19 15:33 ` Remy Bohmer @ 2009-08-20 19:52 ` Uwe Kleine-König 2009-08-21 12:57 ` Bo Hansen 1 sibling, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-08-20 19:52 UTC (permalink / raw) To: Bo Hansen; +Cc: rt-users Hello, > PC is at clkevt32k_next_event+0x70/0xac This function has a WARN_ON_ONCE(...). Can you check if this triggers? Can you provide me a kernel image with CONFIG_DEBUG_INFO=y and a matching oops? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-08-20 19:52 ` Uwe Kleine-König @ 2009-08-21 12:57 ` Bo Hansen [not found] ` <fae6a9700908232327u65fca65bt97bb8c43a91345cd@mail.gmail.com> 2009-09-03 14:12 ` Uwe Kleine-König 0 siblings, 2 replies; 19+ messages in thread From: Bo Hansen @ 2009-08-21 12:57 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: rt-users Hi Uwe, I cannot find anything in the dmesg or serial output regarding the WARN_ON_ONCE(...) I guess the WARN_ON_ONCE(...) should show up in dmesg output? What puzzles me is that I tried the CONFIG_DEBUG_BUGVERBOSE=y also when I enabled CONFIG_DEBUG_INFO. When I do this I don't get the kernel panic. Is this somehow related to the WARN_ON_ONCE? Attachments only sent to Uwe - anybody are welcome to get a copy. Best regards, Bo Uwe Kleine-König wrote: > Hello, > > >> PC is at clkevt32k_next_event+0x70/0xac >> > This function has a WARN_ON_ONCE(...). Can you check if this triggers? > > Can you provide me a kernel image with CONFIG_DEBUG_INFO=y and a > matching oops? > > Best regards > Uwe > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <fae6a9700908232327u65fca65bt97bb8c43a91345cd@mail.gmail.com>]
* Re: hrtimer problem on AT91RM9200 [not found] ` <fae6a9700908232327u65fca65bt97bb8c43a91345cd@mail.gmail.com> @ 2009-08-24 6:30 ` Vivek Satpute 0 siblings, 0 replies; 19+ messages in thread From: Vivek Satpute @ 2009-08-24 6:30 UTC (permalink / raw) To: Bo Hansen; +Cc: Uwe Kleine-König, rt-users Adding RT-mailing list.. 2009/8/24 Vivek Satpute <vivek@linsyssoft.com>: > Hi Bo, > > 2009/8/21 Bo Hansen <bh@newtec.dk> >> >> Hi Uwe, >> >> I cannot find anything in the dmesg or serial output regarding the >> WARN_ON_ONCE(...) >> I guess the WARN_ON_ONCE(...) should show up in dmesg output? > > I am not sure that it is directly shown up in dmesg output, but you can > add printks to verify whether WARN_ON_ONCE() code path hits or not. >> >> What puzzles me is that I tried the CONFIG_DEBUG_BUGVERBOSE=y >> also when I enabled CONFIG_DEBUG_INFO. When I do >> this I don't get the kernel panic. Is this somehow related to the >> WARN_ON_ONCE? > > I have encountered this scenario. If there is any race condition in code > then it might get suppressed when debugging options are enabled and > kernel does not panic. >> >> Attachments only sent to Uwe - anybody are welcome to get a copy. >> >> Best regards, >> Bo > > Thanks, > Vivek Satpute >> >> >> Uwe Kleine-König wrote: >>> >>> Hello, >>> >>> >>>> >>>> PC is at clkevt32k_next_event+0x70/0xac >>>> >>> >>> This function has a WARN_ON_ONCE(...). Can you check if this triggers? >>> >>> Can you provide me a kernel image with CONFIG_DEBUG_INFO=y and a >>> matching oops? >>> >>> Best regards >>> Uwe >>> >>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" >> in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-08-21 12:57 ` Bo Hansen [not found] ` <fae6a9700908232327u65fca65bt97bb8c43a91345cd@mail.gmail.com> @ 2009-09-03 14:12 ` Uwe Kleine-König 2009-09-10 6:44 ` Bo Hansen 1 sibling, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-03 14:12 UTC (permalink / raw) To: Bo Hansen; +Cc: rt-users Hello, On Fri, Aug 21, 2009 at 02:57:43PM +0200, Bo Hansen wrote: > Hi Uwe, > > I cannot find anything in the dmesg or serial output regarding the > WARN_ON_ONCE(...) > I guess the WARN_ON_ONCE(...) should show up in dmesg output? > > What puzzles me is that I tried the CONFIG_DEBUG_BUGVERBOSE=y > also when I enabled CONFIG_DEBUG_INFO. When I do > this I don't get the kernel panic. Is this somehow related to the > WARN_ON_ONCE? > > Attachments only sent to Uwe - anybody are welcome to get a copy. Hhhhmm, clkevt32k_next_event is called with delta=1 which hits BUG_ON(delta < 2). I calculated the clockeventdevice parameters by hand, and the numbers look good: clkevt.shift = 32 clkevt.mult = div_sc(AT91_SLOW_CLOCK, NSEC_PER_SEC, clkevt.shift) = (32768 << 32) / 1000000000 = 140737 clkevt.min_delta_ns = clockevent_delta2ns(2, &clkevt) + 1 = (2 << clkevt.shift) / clkevt.mult + 1 = (2 << 32) / 140737 + 1 = 61036 min_delta = (clkevt.min_delta_ns * mult) >> clkevt.shift = (61036 * 140737) >> 32 = 2 Can you please apply the patch below, reproduce and provide the two lines of output resulting from the two added printks? Best regards Uwe diff --git a/arch/arm/mach-at91/at91rm9200_time.c b/arch/arm/mach-at91/at91rm9200_time.c index 1ff1bda..83a7379 100644 --- a/arch/arm/mach-at91/at91rm9200_time.c +++ b/arch/arm/mach-at91/at91rm9200_time.c @@ -136,7 +136,13 @@ clkevt32k_next_event(unsigned long delta, struct clock_event_device *dev) u32 alm; int status = 0; - BUG_ON(delta < 2); + if (unlikely(delta < 2)) { + pr_emerg("************** %s: shift=%d, mult=%lu, " + "min_delta_ns=%lu, dev=%pS\n", __func__, + dev->shift, dev->mult, + dev->min_delta_ns, dev); + BUG(); + } /* Use "raw" primitives so we behave correctly on RT kernels. */ raw_local_irq_save(flags); @@ -205,6 +211,18 @@ void __init at91rm9200_timer_init(void) clkevt.mult = div_sc(AT91_SLOW_CLOCK, NSEC_PER_SEC, clkevt.shift); clkevt.max_delta_ns = clockevent_delta2ns(AT91_ST_ALMV, &clkevt); clkevt.min_delta_ns = clockevent_delta2ns(2, &clkevt) + 1; + + { + unsigned long long clc = (int64_t)clkevt.min_delta_ns * + clkevt.mult; + + clc >>= clkevt.shift; + + pr_info("************** %s: shift=%d, mult=%lu, " + "min_delta_ns=%lu, minclc=%llu\n", __func__, + clkevt.shift, clkevt.mult, + clkevt.min_delta_ns, clc); + } clkevt.cpumask = cpumask_of(0); clockevents_register_device(&clkevt); -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-03 14:12 ` Uwe Kleine-König @ 2009-09-10 6:44 ` Bo Hansen 2009-09-10 9:51 ` Uwe Kleine-König 0 siblings, 1 reply; 19+ messages in thread From: Bo Hansen @ 2009-09-10 6:44 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: rt-users [-- Attachment #1: Type: text/plain, Size: 2841 bytes --] Hi Uwe, Sorry for my late reply. I only get the initialization line: ************** at91rm9200_timer_init: shift=32, mult=140737, min_delta_ns=61036, minclc=2 I have attached the kernel panic. Best regards, Bo Uwe Kleine-König wrote: > Hello, > > On Fri, Aug 21, 2009 at 02:57:43PM +0200, Bo Hansen wrote: > >> Hi Uwe, >> >> I cannot find anything in the dmesg or serial output regarding the >> WARN_ON_ONCE(...) >> I guess the WARN_ON_ONCE(...) should show up in dmesg output? >> >> What puzzles me is that I tried the CONFIG_DEBUG_BUGVERBOSE=y >> also when I enabled CONFIG_DEBUG_INFO. When I do >> this I don't get the kernel panic. Is this somehow related to the >> WARN_ON_ONCE? >> >> Attachments only sent to Uwe - anybody are welcome to get a copy. >> > Hhhhmm, clkevt32k_next_event is called with delta=1 which hits > BUG_ON(delta < 2). > > I calculated the clockeventdevice parameters by hand, and the numbers > look good: > > clkevt.shift = 32 > clkevt.mult = div_sc(AT91_SLOW_CLOCK, NSEC_PER_SEC, clkevt.shift) > = (32768 << 32) / 1000000000 > = 140737 > clkevt.min_delta_ns = clockevent_delta2ns(2, &clkevt) + 1 > = (2 << clkevt.shift) / clkevt.mult + 1 > = (2 << 32) / 140737 + 1 > = 61036 > min_delta = (clkevt.min_delta_ns * mult) >> clkevt.shift > = (61036 * 140737) >> 32 > = 2 > > Can you please apply the patch below, reproduce and provide the two > lines of output resulting from the two added printks? > > Best regards > Uwe > > diff --git a/arch/arm/mach-at91/at91rm9200_time.c b/arch/arm/mach-at91/at91rm9200_time.c > index 1ff1bda..83a7379 100644 > --- a/arch/arm/mach-at91/at91rm9200_time.c > +++ b/arch/arm/mach-at91/at91rm9200_time.c > @@ -136,7 +136,13 @@ clkevt32k_next_event(unsigned long delta, struct clock_event_device *dev) > u32 alm; > int status = 0; > > - BUG_ON(delta < 2); > + if (unlikely(delta < 2)) { > + pr_emerg("************** %s: shift=%d, mult=%lu, " > + "min_delta_ns=%lu, dev=%pS\n", __func__, > + dev->shift, dev->mult, > + dev->min_delta_ns, dev); > + BUG(); > + } > > /* Use "raw" primitives so we behave correctly on RT kernels. */ > raw_local_irq_save(flags); > @@ -205,6 +211,18 @@ void __init at91rm9200_timer_init(void) > clkevt.mult = div_sc(AT91_SLOW_CLOCK, NSEC_PER_SEC, clkevt.shift); > clkevt.max_delta_ns = clockevent_delta2ns(AT91_ST_ALMV, &clkevt); > clkevt.min_delta_ns = clockevent_delta2ns(2, &clkevt) + 1; > + > + { > + unsigned long long clc = (int64_t)clkevt.min_delta_ns * > + clkevt.mult; > + > + clc >>= clkevt.shift; > + > + pr_info("************** %s: shift=%d, mult=%lu, " > + "min_delta_ns=%lu, minclc=%llu\n", __func__, > + clkevt.shift, clkevt.mult, > + clkevt.min_delta_ns, clc); > + } > clkevt.cpumask = cpumask_of(0); > clockevents_register_device(&clkevt); > > > > > > [-- Attachment #2: 100909_kernel_panic_2.6.29.6-rt23-debug-uwe --] [-- Type: text/plain, Size: 7414 bytes --] Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c3930000 [00000000] *pgd=2386e031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT Modules linked in: CPU: 0 Not tainted (2.6.29.6-rt23 #1) PC is at clkevt32k_next_event+0x94/0xd8 LR is at rt_mutex_unlock+0x10/0x14 pc : [<c002efac>] lr : [<c0256e40>] psr: 00000093 sp : c3a6fc20 ip : c3a6fb30 fp : c3a6fc44 r10: 00000001 r9 : 00000000 r8 : c02f6a30 r7 : 00000032 r6 : 00000000 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00010002 r1 : 60000093 r0 : 00000062 Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: c000717f Table: 23930000 DAC: 00000015 Process cyclictest (pid: 579, stack limit = 0xc3a6e270) Stack: (0xc3a6fc20 to 0xc3a70000) fc20: 0000dcaa c02f6a30 00013bc9 00000000 000225c1 00000000 c3a6fc6c c3a6fc48 fc40: c005a20c c002ef28 00000000 0083ad27 00000032 0082d07d 00000032 c02f6a30 fc60: c3a6fcbc c3a6fc70 c005aed4 c005a120 0082d07d 00000032 0083ad27 00000032 fc80: 0082d07d 00000032 0082bdb6 00000032 3bccb551 00829826 00000032 00000032 fca0: c02f9de0 00829826 00000032 c3a6e000 c3a6fcdc c3a6fcc0 c005af78 c005ae18 fcc0: 00000001 00000000 c02f9de0 0082947f c3a6fd3c c3a6fce0 c0052188 c005af58 fce0: 3fd06258 c02f6a30 0082947f 00000032 c3a6fd1c c02f9de0 00000001 00000000 fd00: 0082947f 00000032 c02f9de0 198461bc c3a6fd7c c02f6a08 00010002 c3a6e000 fd20: 00000001 00000000 00000000 00000001 c3a6fd5c c3a6fd40 c002f0a4 c0051f54 fd40: 198461bc c02f6a08 00010002 c3a6e000 c3a6fd8c c3a6fd60 c0065168 c002f000 fd60: 00000001 c02fa4f8 c3a6e000 c02f6a08 00000001 00000001 c3a6e000 c3a6e000 fd80: c3a6fdac c3a6fd90 c0067908 c0065124 00000001 c0302d40 00000000 00000002 fda0: c3a6fdcc c3a6fdb0 c002506c c0067814 00000000 ffffffff fefff000 00000001 fdc0: c3a6fe74 c3a6fdd0 c00259dc c0025010 00000000 c3a67ef8 c3a5fef8 40000013 fde0: 40000013 00000000 0233a240 00000032 c3a6fef8 c02f9de0 c3a6e000 c3a6fe74 fe00: c3a67ef8 c3a6fe18 c3a47ef8 c00528a4 60000013 ffffffff 00000000 00000000 fe20: 00000000 40000013 0233a240 00000032 00000000 00000000 c00540a8 c0054044 fe40: 00000000 c3a6e000 c3a6fe7c 3b9aca00 00000032 00000000 00000000 00000000 fe60: 00000000 c3a6fef8 c3a6fe94 c3a6fe78 c00529cc c0052758 00000000 00000000 fe80: 00000001 3b9aca00 c3a6fed4 c3a6fe98 c02569ac c00529b4 00000000 00000000 fea0: c3a6e000 00000000 c02f9de0 00000000 00000000 00000000 c3a6ff80 c3a6e000 fec0: 00000000 00000000 c3a6ff64 c3a6fed8 c0052a8c c0256904 00000000 00000000 fee0: 38a09550 00000000 0233a240 00000032 00000000 00000000 c0310700 00000000 ff00: 00000000 00000000 0233a240 00000032 0233a240 00000032 c0051e68 c02f9de0 ff20: 00000001 c3a6ff24 c3a6ff24 00000001 c3a63680 c003bf78 c3a6ff88 00000001 ff40: 00000001 00000000 c3a6ff80 c0025f44 c3a6e000 00017690 c3a6ff7c c3a6ff68 ff60: c004c8d0 c00529e4 c0025f44 00000001 c3a6ffa4 c3a6ff80 c004ca00 c004c8b4 ff80: 00000032 0233a240 00000001 00000000 5396ddec 00000109 00000000 c3a6ffa8 ffa0: c0025dc0 c004c8e4 00000001 00000000 00000001 00000001 5396ddec 00000000 ffc0: 00000001 00000000 5396ddec 00000109 00000001 00015c60 00017690 5396ddf4 ffe0: 00000000 5396dc50 4004680c 4004682c 60000010 00000001 00000000 00000000 Backtrace: [<c002ef18>] (clkevt32k_next_event+0x0/0xd8) from [<c005a20c>] (clockevents_prog ram_event+0xfc/0x168) r6:00000000 r5:000225c1 r4:00000000 [<c005a110>] (clockevents_program_event+0x0/0x168) from [<c005aed4>] (tick_dev_p rogram_event+0xcc/0xf8) r8:c02f6a30 r7:00000032 r6:0082d07d r5:00000032 r4:0083ad27 [<c005ae08>] (tick_dev_program_event+0x0/0xf8) from [<c005af78>] (tick_program_e vent+0x30/0x3c) [<c005af48>] (tick_program_event+0x0/0x3c) from [<c0052188>] (hrtimer_interrupt+ 0x244/0x2e8) r5:0082947f r4:c02f9de0 [<c0051f44>] (hrtimer_interrupt+0x0/0x2e8) from [<c002f0a4>] (at91rm9200_timer_i nterrupt+0xb4/0xc8) [<c002eff0>] (at91rm9200_timer_interrupt+0x0/0xc8) from [<c0065168>] (handle_IRQ _event+0x54/0xec) r6:c3a6e000 r5:00010002 r4:c02f6a08 [<c0065114>] (handle_IRQ_event+0x0/0xec) from [<c0067908>] (handle_level_irq+0x1 04/0x174) [<c0067804>] (handle_level_irq+0x0/0x174) from [<c002506c>] (_text+0x6c/0x8c) r7:00000002 r6:00000000 r5:c0302d40 r4:00000001 [<c0025000>] (_text+0x0/0x8c) from [<c00259dc>] (__irq_svc+0x3c/0x80) Exception stack(0xc3a6fdd0 to 0xc3a6fe18) fdc0: 00000000 c3a67ef8 c3a5fef8 40000013 fde0: 40000013 00000000 0233a240 00000032 c3a6fef8 c02f9de0 c3a6e000 c3a6fe74 fe00: c3a67ef8 c3a6fe18 c3a47ef8 c00528a4 60000013 ffffffff r6:00000001 r5:fefff000 r4:ffffffff [<c0052748>] (__hrtimer_start_range_ns+0x0/0x228) from [<c00529cc>] (hrtimer_sta rt_range_ns+0x28/0x30) [<c00529a4>] (hrtimer_start_range_ns+0x0/0x30) from [<c02569ac>] (do_nanosleep+0 xb8/0x114) [<c02568f4>] (do_nanosleep+0x0/0x114) from [<c0052a8c>] (hrtimer_nanosleep+0xb8/ 0x140) [<c00529d4>] (hrtimer_nanosleep+0x0/0x140) from [<c004c8d0>] (common_nsleep+0x2c /0x30) [<c004c8a4>] (common_nsleep+0x0/0x30) from [<c004ca00>] (sys_clock_nanosleep+0x1 2c/0x150) r4:00000001 [<c004c8d4>] (sys_clock_nanosleep+0x0/0x150) from [<c0025dc0>] (ret_fast_syscall +0x0/0x2c) r7:00000109 r6:5396ddec r5:00000000 r4:00000001 Code: e59f1040 e88d5000 eb0024fc e3a03000 (e5833000) Kernel panic - not syncing: Fatal exception in interrupt [-- Attachment #3: 100909_kernel_panic_2.6.29.6-rt23-debug-uwe2 --] [-- Type: text/plain, Size: 5898 bytes --] Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c0004000 [00000000] *pgd=00000000 Internal error: Oops: 817 [#1] PREEMPT Modules linked in: CPU: 0 Not tainted (2.6.29.6-rt23 #1) PC is at clkevt32k_next_event+0x94/0xd8 LR is at rt_mutex_unlock+0x10/0x14 pc : [<c002efac>] lr : [<c0256e40>] psr: 00000093 sp : c02f3d90 ip : c02f3ca0 fp : c02f3db4 r10: 00000000 r9 : 00000000 r8 : c02f6a30 r7 : 0000004a r6 : 00000000 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00010002 r1 : 60000093 r0 : 00000062 Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: c000717f Table: 23890000 DAC: 00000017 Process swapper (pid: 0, stack limit = 0xc02f2270) Stack: (0xc02f3d90 to 0xc02f4000) 3d80: 0000c6c0 c02f6a30 131806f5 00000000 3da0: 000225c1 00000000 c02f3ddc c02f3db8 c005a20c c002ef28 0000004a 13797362 3dc0: 0000004a 13796120 0000004a c02f6a30 c02f3e2c c02f3de0 c005aed4 c005a120 3de0: 13796120 0000004a 4f140486 00000000 00000000 c02f3e70 13796120 0000004a 3e00: 4e6c7a07 13797362 0000004a 0000004a c02f9de0 13797362 0000004a c02f2000 3e20: c02f3e4c c02f3e30 c005af78 c005ae18 00000001 00000000 c02f9de0 13793a86 3e40: c02f3eac c02f3e50 c0052188 c005af58 3fd06258 c02f6a30 13793a86 0000004a 3e60: c0065168 c02f9de0 00000001 00000000 13793a86 0000004a c02f2000 c0302d40 3e80: c02f3e9c c02f6a08 00010002 c02f2000 00000001 00000000 00000000 00000001 3ea0: c02f3ecc c02f3eb0 c002f0a4 c0051f54 c38eb1e0 c02f6a08 00010002 c02f2000 3ec0: c02f3efc c02f3ed0 c0065168 c002f000 c02f3ef4 c02fa4f8 c02f2000 c02f6a08 3ee0: 00000001 00000001 c02f2000 2001fce8 c02f3f1c c02f3f00 c0067908 c0065124 3f00: 00000001 c0302d40 00000000 00000002 c02f3f3c c02f3f20 c002506c c0067814 3f20: c02f3f54 ffffffff fefff000 00000001 c02f3f94 c02f3f40 c00259dc c0025010 3f40: 00000000 00000001 c02f2000 60000013 c0026eb0 c02f2000 c02f5eb0 c03104e0 3f60: 2001fd1c 41129200 2001fce8 c02f3f94 c02f3f98 c02f3f88 c0026ef0 c0026efc 3f80: 60000013 ffffffff c02f3fb4 c02f3f98 c0026db0 c0026ec0 c02f2000 c031048c 3fa0: c00222c0 c02f5ce0 c02f3fcc c02f3fb8 c02540dc c0026d70 c031048c c03195fc 3fc0: c02f3ff4 c02f3fd0 c0008b38 c0254068 c0008594 00000000 00000000 c0021ebc 3fe0: c0007175 c0310518 00000000 c02f3ff8 20008034 c0008970 00000000 00000000 Backtrace: [<c002ef18>] (clkevt32k_next_event+0x0/0xd8) from [<c005a20c>] (clockevents_prog ram_event+0xfc/0x168) r6:00000000 r5:000225c1 r4:00000000 [<c005a110>] (clockevents_program_event+0x0/0x168) from [<c005aed4>] (tick_dev_p rogram_event+0xcc/0xf8) r8:c02f6a30 r7:0000004a r6:13796120 r5:0000004a r4:13797362 [<c005ae08>] (tick_dev_program_event+0x0/0xf8) from [<c005af78>] (tick_program_e vent+0x30/0x3c) [<c005af48>] (tick_program_event+0x0/0x3c) from [<c0052188>] (hrtimer_interrupt+ 0x244/0x2e8) r5:13793a86 r4:c02f9de0 [<c0051f44>] (hrtimer_interrupt+0x0/0x2e8) from [<c002f0a4>] (at91rm9200_timer_i nterrupt+0xb4/0xc8) [<c002eff0>] (at91rm9200_timer_interrupt+0x0/0xc8) from [<c0065168>] (handle_IRQ _event+0x54/0xec) r6:c02f2000 r5:00010002 r4:c02f6a08 [<c0065114>] (handle_IRQ_event+0x0/0xec) from [<c0067908>] (handle_level_irq+0x1 04/0x174) [<c0067804>] (handle_level_irq+0x0/0x174) from [<c002506c>] (_text+0x6c/0x8c) r7:00000002 r6:00000000 r5:c0302d40 r4:00000001 [<c0025000>] (_text+0x0/0x8c) from [<c00259dc>] (__irq_svc+0x3c/0x80) Exception stack(0xc02f3f40 to 0xc02f3f88) 3f40: 00000000 00000001 c02f2000 60000013 c0026eb0 c02f2000 c02f5eb0 c03104e0 3f60: 2001fd1c 41129200 2001fce8 c02f3f94 c02f3f98 c02f3f88 c0026ef0 c0026efc 3f80: 60000013 ffffffff r6:00000001 r5:fefff000 r4:ffffffff [<c0026eb0>] (default_idle+0x0/0x54) from [<c0026db0>] (cpu_idle+0x50/0xac) [<c0026d60>] (cpu_idle+0x0/0xac) from [<c02540dc>] (rest_init+0x84/0x9c) r7:c02f5ce0 r6:c00222c0 r5:c031048c r4:c02f2000 [<c0254058>] (rest_init+0x0/0x9c) from [<c0008b38>] (start_kernel+0x1d8/0x2c8) r4:c03195fc [<c0008960>] (start_kernel+0x0/0x2c8) from [<20008034>] (0x20008034) r5:c0310518 r4:c0007175 Code: e59f1040 e88d5000 eb0024fc e3a03000 (e5833000) Kernel panic - not syncing: Fatal exception in interrupt ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-10 6:44 ` Bo Hansen @ 2009-09-10 9:51 ` Uwe Kleine-König 2009-09-11 15:20 ` Uwe Kleine-König 0 siblings, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-10 9:51 UTC (permalink / raw) To: Bo Hansen, Thomas Gleixner, john stultz; +Cc: rt-users Hello Bo, On Thu, Sep 10, 2009 at 08:44:11AM +0200, Bo Hansen wrote: > Hi Uwe, > > Sorry for my late reply. > > I only get the initialization line: > ************** at91rm9200_timer_init: shift=32, mult=140737, > min_delta_ns=61036, minclc=2 OK, that corresponds to my calculation. Strange, with my patch lr is at rt_mutex_unlock in the oops, before it was at clockevents_program_event. I don't understand that. Can you try changing the pr_info to pr_emerg? If that doesn't help, do you have CONFIG_DEBUG_LL available? If so can you change the pr_info (now pr_emerg) to something like: void printch(char); void printhex8(unsigned); void printascii(const char *); printascii("**************"); printhex8(dev->shift); printch(':'); printhex8(dev->mult); printch(':'); printhex8(dev->min_delta_ns); printch('\n'); and enable CONFIG_DEBUG_LL of course? jstultz, tglx: there shouldn't be anything that fiddles around with shift, mult and min_delta, should it? (Well, increasing min_delta is OK.) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-10 9:51 ` Uwe Kleine-König @ 2009-09-11 15:20 ` Uwe Kleine-König 2009-09-11 23:07 ` Uwe Kleine-König 2009-09-14 9:09 ` Bo Hansen 0 siblings, 2 replies; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-11 15:20 UTC (permalink / raw) To: Bo Hansen, Thomas Gleixner, john stultz; +Cc: rt-users Hello Bo, In the meantime I got access to an at91rm9200, too. To help me reproducing the problem: - which options do you use for cyclictest? - Is the system idle or do you run hackbench or similar in parallel? - anything else? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-11 15:20 ` Uwe Kleine-König @ 2009-09-11 23:07 ` Uwe Kleine-König 2009-09-14 10:50 ` Bo Hansen 2009-09-14 9:09 ` Bo Hansen 1 sibling, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-11 23:07 UTC (permalink / raw) To: Bo Hansen, Thomas Gleixner, john stultz; +Cc: rt-users Hello, On Fri, Sep 11, 2009 at 05:20:44PM +0200, Uwe Kleine-König wrote: > Hello Bo, > > In the meantime I got access to an at91rm9200, too. To help me > reproducing the problem: I still cannot reproduce, but I found something anyhow. The problem is that hrtimer_interrupt_hanging decreases min_delta_ns. (Initially it's 61036.) I talked to jstultz on irc and both of us are unsure if asserting that min_delta_ns isn't decreased in hrtimer_interrupt_hanging is enough or if there is another problem. Bo, can you please apply the patch below, pass the kernel parameter ftrace_dump_on_oops (or alternatively do # echo 1 > /proc/sys/kernel/ftrace_dump_on_oops ), reproduce the problem and send the resulting oops? Best regards Uwe diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 51bd089..cd72ca9 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -18,6 +18,7 @@ #include <linux/personality.h> #include <linux/kallsyms.h> #include <linux/delay.h> +#include <linux/kdebug.h> #include <linux/hardirq.h> #include <linux/init.h> #include <linux/uaccess.h> @@ -223,6 +224,8 @@ static void __die(const char *str, int err, struct thread_info *thread, struct p dump_backtrace(regs, tsk); dump_instr(regs); } + + notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no, SIGSEGV); } DEFINE_RAW_SPINLOCK(die_lock); diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 9e308ab..a0c05f3 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1390,8 +1390,16 @@ void hrtimer_interrupt(struct clock_event_device *dev) /* Reprogramming necessary ? */ if (expires_next.tv64 != KTIME_MAX) { - if (tick_program_event(expires_next, force_clock_reprogram)) + if (tick_program_event(expires_next, force_clock_reprogram)) { + if (nr_retries > 1) + trace_printk("tick_program_event failed, " + "now=%lld, expires_next=%lld, " + "nr_retries=%d\n", + (long long)now.tv64, + (long long)expires_next.tv64, + nr_retries); goto retry; + } } if (raise) -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-11 23:07 ` Uwe Kleine-König @ 2009-09-14 10:50 ` Bo Hansen 2009-09-14 11:10 ` Uwe Kleine-König 0 siblings, 1 reply; 19+ messages in thread From: Bo Hansen @ 2009-09-14 10:50 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Thomas Gleixner, john stultz, rt-users [-- Attachment #1: Type: text/plain, Size: 2919 bytes --] Hi Uwe, I tried to apply your patch and enabled ftrace but have not been able to reproduce the panic. It should also be noted that the panic is much more rare if I apply the first patch you sent. The only panic I got was using your first patch and enabling ftrace. I have not yet had the time to try changing pr_info/pr_emerg. It seems to run stable when something like: "hrtimer: interrupt too slow, forcing clock min delta to 480771 ns" is found in the dmesg output, but I cannot be sure as I don't know if this is also written to the log just before the panic. Unfortunately I don't have much time for testing this at the moment but I'll get back as soon as possible. Best regards, Bo Uwe Kleine-König wrote: > Hello, > > On Fri, Sep 11, 2009 at 05:20:44PM +0200, Uwe Kleine-König wrote: > >> Hello Bo, >> >> In the meantime I got access to an at91rm9200, too. To help me >> reproducing the problem: >> > I still cannot reproduce, but I found something anyhow. > > The problem is that hrtimer_interrupt_hanging decreases min_delta_ns. > (Initially it's 61036.) > > I talked to jstultz on irc and both of us are unsure if asserting that > min_delta_ns isn't decreased in hrtimer_interrupt_hanging is enough or > if there is another problem. > > Bo, can you please apply the patch below, pass the kernel parameter > ftrace_dump_on_oops (or alternatively do > > # echo 1 > /proc/sys/kernel/ftrace_dump_on_oops > > ), reproduce the problem and send the resulting oops? > > Best regards > Uwe > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index 51bd089..cd72ca9 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -18,6 +18,7 @@ > #include <linux/personality.h> > #include <linux/kallsyms.h> > #include <linux/delay.h> > +#include <linux/kdebug.h> > #include <linux/hardirq.h> > #include <linux/init.h> > #include <linux/uaccess.h> > @@ -223,6 +224,8 @@ static void __die(const char *str, int err, struct thread_info *thread, struct p > dump_backtrace(regs, tsk); > dump_instr(regs); > } > + > + notify_die(DIE_OOPS, str, regs, err, current->thread.trap_no, SIGSEGV); > } > > DEFINE_RAW_SPINLOCK(die_lock); > diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c > index 9e308ab..a0c05f3 100644 > --- a/kernel/hrtimer.c > +++ b/kernel/hrtimer.c > @@ -1390,8 +1390,16 @@ void hrtimer_interrupt(struct clock_event_device *dev) > > /* Reprogramming necessary ? */ > if (expires_next.tv64 != KTIME_MAX) { > - if (tick_program_event(expires_next, force_clock_reprogram)) > + if (tick_program_event(expires_next, force_clock_reprogram)) { > + if (nr_retries > 1) > + trace_printk("tick_program_event failed, " > + "now=%lld, expires_next=%lld, " > + "nr_retries=%d\n", > + (long long)now.tv64, > + (long long)expires_next.tv64, > + nr_retries); > goto retry; > + } > } > > if (raise) > > [-- Attachment #2: 140909_kernel_panic_2.6.29.6-rt23-debug-uwe-ftrace --] [-- Type: text/plain, Size: 5275 bytes --] Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = c3a38000 [00000000] *pgd=23a0c031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT Modules linked in: CPU: 0 Not tainted (2.6.29.6-rt23 #1) PC is at clkevt32k_next_event+0x98/0xdc LR is at rt_mutex_unlock+0x14/0x18 pc : [<c0030580>] lr : [<c027e2cc>] psr: 00000093 sp : c3a3dc58 ip : c3a3db68 fp : c3a3dc7c r10: 00000000 r9 : 00000000 r8 : c0322dd8 r7 : 00000067 r6 : 00000000 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00010003 r1 : 60000093 r0 : 00000062 Flags: nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: c000717f Table: 23a38000 DAC: 00000015 Process cyclictest (pid: 572, stack limit = 0xc3a3c270) Stack: (0xc3a3dc58 to 0xc3a3e000) dc40: 0000d026 c0322dd8 dc60: c3a3dce0 00000000 000225c1 00000000 c3a3dca4 c3a3dc80 c005d7f4 c00304f8 dc80: 00000067 38db444d 00000067 38db1ca8 00000067 c0322dd8 c3a3dcf4 c3a3dca8 dca0: c005e514 c005d704 38db1ca8 00000067 38daf15c 00000000 00000000 c3a3dd38 dcc0: 38db1ca8 00000067 38726a67 38db444d 00000067 00000067 c03271f8 38db444d dce0: 00000067 c3a3c000 c3a3dd14 c3a3dcf8 c005e5c0 c005e454 00000001 00000001 dd00: c03271f8 38daf15c c3a3dd74 c3a3dd18 c00551a0 c005e59c 3fcd8e40 c0322dd8 dd20: 38daf15c 00000067 c0378b90 c03271f8 00000001 00000000 38daf15c 00000067 dd40: 14f46b04 c0030cf0 00000000 c0322db0 c3a3c000 c0322db0 00010003 00000001 dd60: c3a3c000 00000000 c3a3dd94 c3a3dd78 c003067c c0054f68 c0033944 c0322db0 dd80: c3a3c000 c0322db0 c3a3ddcc c3a3dd98 c0068fbc c00305d4 c393601c 00000000 dda0: 00000000 c0328068 c3a3c000 c0322db0 00000001 00000002 c3a3c000 c03247a8 ddc0: c3a3ddec c3a3ddd0 c006b9b0 c0068f64 00000001 c0335330 00000000 00000003 dde0: c3a3de0c c3a3ddf0 c0026070 c006b8b8 c0033944 ffffffff fefff000 00000001 de00: c3a3de7c c3a3de10 c00269fc c0026010 c38ce120 00000001 00000013 0000086c de20: 00000000 c3a3c000 c3918800 c3918990 c38ce120 00000000 c03247a8 c3a3de7c de40: c3a3de30 c3a3de58 c0031c64 c027d0d0 00000013 ffffffff c3a3c000 00000000 de60: 00000000 00000000 00000000 c3a3def8 c3a3de94 c3a3de80 c027d5b8 c027cfa4 de80: 00000001 3b9aca00 c3a3ded4 c3a3de98 c027de28 c027d59c 00000000 00000000 dea0: c3a3c000 00000000 c003f52c 00000000 00000000 00000000 c3a3df80 c3a3c000 dec0: 00000000 00000000 c3a3df64 c3a3ded8 c0055ae8 c027dd60 00000000 00000000 dee0: 007610eb 00000000 39aff350 00000067 00000000 00000000 c3a77ef8 00000000 df00: 00000000 00000000 39aff350 00000067 39aff350 00000067 c0054e74 c03271f8 df20: 00000001 c3a3df24 c3a3df24 00000001 c3918800 c003de3c c3a3df88 00000001 df40: 00000001 00000000 c3a3df80 c0026fa8 c3a3c000 00017194 c3a3df7c c3a3df68 df60: c004f578 c0055a3c c0026fa8 00000001 c3a3dfa4 c3a3df80 c004f6ac c004f558 df80: 00000067 39aff350 00000001 00000000 4516ddec 00000109 00000000 c3a3dfa8 dfa0: c0026de0 c004f58c 00000001 00000000 00000001 00000001 4516ddec 00000000 dfc0: 00000001 00000000 4516ddec 00000109 00000001 00015c60 00017194 4516ddf4 dfe0: 00000000 4516dc50 4004680c 4004682c 60000010 00000001 00000000 00000000 Backtrace: [<c00304e8>] (clkevt32k_next_event+0x0/0xdc) from [<c005d7f4>] (clockevents_program_event+0x100/0x16c) r6:00000000 r5:000225c1 r4:00000000 [<c005d6f4>] (clockevents_program_event+0x0/0x16c) from [<c005e514>] (tick_dev_program_event+0xd0/0xfc) r8:c0322dd8 r7:00000067 r6:38db1ca8 r5:00000067 r4:38db444d [<c005e444>] (tick_dev_program_event+0x0/0xfc) from [<c005e5c0>] (tick_program_event+0x34/0x40) [<c005e58c>] (tick_program_event+0x0/0x40) from [<c00551a0>] (hrtimer_interrupt+0x248/0x2ec) r5:38daf15c r4:c03271f8 [<c0054f58>] (hrtimer_interrupt+0x0/0x2ec) from [<c003067c>] (at91rm9200_timer_interrupt+0xb8/0xcc) [<c00305c4>] (at91rm9200_timer_interrupt+0x0/0xcc) from [<c0068fbc>] (handle_IRQ_event+0x68/0x1d8) r6:c0322db0 r5:c3a3c000 r4:c0322db0 [<c0068f54>] (handle_IRQ_event+0x0/0x1d8) from [<c006b9b0>] (handle_level_irq+0x108/0x178) [<c006b8a8>] (handle_level_irq+0x0/0x178) from [<c0026070>] (_text+0x70/0x90) r7:00000003 r6:00000000 r5:c0335330 r4:00000001 [<c0026000>] (_text+0x0/0x90) from [<c00269fc>] (__irq_svc+0x3c/0x80) Exception stack(0xc3a3de10 to 0xc3a3de58) de00: c38ce120 00000001 00000013 0000086c de20: 00000000 c3a3c000 c3918800 c3918990 c38ce120 00000000 c03247a8 c3a3de7c de40: c3a3de30 c3a3de58 c0031c64 c027d0d0 00000013 ffffffff r6:00000001 r5:fefff000 r4:ffffffff [<c027cf94>] (__schedule+0x0/0x3b0) from [<c027d5b8>] (schedule+0x2c/0x48) [<c027d58c>] (schedule+0x0/0x48) from [<c027de28>] (do_nanosleep+0xd8/0x118) r4:3b9aca00 [<c027dd50>] (do_nanosleep+0x0/0x118) from [<c0055ae8>] (hrtimer_nanosleep+0xbc/0x144) [<c0055a2c>] (hrtimer_nanosleep+0x0/0x144) from [<c004f578>] (common_nsleep+0x30/0x34) [<c004f548>] (common_nsleep+0x0/0x34) from [<c004f6ac>] (sys_clock_nanosleep+0x130/0x154) r4:00000001 [<c004f57c>] (sys_clock_nanosleep+0x0/0x154) from [<c0026de0>] (ret_fast_syscall+0x0/0x2c) r7:00000109 r6:4516ddec r5:00000000 r4:00000001 Code: e59f1040 e88d5000 eb00269f e3a03000 (e5833000) Kernel panic - not syncing: Fatal exception in interrupt Dumping ftrace buffer: (ftrace buffer empty) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-14 10:50 ` Bo Hansen @ 2009-09-14 11:10 ` Uwe Kleine-König 2009-09-15 9:29 ` [SOLVED, RFC] " Uwe Kleine-König 0 siblings, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-14 11:10 UTC (permalink / raw) To: Bo Hansen; +Cc: Thomas Gleixner, john stultz, rt-users Hello Bo, On Mon, Sep 14, 2009 at 12:50:32PM +0200, Bo Hansen wrote: > I tried to apply your patch and enabled ftrace but have not been able to > reproduce > the panic. > It should also be noted that the panic is much more rare if I apply the > first patch you > sent. Revert this patch, only use the second. And optimally add a BUG() in hrtimer_interrupt_hanging (in kernel/hrtimer.c) after the printk. > The only panic I got was using your first patch and enabling ftrace. I > have not yet had > the time to try changing pr_info/pr_emerg. In the meantime I understood why the 2nd printk doesn't appear on the serial console. That's because "non-atomic" consoles don't print anything in atomic context (with PREEMPT_RT=y) > It seems to run stable when something like: "hrtimer: interrupt too > slow, forcing clock > min delta to 480771 ns" is found in the dmesg output, but I cannot be > sure as I don't > know if this is also written to the log just before the panic. Actually this is a bug because then min_delta_ns is decreased. With the BUG as suggested above this message would be enough to debug this further. > Unfortunately I don't have much time for testing this at the moment but > I'll get back > as soon as possible. :-| Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-14 11:10 ` Uwe Kleine-König @ 2009-09-15 9:29 ` Uwe Kleine-König 2009-09-15 15:29 ` Thomas Gleixner 2009-09-16 6:26 ` Bo Hansen 0 siblings, 2 replies; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-15 9:29 UTC (permalink / raw) To: Bo Hansen; +Cc: Thomas Gleixner, john stultz, rt-users Hello, I'm able to reproduce this now. The problem is really that hrtimer_interrupt() tries to program the next timer event which fails 4 times in a row, because the expiry time is already in the past when tick_dev_program_event is called. This happens because there are so much timers with small expiry times. (cyclictest -t40 on an UP machine) Addionally the fourth interation was short (probably because then there were no expired timers to check for) and so the try_time argument to hrtimer_interrupt_hanging was small and so min_delta_ns was decreased instead of increased. I wonder why increasing min_delta_ns is supposed to fix something here. Actually some pressure that results in hrtimer_interrupt_hanging being called can always happen independently of min_delta_ns' value. And moreover, is it sensible to change the value of force_clock_reprogram? This makes the system harder to benchmark, because to guarantee a certain latency now you have to check both situations force_clock_reprogram==1 and force_clock_reprogram==0. My suggestion is to do the following: - fix hrtimer_interrupt_hanging not to decrease min_delta_ns - in clockevents_program_event instead of if (delta <= 0) return -ETIME; do if (delta <= -dev->min_delta_ns) return -ETIME; I'm not sure though if -dev->min_delta_ns is the right value or if that is a good idea at all. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-15 9:29 ` [SOLVED, RFC] " Uwe Kleine-König @ 2009-09-15 15:29 ` Thomas Gleixner 2009-09-15 18:40 ` Uwe Kleine-König 2009-09-16 6:26 ` Bo Hansen 1 sibling, 1 reply; 19+ messages in thread From: Thomas Gleixner @ 2009-09-15 15:29 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Bo Hansen, john stultz, rt-users [-- Attachment #1: Type: TEXT/PLAIN, Size: 1870 bytes --] On Tue, 15 Sep 2009, Uwe Kleine-König wrote: > Hello, > > I'm able to reproduce this now. The problem is really that > hrtimer_interrupt() tries to program the next timer event which fails 4 > times in a row, because the expiry time is already in the past when > tick_dev_program_event is called. This happens because there are so > much timers with small expiry times. (cyclictest -t40 on an UP machine) Pretty much the case which triggers the hang mechanism. > Addionally the fourth interation was short (probably because then there > were no expired timers to check for) and so the try_time argument to > hrtimer_interrupt_hanging was small and so min_delta_ns was decreased > instead of increased. He, what is decreasing min_delta_ns ? There is no such code. > I wonder why increasing min_delta_ns is supposed to fix something here. > Actually some pressure that results in hrtimer_interrupt_hanging being > called can always happen independently of min_delta_ns' value. It ensures that the timer interrupts are delayed sufficiently that something else than the timer interrupt has a chance to get on the CPU. > And moreover, is it sensible to change the value of > force_clock_reprogram? This makes the system harder to benchmark, > because to guarantee a certain latency now you have to check both > situations force_clock_reprogram==1 and force_clock_reprogram==0. > > My suggestion is to do the following: > - fix hrtimer_interrupt_hanging not to decrease min_delta_ns It does not decrease anything, so what's to fix there ? > - in clockevents_program_event instead of > if (delta <= 0) > return -ETIME; > do > if (delta <= -dev->min_delta_ns) > return -ETIME; > > I'm not sure though if -dev->min_delta_ns is the right value or if > that is a good idea at all. You might expire timers early, which is wrong. Thanks, tglx ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-15 15:29 ` Thomas Gleixner @ 2009-09-15 18:40 ` Uwe Kleine-König 2009-09-15 19:06 ` Thomas Gleixner 0 siblings, 1 reply; 19+ messages in thread From: Uwe Kleine-König @ 2009-09-15 18:40 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Bo Hansen, john stultz, rt-users Hello Thomas, On Tue, Sep 15, 2009 at 05:29:17PM +0200, Thomas Gleixner wrote: > On Tue, 15 Sep 2009, Uwe Kleine-König wrote: > > Hello, > > > > I'm able to reproduce this now. The problem is really that > > hrtimer_interrupt() tries to program the next timer event which fails 4 > > times in a row, because the expiry time is already in the past when > > tick_dev_program_event is called. This happens because there are so > > much timers with small expiry times. (cyclictest -t40 on an UP machine) > > Pretty much the case which triggers the hang mechanism. OK. > > Addionally the fourth interation was short (probably because then there > > were no expired timers to check for) and so the try_time argument to > > hrtimer_interrupt_hanging was small and so min_delta_ns was decreased > > instead of increased. > > He, what is decreasing min_delta_ns ? There is no such code. hrtimer_interrupt_hanging() can decrease min_delta_ns. Look at Bo's reports: min_delta_ns is initially 61036 and he gets hrtimer: interrupt too slow, forcing clock min delta to 52482 ns If you want I can add a WARN_ON_ONCE((unsigned long)try_time.tv64 * 3 < dev->min_delta_ns) and send you the output of dmesg. > > I wonder why increasing min_delta_ns is supposed to fix something here. > > Actually some pressure that results in hrtimer_interrupt_hanging being > > called can always happen independently of min_delta_ns' value. > > It ensures that the timer interrupts are delayed sufficiently that > something else than the timer interrupt has a chance to get on the > CPU. Hhhhm, doesn't destroy that any upper bound for latency. If a process succeeds to install many timers that expire one after another with a short delta in between min_delta_ns goes up. This in turn mean that even high priority processes loose the ability to sleep a short period. > > And moreover, is it sensible to change the value of > > force_clock_reprogram? This makes the system harder to benchmark, > > because to guarantee a certain latency now you have to check both > > situations force_clock_reprogram==1 and force_clock_reprogram==0. > > > > My suggestion is to do the following: > > - fix hrtimer_interrupt_hanging not to decrease min_delta_ns > > It does not decrease anything, so what's to fix there ? http://git.pengutronix.de/?p=ukl/linux-2.6.git;a=commitdiff;h=a1b46573904d822ac5c7495ea1dbc8ef38200bf7 > > - in clockevents_program_event instead of > > if (delta <= 0) > > return -ETIME; > > do > > if (delta <= -dev->min_delta_ns) > > return -ETIME; > > > > I'm not sure though if -dev->min_delta_ns is the right value or if > > that is a good idea at all. > > You might expire timers early, which is wrong. The critic be right, I haven't thought much about that, didn't even try it. My thought was only that in the original case (delta <= 0) the event to program is in the past, so programming min_delta_ns for it isn't too early at all. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-15 18:40 ` Uwe Kleine-König @ 2009-09-15 19:06 ` Thomas Gleixner 2009-09-15 20:54 ` Thomas Gleixner 0 siblings, 1 reply; 19+ messages in thread From: Thomas Gleixner @ 2009-09-15 19:06 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Bo Hansen, john stultz, rt-users [-- Attachment #1: Type: TEXT/PLAIN, Size: 1101 bytes --] On Tue, 15 Sep 2009, Uwe Kleine-König wrote: > On Tue, Sep 15, 2009 at 05:29:17PM +0200, Thomas Gleixner wrote: > > He, what is decreasing min_delta_ns ? There is no such code. > hrtimer_interrupt_hanging() can decrease min_delta_ns. > > Look at Bo's reports: min_delta_ns is initially 61036 and he gets > > hrtimer: interrupt too slow, forcing clock min delta to 52482 ns So the problem is that min delta is too small due to an unhandled 32bit overflow. But that's not a decrease by any means. It's simply an 32bit overflow bug which needs to be fixed. > > It ensures that the timer interrupts are delayed sufficiently that > > something else than the timer interrupt has a chance to get on the > > CPU. > Hhhhm, doesn't destroy that any upper bound for latency. If a process > succeeds to install many timers that expire one after another with a > short delta in between min_delta_ns goes up. This in turn mean that > even high priority processes loose the ability to sleep a short period. Yes, we know that already and it's on the list of things which need to be solved. Thanks, tglx ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-15 19:06 ` Thomas Gleixner @ 2009-09-15 20:54 ` Thomas Gleixner 0 siblings, 0 replies; 19+ messages in thread From: Thomas Gleixner @ 2009-09-15 20:54 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Bo Hansen, john stultz, rt-users [-- Attachment #1: Type: TEXT/PLAIN, Size: 1104 bytes --] On Tue, 15 Sep 2009, Thomas Gleixner wrote: > On Tue, 15 Sep 2009, Uwe Kleine-König wrote: > > On Tue, Sep 15, 2009 at 05:29:17PM +0200, Thomas Gleixner wrote: > > > He, what is decreasing min_delta_ns ? There is no such code. > > hrtimer_interrupt_hanging() can decrease min_delta_ns. > > > > Look at Bo's reports: min_delta_ns is initially 61036 and he gets > > > > hrtimer: interrupt too slow, forcing clock min delta to 52482 ns > > So the problem is that min delta is too small due to an unhandled > 32bit overflow. > > But that's not a decrease by any means. It's simply an 32bit overflow > bug which needs to be fixed. Ok. I read the whole thread back to avoid further confusion. Looking at the code it's pretty simple to explain. The time delta is multiplied times 3 and unconditionally stored in min_delta_ns. That needs to be fixed but the real problem is how the delta is calculated. We take it from the last iteration and not across the 5 iterations which trigger the hang check. I'm queuing a fix for that for maninline and the -rt11 release (I already uploaded -rt10). Thanks, tglx ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [SOLVED, RFC] Re: hrtimer problem on AT91RM9200 2009-09-15 9:29 ` [SOLVED, RFC] " Uwe Kleine-König 2009-09-15 15:29 ` Thomas Gleixner @ 2009-09-16 6:26 ` Bo Hansen 1 sibling, 0 replies; 19+ messages in thread From: Bo Hansen @ 2009-09-16 6:26 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Thomas Gleixner, john stultz, rt-users Hi Uwe, I'm happy to see you are able to reproduce the problem now. This is a bit over my head as a newbie in the kernel, but I will follow from the sideline ;-). Thank you for taking care of the problem. Best regards, Bo Uwe Kleine-König wrote: > Hello, > > I'm able to reproduce this now. The problem is really that > hrtimer_interrupt() tries to program the next timer event which fails 4 > times in a row, because the expiry time is already in the past when > tick_dev_program_event is called. This happens because there are so > much timers with small expiry times. (cyclictest -t40 on an UP machine) > > Addionally the fourth interation was short (probably because then there > were no expired timers to check for) and so the try_time argument to > hrtimer_interrupt_hanging was small and so min_delta_ns was decreased > instead of increased. > > I wonder why increasing min_delta_ns is supposed to fix something here. > Actually some pressure that results in hrtimer_interrupt_hanging being > called can always happen independently of min_delta_ns' value. > And moreover, is it sensible to change the value of > force_clock_reprogram? This makes the system harder to benchmark, > because to guarantee a certain latency now you have to check both > situations force_clock_reprogram==1 and force_clock_reprogram==0. > > My suggestion is to do the following: > - fix hrtimer_interrupt_hanging not to decrease min_delta_ns > - in clockevents_program_event instead of > if (delta <= 0) > return -ETIME; > do > if (delta <= -dev->min_delta_ns) > return -ETIME; > > I'm not sure though if -dev->min_delta_ns is the right value or if > that is a good idea at all. > > Best regards > Uwe > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hrtimer problem on AT91RM9200 2009-09-11 15:20 ` Uwe Kleine-König 2009-09-11 23:07 ` Uwe Kleine-König @ 2009-09-14 9:09 ` Bo Hansen 1 sibling, 0 replies; 19+ messages in thread From: Bo Hansen @ 2009-09-14 9:09 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Thomas Gleixner, john stultz, rt-users Hi Uwe, I use cyclictest -p95 -n -i2000 -t40 and 3x ping -s65000. In the beginning I had hackbench 4 running also, but actually the cyclictest was enough to trigger the panic. Also I'm running from NFS rootfs and running cyclictest from an ssh connection. Best regards Bo Uwe Kleine-König wrote: > Hello Bo, > > In the meantime I got access to an at91rm9200, too. To help me > reproducing the problem: > > - which options do you use for cyclictest? > - Is the system idle or do you run hackbench or similar in parallel? > - anything else? > > Best regards > Uwe > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2009-09-16 6:26 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-19 12:24 hrtimer problem on AT91RM9200 Bo Hansen
2009-08-19 15:33 ` Remy Bohmer
2009-08-20 19:52 ` Uwe Kleine-König
2009-08-21 12:57 ` Bo Hansen
[not found] ` <fae6a9700908232327u65fca65bt97bb8c43a91345cd@mail.gmail.com>
2009-08-24 6:30 ` Vivek Satpute
2009-09-03 14:12 ` Uwe Kleine-König
2009-09-10 6:44 ` Bo Hansen
2009-09-10 9:51 ` Uwe Kleine-König
2009-09-11 15:20 ` Uwe Kleine-König
2009-09-11 23:07 ` Uwe Kleine-König
2009-09-14 10:50 ` Bo Hansen
2009-09-14 11:10 ` Uwe Kleine-König
2009-09-15 9:29 ` [SOLVED, RFC] " Uwe Kleine-König
2009-09-15 15:29 ` Thomas Gleixner
2009-09-15 18:40 ` Uwe Kleine-König
2009-09-15 19:06 ` Thomas Gleixner
2009-09-15 20:54 ` Thomas Gleixner
2009-09-16 6:26 ` Bo Hansen
2009-09-14 9:09 ` Bo Hansen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.