Philippe Gerum wrote:
> On Sun, 2006-07-30 at 21:33 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Sat, 2006-07-29 at 16:20 +0200, Jan Kiszka wrote:
>>>>>> :|func        6   xnintr_clock_handler (__ipipe_dispatch_wired)
>>>>>> :|func        6   xnintr_irq_handler (xnintr_clock_handler)
>>>>>> :|func        7   xnpod_announce_tick (xnintr_irq_handler)
>>>>>> :|func        8+  xntimer_do_tick_aperiodic (xnpod_announce_tick)
>>>>>> :|func        9   xnthread_periodic_handler (xntimer_do_tick_aperiodic)
>>>>>> :|func       10   xnpod_resume_thread (xnthread_periodic_handler)
>>>>>> :|[21559]    11+  xnpod_resume_thread (xnthread_periodic_handler)
>>>>>> :|func       13+  xnthread_periodic_handler (xntimer_do_tick_aperiodic)
>>>> ...
>>>>
>>>>>> :|func      363+  xnthread_periodic_handler (xntimer_do_tick_aperiodic)
>>>> That are a lot of overruns. Haven't counted, but it should be one
>>>> xnthread_periodic_handler per missed 100 us period (20000 / 100 = 200!).
>>>> [BTW, I think we should handle even this failure scenario without
>>>> looping.
>>> We need to loop in the aperiodic handler in order to catch timers that
>>> could have elapsed while processing the current tick. However,
>> No, that was not what I meant. I know that we need the timer loop. But I
>> was thinking of something like this for the tick handler's error path:
>>
>> if (unlikely((timer.date += timer.interval) < now))
>> 	timer.date = now + timer.interval -
>> 		(now - timer.date) % timer.interval;
>>
>>> xnpod_wait_thread_period() - over which rt_task_wait_period() is based -
>>> does not loop, but rather computes the actual count of overruns by
>>> substracting the current time from the deadline.
>> ...but by looping for some scenarios instead of dividing for all. Why
>> optimising the slow path here?
> 
> Division is utterly expensive and having a jitter that would not fit
> in 32bit is seldom (and the definitive sign of serious brokenness anyway),
> so this is actually the fast error path which gets optimized.
> 
>>> Which brings us an interesting question: why does the aperiodic handler
>>> loop frenetically in the first place? I would be pretty interested in
>>> checking the TSC values returned by xnarch_get_cpu_tsc() while spinning
>>> inside this deadly loop...
>> You can already read those TSCs: each trace point got recorded with the
>> current TSC value, fresh from the hardware.
>>
> 
> I'd like to explain why we don't we see any other routines than
> xnthread_aperiodic_handler called from xntimer_do_tick_aperiodic in the
> call frame? Even in case of massive jittery (e.g. > 300 us late) in one
> shot, we should not spin in this code, due to the resync done in
> xnpod_wait_thread_timeout - assuming we only have a single outstanding
> timer (+ the host tick, but this should not be an issue).

xnpod_wait_thread_timeout? Do you mean xnpod_wait_thread_period? How
should it help us as long as we are in the tick handler?

> 
>> I rather think, also when looking at Julien's second trace, that we have
>> some issue with X in user-space here, probably in combination with weird
>> VIA hardware stalling IRQ delivery for a "few" microseconds. Let's see
>> if the irqbench gives similar results.
>>
> 
> The problem is that I can reproduce X-related jittery (> 2 ms in a row)
> on one of my test boxen when dragging windows over the screen, without
> triggering the NMI watchdog set to 100 us (and guess what, the chipset
> in question is from VIA).

Does NMI management happen in the CPU or has the chipset any influence
as well? If yes, I could imagine what VIA does here... Have you already
checked what irqbench records?

Jan