getnstimeofday stuck for several milliseconds?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* getnstimeofday stuck for several milliseconds?
@ 2012-11-05  8:51 David Henningsson
  2012-11-12 23:53 ` John Stultz
  0 siblings, 1 reply; 5+ messages in thread
From: David Henningsson @ 2012-11-05  8:51 UTC (permalink / raw)
  To: linux-kernel; +Cc: tglx, johnstul

Hi LKML,

I'm trying to make audio more useful in everyday low-latency scenarios 
such as gaming or VOIP.

While doing so, I ran the wakeup_rt tracer, to track the time from 
PulseAudio requesting wakeup (through hrtimers), to the thread actually 
running.

I'm not sure how much overhead added by the wakeup_rt tracer itself, but 
I got 9 ms on one machine and 20 ms on another, which I consider to be 
quite a lot even for a standard kernel (i e without RT or other special 
configuration).

The 9 ms example is pastebinned at [1], and here's where we get stuck 
for most of the time:

   <idle>-0       3d... 1105us : ktime_get_real <-intel_idle
   <idle>-0       3d... 1106us!: getnstimeofday <-ktime_get_real
   <idle>-0       3d... 7823us : ktime_get_real <-intel_idle

   <idle>-0       3d... 7890us : ktime_get_real <-intel_idle
   <idle>-0       3d... 7891us!: getnstimeofday <-ktime_get_real
   <idle>-0       3d... 9023us : ktime_get_real <-intel_idle

It seems to me that sometimes we get stuck for several milliseconds 
inside the getnstimeofday function - this was seen on both the 9 ms and 
the 20 ms trace. This looks like a bug to me, and as I'm not sure on how 
to best debug it further, and therefore I'm asking for help (or a bug 
fix!) here.

For reference, the 9 ms trace was from a ~2 year old laptop (core i3 
cpu) running 3.7rc2 vanilla/mainline kernel, and the 20 ms trace was 
from an ~1 year old Atom-based machine running the 3.2-ubuntu kernel. 
While tracing was enabled, I was running a libSDL game for a minute or two.

Thanks in advance for looking into this, and let me know if you need 
further information, or anything else I can do to help sorting this one out.

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic

[1] http://pastebin.se/6iMRdDfR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getnstimeofday stuck for several milliseconds?
  2012-11-05  8:51 getnstimeofday stuck for several milliseconds? David Henningsson
@ 2012-11-12 23:53 ` John Stultz
  2012-11-13  0:09   ` John Stultz
  2012-11-13  0:15   ` Steven Rostedt
  0 siblings, 2 replies; 5+ messages in thread
From: John Stultz @ 2012-11-12 23:53 UTC (permalink / raw)
  To: David Henningsson; +Cc: linux-kernel, tglx, Rostedt, Steven

On 11/05/2012 12:51 AM, David Henningsson wrote:
> Hi LKML,
>
> I'm trying to make audio more useful in everyday low-latency scenarios 
> such as gaming or VOIP.
>
> While doing so, I ran the wakeup_rt tracer, to track the time from 
> PulseAudio requesting wakeup (through hrtimers), to the thread 
> actually running.
>
> I'm not sure how much overhead added by the wakeup_rt tracer itself, 
> but I got 9 ms on one machine and 20 ms on another, which I consider 
> to be quite a lot even for a standard kernel (i e without RT or other 
> special configuration).
>
> The 9 ms example is pastebinned at [1], and here's where we get stuck 
> for most of the time:
>
>   <idle>-0       3d... 1105us : ktime_get_real <-intel_idle
>   <idle>-0       3d... 1106us!: getnstimeofday <-ktime_get_real
>   <idle>-0       3d... 7823us : ktime_get_real <-intel_idle
>
>   <idle>-0       3d... 7890us : ktime_get_real <-intel_idle
>   <idle>-0       3d... 7891us!: getnstimeofday <-ktime_get_real
>   <idle>-0       3d... 9023us : ktime_get_real <-intel_idle
>
Its been awhile since I looked at wakeup_rt trace output, but that looks 
more like ~6.7ms and ~1.2ms latencies, not 9ms (are you adding these 
together?).

> It seems to me that sometimes we get stuck for several milliseconds 
> inside the getnstimeofday function - this was seen on both the 9 ms 
> and the 20 ms trace. This looks like a bug to me, and as I'm not sure 
> on how to best debug it further, and therefore I'm asking for help (or 
> a bug fix!) here.
>
> For reference, the 9 ms trace was from a ~2 year old laptop (core i3 
> cpu) running 3.7rc2 vanilla/mainline kernel, and the 20 ms trace was 
> from an ~1 year old Atom-based machine running the 3.2-ubuntu kernel. 
> While tracing was enabled, I was running a libSDL game for a minute or 
> two.
>
> Thanks in advance for looking into this, and let me know if you need 
> further information, or anything else I can do to help sorting this 
> one out.

Hrmm.. So 6.7ms is still a long time.

Looking at the trace you posted here: http://pastebin.se/6iMRdDfR

The trace also looks like its the cpuidle to interrupt transition where 
you're seeing this.  I sort of wonder if its mis-attributing the idle 
time to the getnstimeofday()? Mainly because you don't seem to spend 
much time in intel_idle() otherwise.

Or maybe we're both misreading it and its saying there's a delay between 
the first ktime_get_real() from intel_idle() to the second call of 
ktime_get_real(), between which we're in deep idle (which would make sense)?

Because unless the timekeeping lock is getting held for a long time, I 
don't know why else you'd see such long delays at getnstimeofday().

Cc'ing Steven to see if he can't help understand whats going on here.

thanks
-john




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getnstimeofday stuck for several milliseconds?
  2012-11-12 23:53 ` John Stultz
@ 2012-11-13  0:09   ` John Stultz
  2012-11-13  0:15   ` Steven Rostedt
  1 sibling, 0 replies; 5+ messages in thread
From: John Stultz @ 2012-11-13  0:09 UTC (permalink / raw)
  To: David Henningsson; +Cc: linux-kernel, tglx, Rostedt, Steven

On 11/12/2012 03:53 PM, John Stultz wrote:
> On 11/05/2012 12:51 AM, David Henningsson wrote:
>> Hi LKML,
>>
>> I'm trying to make audio more useful in everyday low-latency 
>> scenarios such as gaming or VOIP.
>>
>> While doing so, I ran the wakeup_rt tracer, to track the time from 
>> PulseAudio requesting wakeup (through hrtimers), to the thread 
>> actually running.
>>
>> I'm not sure how much overhead added by the wakeup_rt tracer itself, 
>> but I got 9 ms on one machine and 20 ms on another, which I consider 
>> to be quite a lot even for a standard kernel (i e without RT or other 
>> special configuration).
>>
>> The 9 ms example is pastebinned at [1], and here's where we get stuck 
>> for most of the time:
>>
>>   <idle>-0       3d... 1105us : ktime_get_real <-intel_idle
>>   <idle>-0       3d... 1106us!: getnstimeofday <-ktime_get_real
>>   <idle>-0       3d... 7823us : ktime_get_real <-intel_idle
>>
>>   <idle>-0       3d... 7890us : ktime_get_real <-intel_idle
>>   <idle>-0       3d... 7891us!: getnstimeofday <-ktime_get_real
>>   <idle>-0       3d... 9023us : ktime_get_real <-intel_idle
>>
>
> Looking at the trace you posted here: http://pastebin.se/6iMRdDfR
>
> The trace also looks like its the cpuidle to interrupt transition 
> where you're seeing this.  I sort of wonder if its mis-attributing the 
> idle time to the getnstimeofday()? Mainly because you don't seem to 
> spend much time in intel_idle() otherwise.
>
> Or maybe we're both misreading it and its saying there's a delay 
> between the first ktime_get_real() from intel_idle() to the second 
> call of ktime_get_real(), between which we're in deep idle (which 
> would make sense)?
>
The more I think about it, I'm pretty sure this is the case:
The full context you need is:
     <idle>-0       3d... 7890us : ktime_get_real <-intel_idle
     <idle>-0       3d... 7891us!: getnstimeofday <-ktime_get_real
     <idle>-0       3d... 9023us : ktime_get_real <-intel_idle
     <idle>-0       3d... 9024us : getnstimeofday <-ktime_get_real

Where intel_idle() is calling ktime_get_real twice in a row, and 
inbetween we see a large latency. Looking at intel_idle() the code in 
question is:

     kt_before = ktime_get_real();

     stop_critical_timings();
     if (!need_resched()) {

         __monitor((void *)&current_thread_info()->flags, 0, 0);
         smp_mb();
         if (!need_resched())
             __mwait(eax, ecx);
     }

     start_critical_timings();

     kt_after = ktime_get_real();


Where we're basically timing how long we were in idle for.

So I think the problem is just misreading the trace output.

thanks
-john





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getnstimeofday stuck for several milliseconds?
  2012-11-12 23:53 ` John Stultz
  2012-11-13  0:09   ` John Stultz
@ 2012-11-13  0:15   ` Steven Rostedt
  2012-11-13  8:26     ` David Henningsson
  1 sibling, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2012-11-13  0:15 UTC (permalink / raw)
  To: John Stultz; +Cc: David Henningsson, linux-kernel, tglx

On Mon, 2012-11-12 at 15:53 -0800, John Stultz wrote:

> Cc'ing Steven to see if he can't help understand whats going on here.

I don't trust the trace...


The wake up happened on CPU 2

pulseaud-2106    2d...    1us :   2106:109:R   + [003]  2115: 94:S <...>
pulseaud-2106    2d...    2us+: try_to_wake_up <-default_wake_function


    Xorg-1278    3d... 1038us : __switch_to_xtra <-__switch_to
    Xorg-1278    3d... 1039us : memset <-__switch_to_xtra
kworker/-37      3d... 1040us : finish_task_switch <-__schedule
kworker/-37      3.... 1040us : _raw_spin_lock_irq <-worker_thread
kworker/-37      3d... 1041us : need_more_worker <-worker_thread
kworker/-37      3d... 1041us : process_one_work <-worker_thread
kworker/-37      3.... 1042us : __wake_up <-wakeup_work_handler

A millisecond goes by, Xorg is schedules out for kworker (which I'm sure
is also less in priority). And we don't see any sign of need resched
being set (N).

kworker/-37      3d... 1082us : pick_next_task_idle <-__schedule
  <idle>-0       3d... 1083us : finish_task_switch <-__schedule
  <idle>-0       3.... 1083us : tick_nohz_idle_enter <-cpu_idle
  <idle>-0       3.... 1084us : set_cpu_sd_state_idle <-tick_nohz_idle_enter
  <idle>-0       3d... 1084us : __tick_nohz_idle_enter <-tick_nohz_idle_enter

Idle is even scheduled in here!

  <idle>-0       3d.h. 9047us : ttwu_do_wakeup <-ttwu_do_activate.constprop.86
  <idle>-0       3d.h. 9048us : check_preempt_curr <-ttwu_do_wakeup
  <idle>-0       3d.h. 9049us : resched_task <-check_preempt_curr
  <idle>-0       3dNh. 9049us : task_woken_rt <-ttwu_do_wakeup
  <idle>-0       3dNh. 9050us : ttwu_stat <-try_to_wake_up
  <idle>-0       3dNh. 9050us : _raw_spin_unlock_irqrestore <-try_to_wake_up
  <idle>-0       3dNh. 9051us : _raw_spin_lock <-__run_hrtimer

During an interrupt (h), we see another wakeup (not sure what this was
for), and the need resched is set (N).

  <idle>-0       3dNh. 9053us : lapic_next_event <-clockevents_program_event
  <idle>-0       3dNh. 9054us : irq_exit <-smp_apic_timer_interrupt
  <idle>-0       3dN.. 9055us : idle_cpu <-irq_exit
  <idle>-0       3dN.. 9055us : rcu_irq_exit <-irq_exit
  <idle>-0       3dN.. 9056us : rcu_eqs_enter_common.isra.40 <-rcu_irq_exit
  <idle>-0       3dN.. 9056us : rcu_prepare_for_idle <-rcu_eqs_enter_common.isra.40
  <idle>-0       3.N.. 9057us : menu_reflect <-cpuidle_idle_call
  <idle>-0       3.N.. 9058us : rcu_idle_exit <-cpu_idle
  <idle>-0       3dN.. 9058us : rcu_eqs_exit_common.isra.38 <-rcu_idle_exit
  <idle>-0       3dN.. 9059us : rcu_cleanup_after_idle <-rcu_eqs_exit_common.isra.38
  <idle>-0       3dN.. 9059us : del_timer <-rcu_cleanup_after_idle
  <idle>-0       3.N.. 9060us : tick_nohz_idle_exit <-cpu_idle
  <idle>-0       3dN.. 9060us : ktime_get <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9061us : tick_do_update_jiffies64 <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9061us : update_cpu_load_nohz <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9061us : calc_load_exit_idle <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9062us : touch_softlockup_watchdog <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9062us : tick_nohz_restart <-tick_nohz_idle_exit
  <idle>-0       3dN.. 9063us : hrtimer_cancel <-tick_nohz_restart
  <idle>-0       3dN.. 9063us : hrtimer_try_to_cancel <-hrtimer_cancel

We exit the interrupt and then do a lot of stuff with interrupts
disabled? Oh, this looks like the exit from idle (rcu_idle_exit is
there). It's doing clean up stuff here.


  <idle>-0       3dN.. 9077us : _pick_next_task_rt <-pick_next_task_rt
  <idle>-0       3dN.. 9078us : dequeue_pushable_task <-pick_next_task_rt
  <idle>-0       3d... 9079us : probe_wakeup_sched_switch <-__schedule
  <idle>-0       3d... 9079us :      0:120:R ==> [003]  2115: 94:R <...>

Finally our task is scheduled. But this looks all screwed up to me. I
bet the clocks are not in sync. Do a:

echo global > /sys/kernel/debug/tracing/trace_clock

and this will use a slower clock, but one that is in sync between CPUs.

Also, enable the sched_switch and sched_wakeup trace points too.

cd /sys/kernel/debug/tracing/events/sched
echo 1 > sched_wakeup/enable
echo 1 > sched_switch/enable

-- Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: getnstimeofday stuck for several milliseconds?
  2012-11-13  0:15   ` Steven Rostedt
@ 2012-11-13  8:26     ` David Henningsson
  0 siblings, 0 replies; 5+ messages in thread
From: David Henningsson @ 2012-11-13  8:26 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: John Stultz, linux-kernel, tglx, Arun Raghavan

On 11/13/2012 01:15 AM, Steven Rostedt wrote:
> On Mon, 2012-11-12 at 15:53 -0800, John Stultz wrote:
>
>> Cc'ing Steven to see if he can't help understand whats going on here.
>
> I don't trust the trace...

Thanks John and Steven!

I've redone the trace with the global clock and got a different stack 
trace, this time at 11 ms in total. (I don't know how much of these 11 
ms are caused from the tracing overhead?)

The result is here:

http://pastebin.se/jxxqf8pt

and most of the time it seems to be these lines repeating:

   compiz-1975    0.N.1   11us+: arch_flush_lazy_mmu_mode <-kmap_atomic_prot
   compiz-1975    0.N.1   16us : __kunmap_atomic <-drm_clflush_page
   compiz-1975    0.N.1   16us : native_flush_tlb_single <-__kunmap_atomic
   compiz-1975    0.N.1   17us : arch_flush_lazy_mmu_mode <-__kunmap_atomic
   compiz-1975    0.N..   18us : drm_clflush_page <-drm_clflush_sg
   compiz-1975    0.N..   18us : kmap_atomic <-drm_clflush_page
   compiz-1975    0.N..   19us : kmap_atomic_prot <-kmap_atomic

There are also occasionally sched* tasks going on at other CPUs. If you 
would excuse a layman's question - why can't we just schedule alsa-sink 
on another CPU, if this one is busy with doing graphics stuff?

For reference, test kernel and test case were the same this time around 
(3.7rc2, then playing a game for a few minutes).

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-11-13  8:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-05  8:51 getnstimeofday stuck for several milliseconds? David Henningsson
2012-11-12 23:53 ` John Stultz
2012-11-13  0:09   ` John Stultz
2012-11-13  0:15   ` Steven Rostedt
2012-11-13  8:26     ` David Henningsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox