linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Large latencies with 4.4.1-rt6
@ 2016-02-19 17:02 Martin.Wirth
  2016-02-25 12:34 ` Tim Sander
  2016-02-25 15:29 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 9+ messages in thread
From: Martin.Wirth @ 2016-02-19 17:02 UTC (permalink / raw)
  To: linux-rt-users

Hi,

I today switched from 4.1.15-rt17 to 4.4.1-rt6.
Now I'm seeing large latencies of more than 1 ms.
They occur sporadic ( every few minutes) and show up for example running 

cyclictest -l10000000 -m -Sp99 -i200

I tried to catch a trace and got the following: 


# tracer: preemptoff
#
# preemptoff latency trace v1.1.5 on 4.4.1-rt6+
# --------------------------------------------------------------------
# latency: 1099 us, #79/79, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
#    -----------------
#    | task: firefox-3736 (uid:1001 nice:0 policy:0 rt_prio:0)
#    -----------------
#  => started at: worker_thread
#  => ended at:   smp_apic_timer_interrupt
#
#
#                   _--------=> CPU#              
#                  / _-------=> irqs-off          
#                 | / _------=> need-resched      
#                 || / _-----=> need-resched_lazy 
#                 ||| / _----=> hardirq/softirq   
#                 |||| / _---=> preempt-depth     
#                 ||||| / _--=> preempt-lazy-depth
#                 |||||| / _-=> migrate-disable   
#                 ||||||| /     delay             
#  cmd     pid    |||||||| time  |   caller       
#     \   /      ||||||||  \   |   /            
   <...>-3709    2....1..    0us : schedule <-worker_thread
   <...>-3709    2d...1..    0us : preempt_count_add <-_raw_spin_lock_irq
   <...>-3709    2d...2..    0us : deactivate_task <-__schedule
   <...>-3709    2d...2..    0us : update_rq_clock.part.88 <-deactivate_task
   <...>-3709    2d...2..    1us : dequeue_task_fair <-deactivate_task
   <...>-3709    2d...2..    1us : update_curr <-dequeue_task_fair
   <...>-3709    2d...2..    1us : update_min_vruntime <-update_curr
   <...>-3709    2d...2..    1us : clear_buddies <-dequeue_task_fair
   <...>-3709    2d...2..    1us : account_entity_dequeue <-dequeue_task_fair
   <...>-3709    2d...2..    1us : update_min_vruntime <-dequeue_task_fair
   <...>-3709    2d...2..    1us : update_cfs_shares <-dequeue_task_fair
   <...>-3709    2d...2..    1us : hrtick_update <-dequeue_task_fair
   <...>-3709    2d...2..    2us : pick_next_task_fair <-__schedule
   <...>-3709    2d...2..    2us : pick_next_entity <-pick_next_task_fair
   <...>-3709    2d...2..    2us : clear_buddies <-pick_next_entity
   <...>-3709    2d...2..    2us : pick_next_entity <-pick_next_task_fair
   <...>-3709    2d...2..    2us : clear_buddies <-pick_next_entity
   <...>-3709    2d...2..    2us : set_next_entity <-pick_next_task_fair
   <...>-3709    2d...2..    2us : update_stats_wait_end <-set_next_entity
   <...>-3709    2d...2..    2us : put_prev_entity <-pick_next_task_fair
   <...>-3709    2d...2..    2us : set_next_entity <-pick_next_task_fair
   <...>-3709    2d...2..    2us : update_stats_wait_end <-set_next_entity
 firefox-3736    2d...2..    3us : finish_task_switch <-__schedule
 firefox-3736    2d...2..    3us#: _raw_spin_unlock_irq <-finish_task_switch
 firefox-3736    2d...... 1091us : smp_apic_timer_interrupt <-apic_timer_interrupt
 firefox-3736    2d...... 1092us : irq_enter <-smp_apic_timer_interrupt
 firefox-3736    2d...... 1092us : rcu_irq_enter <-irq_enter
 firefox-3736    2d...... 1092us : irqtime_account_irq <-irq_enter
 firefox-3736    2d...... 1092us : in_serving_softirq <-irqtime_account_irq
 firefox-3736    2d...... 1092us : preempt_count_add <-irq_enter
 firefox-3736    2d..h... 1093us : exit_idle <-smp_apic_timer_interrupt
 firefox-3736    2d..h... 1093us : local_apic_timer_interrupt <-smp_apic_timer_interrupt
 firefox-3736    2d..h... 1093us : hrtimer_interrupt <-local_apic_timer_interrupt
 firefox-3736    2d..h... 1093us : _raw_spin_lock <-hrtimer_interrupt
 firefox-3736    2d..h... 1093us : preempt_count_add <-_raw_spin_lock
 firefox-3736    2d..h1.. 1093us : ktime_get_update_offsets_now <-hrtimer_interrupt
 firefox-3736    2d..h1.. 1093us : __hrtimer_run_queues <-hrtimer_interrupt
 firefox-3736    2d..h1.. 1094us : __remove_hrtimer <-__hrtimer_run_queues
 firefox-3736    2d..h1.. 1094us : preempt_count_sub <-__hrtimer_run_queues
 firefox-3736    2d..h... 1094us : hrtimer_wakeup <-__hrtimer_run_queues
 firefox-3736    2d..h... 1094us : wake_up_process <-hrtimer_wakeup
 firefox-3736    2d..h... 1094us : try_to_wake_up <-wake_up_process
 firefox-3736    2d..h... 1094us : _raw_spin_lock_irqsave <-try_to_wake_up
 firefox-3736    2d..h... 1094us : preempt_count_add <-_raw_spin_lock_irqsave
 firefox-3736    2d..h1.. 1095us : _raw_spin_lock <-try_to_wake_up
 firefox-3736    2d..h1.. 1095us : preempt_count_add <-_raw_spin_lock
 firefox-3736    2d..h2.. 1095us : ttwu_do_activate.constprop.102 <-try_to_wake_up
 firefox-3736    2d..h2.. 1095us : activate_task <-ttwu_do_activate.constprop.102
 firefox-3736    2d..h2.. 1095us : update_rq_clock.part.88 <-activate_task
 firefox-3736    2d..h2.. 1095us : enqueue_task_rt <-activate_task
 firefox-3736    2d..h2.. 1095us : dequeue_rt_stack <-enqueue_task_rt
 firefox-3736    2d..h2.. 1095us : dequeue_top_rt_rq <-dequeue_rt_stack
 firefox-3736    2d..h2.. 1096us : cpupri_set <-enqueue_task_rt
 firefox-3736    2d..h2.. 1096us : update_rt_migration <-enqueue_task_rt
 firefox-3736    2d..h2.. 1096us : _raw_spin_lock <-enqueue_task_rt
 firefox-3736    2d..h2.. 1096us : preempt_count_add <-_raw_spin_lock
 firefox-3736    2d..h3.. 1096us : preempt_count_sub <-enqueue_task_rt
 firefox-3736    2d..h2.. 1096us : enqueue_top_rt_rq <-enqueue_task_rt
 firefox-3736    2d..h2.. 1096us : ttwu_do_wakeup <-ttwu_do_activate.constprop.102
 firefox-3736    2d..h2.. 1097us : check_preempt_curr <-ttwu_do_wakeup
 firefox-3736    2d..h2.. 1097us : resched_curr <-check_preempt_curr
 firefox-3736    2dN.h2.. 1097us : task_woken_rt <-ttwu_do_wakeup
 firefox-3736    2dN.h2.. 1097us : preempt_count_sub <-try_to_wake_up
 firefox-3736    2dN.h1.. 1097us : _raw_spin_unlock_irqrestore <-try_to_wake_up
 firefox-3736    2dN.h1.. 1097us : preempt_count_sub <-_raw_spin_unlock_irqrestore
 firefox-3736    2dN.h... 1097us : _raw_spin_lock <-__hrtimer_run_queues
 firefox-3736    2dN.h... 1097us : preempt_count_add <-_raw_spin_lock
 firefox-3736    2dN.h1.. 1098us : __hrtimer_get_next_event <-hrtimer_interrupt
 firefox-3736    2dN.h1.. 1098us : preempt_count_sub <-hrtimer_interrupt
 firefox-3736    2dN.h... 1098us : tick_program_event <-hrtimer_interrupt
 firefox-3736    2dN.h... 1098us : clockevents_program_event <-tick_program_event
 firefox-3736    2dN.h... 1098us : ktime_get <-clockevents_program_event
 firefox-3736    2dN.h... 1098us : lapic_next_deadline <-clockevents_program_event
 firefox-3736    2dN.h... 1099us : irq_exit <-smp_apic_timer_interrupt
 firefox-3736    2dN.h... 1099us : irqtime_account_irq <-irq_exit
 firefox-3736    2dN.h... 1099us : preempt_count_sub <-irq_exit
 firefox-3736    2dN.h... 1099us : irq_exit <-smp_apic_timer_interrupt
 firefox-3736    2dN.h... 1099us : trace_preempt_on <-smp_apic_timer_interrupt
 firefox-3736    2dN.h... 1106us : <stack trace>
 => preempt_count_sub
 => irq_exit
 => smp_apic_timer_interrupt
 => apic_timer_interrupt

This is a i7-4700EQ CPU @ 2.40GHz machine  

uname -a output is
Linux su02li 4.4.1-rt6+ #26 SMP PREEMPT RT Fri Feb 19 16:05:57 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Does the above trace ring a bell or is it useless?

Thanks,

Martin




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Large latencies with 4.4.1-rt6
  2016-02-19 17:02 Large latencies with 4.4.1-rt6 Martin.Wirth
@ 2016-02-25 12:34 ` Tim Sander
  2016-02-25 12:47   ` AW: " Martin.Wirth
  2016-02-25 15:29 ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 9+ messages in thread
From: Tim Sander @ 2016-02-25 12:34 UTC (permalink / raw)
  To: Martin.Wirth; +Cc: linux-rt-users

Hi Martin

Am Freitag, 19. Februar 2016, 17:02:20 schrieb Martin.Wirth@dlr.de:
> I today switched from 4.1.15-rt17 to 4.4.1-rt6.
> Now I'm seeing large latencies of more than 1 ms.
> They occur sporadic ( every few minutes) and show up for example running
It would be interesting to know the platform the rt kernel is running on?

I have also seen a latency increase with 4.4.1-rtX releases. I have the 
impression that some stable kernel updates made it worse. Care to test 4.4-rt3 
or 4.4-rt4? Had not the chance to pin them down by now.

Best regards
Tim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: Large latencies with 4.4.1-rt6
  2016-02-25 12:34 ` Tim Sander
@ 2016-02-25 12:47   ` Martin.Wirth
  2016-02-25 12:52     ` Tim Sander
  0 siblings, 1 reply; 9+ messages in thread
From: Martin.Wirth @ 2016-02-25 12:47 UTC (permalink / raw)
  To: tim; +Cc: linux-rt-users

Hi Tim

> -----Ursprüngliche Nachricht-----
> Von: Tim Sander [mailto:tim@krieglstein.org]
> Gesendet: Donnerstag, 25. Februar 2016 13:34
> An: Wirth, Martin
> Cc: linux-rt-users@vger.kernel.org
> Betreff: Re: Large latencies with 4.4.1-rt6
> 
> Hi Martin
> 
> Am Freitag, 19. Februar 2016, 17:02:20 schrieb Martin.Wirth@dlr.de:
> > I today switched from 4.1.15-rt17 to 4.4.1-rt6.
> > Now I'm seeing large latencies of more than 1 ms.
> > They occur sporadic ( every few minutes) and show up for example
> > running
> It would be interesting to know the platform the rt kernel is running on?

This is an intel  i7-4700EQ CPU @ 2.40GHz machine  on an ADLINK cPCI-6530 board.

> I have also seen a latency increase with 4.4.1-rtX releases. I have the
> impression that some stable kernel updates made it worse. Care to test 4.4-
> rt3 or 4.4-rt4? Had not the chance to pin them down by now.
> 
The worst case cyclictest latencies (running over serveral hours) with pre 4.4.x-rt versions 
were below 10 µs for an idle system and below 23 µs for a kernel compile with make -j4.
Since this is a machine with is in lab-use during the day I hesitate a bit to try too 
early 4.4.x version which might be instable enough to eat my disk.

Best regards,

Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: Large latencies with 4.4.1-rt6
  2016-02-25 12:47   ` AW: " Martin.Wirth
@ 2016-02-25 12:52     ` Tim Sander
  0 siblings, 0 replies; 9+ messages in thread
From: Tim Sander @ 2016-02-25 12:52 UTC (permalink / raw)
  To: Martin.Wirth; +Cc: linux-rt-users

Hi Martin
Am Donnerstag, 25. Februar 2016, 12:47:31 schrieb Martin.Wirth@dlr.de:
> > I have also seen a latency increase with 4.4.1-rtX releases. I have the
> > impression that some stable kernel updates made it worse. Care to test
> > 4.4-
> > rt3 or 4.4-rt4? Had not the chance to pin them down by now.
> 
> The worst case cyclictest latencies (running over serveral hours) with pre
> 4.4.x-rt versions were below 10 µs for an idle system and below 23 µs for a
> kernel compile with make -j4. Since this is a machine with is in lab-use
> during the day I hesitate a bit to try too early 4.4.x version which might
> be instable enough to eat my disk.
Well the 4.4 releases have been pretty stable for me on ARM but your milage 
may by different...

Best regards
Tim
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Large latencies with 4.4.1-rt6
  2016-02-19 17:02 Large latencies with 4.4.1-rt6 Martin.Wirth
  2016-02-25 12:34 ` Tim Sander
@ 2016-02-25 15:29 ` Sebastian Andrzej Siewior
  2016-02-25 16:08   ` AW: " Martin.Wirth
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-02-25 15:29 UTC (permalink / raw)
  To: Martin.Wirth; +Cc: linux-rt-users

* Martin.Wirth@dlr.de | 2016-02-19 17:02:20 [+0000]:

>Hi,
Hi,

>I today switched from 4.1.15-rt17 to 4.4.1-rt6.
>Now I'm seeing large latencies of more than 1 ms.
>They occur sporadic ( every few minutes) and show up for example running 
>
>cyclictest -l10000000 -m -Sp99 -i200
>
>I tried to catch a trace and got the following: 
>
>
>   <...>-3709    2d...2..    2us : set_next_entity <-pick_next_task_fair
>   <...>-3709    2d...2..    2us : update_stats_wait_end <-set_next_entity
> firefox-3736    2d...2..    3us : finish_task_switch <-__schedule
> firefox-3736    2d...2..    3us#: _raw_spin_unlock_irq <-finish_task_switch
> firefox-3736    2d...... 1091us : smp_apic_timer_interrupt <-apic_timer_interrupt
> firefox-3736    2d...... 1092us : irq_enter <-smp_apic_timer_interrupt
>Does the above trace ring a bell or is it useless?

This does not ring a bell. According to the trace your CPU took a break
after the unlock until the next interrupt. If this a fucntion trace I
would expect some functions after that.
You could try the v4.4 line as of:
 v4.4-rt2
 v4.4-rt3
 v4.4.1-rt5

to see if I introduced this somewhere by accident. However I haven't seen
this on any of my boxes.

>Thanks,
>
>Martin

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: Large latencies with 4.4.1-rt6
  2016-02-25 15:29 ` Sebastian Andrzej Siewior
@ 2016-02-25 16:08   ` Martin.Wirth
  2016-02-25 16:29     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Martin.Wirth @ 2016-02-25 16:08 UTC (permalink / raw)
  To: bigeasy; +Cc: linux-rt-users

Hi Sebastian

> -----Ursprüngliche Nachricht-----
> Von: Sebastian Andrzej Siewior [mailto:bigeasy@linutronix.de]
> Gesendet: Donnerstag, 25. Februar 2016 16:30
> An: Wirth, Martin
> Cc: linux-rt-users@vger.kernel.org
> Betreff: Re: Large latencies with 4.4.1-rt6
> 
> You could try the v4.4 line as of:
>  v4.4-rt2
>  v4.4-rt3
>  v4.4.1-rt5
> 

I will have access to the machine for longer testing next Monday again. But I remember that v4.4.1-rt5 showed the same long latencies.
Will try v4.4-rt3 and v4.4-rt2 beginning of next week.

Martin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: Large latencies with 4.4.1-rt6
  2016-02-25 16:08   ` AW: " Martin.Wirth
@ 2016-02-25 16:29     ` Sebastian Andrzej Siewior
  2016-02-29 17:02       ` AW: " Martin.Wirth
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-02-25 16:29 UTC (permalink / raw)
  To: Martin.Wirth; +Cc: linux-rt-users

On 02/25/2016 05:08 PM, Martin.Wirth@dlr.de wrote:
> Hi Sebastian

Hi Martin,

> I will have access to the machine for longer testing next Monday again. But I remember that v4.4.1-rt5 showed the same long latencies.
> Will try v4.4-rt3 and v4.4-rt2 beginning of next week.

If this would be your first try I would suggest to run hwlatdetect in
case BIOS is calling for attention. However since you don't see this in
v4.1 is looks unlikely. It also might be help to have a log to see what
the other CPUs are doing around that time. Are they busy doing
something or do they pause as well. Also do try some fancy PM things
(but then again why now and not in v4.1 as well).

> Martin

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* AW: AW: Large latencies with 4.4.1-rt6
  2016-02-25 16:29     ` Sebastian Andrzej Siewior
@ 2016-02-29 17:02       ` Martin.Wirth
  2016-02-29 21:05         ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Martin.Wirth @ 2016-02-29 17:02 UTC (permalink / raw)
  To: bigeasy; +Cc: linux-rt-users

Hi Sebastian,

> Von: Sebastian Andrzej Siewior [mailto:bigeasy@linutronix.de]
> Gesendet: Donnerstag, 25. Februar 2016 17:29
> An: Wirth, Martin
> Cc: linux-rt-users@vger.kernel.org
> Betreff: Re: AW: Large latencies with 4.4.1-rt6
> 
> On 02/25/2016 05:08 PM, Martin.Wirth@dlr.de wrote:
> > Hi Sebastian
> 
> Hi Martin,
> 
> > I will have access to the machine for longer testing next Monday again. But
> I remember that v4.4.1-rt5 showed the same long latencies.
> > Will try v4.4-rt3 and v4.4-rt2 beginning of next week.
> 
> If this would be your first try I would suggest to run hwlatdetect in case BIOS
> is calling for attention. However since you don't see this in
> v4.1 is looks unlikely. It also might be help to have a log to see what the other
> CPUs are doing around that time. Are they busy doing something or do they
> pause as well. Also do try some fancy PM things (but then again why now
> and not in v4.1 as well).
> 

I tested a bunch of kernels today and the outcome is:

4.4.0-rt2:   good
4.4.0-rt3:   good
4.4.1-rt4:   good
4.4.1-rt5:   bad
4.4.1-rt6:   bad
4.4.2-rt7:   bad
4.4.3-rt9:   bad
4.4.3-rt9 with commit b3b36ee61c780407737a321007a24cfe9c3668a7 reverted:   good

(good means max cyclictest latencies < 25 µs,  bad means latencies > 500 µs)

I'm not a kernel developer, so I have not the slightest clue if the revert makes sense or only is plastering over the real problem.

Best wishes,

Martin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: AW: AW: Large latencies with 4.4.1-rt6
  2016-02-29 17:02       ` AW: " Martin.Wirth
@ 2016-02-29 21:05         ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2016-02-29 21:05 UTC (permalink / raw)
  To: Martin.Wirth; +Cc: bigeasy, linux-rt-users

[-- Attachment #1: Type: TEXT/PLAIN, Size: 657 bytes --]

On Mon, 29 Feb 2016, Martin.Wirth@dlr.de wrote:
> I tested a bunch of kernels today and the outcome is:
> 
> 4.4.0-rt2:   good
> 4.4.0-rt3:   good
> 4.4.1-rt4:   good
> 4.4.1-rt5:   bad
> 4.4.1-rt6:   bad
> 4.4.2-rt7:   bad
> 4.4.3-rt9:   bad
> 4.4.3-rt9 with commit b3b36ee61c780407737a321007a24cfe9c3668a7 reverted:   good
> 
> (good means max cyclictest latencies < 25 µs,  bad means latencies > 500 µs)
> 
> I'm not a kernel developer, so I have not the slightest clue if the revert
> makes sense or only is plastering over the real problem.

The latter :)

Can you please send me your .config and tell me your cyclictes command line ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-29 21:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-19 17:02 Large latencies with 4.4.1-rt6 Martin.Wirth
2016-02-25 12:34 ` Tim Sander
2016-02-25 12:47   ` AW: " Martin.Wirth
2016-02-25 12:52     ` Tim Sander
2016-02-25 15:29 ` Sebastian Andrzej Siewior
2016-02-25 16:08   ` AW: " Martin.Wirth
2016-02-25 16:29     ` Sebastian Andrzej Siewior
2016-02-29 17:02       ` AW: " Martin.Wirth
2016-02-29 21:05         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).