linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds
@ 2013-04-15 11:02 Stanislav Meduna
  2013-04-15 11:54 ` Stanislav Meduna
  2013-04-15 12:22 ` Thomas Gleixner
  0 siblings, 2 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-15 11:02 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

Hi,

Apr 15 10:14:57 lnx kernel: [56281.700293] hrtimer: interrupt took 6742 ns
Apr 15 10:14:57 lnx kernel: [330740.000129] [sched_delayed] sched: RT
throttling activated


>From our application logs the machine was basically hung for something
between 1.71 - 1.73 seconds, then resumed normal operation. A 5ms
timerfd_create timer returned 341 expirations, the 340 missed
exactly correspond to the 1.7 seconds.

None of the /sys/kernel/debug/tracing/latency_hist/* reports anything
unusual, they are all in the tens of microsecond range, only wakeup
shared prio is at 957 us between two same prio application threads,
which is expected.

It is not very probable that the reason for the throttling is our
application. We have own monitoring of the runaway tasks and this
did not kick in. Besides, the coincidence with the hrtimer
message looks very suspicious.


The kernel is 3.4.25-rt37 with full preempt on a 1 GHz Celeron M
industrial PC, ICH4 (ata_piix) used for ATA, Intel 82801DB PRO/100 VE
(e100) for ethernet.

Unfortunately it is not easily reproducible - it happens once
per several days and there is no obvious trigger.

Any hints?

Thanks
-- 
                                             Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-11-25 10:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
2013-11-25 10:36                   ` Vijay Katoch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).