hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds
@ 2013-04-15 11:02 Stanislav Meduna
  2013-04-15 11:54 ` Stanislav Meduna
  2013-04-15 12:22 ` Thomas Gleixner
  0 siblings, 2 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-15 11:02 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

Hi,

Apr 15 10:14:57 lnx kernel: [56281.700293] hrtimer: interrupt took 6742 ns
Apr 15 10:14:57 lnx kernel: [330740.000129] [sched_delayed] sched: RT
throttling activated

>From our application logs the machine was basically hung for something
between 1.71 - 1.73 seconds, then resumed normal operation. A 5ms
timerfd_create timer returned 341 expirations, the 340 missed
exactly correspond to the 1.7 seconds.

None of the /sys/kernel/debug/tracing/latency_hist/* reports anything
unusual, they are all in the tens of microsecond range, only wakeup
shared prio is at 957 us between two same prio application threads,
which is expected.

It is not very probable that the reason for the throttling is our
application. We have own monitoring of the runaway tasks and this
did not kick in. Besides, the coincidence with the hrtimer
message looks very suspicious.

The kernel is 3.4.25-rt37 with full preempt on a 1 GHz Celeron M
industrial PC, ICH4 (ata_piix) used for ATA, Intel 82801DB PRO/100 VE
(e100) for ethernet.

Unfortunately it is not easily reproducible - it happens once
per several days and there is no obvious trigger.

Any hints?

Thanks
-- 
                                             Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds
  2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
@ 2013-04-15 11:54 ` Stanislav Meduna
  2013-04-15 12:22 ` Thomas Gleixner
  1 sibling, 0 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-15 11:54 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

On 15.04.2013 13:02, Stanislav Meduna wrote:

> Apr 15 10:14:57 lnx kernel: [56281.700293] hrtimer: interrupt took 6742 ns

The /proc/timer_list follows. What worries me a bit are the retries -
they are slowly rising (~1 / sec for nr_retries) even if the machine
is basically idle. Is it normal?

The rise only happens if our application is running, but it is
basically the only thing in system that is actively using
the hrtimers, so this does not really say much. Tweaking
the priority of ksoftirqd has no effect.


Timer List Version: v0.6
HRTIMER_MAX_CLOCK_BASES: 3
now at 342575389945882 nsecs

cpu: 0
 clock 0:
  .base:       c04d7f68
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: tick_cpu_sched, tick_sched_timer, S:01
 # expires at 342575390000000-342575390000000 nsecs [in 54118 to 54118
nsecs]
 #1: <cd9bb680>, timerfd_tmrproc, S:01
 # expires at 342575390284003-342575390284003 nsecs [in 338121 to 338121
nsecs]
 #2: <cc6cff48>, hrtimer_wakeup, S:01
 # expires at 342575390421978-342575390421978 nsecs [in 476096 to 476096
nsecs]
... 16 more hrtimer_wakeup follow ...

 clock 1:
  .base:       c04d7fa0
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1365682957275692841 nsecs
active timers:
 clock 2:
  .base:       c04d7fd8
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
  .expires_next   : 342575390284003 nsecs
  .hres_active    : 1
  .nr_events      : 556415337
  .nr_retries     : 1124752
  .nr_hangs       : 1
  .max_hang_time  : 6742 nsecs
  .nohz_mode      : 2
  .idle_tick      : 342575389000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 342275388
  .idle_calls     : 514502119
  .idle_sleeps    : 510841382
  .idle_entrytime : 342575388177611 nsecs
  .idle_waketime  : 342575388177611 nsecs
  .idle_exittime  : 342575388184375 nsecs
  .idle_sleeptime : 301580528758450 nsecs
  .iowait_sleeptime: 21756268924 nsecs
  .last_jiffies   : 342275388
  .next_jiffies   : 342275421
  .idle_expires   : 342575421000000 nsecs
jiffies: 342275390


Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: hpet
 max_delta_ns:   149983013276
 min_delta_ns:   13409
 mult:           61496111
 shift:          32
 mode:           3
 next_event:     342575390284003 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  hrtimer_interrupt
 retries:        9359


Regards
-- 
                                      Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds
  2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
  2013-04-15 11:54 ` Stanislav Meduna
@ 2013-04-15 12:22 ` Thomas Gleixner
  2013-04-15 12:56   ` Stanislav Meduna
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2013-04-15 12:22 UTC (permalink / raw)
  To: Stanislav Meduna; +Cc: linux-rt-users@vger.kernel.org

On Mon, 15 Apr 2013, Stanislav Meduna wrote:
> Unfortunately it is not easily reproducible - it happens once
> per several days and there is no obvious trigger.
> 
> Any hints?

Yes, we decoded that problem a week ago. Patch is available here:

http://www.spinics.net/lists/linux-tip-commits/msg19100.html

It's on the way to Linus tree, so it will come via the stable updates
to RT soon.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds
  2013-04-15 12:22 ` Thomas Gleixner
@ 2013-04-15 12:56   ` Stanislav Meduna
  2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-15 12:56 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-rt-users@vger.kernel.org

On 15.04.2013 14:22, Thomas Gleixner wrote:

> Yes, we decoded that problem a week ago. Patch is available here:
> 
> http://www.spinics.net/lists/linux-tip-commits/msg19100.html
> 
> It's on the way to Linus tree, so it will come via the stable updates
> to RT soon.

Hmm, this is a uniprocessor. Could this still happen via some interrupt
or something?

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 13
model name      : Intel(R) Celeron(R) M processor         1.00GHz
stepping        : 8
cpu MHz         : 999.993
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts
bogomips        : 1999.98
clflush size    : 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:


Thanks
-- 
                                              Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds]
  2013-04-15 12:56   ` Stanislav Meduna
@ 2013-04-17 15:46     ` Stanislav Meduna
  2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-17 15:46 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

Hi,

> Yes, we decoded that problem a week ago. Patch is available here:
>
> http://www.spinics.net/lists/linux-tip-commits/msg19100.html

hmm, I think this is something else. From my understanding this
cannot happen on a uniprocessor machine - it is only called
if (cpu != smp_processor_id()).

Could someone point me to a description how exactly the soft
interrupts work? - this is an area that I not really understand
yet. In the kernel I am using these were not yet (or again) handled
by single ksoftirqd.

I caught one of the instances with a trace. The core of the
setup is

1) Thread with a 5 ms timerfd loop, let's call it TimerT,
   RR, rtprio 45

2) ksoftirqd, FF, rtprio 37. Lower than the TimerT because I did
   not want the TimerT be blocked by e.g. net-tx/rx

3) Thread that gets triggered by TimerT, let's call him WorkT.
   RR, rtprio 25. The two use PI lock to synchronize.

4) networking and other irq threads at FF 50 default

I also hacked the throttler code to kill my task with SIGABRT
when RT throttling happens. I am not really sure whether this is
a sane thing to do, but the resulting core shows the thread
in the timerfd read(). Unfortunately I did not have syscall
tracing enabled at that point.

It is difficult to reproduce this - I got it down to once per one
or two days. I could of course try to twiddle with the priorities,
but I will never know whether it fixed the problem or just
made it less probable... Same with the kernel version - I am
using 3.4.25-rt37.

My understanding is that the HRTIMER woke TimerT, in turn the WorkT
was boosted to free the mutex, then the TimerT has done its work.
Now
- either I have a bug and the WorkT loops indefinitely. Not impossible,
  but the code is relatively simple
- or the kernel sometimes does not like a return onto timerfd read
  if the softirq that has woken it did not return yet

This is on a uniprocessor machine. I have edited the task names
and removed the PIDs and other clutter.

ksoftirqd/0 ...1.  1209.368437: softirq_entry: vec=8 [action=HRTIMER]
ksoftirqd/0 d..3.  1209.368444: sched_wakeup: comm=TimerT prio=54
ksoftirqd/0 d..3.  1209.368451: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d..4.  1209.368458: sched_pi_setprio: comm=ksoftirqd/0 pid=3 oldprio=62 newprio=54
     TimerT d..3.  1209.368468: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=D ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=54
ksoftirqd/0 d..4.  1209.368475: sched_wakeup: comm=TimerT prio=54
ksoftirqd/0 d..3.  1209.368478: sched_pi_setprio: comm=ksoftirqd/0 pid=3 oldprio=54 newprio=62
ksoftirqd/0 d..3.  1209.368488: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d..4.  1209.368506: sched_pi_setprio: comm=WorkT oldprio=74 newprio=54
     TimerT d..3.  1209.368517: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=D ==> next_comm=WorkT next_prio=54
      WorkT d..4.  1209.368525: sched_wakeup: comm=TimerT prio=54
      WorkT d..3.  1209.368529: sched_pi_setprio: comm=WorkT oldprio=54 newprio=74
      WorkT d..3.  1209.368538: sched_switch: prev_comm=WorkT prev_prio=74 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d.h1.  1209.369388: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.369399: softirq_raise: vec=1 [action=TIMER]
     TimerT d.h1.  1209.369402: irq_handler_exit: irq=0 ret=handled
     TimerT d.h1.  1209.370388: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.370398: softirq_raise: vec=1 [action=TIMER]
... only timers here, then a HRTIMER comes ...
     TimerT d.h1.  1209.373387: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.373397: softirq_raise: vec=8 [action=HRTIMER]
     TimerT d.h1.  1209.373400: softirq_raise: vec=1 [action=TIMER]
     TimerT d.h1.  1209.373402: irq_handler_exit: irq=0 ret=handled
     TimerT d.h1.  1209.374387: irq_handler_entry: irq=0 name=timer
... only timers here ...
     TimerT d.h3.  1209.555024: sched_wakeup: comm=irq/5-eth0 prio=49
     TimerT d..3.  1209.555033: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=R+ ==> next_comm=irq/5-eth0 next_pid=442 next_prio=49
 irq/5-eth0 d..1.  1209.555041: softirq_raise: vec=3 [action=NET_RX]
 irq/5-eth0 d..3.  1209.555054: sched_switch: prev_comm=irq/5-eth0 prev_prio=49 prev_state=S ==> next_comm=TimerT next_prio=54
... a HRTIMER much later than the 5 ms ...
     TimerT d.h1.  1210.253256: softirq_raise: vec=8 [action=HRTIMER]
... last entry ...
     TimerT d.h1.  1211.196095: irq_handler_entry: irq=0 name=timer

Note1: there is no
  softirq_exit: vec=8 [action=HRTIMER]
until the rt throttler kicked in. There are only two
  softirq_raise: vec=8 [action=HRTIMER]
although it should come once per 5 ms.

Note2: I call tracing_off() right in the throttling code in rt.c. The
last switch from a non-RT task was at 1209.365377 - i.e 1.8 seconds
before. Why did the throttler did not react sooner?

Thanks
-- 
                                           Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* timerfd read does not return [Was: Re: timerfd and softirqd]
  2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
@ 2013-04-18  9:11       ` Stanislav Meduna
  2013-04-19 19:53         ` Stanislav Meduna
  2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
  0 siblings, 2 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-18  9:11 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org, Thomas Gleixner

Hi,

from a new log that includes syscall tracing it seems that a thread
waiting in a timerfd is woken, but does not return from the read:

     TimerT d....  3825.943136: sys_read -> 0x8
...
     TimerT d....  3825.943199: sys_read(fd: 6, buf: b7633158, count: 8)
     TimerT d..3.  3825.943208: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=S ==> next_comm=ksoftirqd/0 next_prio=62
ksoftirqd/0 ...1.  3825.943213: softirq_exit: vec=8 [action=HRTIMER]
...
     OtherT d.h1.  3825.947983: irq_handler_entry: irq=0 name=timer
     OtherT d.h3.  3825.947997: sched_wakeup: comm=SomeT1 prio=84 success=1
     OtherT d.h3.  3825.948005: sched_wakeup: comm=SomeT2 prio=69 success=1
     OtherT dNh1.  3825.948009: softirq_raise: vec=8 [action=HRTIMER]
     OtherT dNh1.  3825.948011: softirq_raise: vec=1 [action=TIMER]
     OtherT dNh1.  3825.948014: irq_handler_exit: irq=0 ret=handled
...
ksoftirqd/0 ...1.  3825.948034: softirq_entry: vec=1 [action=TIMER]
ksoftirqd/0 ...1.  3825.948037: softirq_exit: vec=1 [action=TIMER]
ksoftirqd/0 ...1.  3825.948039: softirq_entry: vec=8 [action=HRTIMER]
ksoftirqd/0 d..3.  3825.948047: sched_wakeup: comm=TimerT prio=54 success=1
ksoftirqd/0 d..3.  3825.948054: sched_switch: prev_comm=ksoftirqd/0 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d..4.  3825.948061: sched_pi_setprio: comm=ksoftirqd/0 oldprio=62 newprio=54
     TimerT d..3.  3825.948072: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=D ==> next_comm=ksoftirqd/0 next_prio=54
ksoftirqd/0 d..4.  3825.948079: sched_wakeup: comm=TimerT prio=54 success=1
ksoftirqd/0 d..3.  3825.948083: sched_pi_setprio: comm=ksoftirqd/0 oldprio=54 newprio=62
ksoftirqd/0 d..3.  3825.948093: sched_switch: prev_comm=ksoftirqd/0 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d.h1.  3825.948980: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  3825.948991: softirq_raise: vec=1 [action=TIMER]
...

There is no return from the read until the RT throttler got activated.
Kernel version 3.4.25-rt37 - I'll check whether anything relevant
has changed since then.


Regards
-- 
                                            Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return [Was: Re: timerfd and softirqd]
  2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
@ 2013-04-19 19:53         ` Stanislav Meduna
  2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
  2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
  1 sibling, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-19 19:53 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

On 18.04.2013 11:11, Stanislav Meduna wrote:

> There is no return from the read until the RT throttler got activated.
> Kernel version 3.4.25-rt37 - I'll check whether anything relevant
> has changed since then.

I changed the RT throttler to do show_state_filter(0); (sysrq-t).
No idea whether this is the sane way to do in that context
and how relevant the results are, but from the failing thread
I am getting


TimerT     R running      0   671    636 0x10000000
 ceb76a70 b7633158 c01169b0 ceb51ed4 c0116acb 00000029 ceb51e6c c012385d
 ceb51edc cea6d248 cd8024b0 cea6d200 00000002 00000029 c012385d 00000001
 00000000 00000000 ceb51ea0 c0146476 00000030 ceb51ea8 c012385d ceb51ed4
Call Trace:
 [<c01169b0>] ? mm_fault_error+0x170/0x170
 [<c0116acb>] ? do_page_fault+0x11b/0x390
 [<c012385d>] ? irq_exit+0x6d/0x80
 [<c012385d>] ? irq_exit+0x6d/0x80
 [<c0146476>] ? sub_preempt_count+0x36/0x50
 [<c012385d>] ? irq_exit+0x6d/0x80
 [<c03a9718>] ? do_IRQ+0x48/0x94
 [<c01169b0>] ? mm_fault_error+0x170/0x170
last message repeated 2 times
 [<c03a8e3d>] ? error_code+0x5d/0x70
 [<c0130000>] ? sys_rt_sigqueueinfo+0x70/0x90
 [<c01169b0>] ? mm_fault_error+0x170/0x170
 [<c027cb51>] ? __put_user_8+0x11/0x19
 [<c01fa323>] ? timerfd_read+0x183/0x280
 [<c013aa40>] ? wake_up_bit+0x60/0x60
 [<c01bffcf>] ? vfs_read+0x9f/0x150
 [<c01c08e7>] ? fget_light+0x87/0xe0
 [<c01fa1a0>] ? sys_timerfd_gettime+0x140/0x140
 [<c01c0152>] ? sys_read+0x42/0x70
 [<c03a8b95>] ? syscall_call+0x7/0xb
 [<c03a0000>] ? init_memory_mapping+0x1a0/0x410


I have two instances of this trace and one another:

Call Trace:
 [<c027bffc>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c03a7dbd>] ? restore_all+0xf/0xf
 [<c027007b>] ? cfq_completed_request+0x58b/0x5f0
 [<c0119ea0>] ? __kunmap_atomic+0x50/0x60
 [<c01aadd1>] ? handle_pte_fault+0x141/0xa00
 [<c01ab710>] ? handle_mm_fault+0x80/0xc0
 [<c015b26d>] ? rt_up_read+0x1d/0x30
 [<c0116b18>] ? do_page_fault+0x168/0x390
 [<c012385d>] ? irq_exit+0x6d/0x80
 [<c0161684>] ? irq_to_desc+0x14/0x20
 [<c0103c3f>] ? handle_irq+0x1f/0x90
 [<c03a890f>] ? do_IRQ+0x3f/0x94
 [<c011007b>] ? x86_schedule_events+0x34b/0x4b0
 [<c03a882e>] ? common_interrupt+0x2e/0x40
 [<c0130000>] ? sys_rt_sigqueueinfo+0x70/0x90
 [<c027bd51>] ? __put_user_8+0x11/0x19
 [<c01f9523>] ? timerfd_read+0x183/0x280
 [<c013aa40>] ? wake_up_bit+0x60/0x60
 [<c01bf1cf>] ? vfs_read+0x9f/0x150
 [<c01bfae7>] ? fget_light+0x87/0xe0
 [<c01f93a0>] ? sys_timerfd_gettime+0x140/0x140
 [<c01bf352>] ? sys_read+0x42/0x70
 [<c03a7d95>] ? syscall_call+0x7/0xb


The __put_user_8 looks like the put_user at the end of the
timerfd_read. Where does the sys_rt_sigqueueinfo come from?

The read destination is just a normal 8-byte variable on stack,
I don't see how this can become invalid while the thread is
in kernel

  uint64_t exp;
  ...
  read(tmrFd, &exp, sizeof(exp))

The timer fd _does_ eventually return - after 200 ms to 5 seconds -
and works normally afterwards. Unfortunately I don't know whether
it returns with 8 or with -EFAULT. In any case there is no crash,
I do not catch SIGSEGV, SIGBUS and similar.

I'm puzzled. Does this make sense to anyone? Is this better suited
for the general Linux list?

Unrelated questions: I am not so familiar with Linux on this level,
but shouldn't interrupts be handled on their own stack?

Thanks
-- 
                                      Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] Re: timerfd read does not return
  2013-04-19 19:53         ` Stanislav Meduna
@ 2013-04-22  7:35           ` Stanislav Meduna
  2013-04-22  8:55             ` Stanislav Meduna
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-22  7:35 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org, Thomas Gleixner

On 19.04.2013 21:53, Stanislav Meduna wrote:

>> There is no return from the read until the RT throttler got activated.
>> Kernel version 3.4.25-rt37 - I'll check whether anything relevant
>> has changed since then.

FWIW, the following patch seems to solve the issue for me.
Unfortunately I have no idea whether this is a sane way to
do and whether it solves or just masks or delays the issue,
I just decided to try it after seeing http://lkml.org/lkml/2012/2/8/487

Note that in my case the ksoftirqd has lower priority than
the thread reading the timerfd, so the latter gets woken up
immediately and it is well possible and usual that it returns
into the kernel to do another read before the hrtimer softirq
ends. Would it be possible that there is some race in this
situation?

diff --git a/fs/timerfd.c b/fs/timerfd.c
index 57f0e4e..b06ba86 100644
--- a/fs/timerfd.c
+++ b/fs/timerfd.c
@@ -222,6 +222,7 @@ static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count,
                ctx->ticks = 0;
        }
        spin_unlock_irq(&ctx->wqh.lock);
+       hrtimer_wait_for_timer(&ctx->tmr);
        if (ticks)
                res = put_user(ticks, (u64 __user *) buf) ? -EFAULT: sizeof(ticks);
        return res;

Thanks
-- 
                                         Stano

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] Re: timerfd read does not return
  2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
@ 2013-04-22  8:55             ` Stanislav Meduna
  0 siblings, 0 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-22  8:55 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org, Thomas Gleixner

On 22.04.2013 09:35, Stanislav Meduna wrote:

> FWIW, the following patch seems to solve the issue for me.
> Unfortunately I have no idea whether this is a sane way to
> do and whether it solves or just masks or delays the issue,

Unfortunately it only delays it (funny how it crashes minutes
after I write to the list, after running for hours and hours
before), so please, disregard :(

The dumped call trace at the time the throttler kicks in this time

restore_all+0xf/0xf
irq_exit+0x6d/0x80
sub_preempt_count+0x36/0x50
irq_exit+0x6d/0x80
do_IRQ+0x48/0x94
__kunmap_atomic+0x50/0x60
handle_pte_fault+0x141/0xa00
trace_hardirqs_on_thunk+0xc/0x10
restore_all+0xf/0xf
handle_mm_fault+0x80/0xc0
rt_up_read+0x1d/0x30
do_page_fault+0x168/0x390
irq_exit+0x6d/0x80
irq_to_desc+0x14/0x20
handle_irq+0x1f/0x90
do_IRQ+0x3f/0x94
common_interrupt+0x2e/0x40
__kill_pgrp_info+0x5b/0x60
__put_user_8+0x11/0x19
timerfd_read+0x18e/0x2a0
wake_up_bit+0x60/0x60
vfs_read+0x9f/0x150
fget_light+0x87/0xe0
sys_timerfd_gettime+0x140/0x140
sys_read+0x42/0x70
syscall_call+0x7/0xb
init_memory_mapping+0x230/0x410

and again the last logged trace is sched_switch to the timer thread
and there is no softirq_exit from HRTIMER


Again __put_user_8 and then something signal-related. Note: I do not
(knowingly) use signals in my application, neither explicitly nor
pthread_cancel or something like that via glibc.

The problem does not manifest if I use timer_create instead
of timerfd (at least such version did run for 2,5 days without
a problem).

Regards
-- 
                                   Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - was probably fixed in 3.4.38
  2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
  2013-04-19 19:53         ` Stanislav Meduna
@ 2013-04-27  8:34         ` Stanislav Meduna
  2013-04-28 11:53           ` Carsten Emde
                             ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-27  8:34 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org, rostedt; +Cc: Thomas Gleixner

On 18.04.2013 11:11, Stanislav Meduna wrote:

> from a new log that includes syscall tracing it seems that a thread
> waiting in a timerfd is woken, but does not return from the read:

For the record, the problem seems to be the issue that the
http://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.4-rt&id=52cecaa
fixed in v3.4.38-rt52 (tracing: Fix free of probe entry by calling
call_rcu_sched()). A kernel with just this commit reverted stalled
once per ~4 hours, with this one it did not stall in ~24h (I will
surely try to run it longer).

Why did is always manifest itself by hanging in return from timerfd
read (and not e.g. when timer_create and friends were used) is beyond
me, but there will be a reason.

Sorry for the false alarm, it was already fixed at the time I reported
it - it is just that one wants to know the reason before updating
to a brand new kernel with 750+ other changes.

A general question: I am seeing a few recent changes in the tracing
code. Is it generally safe to have at least latency histogram enabled
in the production code, or is it advised not to compile these features
in? I'd like to see the latencies the customer is getting in the real
environment, but if the code is considered as work-in-progress,
it is better to play it safe.

Thanks
-- 
                                           Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - was probably fixed in 3.4.38
  2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
@ 2013-04-28 11:53           ` Carsten Emde
  2013-04-29  8:43             ` Stanislav Meduna
  2013-05-02 20:02           ` Steven Rostedt
  2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
  2 siblings, 1 reply; 19+ messages in thread
From: Carsten Emde @ 2013-04-28 11:53 UTC (permalink / raw)
  To: Stanislav Meduna; +Cc: linux-rt-users@vger.kernel.org, rostedt, Thomas Gleixner

Stano,

> [..]
> A general question: I am seeing a few recent changes in the tracing
> code. Is it generally safe to have at least latency histogram
> enabled in the production code, or is it advised not to compile these
> features in? I'd like to see the latencies the customer is getting in
> the real environment, but if the code is considered as
> work-in-progress, it is better to play it safe.
There are a number of different built-in latency histograms in the
kernel, some of them are best used to trace a known problem (e.g.
INTERRUPT_OFF_HIST and PREEMPT_OFF_HIST) while others are intended for
continuous latency monitoring (CONFIG_WAKEUP_LATENCY_HIST and
CONFIG_MISSED_TIMER_OFFSETS_HIST). The latter are ideally suitable to
document the real-time capabilities of a production system and to
retrospectively analyze a failure that may be related to a prolonged
latency. We, therefore, configure and enable continuous latency
monitoring histograms (CONFIG_WAKEUP_LATENCY_HIST and
CONFIG_MISSED_TIMER_OFFSETS_HIST) by default in all our farm systems
since more than two years (https://www.osadl.org/?id=864), and we are
not aware of any problem that might be related to them. I, thus, would
like to say that they are save to be used in production systems. Since
we do not have much long-term experience with the other histograms and
they may introduce additional and longer latencies than the former, I
would not recommend to enable them in production systems.

BTW: Your question was whether it is "advised not to compile these
features in". There certainly is very little penalty or risk to
configure these features without enabling them. So I don't see any
reason not to compile them in. Only when you enable them
   cd /sys/kernel/debug/tracing/latency_hist/enable
   for i in wakeup missed_timer_offsets timerandwakeup
   do
     echo 1 >$i
   done
a certain latency penalty and risk of malfunction may apply.

	-Carsten.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - was probably fixed in 3.4.38
  2013-04-28 11:53           ` Carsten Emde
@ 2013-04-29  8:43             ` Stanislav Meduna
  0 siblings, 0 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-04-29  8:43 UTC (permalink / raw)
  To: Carsten Emde; +Cc: linux-rt-users@vger.kernel.org

On 28.04.2013 13:53, Carsten Emde wrote:

Hi Carsten,

> We, therefore, configure and enable continuous latency
> monitoring histograms (CONFIG_WAKEUP_LATENCY_HIST and
> CONFIG_MISSED_TIMER_OFFSETS_HIST) by default in all our farm systems
> since more than two years (https://www.osadl.org/?id=864), and we are
> not aware of any problem that might be related to them.

The problem is our systems that started to exhibit this problem
did have just these enabled :( Many more were compiled in.

Your continuous test farm is impressive! A suggestion: would it
be possible to also add a dmesg and more of the /proc/timer_list
output to the profile pages? This would make comparing with
the own system even easier.

For example the Geode system at https://www.osadl.org/?id=948 :
how did you convince it to use highres timers? In my environment
the Geode LX / CS5536 needs tsc=reliable, but your kernel command
line does not contain it.

Thanks
-- 
                                        Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - was probably fixed in 3.4.38
  2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
  2013-04-28 11:53           ` Carsten Emde
@ 2013-05-02 20:02           ` Steven Rostedt
  2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
  2 siblings, 0 replies; 19+ messages in thread
From: Steven Rostedt @ 2013-05-02 20:02 UTC (permalink / raw)
  To: Stanislav Meduna; +Cc: linux-rt-users@vger.kernel.org, Thomas Gleixner

On Sat, 2013-04-27 at 10:34 +0200, Stanislav Meduna wrote:

> A general question: I am seeing a few recent changes in the tracing
> code. Is it generally safe to have at least latency histogram enabled
> in the production code, or is it advised not to compile these features
> in? I'd like to see the latencies the customer is getting in the real
> environment, but if the code is considered as work-in-progress,
> it is better to play it safe.

Only the wakeup latency tracer does not cause a hit to performance. I
recommend irqsoff/preemptoff to not be enabled on production systems.
Not for safety reasons, but because they cause a noticeable overhead to
the kernel when configured in but not enabled.

As for the other tracers, the function tracer is usually fine. But some
have said that they see a slight performance decrease when configured
in. I haven't seen that, but it really does depend on the hardware.

The events should not be an issue to configure in.

-- Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - some traces
  2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
  2013-04-28 11:53           ` Carsten Emde
  2013-05-02 20:02           ` Steven Rostedt
@ 2013-05-10 12:42           ` Stanislav Meduna
  2013-05-12 17:31             ` Stanislav Meduna
  2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
  2 siblings, 2 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-05-10 12:42 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org; +Cc: rostedt, Thomas Gleixner, Carsten Emde

On 27.04.2013 10:34, Stanislav Meduna wrote:

> For the record, the problem seems to be the issue that the
> http://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.4-rt&id=52cecaa
> fixed in v3.4.38-rt52 (tracing: Fix free of probe entry by calling
> call_rcu_sched()).

Hm.. it was not fixed, it just moved the timings so that it is
even less reproducible. I am now back at 3.4.25-rt37, with slightly
buggy application - due to a bad ioctl it tries to repeatedly
load some module that it does not find. This seems to push the
timings so that the problem does manifest sometimes.

Unfortunately I am unable to reproduce the problem with
function tracing enabled :( I will try to enable/disable
single interesting functions - any tips for what is
the most interesting?

I have uploaded the trace.dat captured by
  trace-cmd start -b 64000 -e irq -e sched -e syscalls -e timer
and an excerpt from it in textual form at
http://www.meduna.org/tmp/trace.timerfdhang.tar.gz

I would greatly appreciate it if someone could take a look.


The threads:

 659  - the thread running a 5 ms timerfd loop, RR, priority 45
   3  - ksoftirqd, FF, 37
1276  - a thread trying to do a non-blocking TCP connect. If it
        fails, it does getsockopt for error, setsockopt
        (the buggy one) and closes the socket. RR, 25
1271  - a thread doing the "real" work. It waits at the futex
        that is woken from the timerfd thread. RR, 25
 653  - a thread periodically (each ms) reading some device that
        does not exist at this platform (returns EBADF) and
        sleeps. RR, 25

The interrupts run at FF, 50.


The functionality:

49762.922986 the timerfd thread exits the read()
             successfully for the last time

49762.923028 it is back in the read

49762.927836 the HRTIMER for the next iteration is raised and
49762.927872 starts processing. After some PI juggling
49762.927934 arms it for the new interval, i.e. the hrtimer_restart
             in timerfd.c was reached

49762.927944 the thread with the socket is priority inherit. As far
             as I can tell the two do not share any userspace lock
             that is held at the moment, so this has to be some
             kernel one.
             Anyway, a network interrupt comes in and after handling
             the thread obviously finishes the lock and returns
             the priority

49762.928029 the timerfd thread gets the CPU, does not giveit back,
but the end of the syscall is not reached in a sane time. It eventually
is, this time after more than 2 seconds (I've seen 200 ms to 4 s);
however this is not captured as I stopped the tracing at the RT
throttler activation.

This is a 32-bit uniprocessor, so there can be neither races between
CPUs nor 64-bit atomic access problems.

I already tried:
- fixing the bug that caused the module request
- moving ksoftirqd priority above the timerfd thread
- enable/disable highres timers
- different machines - a Geode based one (needs tsc=reliable
  for the hrtimers) and a Celeron M based one (hpet=force needed)
- removing loadable modules and anything debug-related from the kernel
  config
- running with lockdep (unable to reproduce, so no reliable data)
- changing a suspicious (for me at least) code in e100_watchdog
  to use raw spin lock


Thanks
-- 
                                                 Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - some traces
  2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
@ 2013-05-12 17:31             ` Stanislav Meduna
  2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
  1 sibling, 0 replies; 19+ messages in thread
From: Stanislav Meduna @ 2013-05-12 17:31 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org; +Cc: rostedt, Thomas Gleixner, Carsten Emde

On 10.05.2013 14:42, Stanislav Meduna wrote:

> I have uploaded the trace.dat captured by
>   trace-cmd start -b 64000 -e irq -e sched -e syscalls -e timer

Actually the problem probably comes earlier. The thread with id 1276
normally tries to call setsockopt and then right after it
returns a select is called. See e.g. trace #914414 @ 49720.850080.
There is no way in the code it can wait for more than tens of
microseconds - no loop, no lock, just two function calls.

With -p function -n '*preempt_count' (which unfortunately changes
the timings so that the problem does not occur) there is nothing
logged between the calls as well, except the syscall_trace_entry
function.

However, at #1800630 @ 49762.927207 the setsockopt returns,
but the expected system call is not entered for 600 us
and no irq, sched or timer event is logged in between.

I also have another instance of the crash where the timing
is different - the thread runs for 4 ms with soft and hard
interrupts coming, so I am sure that this is not some SMI.
It really looks that something is spinning here.

At #1800649 the priority of the thread is boosted, i.e. there
most probably _is_ a PI lock held. It even looks to be unlocked
at #1800658 where the priority boost goes away.

Which one? What locking happens between sys_exit_socketcall
is logged and the time next syscall entry is logged?

The problem akso happens with the whole CONFIG_FTRACE subtree
disabled, so if there is some lock only present if the
tracers are present, it is not the (only) culprit.

Thanks
-- 
                                     Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - hangs inside put_user
  2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
  2013-05-12 17:31             ` Stanislav Meduna
@ 2013-05-12 23:20             ` Stanislav Meduna
  2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
  1 sibling, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-05-12 23:20 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org; +Cc: rostedt, Thomas Gleixner, Carsten Emde

On 10.05.2013 14:42, Stanislav Meduna wrote:

> 49762.928029 the timerfd thread gets the CPU, does not giveit back,
> but the end of the syscall is not reached in a sane time.

Hmm... I added some trace_printks into fs/timerfd.c:

static ssize_t timerfd_read(
{
...
	trace_printk("timerfd_read before unlock, res=%d", res);
	spin_unlock_irq(&ctx->wqh.lock);
	trace_printk("timerfd_read after unlock, res=%d", res);
	if (ticks)
		res = put_user(ticks, (u64 __user *) buf) ? -EFAULT: sizeof(ticks);
	trace_printk("timerfd_read return, res=%d, ticks=%llu, buf=%p", res, (unsigned long long) ticks, buf);
	return res;
}

The first two are printed. The last one is not.

0....0 timerfd_read: timerfd_read before unlock, res=0
0....0 timerfd_read: timerfd_read after unlock, res=0

The usual

0....0 timerfd_read: timerfd_read return, res=8, ticks=1, buf=0xb7600158

does _not_ happen here.

So it looks like the problem happens inside the put_user,
maybe a pagefault? The buf is an address on the stack:

while(alive) {
	u64 exp;

	if (read(tmrFd, &exp, sizeof(exp)) != sizeof(exp) || exp < 1)
		continue;
...

and the process is running with mlockall(MCL_CURRENT | MCL_FUTURE),
so the whole stack is forced into the RAM.

Viewing the ps -o min_flt,maj_flt for the task shows 969701
minor faults (that do not increment - I will check this when
the hang happens again) and 0 major ones.

This starts to look like some priority inversion where a realtime
thread is busy-waiting for something a SCHED_OTHER thread holds.
After the RT throttler kicks in, that other thread can proceed
and the system eventually recovers - unfortunately after
a 1+ second pause, which is what I am seeing.

Is there something that could cause stalling the whole realtime
thread for a significant time? Unfortunately I am not an expert
on memory management, pagefaults etc - but I guess this should
only produce a minor pagefault and that should never block?
Or am I seeing it wrong?

Regards
-- 
                                     Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: timerfd read does not return - caused by MM fault
  2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
@ 2013-05-13  8:05               ` Stanislav Meduna
  2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-05-13  8:05 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org; +Cc: rostedt, Thomas Gleixner, Carsten Emde

On 13.05.2013 01:20, Stanislav Meduna wrote:

> So it looks like the problem happens inside the put_user,
> maybe a pagefault? The buf is an address on the stack:

Yep, this is a page fault. I added tracing of the
  mm_fault_error
  handle_mm_fault
  handle_pte_fault
  do_page_fault
functions


The result is

0....0 62811.755379: bprint:    timerfd_read: timerfd_read after unlock, res=0
0d...0 62811.755382: function:  do_page_fault
0....0 62811.755386: function:     handle_mm_fault
0....0 62811.755389: function:        handle_pte_fault
0d...0 62811.755394: function:  do_page_fault
0....0 62811.755396: function:     handle_mm_fault
0....0 62811.755398: function:        handle_pte_fault
0d...0 62811.755402: function:  do_page_fault
0....0 62811.755404: function:     handle_mm_fault
0....0 62811.755406: function:        handle_pte_fault


> Viewing the ps -o min_flt,maj_flt for the task shows 969701
> minor faults (that do not increment - I will check this when
> the hang happens again) and 0 major ones.

The number of minor faults of the thread jumped to 1433400 and
the process recovered after 2.77 seconds.

The trace file is at
  http://www.meduna.org/tmp/trace.mmfaulthang.dat.gz

Thanks
-- 
                                        Stano


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Livelock in handle_pte_fault [Was: Re: timerfd read does not return]
  2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
@ 2013-05-14  8:31                 ` Stanislav Meduna
  2013-11-25 10:36                   ` Vijay Katoch
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Meduna @ 2013-05-14  8:31 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org; +Cc: rostedt, Thomas Gleixner, Carsten Emde

On 13.05.2013 10:05, Stanislav Meduna wrote:

> 0d...0 62811.755394: function:  do_page_fault
> 0....0 62811.755396: function:     handle_mm_fault
> 0....0 62811.755398: function:        handle_pte_fault
> 0d...0 62811.755402: function:  do_page_fault
> 0....0 62811.755404: function:     handle_mm_fault
> 0....0 62811.755406: function:        handle_pte_fault

The flags in the pagefault handler are 0x28 - if I understand
it correctly, FAULT_FLAG_KILLABLE | FAULT_FLAG_ALLOW_RETRY.
The faulting address is indeed the one from stack that worked
for hours before, is mlockall()-ed and I have (of course)
no swap. I will add some code to print the content of
the offending pte.

The code in handle_pte_fault proceeds through the
  entry = pte_mkyoung(entry);
line and the following
  ptep_set_access_flags
returns zero. This repeats ad nauseum without anything run
in between. I will add some tracing prints to output
the content of the pte.

Adding flush_tlb_page(vma, address) at the beginning of
handle_pte_fault does not change anything.

The length of the hang could correlate with the time until
some SCHED_OTHER process is scheduled after the RT throttler
activates. There is a process running each 2 seconds and the
length of the hang is usually between 1 and 3 seconds. This
is not (yet) verified.

I am starting to think that the virtual memory mapping of the
process got somehow corrupted and is fixed at the next regular
context switch. There is no switch to other non-kernel process,
only to ksoftirqd, irq threads or other thread of the
same process afterwards. Shortly before there was some
switching between modprobe and kworker and sched_process_free
of kworker and modprobe in the RCU softirq.

The symptoms are similar to
http://lkml.indiana.edu/hypermail/linux/kernel/1103.0/01364.html

Regards
-- 
                                            Stano

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Livelock in handle_pte_fault [Was: Re: timerfd read does not return]
  2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
@ 2013-11-25 10:36                   ` Vijay Katoch
  0 siblings, 0 replies; 19+ messages in thread
From: Vijay Katoch @ 2013-11-25 10:36 UTC (permalink / raw)
  To: linux-rt-users

Hello,
I'm also facing timer related issue in 2.6.33.7.2-rt30.
I have an embedded board with PowerPC 5200 running Realtime Linux with 
version 2.6.33 (OSADL).
My application is using one high resolution timer in Linux for alarms. This 
timer sometimes didn’t expire. The problem happens very rarely, it may go 
many months between each time it happens on a systems.
The timer is set by function timer_settime with absolute time.
I have done some interesting observations when the timer didn’t expire: 
- Function timer_gettime returns remaining time 1ns.
- Active timers are checked by displaying file /proc/timer_list and the 
timer_list didn’t show this timer in the active timer list. 
I have looked into the Linux source and found a possible scenario:
The function timer_gettime ends up in function common_timer_get (posix-
timers.c).  Function common_timer_get returns it_value.tv_nsec = 1 if timer 
is active and remaining time is <= 0. This means that the timer has counted 
down and the timer state must be 'enqueued' or 'callback'.

I suppose that it is in state 'callback', that means it is running in 
function __run_hrtimer (hrtimer.c). Function __run_hrtimer is calling 
function__remove_hrtimer that remove the timer from timer active list before 
it changes timer state from 'enqueued' to 'callback'.

Several functions are called in function __run_hrtimer between changing 
timer state to 'callback' and the end of the function where the state 
'callback' is cleared.  If it is hanging here, the function timer_gettime 
may return 1ns while the timer is not in the active list. Here it is calling 
several functions in Linux kernel and the callback function in the 
application. 

I have checked the callback function in my application. It is signaling a 
semaphore and setting the timer again on the same thread. I can't see why 
that should not work. 

Have you observed a similar case?
Do you have any idea of what are going wrong here?

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-11-25 10:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds] Stanislav Meduna
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
2013-11-25 10:36                   ` Vijay Katoch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).