All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Meduna <stano@meduna.org>
To: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Subject: timerfd and softirqd [Was: Re: hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds]
Date: Wed, 17 Apr 2013 17:46:59 +0200	[thread overview]
Message-ID: <516EC3F3.1080406@meduna.org> (raw)
In-Reply-To: <516BF8FD.2000700@meduna.org>

Hi,

> Yes, we decoded that problem a week ago. Patch is available here:
>
> http://www.spinics.net/lists/linux-tip-commits/msg19100.html

hmm, I think this is something else. From my understanding this
cannot happen on a uniprocessor machine - it is only called
if (cpu != smp_processor_id()).

Could someone point me to a description how exactly the soft
interrupts work? - this is an area that I not really understand
yet. In the kernel I am using these were not yet (or again) handled
by single ksoftirqd.

I caught one of the instances with a trace. The core of the
setup is

1) Thread with a 5 ms timerfd loop, let's call it TimerT,
   RR, rtprio 45

2) ksoftirqd, FF, rtprio 37. Lower than the TimerT because I did
   not want the TimerT be blocked by e.g. net-tx/rx

3) Thread that gets triggered by TimerT, let's call him WorkT.
   RR, rtprio 25. The two use PI lock to synchronize.

4) networking and other irq threads at FF 50 default

I also hacked the throttler code to kill my task with SIGABRT
when RT throttling happens. I am not really sure whether this is
a sane thing to do, but the resulting core shows the thread
in the timerfd read(). Unfortunately I did not have syscall
tracing enabled at that point.

It is difficult to reproduce this - I got it down to once per one
or two days. I could of course try to twiddle with the priorities,
but I will never know whether it fixed the problem or just
made it less probable... Same with the kernel version - I am
using 3.4.25-rt37.


My understanding is that the HRTIMER woke TimerT, in turn the WorkT
was boosted to free the mutex, then the TimerT has done its work.
Now
- either I have a bug and the WorkT loops indefinitely. Not impossible,
  but the code is relatively simple
- or the kernel sometimes does not like a return onto timerfd read
  if the softirq that has woken it did not return yet


This is on a uniprocessor machine. I have edited the task names
and removed the PIDs and other clutter.

ksoftirqd/0 ...1.  1209.368437: softirq_entry: vec=8 [action=HRTIMER]
ksoftirqd/0 d..3.  1209.368444: sched_wakeup: comm=TimerT prio=54
ksoftirqd/0 d..3.  1209.368451: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d..4.  1209.368458: sched_pi_setprio: comm=ksoftirqd/0 pid=3 oldprio=62 newprio=54
     TimerT d..3.  1209.368468: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=D ==> next_comm=ksoftirqd/0 next_pid=3 next_prio=54
ksoftirqd/0 d..4.  1209.368475: sched_wakeup: comm=TimerT prio=54
ksoftirqd/0 d..3.  1209.368478: sched_pi_setprio: comm=ksoftirqd/0 pid=3 oldprio=54 newprio=62
ksoftirqd/0 d..3.  1209.368488: sched_switch: prev_comm=ksoftirqd/0 prev_pid=3 prev_prio=62 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d..4.  1209.368506: sched_pi_setprio: comm=WorkT oldprio=74 newprio=54
     TimerT d..3.  1209.368517: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=D ==> next_comm=WorkT next_prio=54
      WorkT d..4.  1209.368525: sched_wakeup: comm=TimerT prio=54
      WorkT d..3.  1209.368529: sched_pi_setprio: comm=WorkT oldprio=54 newprio=74
      WorkT d..3.  1209.368538: sched_switch: prev_comm=WorkT prev_prio=74 prev_state=R+ ==> next_comm=TimerT next_prio=54
     TimerT d.h1.  1209.369388: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.369399: softirq_raise: vec=1 [action=TIMER]
     TimerT d.h1.  1209.369402: irq_handler_exit: irq=0 ret=handled
     TimerT d.h1.  1209.370388: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.370398: softirq_raise: vec=1 [action=TIMER]
... only timers here, then a HRTIMER comes ...
     TimerT d.h1.  1209.373387: irq_handler_entry: irq=0 name=timer
     TimerT d.h1.  1209.373397: softirq_raise: vec=8 [action=HRTIMER]
     TimerT d.h1.  1209.373400: softirq_raise: vec=1 [action=TIMER]
     TimerT d.h1.  1209.373402: irq_handler_exit: irq=0 ret=handled
     TimerT d.h1.  1209.374387: irq_handler_entry: irq=0 name=timer
... only timers here ...
     TimerT d.h3.  1209.555024: sched_wakeup: comm=irq/5-eth0 prio=49
     TimerT d..3.  1209.555033: sched_switch: prev_comm=TimerT prev_prio=54 prev_state=R+ ==> next_comm=irq/5-eth0 next_pid=442 next_prio=49
 irq/5-eth0 d..1.  1209.555041: softirq_raise: vec=3 [action=NET_RX]
 irq/5-eth0 d..3.  1209.555054: sched_switch: prev_comm=irq/5-eth0 prev_prio=49 prev_state=S ==> next_comm=TimerT next_prio=54
... a HRTIMER much later than the 5 ms ...
     TimerT d.h1.  1210.253256: softirq_raise: vec=8 [action=HRTIMER]
... last entry ...
     TimerT d.h1.  1211.196095: irq_handler_entry: irq=0 name=timer
	
Note1: there is no
  softirq_exit: vec=8 [action=HRTIMER]
until the rt throttler kicked in. There are only two
  softirq_raise: vec=8 [action=HRTIMER]
although it should come once per 5 ms.

Note2: I call tracing_off() right in the throttling code in rt.c. The
last switch from a non-RT task was at 1209.365377 - i.e 1.8 seconds
before. Why did the throttler did not react sooner?

Thanks
-- 
                                           Stano


  reply	other threads:[~2013-04-17 15:47 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 11:02 hrtimer: interrupt took 6742 ns, then RT throttling and hung machine for nearly 2 seconds Stanislav Meduna
2013-04-15 11:54 ` Stanislav Meduna
2013-04-15 12:22 ` Thomas Gleixner
2013-04-15 12:56   ` Stanislav Meduna
2013-04-17 15:46     ` Stanislav Meduna [this message]
2013-04-18  9:11       ` timerfd read does not return [Was: Re: timerfd and softirqd] Stanislav Meduna
2013-04-19 19:53         ` Stanislav Meduna
2013-04-22  7:35           ` [PATCH] Re: timerfd read does not return Stanislav Meduna
2013-04-22  8:55             ` Stanislav Meduna
2013-04-27  8:34         ` timerfd read does not return - was probably fixed in 3.4.38 Stanislav Meduna
2013-04-28 11:53           ` Carsten Emde
2013-04-29  8:43             ` Stanislav Meduna
2013-05-02 20:02           ` Steven Rostedt
2013-05-10 12:42           ` timerfd read does not return - some traces Stanislav Meduna
2013-05-12 17:31             ` Stanislav Meduna
2013-05-12 23:20             ` timerfd read does not return - hangs inside put_user Stanislav Meduna
2013-05-13  8:05               ` timerfd read does not return - caused by MM fault Stanislav Meduna
2013-05-14  8:31                 ` Livelock in handle_pte_fault [Was: Re: timerfd read does not return] Stanislav Meduna
2013-11-25 10:36                   ` Vijay Katoch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516EC3F3.1080406@meduna.org \
    --to=stano@meduna.org \
    --cc=linux-rt-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.