All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: JK.Behnke@web.de
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] "RT throttling" issue
Date: Fri, 4 Dec 2015 17:33:15 +0100	[thread overview]
Message-ID: <20151204163315.GD29398@hermes.click-hack.org> (raw)
In-Reply-To: <trinity-edf656b8-4eef-4967-92eb-85856ad3f8bd-1449246268211@3capp-webde-bs27>

On Fri, Dec 04, 2015 at 05:24:28PM +0100, JK.Behnke@web.de wrote:
> Hello Gilles,
> 
> sorry for bothering you again.
> 
> > >
> > > Using the I-pipe tracer, you may be able to understand what happens.
> > Should I patch my Linux kernel so that xntrace_user_freeze is called,
> > when Linux scheduler sends "RT throttling activated" message?
> 
> I now have been able to capture an I-pipe frozen file, when the
> "RT throttling" condition occurs.
> I have done this by adding the line
>   ipipe_trace_freeze(1);
> right before
>   printk_sched("sched: RT Throttling activated\n");
> inside "kernel/sched/rt.c".
> I have set back_trace_points to 10000 to get about 1 second of
> backtrace history.
> 
> Here is a fragment of my huge frozen file which contains
> the last 1.6 seconds before the freeze.
> 
> //================== frozen file start =======================
>  +----- Hard IRQs ('|': locked)
>  |+-- Xenomai
>  ||+- Linux ('*': domain stalled, '+': current, '#': current+stalled)
>  |||              +---------- Delay flag ('+': > 1 us, '!': > 10 us)
>  |||              |       +- NMI noise ('N')
>  |||              |       |
>       Type      User Val.   Time    Delay  Function (Parent)
> :|  # [ 4936] -<?>-    0 -1613869      0.852  xnpod_resume_thread+0x4e (xnthread_timeout_handler+0x1e)
> :|  # event   tick@-1611814-1613868+   3.538  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
> :|  # [ 5066] prc6hmi -1 -1613864+   1.964  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
> :|  # [ 4936] -<?>-    0 -1613862+   3.127  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> :|  # [ 4936] -<?>-    0 -1613859+   1.192  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x171)
> :|  # [ 5066] prc6hmi -1 -1613858!  27.286  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
> :|  *+[ 4936] -<?>-    0 -1613831+   1.824  xnpod_resume_thread+0x4e (gatekeeper_thread+0xf1)
> :|  # [  456] gatekee -1 -1613829+   4.200  __xnpod_schedule+0x77 (xnpod_schedule_handler+0x29)
> :|  # [ 4936] -<?>-    0 -1613825      0.952  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x171)
> :|  # [  456] gatekee -1 -1613824!  13.001  __xnpod_schedule+0x45d (xnpod_schedule_handler+0x29)
> :|  *+[ 4936] -<?>-    0 -1613811      0.882  xnpod_resume_thread+0x4e (gatekeeper_thread+0xf1)
> :|  # [  456] gatekee -1 -1613810+   2.506  __xnpod_schedule+0x77 (xnpod_schedule_handler+0x29)
> :|  # [ 4936] -<?>-    0 -1613807      0.952  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x171)
> :|  # [  456] gatekee -1 -1613806!  11.588  __xnpod_schedule+0x45d (xnpod_schedule_handler+0x29)
> :|  *+[ 4936] -<?>-    0 -1613795      0.721  xnpod_resume_thread+0x4e (gatekeeper_thread+0xf1)
> :|  # [  456] gatekee -1 -1613794+   6.395  __xnpod_schedule+0x77 (xnpod_schedule_handler+0x29)
> :|  # [ 4936] -<?>-    0 -1613788      0.922  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x171)
> :|  # [  456] gatekee -1 -1613787!  10.124  __xnpod_schedule+0x45d (xnpod_schedule_handler+0x29)
> :|  *+[ 4936] -<?>-    0 -1613777      0.711  xnpod_resume_thread+0x4e (gatekeeper_thread+0xf1)
> :|  # [  456] gatekee -1 -1613776+   2.686  __xnpod_schedule+0x77 (xnpod_schedule_handler+0x29)
> :|  # [ 4936] -<?>-    0 -1613773      0.892  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x197)
> :|  # [  456] gatekee -1 -1613772! 1977.417  __xnpod_schedule+0x45d (xnpod_schedule_handler+0x29)
> :|  # [ 4956] -<?>-   50 -1611795+   1.794  xnpod_resume_thread+0x4e (xnthread_timeout_handler+0x1e)
> ...
> ...
> ...
> :|  # event   tick@-5129 -5382+   4.771  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
> :|  # [ 4961] -<?>-   -1 -5377+   3.448  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
> :|  # [ 4936] -<?>-    0 -5374+   7.027  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> :|  # [ 4936] -<?>-    0 -5367+   1.172  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x171)
> :|  # [ 4961] -<?>-   -1 -5366! 253.408  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
> :|  # [ 4957] -<?>-   50 -5112+   1.884  xnpod_resume_thread+0x4e (xnthread_periodic_handler+0x28)
> :|  # event   tick@-4525 -5110+   3.548  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
> :|  # [ 4961] -<?>-   -1 -5107+   2.476  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
> :|  # [ 4957] -<?>-   50 -5104!  59.785  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> :|  # [ 4957] -<?>-   50 -5045+   1.794  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x197)
> :|  # [ 4961] -<?>-   -1 -5043! 538.795  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
> :|  # [ 5012] -<?>-   99 -4504+   2.195  xnpod_resume_thread+0x4e (xnthread_periodic_handler+0x28)
> :|  # event   tick@-129  -4502+   4.180  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
> :|  # [ 4961] -<?>-   -1 -4498+   2.596  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
> :|  # [ 5012] -<?>-   99 -4495! 677.824  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> :|  # [ 5012] -<?>-   99 -3817+   5.844  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x197)
> :|  # [ 4961] -<?>-   -1 -3811! 3709.015  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
> :|  # [ 4957] -<?>-   50  -102+   3.147  xnpod_resume_thread+0x4e (xnthread_periodic_handler+0x28)
> :|  # event   tick@474     -99+   4.210  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
> :|  # [ 4961] -<?>-   -1   -95+   3.388  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
> :|  # [ 4957] -<?>-   50   -92!  55.645  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> :|  # [ 4957] -<?>-   50   -36+   2.065  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x197)
> :|  # [ 4961] -<?>-   -1   -34!  34.474  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
> <    #freeze  0x00000001     0    489.264  update_curr_rt+0x10f (task_tick_rt+0x15)
>  |  # [ 5012] -<?>-   99   489      1.724  xnpod_resume_thread+0x4e (xnthread_periodic_handler+0x28)
>  |  # event   tick@4870    490      3.147  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
>  |  # [    0] -<?>-   -1   494      2.866  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
>  |  # [ 5012] -<?>-   99   497    662.948  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
>  |  # [ 5012] -<?>-   99  1159      3.608  __xnpod_schedule+0x77 (xnpod_suspend_thread+0x197)
>  |  # [    0] -<?>-   -1  1163    3720.243  __xnpod_schedule+0x45d (xnintr_clock_handler+0x105)
>  |  # [ 4957] -<?>-   50  4883      2.085  xnpod_resume_thread+0x4e (xnthread_periodic_handler+0x28)
>  |  # event   tick@5474   4885      3.368  xntimer_next_local_shot+0xb4 (xntimer_tick_aperiodic+0x1a0)
>  |  # [    0] -<?>-   -1  4889      1.233  __xnpod_schedule+0x77 (xnintr_clock_handler+0x105)
>  |  # [ 4957] -<?>-   50  4890      0.000  __xnpod_schedule+0x45d (xnpod_suspend_thread+0x197)
> //================== frozen file end =======================
> 
> There is no process utilizing the CPU  for > 950000 us, which
> leads me to the conclusion, that the problem is not caused by
> an infinite loop.

The problem is caused by processes scheduled by the Linux scheduler
utilizing the CPU without never releasing it. These may be different
processes you are right.

> However I observe delay values of up to 3900 us multiple times which
> makes me think that there is a high CPU utilization which in the
> worst case exceeds the limit of 950000 us per 1 second interval.
> Probably I would have to sum up all delay values of processes in 
> secondary mode within 1 second time interval and see if it exceeds
> 950000 us.
> 
> Now I have the following questions
> 1. What is meant by "domain stalled", "current" and "current+stalled"?
> 2. What is that process having ID "[   0]"
> 3. How can I tell processes running in secondary mode from others?
>    (just by looking at the priority value?)
> 
> Any hint on interpreting my frozen file is appreciated.

This trace is worthless. To have a meaningful trace, enable tracing
on functions entry (the default when you enable the I-pipe tracer).

-- 
					    Gilles.
https://click-hack.org


  reply	other threads:[~2015-12-04 16:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-26 10:32 [Xenomai] "RT throttling" issue JK.Behnke
2015-11-28  6:43 ` Gilles Chanteperdrix
2015-11-29  9:30   ` Jochen Behnke
2015-12-01 13:09   ` JK.Behnke
2015-12-01 18:11     ` Gilles Chanteperdrix
2015-12-02 12:58       ` JK.Behnke
2015-12-04 16:24         ` JK.Behnke
2015-12-04 16:33           ` Gilles Chanteperdrix [this message]
2015-12-04 19:09             ` Jochen Behnke
2015-12-04 20:41               ` Gilles Chanteperdrix
2015-12-08 10:42                 ` Jochen Behnke
2015-12-08 11:05                   ` Gilles Chanteperdrix
2015-12-08 12:12                     ` Jochen Behnke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151204163315.GD29398@hermes.click-hack.org \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=JK.Behnke@web.de \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.