* Question regarding 'sched: RT throttling activated'
@ 2012-09-11 11:38 Wolfgang Wallner
2012-09-11 12:10 ` John Kacur
2012-09-11 13:05 ` Mike Galbraith
0 siblings, 2 replies; 8+ messages in thread
From: Wolfgang Wallner @ 2012-09-11 11:38 UTC (permalink / raw)
To: linux-rt-users; +Cc: Josef Baumgartner
Hello all,
I have a few questions about the logging entry "sched: RT throttling activated" which I observe, and hope this is the correct place to ask :)
I will start by describing how i come to see this logging entry:
I use an industrial PC with an 2.6.33 RT PREEMP kernel to run a real time application (based on the openPOWERLINK stack [1]), which consists of a kernel part and a user space part.
My application uses the openPOWERLINK stack to exchange data with other real time nodes in the network (a good overview how this is done is given on wikipedia [2]).
When I let my application run overnight, I always see that it stops working after some hours, and the output of dmesg shows the following:
[ 622.722690] EdrvInitOne waiting for link up...
[ 624.439269] EdrvInitOne finished with 0
[ 624.439334] EdrvInit local MAC = 00 60 65 06 B2 25
[ 624.454662] EplApiProcessImageAlloc: Alloc(640, 1100, 2, 2)
[ 624.454672] EplApiProcessImageAlloc: Alloc(eeea2000, 640, eef36800, 1100)
[ 624.477242] ifconfig epl 192.168.100.240 returned 0
[ 635.058012] epl: no IPv6 routers present
[60382.945209] sched: RT throttling activated
[60382.945209] EplTimerHighResk: Continuous timer (handle 0x10000001) had to skip 836 interval(s)!
The first 7 entries are the normal logging output from the powerlink stack as it starts up. I just copied them into the mail for completeness.
The last line is debugging output of the powerlink stack as it complains that its timer was not handled for some time.
The line with "sched: RT throttling activated" is the one I'm interested in.
I don't know what it means, and searching for this term did not turn up much.
My questions are now:
* What does this logging entry mean?
Could you please point me to some information about RT throttling so that I can understand what's it about?
* How to debug this issue?
At the moment I don't know what is cause and what is effect.
I'm not sure if I have a misconfiguration somewhere on my linux system and this influences my application or vice versa.
Any tips on how to find out what's going on are greatly appreciated :)
Regards,
Wolfgang Wallner
[1] http://sourceforge.net/projects/openpowerlink/
[2] http://en.wikipedia.org/wiki/Ethernet_Powerlink
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 11:38 Question regarding 'sched: RT throttling activated' Wolfgang Wallner
@ 2012-09-11 12:10 ` John Kacur
2012-09-11 13:20 ` Mike Galbraith
2012-09-11 13:05 ` Mike Galbraith
1 sibling, 1 reply; 8+ messages in thread
From: John Kacur @ 2012-09-11 12:10 UTC (permalink / raw)
To: Wolfgang Wallner; +Cc: linux-rt-users, Josef Baumgartner
On Tue, 11 Sep 2012, Wolfgang Wallner wrote:
> Hello all,
>
> I have a few questions about the logging entry "sched: RT throttling activated" which I observe, and hope this is the correct place to ask :)
> I will start by describing how i come to see this logging entry:
>
> I use an industrial PC with an 2.6.33 RT PREEMP kernel to run a real time application (based on the openPOWERLINK stack [1]), which consists of a kernel part and a user space part.
> My application uses the openPOWERLINK stack to exchange data with other real time nodes in the network (a good overview how this is done is given on wikipedia [2]).
>
> When I let my application run overnight, I always see that it stops working after some hours, and the output of dmesg shows the following:
>
> [ 622.722690] EdrvInitOne waiting for link up...
> [ 624.439269] EdrvInitOne finished with 0
> [ 624.439334] EdrvInit local MAC = 00 60 65 06 B2 25
> [ 624.454662] EplApiProcessImageAlloc: Alloc(640, 1100, 2, 2)
> [ 624.454672] EplApiProcessImageAlloc: Alloc(eeea2000, 640, eef36800, 1100)
> [ 624.477242] ifconfig epl 192.168.100.240 returned 0
> [ 635.058012] epl: no IPv6 routers present
> [60382.945209] sched: RT throttling activated
> [60382.945209] EplTimerHighResk: Continuous timer (handle 0x10000001) had to skip 836 interval(s)!
>
>
> The first 7 entries are the normal logging output from the powerlink stack as it starts up. I just copied them into the mail for completeness.
> The last line is debugging output of the powerlink stack as it complains that its timer was not handled for some time.
>
> The line with "sched: RT throttling activated" is the one I'm interested in.
> I don't know what it means, and searching for this term did not turn up much.
What kind of a search? A grep of "RT\ throttling" would have found it.
>
> My questions are now:
>
> * What does this logging entry mean?
> Could you please point me to some information about RT throttling so that I can understand what's it about?
in the kernel, look at Documentation/scheduler/sched-rt-group.txt
>
> * How to debug this issue?
> At the moment I don't know what is cause and what is effect.
> I'm not sure if I have a misconfiguration somewhere on my linux system and this influences my application or vice versa.
>
> Any tips on how to find out what's going on are greatly appreciated :)
>
It is considered a feature that allows non-RT tasks to make some progress,
so that poorly written user space programs can't totally freeze your
system - which would often be blamed on the rt kernel. You can shut this
off for a harder kind of real-time - in which case, if your system is freezing,
turning it back on could help to show whether the kernel is really
freezing, or if it is a result of a problem with the user-space program.
John
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 11:38 Question regarding 'sched: RT throttling activated' Wolfgang Wallner
2012-09-11 12:10 ` John Kacur
@ 2012-09-11 13:05 ` Mike Galbraith
1 sibling, 0 replies; 8+ messages in thread
From: Mike Galbraith @ 2012-09-11 13:05 UTC (permalink / raw)
To: Wolfgang Wallner; +Cc: linux-rt-users, Josef Baumgartner
On Tue, 2012-09-11 at 13:38 +0200, Wolfgang Wallner wrote:
> Hello all,
>
> I have a few questions about the logging entry "sched: RT throttling
> activated" which I observe, and hope this is the correct place to
> ask :)
> I will start by describing how i come to see this logging entry:
>
> I use an industrial PC with an 2.6.33 RT PREEMP kernel to run a real
> time application (based on the openPOWERLINK stack [1]), which
> consists of a kernel part and a user space part.
> My application uses the openPOWERLINK stack to exchange data with
> other real time nodes in the network (a good overview how this is done
> is given on wikipedia [2]).
>
> When I let my application run overnight, I always see that it stops
> working after some hours, and the output of dmesg shows the following:
>
> [ 622.722690] EdrvInitOne waiting for link up...
> [ 624.439269] EdrvInitOne finished with 0
> [ 624.439334] EdrvInit local MAC = 00 60 65 06 B2 25
> [ 624.454662] EplApiProcessImageAlloc: Alloc(640, 1100, 2, 2)
> [ 624.454672] EplApiProcessImageAlloc: Alloc(eeea2000, 640, eef36800,
> 1100)
> [ 624.477242] ifconfig epl 192.168.100.240 returned 0
> [ 635.058012] epl: no IPv6 routers present
> [60382.945209] sched: RT throttling activated
> [60382.945209] EplTimerHighResk: Continuous timer (handle 0x10000001)
> had to skip 836 interval(s)!
Likely realtime task went gaga. Kernel or userspace doesn't matter.
> My questions are now:
>
> * What does this logging entry mean?
> Could you please point me to some information about RT throttling
> so that I can understand what's it about?
With stock settings, it means realtime task[s] consumed > 95% of the
throttle interval (1s), so the throttle activated, allowing SCHED_NORMAL
tasks to have a sip of CPU, to let you try to save the box from nutty RT
CPU hogs. See kernel/sched_rt.c.
>
> * How to debug this issue?
> At the moment I don't know what is cause and what is effect.
> I'm not sure if I have a misconfiguration somewhere on my linux system
> and this influences my application or vice versa.
>
> Any tips on how to find out what's going on are greatly appreciated :)
Hm, I would revoke godlike powers until the recipients can be trusted to
use them wisely. 100% CPU utilization being considered highly unwise.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 12:10 ` John Kacur
@ 2012-09-11 13:20 ` Mike Galbraith
0 siblings, 0 replies; 8+ messages in thread
From: Mike Galbraith @ 2012-09-11 13:20 UTC (permalink / raw)
To: John Kacur; +Cc: Wolfgang Wallner, linux-rt-users, Josef Baumgartner
On Tue, 2012-09-11 at 14:10 +0200, John Kacur wrote:
>
> On Tue, 11 Sep 2012, Wolfgang Wallner wrote:
>
> > Hello all,
> >
> > I have a few questions about the logging entry "sched: RT throttling
> activated" which I observe, and hope this is the correct place to
> ask :)
> > I will start by describing how i come to see this logging entry:
> >
> > I use an industrial PC with an 2.6.33 RT PREEMP kernel...
BTW, there have been a few throttle fixes since then that you'll likely
want to backport. I don't have a list handy, but they're easy to find
with git.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
@ 2012-09-11 15:34 Wolfgang Wallner
2012-09-11 18:10 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Wallner @ 2012-09-11 15:34 UTC (permalink / raw)
To: linux-rt-users; +Cc: John Kacur, Josef Baumgartner, Mike Galbraith
Hello,
thanks for your feedback.
>> The line with "sched: RT throttling activated" is the one I'm interested in.
>> I don't know what it means, and searching for this term did not turn up much.
> What kind of a search? A grep of "RT\ throttling" would have found it.
Sorry, I was not aware of this (I'm not used to working with kernel sources), I did a web search for "sched: RT throttling activated".
I will consider greping the kernel sources next time (to be clear: you did mean a grep in the kernel sources?).
> Likely realtime task went gaga. Kernel or userspace doesn't matter.
>> My questions are now:
>>
>> * What does this logging entry mean?
>> Could you please point me to some information about RT throttling
>> so that I can understand what's it about?
> With stock settings, it means realtime task[s] consumed > 95% of the
> throttle interval (1s), so the throttle activated, allowing SCHED_NORMAL
> tasks to have a sip of CPU, to let you try to save the box from nutty RT
> CPU hogs. See kernel/sched_rt.c.
I think if the application turns into a cpu hog, this would be visible in a sched_switch ftrace.
Thus, I added the line tracing_off(); in kernel/sched_rt.c at the place where the "sched: RT throttling activated" string is printed and recompiled my kernel. I will let the sched_switch tracing run togheter with my application until tomorrow and hopefully see in the trace what's wrong.
I did not change the default throttling settings. Thus, the error should be visible in the resulting trace about 0.95 seconds before the end of the trace.
Is this approach reasonable? Any other suggestions?
> BTW, there have been a few throttle fixes since then that you'll likely
> want to backport. I don't have a list handy, but they're easy to find
> with git.
Thanks for the tip, I will have a look at it.
Will it be enough to watch out for changes in sched_rt.c or is throttling also handled elsewhere?
regards, Wolfgang
-----linux-rt-users-owner@vger.kernel.org schrieb: -----
An: John Kacur <jkacur@redhat.com>
Von: Mike Galbraith
Gesendet von: linux-rt-users-owner@vger.kernel.org
Datum: 11.09.2012 15:20
Kopie: Wolfgang Wallner <Wolfgang.Wallner@br-automation.com>, linux-rt-users@vger.kernel.org, Josef Baumgartner <Josef.Baumgartner@br-automation.com>
Betreff: Re: Question regarding 'sched: RT throttling activated'
On Tue, 2012-09-11 at 14:10 +0200, John Kacur wrote:
>
> On Tue, 11 Sep 2012, Wolfgang Wallner wrote:
>
> > Hello all,
> >
> > I have a few questions about the logging entry "sched: RT throttling
> activated" which I observe, and hope this is the correct place to
> ask :)
> > I will start by describing how i come to see this logging entry:
> >
> > I use an industrial PC with an 2.6.33 RT PREEMP kernel...
BTW, there have been a few throttle fixes since then that you'll likely
want to backport. I don't have a list handy, but they're easy to find
with git.
-Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 15:34 Wolfgang Wallner
@ 2012-09-11 18:10 ` Mike Galbraith
2012-09-11 18:55 ` Sven-Thorsten Dietrich
0 siblings, 1 reply; 8+ messages in thread
From: Mike Galbraith @ 2012-09-11 18:10 UTC (permalink / raw)
To: Wolfgang Wallner; +Cc: linux-rt-users, John Kacur, Josef Baumgartner
On Tue, 2012-09-11 at 17:34 +0200, Wolfgang Wallner wrote:
> >> My questions are now:
> >>
> >> * What does this logging entry mean?
> >> Could you please point me to some information about RT throttling
> >> so that I can understand what's it about?
>
> > With stock settings, it means realtime task[s] consumed > 95% of the
> > throttle interval (1s), so the throttle activated, allowing
> SCHED_NORMAL
> > tasks to have a sip of CPU, to let you try to save the box from
> nutty RT
> > CPU hogs. See kernel/sched_rt.c.
>
> I think if the application turns into a cpu hog...
Where from comes 'if'? You presented evidence, so methinks there's not
a _lot_ of room for an 'if', there's just a missing 'why'.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 18:10 ` Mike Galbraith
@ 2012-09-11 18:55 ` Sven-Thorsten Dietrich
2012-09-12 3:40 ` Mike Galbraith
0 siblings, 1 reply; 8+ messages in thread
From: Sven-Thorsten Dietrich @ 2012-09-11 18:55 UTC (permalink / raw)
To: Mike Galbraith
Cc: Wolfgang Wallner, linux-rt-users, John Kacur, Josef Baumgartner
On Sep 11, 2012, at 11:10 AM, Mike Galbraith <efault@gmx.de> wrote:
> On Tue, 2012-09-11 at 17:34 +0200, Wolfgang Wallner wrote:
>
>>>> My questions are now:
>>>>
>>>> * What does this logging entry mean?
>>>> Could you please point me to some information about RT throttling
>>>> so that I can understand what's it about?
>>
>>> With stock settings, it means realtime task[s] consumed > 95% of the
>>> throttle interval (1s), so the throttle activated, allowing
>> SCHED_NORMAL
>>> tasks to have a sip of CPU, to let you try to save the box from
>> nutty RT
>>> CPU hogs. See kernel/sched_rt.c.
>>
>> I think if the application turns into a cpu hog...
>
> Where from comes 'if'? You presented evidence, so methinks there's not
> a _lot_ of room for an 'if', there's just a missing 'why'.
Well, as you know a lot of folks do polling, but its important to make sure not ALL cores are doing that...
Need to look at the code, maybe we could add a mask to restrict throttling per cpu (e.g. to 0) ?
Sven
>
> -Mike
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Question regarding 'sched: RT throttling activated'
2012-09-11 18:55 ` Sven-Thorsten Dietrich
@ 2012-09-12 3:40 ` Mike Galbraith
0 siblings, 0 replies; 8+ messages in thread
From: Mike Galbraith @ 2012-09-12 3:40 UTC (permalink / raw)
To: Sven-Thorsten Dietrich
Cc: Wolfgang Wallner, linux-rt-users, John Kacur, Josef Baumgartner
On Tue, 2012-09-11 at 11:55 -0700, Sven-Thorsten Dietrich wrote:
> On Sep 11, 2012, at 11:10 AM, Mike Galbraith <efault@gmx.de> wrote:
>
> > On Tue, 2012-09-11 at 17:34 +0200, Wolfgang Wallner wrote:
> >
> >>>> My questions are now:
> >>>>
> >>>> * What does this logging entry mean?
> >>>> Could you please point me to some information about RT throttling
> >>>> so that I can understand what's it about?
> >>
> >>> With stock settings, it means realtime task[s] consumed > 95% of the
> >>> throttle interval (1s), so the throttle activated, allowing
> >> SCHED_NORMAL
> >>> tasks to have a sip of CPU, to let you try to save the box from
> >> nutty RT
> >>> CPU hogs. See kernel/sched_rt.c.
> >>
> >> I think if the application turns into a cpu hog...
> >
> > Where from comes 'if'? You presented evidence, so methinks there's not
> > a _lot_ of room for an 'if', there's just a missing 'why'.
>
> Well, as you know a lot of folks do polling, but its important to make
> sure not ALL cores are doing that...
> [60382.945209] EplTimerHighResk: Continuous timer (handle 0x10000001)
> had to skip 836 interval(s)!
That and 'stops working' (as in forever) made me suspect spinner rather
than transient cpu over-commit.
> Need to look at the code, maybe we could add a mask to restrict
> throttling per cpu (e.g. to 0) ?
IMHO it's a debugging tool you turn off for normal operation. Biggest
problem I've ever had with the thing is it allowing hogs to run over to
the neighbors to borrow a cup of CPU. That's fixed.
-Mike
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-09-12 3:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-11 11:38 Question regarding 'sched: RT throttling activated' Wolfgang Wallner
2012-09-11 12:10 ` John Kacur
2012-09-11 13:20 ` Mike Galbraith
2012-09-11 13:05 ` Mike Galbraith
-- strict thread matches above, loose matches on Subject: below --
2012-09-11 15:34 Wolfgang Wallner
2012-09-11 18:10 ` Mike Galbraith
2012-09-11 18:55 ` Sven-Thorsten Dietrich
2012-09-12 3:40 ` Mike Galbraith
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).