* sched: RT throttling activated, 3.12.3
@ 2013-12-11 2:59 Howard Chu
2013-12-11 3:03 ` Li Zefan
0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11 2:59 UTC (permalink / raw)
To: Linux Kernel Mailing List
I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some
new benchmarks on it. I see my test program ramps up in CPU usage for a few
seconds and then it gradually tails off. There's nothing obvious in the user
code to trigger this behavior, so I check dmesg, and see this:
[ 55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
[163591.807470] perf samples too long (2758 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
[164061.362762] perf samples too long (5204 > 5000), lowering
kernel.perf_event_max_sample_rate to 25000
[167969.339513] [sched_delayed] sched: RT throttling activated
[182741.484637] perf samples too long (294588 > 10000), lowering
kernel.perf_event_max_sample_rate to 12500
[182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to
run: 36.665 msecs
[182822.633084] perf samples too long (292359 > 20000), lowering
kernel.perf_event_max_sample_rate to 6250
[182905.606119] perf samples too long (290291 > 40000), lowering
kernel.perf_event_max_sample_rate to 3250
[199384.293514] perf samples too long (288142 > 76923), lowering
kernel.perf_event_max_sample_rate to 1750
[208507.301027] perf samples too long (285964 > 142857), lowering
kernel.perf_event_max_sample_rate to 1000
[208528.976208] perf samples too long (283799 > 250000), lowering
kernel.perf_event_max_sample_rate to 500
Why is the kernel throttling my server?
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sched: RT throttling activated, 3.12.3
2013-12-11 2:59 sched: RT throttling activated, 3.12.3 Howard Chu
@ 2013-12-11 3:03 ` Li Zefan
2013-12-11 3:09 ` Howard Chu
0 siblings, 1 reply; 5+ messages in thread
From: Li Zefan @ 2013-12-11 3:03 UTC (permalink / raw)
To: Howard Chu; +Cc: Linux Kernel Mailing List
On 2013/12/11 10:59, Howard Chu wrote:
> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>
> [ 55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
> [167969.339513] [sched_delayed] sched: RT throttling activated
> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>
> Why is the kernel throttling my server?
>
Because that is the default setting of the kernel.
lxc34:/proc/sys/kernel # cat sched_rt_period_us
1000000
lxc34:/proc/sys/kernel # cat sched_rt_runtime_us
950000
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sched: RT throttling activated, 3.12.3
2013-12-11 3:03 ` Li Zefan
@ 2013-12-11 3:09 ` Howard Chu
2013-12-11 4:57 ` Howard Chu
0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11 3:09 UTC (permalink / raw)
To: Li Zefan; +Cc: Linux Kernel Mailing List
Li Zefan wrote:
> On 2013/12/11 10:59, Howard Chu wrote:
>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>
>> [ 55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>> [167969.339513] [sched_delayed] sched: RT throttling activated
>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>
>> Why is the kernel throttling my server?
>>
>
> Because that is the default setting of the kernel.
Apparently a "new" default that didn't exist in 3.5? The code in question is
not a realtime process. This behavior also wasn't seen in 3.10 or any older
kernels.
> lxc34:/proc/sys/kernel # cat sched_rt_period_us
> 1000000
> lxc34:/proc/sys/kernel # cat sched_rt_runtime_us
> 950000
>
>
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sched: RT throttling activated, 3.12.3
2013-12-11 3:09 ` Howard Chu
@ 2013-12-11 4:57 ` Howard Chu
2013-12-11 6:06 ` Howard Chu
0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11 4:57 UTC (permalink / raw)
To: Li Zefan; +Cc: Linux Kernel Mailing List
Howard Chu wrote:
> Li Zefan wrote:
>> On 2013/12/11 10:59, Howard Chu wrote:
>>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>>
>>> [ 55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>>> [167969.339513] [sched_delayed] sched: RT throttling activated
>>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>>
>>> Why is the kernel throttling my server?
>>>
>>
>> Because that is the default setting of the kernel.
>
> Apparently a "new" default that didn't exist in 3.5? The code in question is
> not a realtime process. This behavior also wasn't seen in 3.10 or any older
> kernels.
I just downgraded to 3.10.23 to doublecheck - everything is running normally
there, although a few percent slower than I expected. (Last time I tried 3.10
it was 3.10.11.)
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sched: RT throttling activated, 3.12.3
2013-12-11 4:57 ` Howard Chu
@ 2013-12-11 6:06 ` Howard Chu
0 siblings, 0 replies; 5+ messages in thread
From: Howard Chu @ 2013-12-11 6:06 UTC (permalink / raw)
To: Li Zefan; +Cc: Linux Kernel Mailing List
Howard Chu wrote:
> Howard Chu wrote:
>> Li Zefan wrote:
>>> On 2013/12/11 10:59, Howard Chu wrote:
>>>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>>>
>>>> [ 55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>>>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>>>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>>>> [167969.339513] [sched_delayed] sched: RT throttling activated
>>>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>>>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>>>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>>>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>>>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>>>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>>>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>>>
>>>> Why is the kernel throttling my server?
>>>>
>>>
>>> Because that is the default setting of the kernel.
>>
>> Apparently a "new" default that didn't exist in 3.5? The code in question is
>> not a realtime process. This behavior also wasn't seen in 3.10 or any older
>> kernels.
>
> I just downgraded to 3.10.23 to doublecheck - everything is running normally
> there, although a few percent slower than I expected. (Last time I tried 3.10
> it was 3.10.11.)
>
For comparison, here's a "normally" behaving benchmark run:
http://highlandsun.com/hyc/linux3.10/
The result is a fairly steady 15,000 ops/sec and CPU usage is around 190%
(this is a quadcore machine).
On the 3.12.3 kernel:
http://highlandsun.com/hyc/linux3.12/
The CPU usage is initially around 180% but quickly plummets to about 7% and
stays there. This is a pretty major regression for a "default" kernel setting.
And given that the target process isn't running with realtime scheduling
priority, this can only be considered a bug. (Btw, setting both
sched_rt_period_us and sched_rt_runtime_us to -1 has no effect on this behavior.)
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-12-11 6:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-11 2:59 sched: RT throttling activated, 3.12.3 Howard Chu
2013-12-11 3:03 ` Li Zefan
2013-12-11 3:09 ` Howard Chu
2013-12-11 4:57 ` Howard Chu
2013-12-11 6:06 ` Howard Chu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox