sched: RT throttling activated, 3.12.3

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* sched: RT throttling activated, 3.12.3
@ 2013-12-11  2:59 Howard Chu
  2013-12-11  3:03 ` Li Zefan
  0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11  2:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List

I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some 
new benchmarks on it. I see my test program ramps up in CPU usage for a few 
seconds and then it gradually tails off. There's nothing obvious in the user 
code to trigger this behavior, so I check dmesg, and see this:

[   55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
[163591.807470] perf samples too long (2758 > 2500), lowering 
kernel.perf_event_max_sample_rate to 50000
[164061.362762] perf samples too long (5204 > 5000), lowering 
kernel.perf_event_max_sample_rate to 25000
[167969.339513] [sched_delayed] sched: RT throttling activated
[182741.484637] perf samples too long (294588 > 10000), lowering 
kernel.perf_event_max_sample_rate to 12500
[182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to 
run: 36.665 msecs
[182822.633084] perf samples too long (292359 > 20000), lowering 
kernel.perf_event_max_sample_rate to 6250
[182905.606119] perf samples too long (290291 > 40000), lowering 
kernel.perf_event_max_sample_rate to 3250
[199384.293514] perf samples too long (288142 > 76923), lowering 
kernel.perf_event_max_sample_rate to 1750
[208507.301027] perf samples too long (285964 > 142857), lowering 
kernel.perf_event_max_sample_rate to 1000
[208528.976208] perf samples too long (283799 > 250000), lowering 
kernel.perf_event_max_sample_rate to 500

Why is the kernel throttling my server?

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sched: RT throttling activated, 3.12.3
  2013-12-11  2:59 sched: RT throttling activated, 3.12.3 Howard Chu
@ 2013-12-11  3:03 ` Li Zefan
  2013-12-11  3:09   ` Howard Chu
  0 siblings, 1 reply; 5+ messages in thread
From: Li Zefan @ 2013-12-11  3:03 UTC (permalink / raw)
  To: Howard Chu; +Cc: Linux Kernel Mailing List

On 2013/12/11 10:59, Howard Chu wrote:
> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
> 
> [   55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
> [167969.339513] [sched_delayed] sched: RT throttling activated
> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
> 
> Why is the kernel throttling my server?
> 

Because that is the default setting of the kernel.

lxc34:/proc/sys/kernel # cat sched_rt_period_us
1000000
lxc34:/proc/sys/kernel # cat sched_rt_runtime_us
950000


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sched: RT throttling activated, 3.12.3
  2013-12-11  3:03 ` Li Zefan
@ 2013-12-11  3:09   ` Howard Chu
  2013-12-11  4:57     ` Howard Chu
  0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11  3:09 UTC (permalink / raw)
  To: Li Zefan; +Cc: Linux Kernel Mailing List

Li Zefan wrote:
> On 2013/12/11 10:59, Howard Chu wrote:
>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>
>> [   55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>> [167969.339513] [sched_delayed] sched: RT throttling activated
>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>
>> Why is the kernel throttling my server?
>>
>
> Because that is the default setting of the kernel.

Apparently a "new" default that didn't exist in 3.5? The code in question is 
not a realtime process. This behavior also wasn't seen in 3.10 or any older 
kernels.

> lxc34:/proc/sys/kernel # cat sched_rt_period_us
> 1000000
> lxc34:/proc/sys/kernel # cat sched_rt_runtime_us
> 950000
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sched: RT throttling activated, 3.12.3
  2013-12-11  3:09   ` Howard Chu
@ 2013-12-11  4:57     ` Howard Chu
  2013-12-11  6:06       ` Howard Chu
  0 siblings, 1 reply; 5+ messages in thread
From: Howard Chu @ 2013-12-11  4:57 UTC (permalink / raw)
  To: Li Zefan; +Cc: Linux Kernel Mailing List

Howard Chu wrote:
> Li Zefan wrote:
>> On 2013/12/11 10:59, Howard Chu wrote:
>>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>>
>>> [   55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>>> [167969.339513] [sched_delayed] sched: RT throttling activated
>>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>>
>>> Why is the kernel throttling my server?
>>>
>>
>> Because that is the default setting of the kernel.
>
> Apparently a "new" default that didn't exist in 3.5? The code in question is
> not a realtime process. This behavior also wasn't seen in 3.10 or any older
> kernels.

I just downgraded to 3.10.23 to doublecheck - everything is running normally 
there, although a few percent slower than I expected. (Last time I tried 3.10 
it was 3.10.11.)

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sched: RT throttling activated, 3.12.3
  2013-12-11  4:57     ` Howard Chu
@ 2013-12-11  6:06       ` Howard Chu
  0 siblings, 0 replies; 5+ messages in thread
From: Howard Chu @ 2013-12-11  6:06 UTC (permalink / raw)
  To: Li Zefan; +Cc: Linux Kernel Mailing List

Howard Chu wrote:
> Howard Chu wrote:
>> Li Zefan wrote:
>>> On 2013/12/11 10:59, Howard Chu wrote:
>>>> I just upgraded a system from a 3.5 kernel to 3.12.3 and attempted to run some new benchmarks on it. I see my test program ramps up in CPU usage for a few seconds and then it gradually tails off. There's nothing obvious in the user code to trigger this behavior, so I check dmesg, and see this:
>>>>
>>>> [   55.037057] JFS: nTxBlock = 8192, nTxLock = 65536
>>>> [163591.807470] perf samples too long (2758 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
>>>> [164061.362762] perf samples too long (5204 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
>>>> [167969.339513] [sched_delayed] sched: RT throttling activated
>>>> [182741.484637] perf samples too long (294588 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
>>>> [182741.484726] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 36.665 msecs
>>>> [182822.633084] perf samples too long (292359 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
>>>> [182905.606119] perf samples too long (290291 > 40000), lowering kernel.perf_event_max_sample_rate to 3250
>>>> [199384.293514] perf samples too long (288142 > 76923), lowering kernel.perf_event_max_sample_rate to 1750
>>>> [208507.301027] perf samples too long (285964 > 142857), lowering kernel.perf_event_max_sample_rate to 1000
>>>> [208528.976208] perf samples too long (283799 > 250000), lowering kernel.perf_event_max_sample_rate to 500
>>>>
>>>> Why is the kernel throttling my server?
>>>>
>>>
>>> Because that is the default setting of the kernel.
>>
>> Apparently a "new" default that didn't exist in 3.5? The code in question is
>> not a realtime process. This behavior also wasn't seen in 3.10 or any older
>> kernels.
>
> I just downgraded to 3.10.23 to doublecheck - everything is running normally
> there, although a few percent slower than I expected. (Last time I tried 3.10
> it was 3.10.11.)
>
For comparison, here's a "normally" behaving benchmark run:
http://highlandsun.com/hyc/linux3.10/

The result is a fairly steady 15,000 ops/sec and CPU usage is around 190% 
(this is a quadcore machine).

On the 3.12.3 kernel:
http://highlandsun.com/hyc/linux3.12/

The CPU usage is initially around 180% but quickly plummets to about 7% and 
stays there. This is a pretty major regression for a "default" kernel setting. 
And given that the target process isn't running with realtime scheduling 
priority, this can only be considered a bug. (Btw, setting both 
sched_rt_period_us and sched_rt_runtime_us to -1 has no effect on this behavior.)

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-12-11  6:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-11  2:59 sched: RT throttling activated, 3.12.3 Howard Chu
2013-12-11  3:03 ` Li Zefan
2013-12-11  3:09   ` Howard Chu
2013-12-11  4:57     ` Howard Chu
2013-12-11  6:06       ` Howard Chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox