Re: How how latent should non-preemptive scheduling be?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: How how latent should non-preemptive scheduling be?
       [not found]       ` <fa.Jx/Ygtm46CVRawlA6OnfYNn6cN0@ifi.uio.no>
@ 2008-09-18  7:26         ` Sitsofe Wheeler
       [not found]         ` <fa.iIHgL48F3T5VAqFw3mqaf9Pzrs4@ifi.uio.no>
  1 sibling, 0 replies; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-18  7:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Arjan van de Ven, linux-kernel, Ingo Molnar

Peter Zijlstra wrote:
> On Wed, 2008-09-17 at 14:54 -0700, Arjan van de Ven wrote:
>> On Wed, 17 Sep 2008 22:48:55 +0100
>>
>>>> Cause                                                Maximum
>>>>
>> Percentage 
>>
>> Scheduler: waiting for cpu                        208 msec         59.4 %
>>
>>
>> you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds..
>> that sounds like a scheduler fairness issue!
> 
> Really hard subject. Perfect fairness requires 0 latency - which with a
> CPU only being able to run one thing at a time is impossible. So what
> latency ends up being is a measure for the convergence towards fairness.
> 
> Anyway - 200ms isn't too weird depending on the circumstances. We start
> out with a 20ms latency for UP, we then multiply with 1+log2(nr_cpus)
> which in say a quad core machine ends up with 60ms. That ought to mean
> that under light load the max latency should not exceed twice that
> (basically a consequence of the Nyquist-Shannon sampling theorem IIRC).
> 
> Now, if you get get under some load (by default: nr_running > 5) the
> expected latency starts to linearly grow with nr_running.
> 
>>From what I gather from the reply to this email the machine was not
> doing much (and after having looked up the original email I see its a
> eeeeeeeee atom - which is dual cpu iirc, so that yields 40ms default) -
> so 200 is definately on the high side.

No, it's not an eeeeee atom. It's an eee celeron M (900Mhz) so it's 
definitely a single CPU with no hyperthreading (and SMP is not enabled 
in the kernel config either). It has less grunt that the atom and can't 
do cpu scaling either (although it seems to have C states).

The load average is less than 0.5 but obviously I don't know if it is 
periodically spiking over 5 and then smoothing out.

> What you can do to investigate this, is use the sched_wakeup tracer from
> ftrace, that should give a function trace of the highest wakeup latency
> showing what the kernel is doing.

Thanks for the hint - I was wondering where to look next.

^ permalink raw reply	[flat|nested] 17+ messages in thread

[parent not found: <fa.iIHgL48F3T5VAqFw3mqaf9Pzrs4@ifi.uio.no>]

[parent not found: <fa.Td8xkKZKMSMghlJmEYefTRVF2kc@ifi.uio.no>]

* Re: How how latent should non-preemptive scheduling be?
       [not found]           ` <fa.Td8xkKZKMSMghlJmEYefTRVF2kc@ifi.uio.no>
@ 2008-09-19 11:54             ` Sitsofe Wheeler
  2008-09-19 14:20               ` Peter Zijlstra
  2008-09-22 11:57               ` Ingo Molnar
  0 siblings, 2 replies; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-19 11:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt

Ingo Molnar wrote:
> here's two quick howtos:
> 
>   http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
>   http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt

These two files appear to be identical. Is this intentional?

Anyway after following your instructions a putting together a small 
script to dice the output, I collated the 10 switches which took the 
longest:

# Top ten longest switches
# Rel TS Process   Abs TS
0.122161 hald-3423 1867.821170 ***
0.039438 <idle>-0  1867.379054
0.036318 hald-3423 1867.669009
0.031362 <idle>-0  1868.002762
0.030000 hald-3423 1867.699009
0.028933 <idle>-0  1867.529238
0.028539 <idle>-0  1867.228861
0.028196 <idle>-0  1867.128731
0.027763 <idle>-0  1868.101449
0.027513 <idle>-0  1867.028606

# tracer: sched_switch from around longest switch
#
#           TASK-PID   CPU#    TIMESTAMP  FUNCTION
#              | |      |          |         |
           <idle>-0     [00]  1867.608017:      0:140:R   +     3:115:S
           <idle>-0     [00]  1867.608038:      0:140:R   +  3423:120:D
           <idle>-0     [00]  1867.608045:      0:140:R ==>     3:115:R
      ksoftirqd/0-3     [00]  1867.608048:      3:115:S ==>  3423:120:R
             hald-3423  [00]  1867.629350:   3423:120:R   +  6096:120:S
             hald-3423  [00]  1867.632691:   3423:120:R   +  3827:120:S
             hald-3423  [00]  1867.669009:   3423:120:R   +  3998:120:S
             hald-3423  [00]  1867.699009:   3423:120:R   +  6097:120:S
          ***hald-3423  [00]  1867.821170:   3423:120:R ==>  6096:120:R
        rhythmbox-6096  [00]  1867.821219:   6096:120:S ==>  6097:120:R
        rhythmbox-6097  [00]  1867.821262:   6097:120:R   +  3826:120:S
        rhythmbox-6097  [00]  1867.821289:   6097:120:S ==>  3826:120:R
       pulseaudio-3826  [00]  1867.821332:   3826:120:R   +  6097:120:S
       pulseaudio-3826  [00]  1867.821374:   3826:120:S ==>  6097:120:R
        rhythmbox-6097  [00]  1867.821380:   6097:120:S ==>  3998:120:R
        rhythmbox-3998  [00]  1867.821709:   3998:120:S ==>  3827:120:R
       pulseaudio-3827  [00]  1867.824041:   3827:120:R   +  3826:120:S


>    LatencyTOP version 0.4       (C) 2008 Intel Corporation
> 
> Cause                                                Maximum     Percentage
> Scheduler: waiting for cpu                        152.4 msec         13.8 %
> Userspace lock contention                           5.0 msec         68.0 %
> Waiting for event (poll)                            5.0 msec         14.3 %
> Waiting for event (select)                          4.9 msec          3.5 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu  1.9 msec          0.1 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re  1.9 msec          0.2 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu  1.9 msec          0.1 %
> Executing raw SCSI command                          1.1 msec          0.0 %
> Waiting for TTY to finish sending                   0.4 msec          0.0 %
> 
> 
> Process rhythmbox (3998)                   Total: 328.3 msec
> Scheduler: waiting for cpu                        152.4 msec         80.0 %
> Userspace lock contention                           4.1 msec         15.7 %
> Waiting for event (poll)                            2.7 msec          4.1 %


> you need to enable:
> 
>   CONFIG_SCHED_TRACER=y
>   CONFIG_CONTEXT_SWITCH_TRACER=y

I actually have both of these enabled but there's still no wakeup tracer 
(as mentioned in the git kernel documentation http://tinyurl.com/4f9s4l 
). The good news is that your instructions don't need the wakeup tracer.

> it's not particularly well named though. Why doesnt it say 
> LATENCY_TRACER or something?

I agree it would be nicer if it had a better name.

>> Additionally I think I found a trigger - unplugging the power cable 
>> from the EeePC and having it run on battery seems to then set off this 
>> periodic stall every 30 seconds... There's no CPU frequency scaling 
>> enabled either (Celeron M's seemingly don't have P states and support 
>> for cpufreq is configured out).
> 
> sounds like potential SMM triggered latencies.

I have just gone away and read about the SMM ( 
http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If 
you're right there is pretty much nothing that can be done about the 
problem : (

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-19 11:54             ` Sitsofe Wheeler
@ 2008-09-19 14:20               ` Peter Zijlstra
  2008-09-22 11:57               ` Ingo Molnar
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2008-09-19 14:20 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: Ingo Molnar, Arjan van de Ven, linux-kernel, Steven Rostedt

On Fri, 2008-09-19 at 12:54 +0100, Sitsofe Wheeler wrote:
> Ingo Molnar wrote:
> > here's two quick howtos:
> > 
> >   http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
> >   http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
> 
> These two files appear to be identical. Is this intentional?
> 
> Anyway after following your instructions a putting together a small 
> script to dice the output, I collated the 10 switches which took the 
> longest:
> 
> # Top ten longest switches
> # Rel TS Process   Abs TS
> 0.122161 hald-3423 1867.821170 ***
> 0.039438 <idle>-0  1867.379054
> 0.036318 hald-3423 1867.669009
> 0.031362 <idle>-0  1868.002762
> 0.030000 hald-3423 1867.699009
> 0.028933 <idle>-0  1867.529238
> 0.028539 <idle>-0  1867.228861
> 0.028196 <idle>-0  1867.128731
> 0.027763 <idle>-0  1868.101449
> 0.027513 <idle>-0  1867.028606
> 
> # tracer: sched_switch from around longest switch
> #
> #           TASK-PID   CPU#    TIMESTAMP  FUNCTION
> #              | |      |          |         |
>            <idle>-0     [00]  1867.608017:      0:140:R   +     3:115:S
>            <idle>-0     [00]  1867.608038:      0:140:R   +  3423:120:D
>            <idle>-0     [00]  1867.608045:      0:140:R ==>     3:115:R
>       ksoftirqd/0-3     [00]  1867.608048:      3:115:S ==>  3423:120:R
>              hald-3423  [00]  1867.629350:   3423:120:R   +  6096:120:S
>              hald-3423  [00]  1867.632691:   3423:120:R   +  3827:120:S
>              hald-3423  [00]  1867.669009:   3423:120:R   +  3998:120:S
>              hald-3423  [00]  1867.699009:   3423:120:R   +  6097:120:S
>           ***hald-3423  [00]  1867.821170:   3423:120:R ==>  6096:120:R
>         rhythmbox-6096  [00]  1867.821219:   6096:120:S ==>  6097:120:R
>         rhythmbox-6097  [00]  1867.821262:   6097:120:R   +  3826:120:S
>         rhythmbox-6097  [00]  1867.821289:   6097:120:S ==>  3826:120:R
>        pulseaudio-3826  [00]  1867.821332:   3826:120:R   +  6097:120:S
>        pulseaudio-3826  [00]  1867.821374:   3826:120:S ==>  6097:120:R
>         rhythmbox-6097  [00]  1867.821380:   6097:120:S ==>  3998:120:R
>         rhythmbox-3998  [00]  1867.821709:   3998:120:S ==>  3827:120:R
>        pulseaudio-3827  [00]  1867.824041:   3827:120:R   +  3826:120:S
> 

Its actually function tracer output I'm interested in.. that shows what
all its doing to make it take 120+ms.

I thought we had a wakeup latency tracer exacty like we have preempt and
irq off latency tracers, Steve, where'd that go?

> > sounds like potential SMM triggered latencies.
> 
> I have just gone away and read about the SMM ( 
> http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If 
> you're right there is pretty much nothing that can be done about the 
> problem : (

Yeah, SMM/SMI is nasty stuff :-(


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-19 11:54             ` Sitsofe Wheeler
  2008-09-19 14:20               ` Peter Zijlstra
@ 2008-09-22 11:57               ` Ingo Molnar
  2008-09-22 12:07                 ` Steven Rostedt
  2008-09-23  6:33                 ` Sitsofe Wheeler
  1 sibling, 2 replies; 17+ messages in thread
From: Ingo Molnar @ 2008-09-22 11:57 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt


* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:

> Ingo Molnar wrote:
>> here's two quick howtos:
>>
>>   http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
>>   http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
>
> These two files appear to be identical. Is this intentional?
>
> Anyway after following your instructions a putting together a small  
> script to dice the output, I collated the 10 switches which took the  
> longest:
>
> # Top ten longest switches
> # Rel TS Process   Abs TS
> 0.122161 hald-3423 1867.821170 ***
> 0.039438 <idle>-0  1867.379054
> 0.036318 hald-3423 1867.669009
> 0.031362 <idle>-0  1868.002762
> 0.030000 hald-3423 1867.699009
> 0.028933 <idle>-0  1867.529238
> 0.028539 <idle>-0  1867.228861
> 0.028196 <idle>-0  1867.128731
> 0.027763 <idle>-0  1868.101449
> 0.027513 <idle>-0  1867.028606
>
> # tracer: sched_switch from around longest switch
> #
> #           TASK-PID   CPU#    TIMESTAMP  FUNCTION
> #              | |      |          |         |
>           <idle>-0     [00]  1867.608017:      0:140:R   +     3:115:S
>           <idle>-0     [00]  1867.608038:      0:140:R   +  3423:120:D
>           <idle>-0     [00]  1867.608045:      0:140:R ==>     3:115:R
>      ksoftirqd/0-3     [00]  1867.608048:      3:115:S ==>  3423:120:R
>             hald-3423  [00]  1867.629350:   3423:120:R   +  6096:120:S
>             hald-3423  [00]  1867.632691:   3423:120:R   +  3827:120:S
>             hald-3423  [00]  1867.669009:   3423:120:R   +  3998:120:S
>             hald-3423  [00]  1867.699009:   3423:120:R   +  6097:120:S
>          ***hald-3423  [00]  1867.821170:   3423:120:R ==>  6096:120:R
>        rhythmbox-6096  [00]  1867.821219:   6096:120:S ==>  6097:120:R
>        rhythmbox-6097  [00]  1867.821262:   6097:120:R   +  3826:120:S
>        rhythmbox-6097  [00]  1867.821289:   6097:120:S ==>  3826:120:R
>       pulseaudio-3826  [00]  1867.821332:   3826:120:R   +  6097:120:S
>       pulseaudio-3826  [00]  1867.821374:   3826:120:S ==>  6097:120:R
>        rhythmbox-6097  [00]  1867.821380:   6097:120:S ==>  3998:120:R
>        rhythmbox-3998  [00]  1867.821709:   3998:120:S ==>  3827:120:R
>       pulseaudio-3827  [00]  1867.824041:   3827:120:R   +  3826:120:S
>
>
>>    LatencyTOP version 0.4       (C) 2008 Intel Corporation
>>
>> Cause                                                Maximum     Percentage
>> Scheduler: waiting for cpu                        152.4 msec         13.8 %
>> Userspace lock contention                           5.0 msec         68.0 %
>> Waiting for event (poll)                            5.0 msec         14.3 %
>> Waiting for event (select)                          4.9 msec          3.5 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu  1.9 msec          0.1 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re  1.9 msec          0.2 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu  1.9 msec          0.1 %
>> Executing raw SCSI command                          1.1 msec          0.0 %
>> Waiting for TTY to finish sending                   0.4 msec          0.0 %
>>
>>
>> Process rhythmbox (3998)                   Total: 328.3 msec
>> Scheduler: waiting for cpu                        152.4 msec         80.0 %
>> Userspace lock contention                           4.1 msec         15.7 %
>> Waiting for event (poll)                            2.7 msec          4.1 %
>
>
>> you need to enable:
>>
>>   CONFIG_SCHED_TRACER=y
>>   CONFIG_CONTEXT_SWITCH_TRACER=y
>
> I actually have both of these enabled but there's still no wakeup tracer  
> (as mentioned in the git kernel documentation http://tinyurl.com/4f9s4l  
> ). The good news is that your instructions don't need the wakeup tracer.
>
>> it's not particularly well named though. Why doesnt it say  
>> LATENCY_TRACER or something?
>
> I agree it would be nicer if it had a better name.
>
>>> Additionally I think I found a trigger - unplugging the power cable  
>>> from the EeePC and having it run on battery seems to then set off 
>>> this periodic stall every 30 seconds... There's no CPU frequency 
>>> scaling enabled either (Celeron M's seemingly don't have P states and 
>>> support for cpufreq is configured out).
>>
>> sounds like potential SMM triggered latencies.
>
> I have just gone away and read about the SMM ( 
> http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If 
> you're right there is pretty much nothing that can be done about the 
> problem : (

well, since they went away after you enabled CONFIG_PREEMPT=y, they are 
definitely in-kernel latencies, not any external SMM latencies.

I.e. they are inherently fixable. Could you enable:

  CONFIG_DYNAMIC_FTRACE=y
  CONFIG_FTRACE_MCOUNT_RECORD=y

that should make the traces a lot more verbose - every kernel function 
executed in the latency path will be logged. That way we'll be able to 
say which one takes that long.

note, you might have to increase /debug/tracing/trace_entries to get a 
long enough trace to capture the relevant portion of the latency.

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-22 11:57               ` Ingo Molnar
@ 2008-09-22 12:07                 ` Steven Rostedt
  2008-09-23  6:33                 ` Sitsofe Wheeler
  1 sibling, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2008-09-22 12:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Sitsofe Wheeler, Peter Zijlstra, Arjan van de Ven, linux-kernel


On Mon, 22 Sep 2008, Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> >>
> >> sounds like potential SMM triggered latencies.
> >
> > I have just gone away and read about the SMM ( 
> > http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If 
> > you're right there is pretty much nothing that can be done about the 
> > problem : (
> 
> well, since they went away after you enabled CONFIG_PREEMPT=y, they are 
> definitely in-kernel latencies, not any external SMM latencies.
> 
> I.e. they are inherently fixable. Could you enable:
> 
>   CONFIG_DYNAMIC_FTRACE=y
>   CONFIG_FTRACE_MCOUNT_RECORD=y
> 
> that should make the traces a lot more verbose - every kernel function 
> executed in the latency path will be logged. That way we'll be able to 
> say which one takes that long.
> 
> note, you might have to increase /debug/tracing/trace_entries to get a 
> long enough trace to capture the relevant portion of the latency.

Also note, to modify trace_entries, you must be in the none (nop?) tracer, 
otherwise the size will not be effected.

If you find the trace is also too big, you can echo a list of functions 
into:

  /debug/tracing/set_ftrace_notrace

to not trace those functions. using '>' will remove any existing function 
in that file, but using '>>' will append functions to the file.

For a list of functions that you can add, see:

  /debug/tracing/available_filter_functions

-- Steve


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-22 11:57               ` Ingo Molnar
  2008-09-22 12:07                 ` Steven Rostedt
@ 2008-09-23  6:33                 ` Sitsofe Wheeler
  2008-09-23 11:53                   ` Ingo Molnar
  1 sibling, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23  6:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt

Ingo Molnar wrote:
> well, since they went away after you enabled CONFIG_PREEMPT=y, they are 
> definitely in-kernel latencies, not any external SMM latencies.
> 
> I.e. they are inherently fixable. Could you enable:
> 
>   CONFIG_DYNAMIC_FTRACE=y
>   CONFIG_FTRACE_MCOUNT_RECORD=y
> 
> that should make the traces a lot more verbose - every kernel function 
> executed in the latency path will be logged. That way we'll be able to 
> say which one takes that long.

I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in 
2.6.27rc7. Is it an option that is only in -tip ?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-23  6:33                 ` Sitsofe Wheeler
@ 2008-09-23 11:53                   ` Ingo Molnar
  2008-09-23 16:30                     ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-09-23 11:53 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt

* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:

> Ingo Molnar wrote:
>> well, since they went away after you enabled CONFIG_PREEMPT=y, they are 
>> definitely in-kernel latencies, not any external SMM latencies.
>>
>> I.e. they are inherently fixable. Could you enable:
>>
>>   CONFIG_DYNAMIC_FTRACE=y
>>   CONFIG_FTRACE_MCOUNT_RECORD=y
>>
>> that should make the traces a lot more verbose - every kernel function  
>> executed in the latency path will be logged. That way we'll be able to  
>> say which one takes that long.
>
> I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in 
> 2.6.27rc7. Is it an option that is only in -tip ?

yeah - it's a new ftrace feature queued up for v2.6.28.

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-23 11:53                   ` Ingo Molnar
@ 2008-09-23 16:30                     ` Sitsofe Wheeler
  2008-09-23 19:39                       ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 16:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt

Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> 
>> Ingo Molnar wrote:
>>> well, since they went away after you enabled CONFIG_PREEMPT=y, they are 
>>> definitely in-kernel latencies, not any external SMM latencies.
>>>
>>> I.e. they are inherently fixable. Could you enable:
>>>
>>>   CONFIG_DYNAMIC_FTRACE=y
>>>   CONFIG_FTRACE_MCOUNT_RECORD=y
>>>
>>> that should make the traces a lot more verbose - every kernel function  
>>> executed in the latency path will be logged. That way we'll be able to  
>>> say which one takes that long.
>> I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in 
>> 2.6.27rc7. Is it an option that is only in -tip ?
> 
> yeah - it's a new ftrace feature queued up for v2.6.28.

I've been struggling to boot -tip/master - currently it blows up just 
after printing SLUB information saying:

BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c0120078>] account_system_time+0x48/0x120
*pde = 00000000
Thread overran stack, or stack corrupted
Oops: 0002 [#1] PREEMPT

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-23 16:30                     ` Sitsofe Wheeler
@ 2008-09-23 19:39                       ` Sitsofe Wheeler
  2008-09-23 22:01                         ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 19:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Arjan van de Ven, linux-kernel,
	Steven Rostedt

Sitsofe Wheeler wrote:
> I've been struggling to boot -tip/master - currently it blows up just 
> after printing SLUB information saying:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000004
> IP: [<c0120078>] account_system_time+0x48/0x120
> *pde = 00000000
> Thread overran stack, or stack corrupted
> Oops: 0002 [#1] PREEMPT

OK this BUG seems to have gone away in the past few hours.

Ingo when you were asking for the ftrace report I presume that would be 
for a non preempt kernel (as a preemptive one only showed a very worst 
latency of 19657 us in one exceptional case)?

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-23 19:39                       ` Sitsofe Wheeler
@ 2008-09-23 22:01                         ` Sitsofe Wheeler
  2008-09-27 20:48                           ` Ingo Molnar
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 22:01 UTC (permalink / raw)
  Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Arjan van de Ven,
	Steven Rostedt

Sitsofe Wheeler wrote:
> Ingo when you were asking for the ftrace report I presume that would be 
> for a non preempt kernel (as a preemptive one only showed a very worst 
> latency of 19657 us in one exceptional case)?

With a non preempt kernel I found the logs would simply become too big 
without filtering. On Peter's suggestion I used the following to remove 
the most frequently called functions:

echo acpi_os_release_object > set_ftrace_notrace && echo kmem_cache_* >> 
set_ftrace_notrace && echo acpi_ut_* >> set_ftrace_notrace

By doing counts across multiple runs I would say that the most 
frequently called functions are the following (in most frequently called 
order). The counts are definitely approximate but are reasonable 
relative to each other.

  475325 acpi_os_release_object (acpi_ut_delete_generic_state)
  406895 kmem_cache_free (acpi_os_release_object)
  402838 kmem_cache_alloc (acpi_ut_create_generic_state)
  132968 acpi_ut_update_ref_count (acpi_ut_update_object_reference)
  131041 acpi_ut_pop_generic_state (acpi_ut_update_object_reference)
  131036 acpi_ut_delete_generic_state (acpi_ut_update_object_reference)
  131025 acpi_ut_create_generic_state (acpi_ut_create_update_state)
  131023 acpi_ut_create_update_state_and_push 
(acpi_ut_update_object_reference)
  131020 acpi_ut_create_update_state (acpi_ut_create_update_state_and_push)
  131018 acpi_ut_push_generic_state (acpi_ut_create_update_state_and_push)
   60147 acpi_ns_get_next_node (acpi_ns_delete_namespace_by_owner)
   28974 acpi_ns_get_next_valid_node (acpi_ns_get_next_node)

Logs with the filtering on can be seen here (15Mbytes decompressed each):
http://sucs.org/~sits/test/eeepc-debug/20080923/latency_trace.gz
http://sucs.org/~sits/test/eeepc-debug/20080923/trace.txt.gz

It looks like lots of acpi state is created and deleted...

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-23 22:01                         ` Sitsofe Wheeler
@ 2008-09-27 20:48                           ` Ingo Molnar
  2008-09-28 20:56                             ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-09-27 20:48 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt


* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:

> Logs with the filtering on can be seen here (15Mbytes decompressed each):
> http://sucs.org/~sits/test/eeepc-debug/20080923/latency_trace.gz
> http://sucs.org/~sits/test/eeepc-debug/20080923/trace.txt.gz
>
> It looks like lots of acpi state is created and deleted...

yeah. The latency starts here:

     cat-5901    0dNh.    1us : default_wake_function (__wake_up_common)
     cat-5901    0.Nh.    2us : kill_fasync (snd_pcm_period_elapsed)
     [...]

and ends here:

     [...]
     cat-5901    0.N.. 270501us+: mutex_lock (acpi_ec_transaction)
     cat-5901    0d... 270507us : __cond_resched (_cond_resched)

270 _milliseconds_ later. That's excessive.

The main overhead is starting here:

     cat-5901    0.N..  167us : acpi_ds_result_push (acpi_ds_exec_end_op)

lots of ACPI code executed ...

does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That 
_should_ break up this section neatly. If it doesnt then please add a 
might_sleep() check to kernel/kernel/semaphore.c's down_timeout() 
function - that is called a number of times in this trace.

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-27 20:48                           ` Ingo Molnar
@ 2008-09-28 20:56                             ` Sitsofe Wheeler
  2008-09-29  8:37                               ` Ingo Molnar
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-28 20:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt

Ingo Molnar wrote:
> does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That 

CONFIG_PREEMPT_VOLUNTARY=y has always been enabled in these results.

> _should_ break up this section neatly. If it doesnt then please add a 
> might_sleep() check to kernel/kernel/semaphore.c's down_timeout() 
> function - that is called a number of times in this trace.

I added might_sleep() to the start of down_timeout() but it neither 
printed anything to dmesg nor changed the latency issue...

Strangely stalling only seems to turn up in linux-tip kernels with very 
little debugging options set within them. I have a linux-tip that has 
lots of extra debugging options set and this problem doesn't show up...

I've also asked about this on the ACPI mailing list and had a huge 
amount of help from finding a good point for a cond_resched - 
http://marc.info/?l=linux-acpi&m=122236450105747&w=2 . Certain things 
(like games) still stutter when the battery is read but it's enough to 
play back sound in rhythmbox without stuttering.

Any ideas why the issue would go away with a debug kernel though?

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-28 20:56                             ` Sitsofe Wheeler
@ 2008-09-29  8:37                               ` Ingo Molnar
  2008-09-29 23:11                                 ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-09-29  8:37 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt


* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:

> Ingo Molnar wrote:
>> does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That 
>> 
>
> CONFIG_PREEMPT_VOLUNTARY=y has always been enabled in these results.

okay - then this definitely needs fixed.

>> _should_ break up this section neatly. If it doesnt then please add a  
>> might_sleep() check to kernel/kernel/semaphore.c's down_timeout()  
>> function - that is called a number of times in this trace.
>
> I added might_sleep() to the start of down_timeout() but it neither 
> printed anything to dmesg nor changed the latency issue...
>
> Strangely stalling only seems to turn up in linux-tip kernels with 
> very little debugging options set within them. I have a linux-tip that 
> has lots of extra debugging options set and this problem doesn't show 
> up...
>
> I've also asked about this on the ACPI mailing list and had a huge 
> amount of help from finding a good point for a cond_resched - 
> http://marc.info/?l=linux-acpi&m=122236450105747&w=2 . Certain things 
> (like games) still stutter when the battery is read but it's enough to 
> play back sound in rhythmbox without stuttering.
>
> Any ideas why the issue would go away with a debug kernel though?

hm, that's weird indeed. You could try to trace the full battery readout 
itself, via a script like this:

  cat /debug/tracing/trace > /dev/null        # drain trace
  cat /proc/acpi/whatever  
  cat /debug/tracing/trace > trace.txt

this way you should be seeing a full trace of the whole thing. (with the 
'ftrace' tracer plugin set in current_tracer)

then, assuming the trace.txt is complete, you can check where it 
reschedules and why. Search for 'schedule' calls. Compare the debug 
versus non-debug traces.

You can even turn on stack backtrace tracing feature, via:

  echo stacktrace > /debug/tracing/iter_ctrl

this adds extra trace entries that show the call path of every 
reschedule. [this attribute is default-disabled.]

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-29  8:37                               ` Ingo Molnar
@ 2008-09-29 23:11                                 ` Sitsofe Wheeler
  2008-09-30 11:22                                   ` Ingo Molnar
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-29 23:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt

Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
>> Any ideas why the issue would go away with a debug kernel though?
> 
> hm, that's weird indeed. You could try to trace the full battery readout 
> itself, via a script like this:
> 
>   cat /debug/tracing/trace > /dev/null        # drain trace
>   cat /proc/acpi/whatever  
>   cat /debug/tracing/trace > trace.txt
> 
> You can even turn on stack backtrace tracing feature, via:
> 
>   echo stacktrace > /debug/tracing/iter_ctrl
> 
> this adds extra trace entries that show the call path of every 
> reschedule. [this attribute is default-disabled.]

(Something bad seems to be happening with my kernels as I have been 
finding strange problems like network-manager detecting the wifi as a 
wired interface unless I did a make clean on my kernel sources before 
compiling). The stalling issue seems to keep coming and going and is 
currently showing up in both debug and non debug kernels.

There are traces when looking at battery information on both AC (where 
the problem is never there) and battery (where the problem always is 
there) here:
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-ac.txt.bz2
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-battery.txt.bz2 
(15Mbytes uncompressed)

I wasn't quite sure how to tell when a reschedule was done. I simply did 
a grep schedule on the file and most of it seemed reasonable. One part 
that caught my eye was the following:

     speaker-test-3891  [000]  1392.751699: __switch_to <-schedule
           <idle>-0     [000]  1392.752070: account_scheduler_latency 
<-enqueue_task_fair
           <idle>-0     [000]  1392.752091: __switch_to <-schedule
              cat-7717  [000]  1392.752092: del_timer <-schedule_timeout
              cat-7717  [000]  1392.772263: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773224: __tasklet_schedule <-kbd_event
              cat-7717  [000]  1392.773225: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773225: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773226: queue_work <-schedule_work
              cat-7717  [000]  1392.773232: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773240: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773251: schedule_delayed_work 
<-put_queue
              cat-7717  [000]  1392.773251: queue_delayed_work 
<-schedule_delayed_work
              cat-7717  [000]  1392.773258: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773258: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773259: queue_work <-schedule_work
              cat-7717  [000]  1392.773267: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773268: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773268: queue_work <-schedule_work
              cat-7717  [000]  1392.807931: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.839464: __tasklet_schedule <-ath5k_intr
              cat-7717  [000]  1392.839506: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.941867: __tasklet_schedule <-ath5k_intr
              cat-7717  [000]  1393.005963: __tasklet_schedule <-ath5k_intr
              cat-7717  [000]  1393.033222: __switch_to <-schedule

Here a schedule seemingly doesn't happen for a few hundredths of a second...

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-29 23:11                                 ` Sitsofe Wheeler
@ 2008-09-30 11:22                                   ` Ingo Molnar
  2008-09-30 13:18                                     ` Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-09-30 11:22 UTC (permalink / raw)
  To: Sitsofe Wheeler
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt


* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:

>              cat-7717  [000]  1392.941867: __tasklet_schedule <-ath5k_intr
>              cat-7717  [000]  1393.005963: __tasklet_schedule <-ath5k_intr
>              cat-7717  [000]  1393.033222: __switch_to <-schedule
>
> Here a schedule seemingly doesn't happen for a few hundredths of a second...

that's about 60 msecs. ath5k_intr IRQ handling overhead? IRQ handlers 
are non-preemptible. (even under CONFIG_PREEMPT=y)

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: How how latent should non-preemptive scheduling be?
  2008-09-30 11:22                                   ` Ingo Molnar
@ 2008-09-30 13:18                                     ` Sitsofe Wheeler
  2008-10-04 10:50                                       ` Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how latent should non-preemptive scheduling be?) Sitsofe Wheeler
  0 siblings, 1 reply; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-09-30 13:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt

Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> 
>>              cat-7717  [000]  1392.941867: __tasklet_schedule <-ath5k_intr
>>              cat-7717  [000]  1393.005963: __tasklet_schedule <-ath5k_intr
>>              cat-7717  [000]  1393.033222: __switch_to <-schedule
>>
>> Here a schedule seemingly doesn't happen for a few hundredths of a second...
> 
> that's about 60 msecs. ath5k_intr IRQ handling overhead? IRQ handlers 
> are non-preemptible. (even under CONFIG_PREEMPT=y)

What about the piece leading up to that:

              cat-7717  [000]  1392.752092: del_timer <-schedule_timeout
              cat-7717  [000]  1392.772263: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773224: __tasklet_schedule <-kbd_event
              cat-7717  [000]  1392.773225: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773225: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773226: queue_work <-schedule_work
              cat-7717  [000]  1392.773232: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773240: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.773251: schedule_delayed_work 
<-put_queue
              cat-7717  [000]  1392.773251: queue_delayed_work 
<-schedule_delayed_work
              cat-7717  [000]  1392.773258: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773258: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773259: queue_work <-schedule_work
              cat-7717  [000]  1392.773267: schedule_console_callback 
<-kbd_event
              cat-7717  [000]  1392.773268: schedule_work 
<-schedule_console_callback
              cat-7717  [000]  1392.773268: queue_work <-schedule_work
              cat-7717  [000]  1392.807931: account_scheduler_latency 
<-enqueue_task_fair
              cat-7717  [000]  1392.839464: __tasklet_schedule <-ath5k_intr

Here's a snippet from 
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-battery.txt.bz2 :
              cat-7717  [000]  1392.752092: del_timer <-schedule_timeout
              cat-7717  [000]  1392.752092: acpi_ec_check_status 
<-acpi_ec_wait
              cat-7717  [000]  1392.752093: constant_test_bit 
<-acpi_ec_check_status
              cat-7717  [000]  1392.752095: mutex_unlock 
<-acpi_ec_transaction
              cat-7717  [000]  1392.752096: acpi_ex_enter_interpreter 
<-acpi_ev_address_space_dispatch
              cat-7717  [000]  1392.752097: acpi_os_wait_semaphore 
<-acpi_ut_acquire_mutex
              cat-7717  [000]  1392.752097: down_timeout 
<-acpi_os_wait_semaphore
              cat-7717  [000]  1392.752098: acpi_ex_release_global_lock 
<-acpi_ex_read_data_from_field
              cat-7717  [000]  1392.752099: acpi_ex_release_mutex_object 
<-acpi_ex_release_global_lock
              cat-7717  [000]  1392.752100: acpi_ev_release_global_lock 
<-acpi_ex_release_mutex_object
              cat-7717  [000]  1392.752100: __acpi_release_global_lock 
<-acpi_ev_release_global_lock
              cat-7717  [000]  1392.752101: acpi_os_signal_semaphore 
<-acpi_ev_release_global_lock
              cat-7717  [000]  1392.752102: up <-acpi_os_signal_semaphore
...
              cat-7717  [000]  1392.839447: 
acpi_ex_truncate_for32bit_table <-acpi_ds_exec_end_op
              cat-7717  [000]  1392.839447: acpi_ps_next_parse_state 
<-acpi_ps_parse_loop
              cat-7717  [000]  1392.839448: acpi_ps_complete_op 
<-acpi_ps_parse_loop
              cat-7717  [000]  1392.839448: acpi_ps_complete_this_op 
<-acpi_ps_complete_op
              cat-7717  [000]  1392.839449: acpi_ps_delete_parse_tree 
<-acpi_ps_complete_this_op
              cat-7717  [000]  1392.839449: acpi_ps_get_arg 
<-acpi_ps_delete_parse_tree
              cat-7717  [000]  1392.839450: acpi_ps_get_opcode_info 
<-acpi_ps_get_arg
              cat-7717  [000]  1392.839450: acpi_ps_get_arg 
<-acpi_ps_delete_parse_tree
              cat-7717  [000]  1392.839451: acpi_ps_get_opcode_info 
<-acpi_ps_get_arg
              cat-7717  [000]  1392.839452: do_IRQ <-common_interrupt
              cat-7717  [000]  1392.839453: irq_enter <-do_IRQ
              cat-7717  [000]  1392.839453: irq_to_desc <-do_IRQ
              cat-7717  [000]  1392.839454: handle_fasteoi_irq <-do_IRQ
              cat-7717  [000]  1392.839455: handle_IRQ_event 
<-handle_fasteoi_irq
              cat-7717  [000]  1392.839455: usb_hcd_irq <-handle_IRQ_event
              cat-7717  [000]  1392.839456: uhci_irq <-usb_hcd_irq
              cat-7717  [000]  1392.839458: ath5k_intr <-handle_IRQ_event
              cat-7717  [000]  1392.839459: ath5k_hw_is_intr_pending 
<-ath5k_intr
              cat-7717  [000]  1392.839461: ath5k_hw_get_isr <-ath5k_intr
              cat-7717  [000]  1392.839464: __tasklet_schedule <-ath5k_intr


Why didn't another process cut in from 1392.752092 to 1392.839464? In 
the schedule grep normally runs for cat are half as long e.g.:

           <idle>-0     [000]  1392.747102: __switch_to <-schedule
              cat-7717  [000]  1392.747103: del_timer <-schedule_timeout
              cat-7717  [000]  1392.747107: 
schedule_timeout_uninterruptible <-msleep
              cat-7717  [000]  1392.747107: schedule_timeout 
<-schedule_timeout_uninterruptible
              cat-7717  [000]  1392.747108: init_timer <-schedule_timeout
              cat-7717  [000]  1392.747108: __mod_timer <-schedule_timeout
              cat-7717  [000]  1392.747111: __switch_to <-schedule
           <idle>-0     [000]  1392.749067: account_scheduler_latency 
<-enqueue_task_fair


Further why (after taking a break to process the ath5k interrupt) did it 
not schedule another process other than cat until 1393.033226:

              cat-7717  [000]  1393.033215: set_normalized_timespec 
<-ktime_get_ts
              cat-7717  [000]  1393.033216: ktime_add_safe 
<-hrtimer_start_range_ns
              cat-7717  [000]  1393.033217: ktime_add_safe 
<-hrtimer_start_range_ns
              cat-7717  [000]  1393.033217: enqueue_hrtimer 
<-hrtimer_start_range_ns
              cat-7717  [000]  1393.033222: __switch_to <-schedule
  hald-addon-inpu-3237  [000]  1393.033226: fget_light <-do_sys_poll

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how   latent should non-preemptive scheduling be?)
  2008-09-30 13:18                                     ` Sitsofe Wheeler
@ 2008-10-04 10:50                                       ` Sitsofe Wheeler
  0 siblings, 0 replies; 17+ messages in thread
From: Sitsofe Wheeler @ 2008-10-04 10:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt,
	Christoph Lameter

Sitsofe Wheeler wrote:
> I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be
> skipping while playing sound through various desktop apps with a
> 2.6.27rc6 kernel. It is running off an SD card which really shows up

After weeks of head scratching as to why these latencies didn't occur
using the xandros 2.6.21.4 kernel (but keeping the same userspace) when
my own kernels would always show this problem I finally found the answer
after reading
http://tservice.net.ru/~s0mbre/blog/devel/other/2008_10_01 on kernel
planet - SLUB can cause regressions compared to SLAB. Switching from 
SLUB to SLAB made the problem more or less disappear (which I guess 
makes sense given the large number of kmem_* calls that are made when 
reading the battery).

My understanding with the memory allocators is that SLOB is the smallest 
and simplest. Its forté is that it uses very little memory which makes 
it ideal for embedded systems and is the easiest allocator to 
understand. The downside is that it might tend towards memory 
fragmentation the longer a system runs and is apparently a little bit 
slower than SLAB. The SLAB allocator is something that the linux kernel 
has had for many years and was a good performer until the pressures to 
perform with large SMP systems started to come in to play (at which 
point lots of memory would be used up on fixed structures - 
http://lwn.net/Articles/229984/ ). SLUB is the newest allocator, scales 
up well and has finer grained diagnostics that SLAB.

Prior to today my understanding was that SLUB would be able to replace 
SLAB in all scenarios and perform just as well or better. However now 
I'm not so sure (SLAB appears to be less latent than SLUB). Other than 
SLUB's newness are there cases where SLAB should be chosen instead of 
SLUB (e.g. uniprocessor desktop systems with hundreds of megabytes of 
RAM)? Will SLAB exist as an alternative to SLUB for certain setups?

-- 
Sitsofe | http://sucs.org/~sits

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-10-04 10:50 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.ERZTl/6uH+mhNoef5fPJKTRjJag@ifi.uio.no>
     [not found] ` <fa.PtPFzP5kIJVCCov6YCewrh+o4z4@ifi.uio.no>
     [not found]   ` <fa.C6WSm5Rh2Nb+Qho7b0qDOZ9RPV8@ifi.uio.no>
     [not found]     ` <fa.ch6j4qXs/2sFpLkHz5fXrtjTR8c@ifi.uio.no>
     [not found]       ` <fa.Jx/Ygtm46CVRawlA6OnfYNn6cN0@ifi.uio.no>
2008-09-18  7:26         ` How how latent should non-preemptive scheduling be? Sitsofe Wheeler
     [not found]         ` <fa.iIHgL48F3T5VAqFw3mqaf9Pzrs4@ifi.uio.no>
     [not found]           ` <fa.Td8xkKZKMSMghlJmEYefTRVF2kc@ifi.uio.no>
2008-09-19 11:54             ` Sitsofe Wheeler
2008-09-19 14:20               ` Peter Zijlstra
2008-09-22 11:57               ` Ingo Molnar
2008-09-22 12:07                 ` Steven Rostedt
2008-09-23  6:33                 ` Sitsofe Wheeler
2008-09-23 11:53                   ` Ingo Molnar
2008-09-23 16:30                     ` Sitsofe Wheeler
2008-09-23 19:39                       ` Sitsofe Wheeler
2008-09-23 22:01                         ` Sitsofe Wheeler
2008-09-27 20:48                           ` Ingo Molnar
2008-09-28 20:56                             ` Sitsofe Wheeler
2008-09-29  8:37                               ` Ingo Molnar
2008-09-29 23:11                                 ` Sitsofe Wheeler
2008-09-30 11:22                                   ` Ingo Molnar
2008-09-30 13:18                                     ` Sitsofe Wheeler
2008-10-04 10:50                                       ` Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how latent should non-preemptive scheduling be?) Sitsofe Wheeler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox