public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: uninteruptable sleep
@ 2001-04-03 16:40 Manfred Spraul
  2001-04-04  7:47 ` uninteruptable sleep (D state => load_avrg++) christophe barbe
  2001-04-04 16:07 ` uninteruptable sleep christophe barbe
  0 siblings, 2 replies; 11+ messages in thread
From: Manfred Spraul @ 2001-04-03 16:40 UTC (permalink / raw)
  To: ocdi; +Cc: linux-kernel, Alan Cox

> ps xl:
>   F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
> 040 1000 1230 1 9 0 24320 4 down_w D ? 0:00
>           /home/data/mozilla/obj/dist/bin/mozi
>
down_w

Perhaps down_write_failed()? 2.4.3 converted the mmap semaphore to a
rw-sem.
Did you compile sysrq into your kernel? Then enable it with

#echo 1 > /proc/sys/kernel/sysrq
and press <Alt>+<SysRQ>+'t'

It prints the complete back trace, not just one function name

--
    Manfred




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-03 16:40 uninteruptable sleep Manfred Spraul
@ 2001-04-04  7:47 ` christophe barbe
  2001-04-04 11:15   ` Alan Cox
  2001-04-04 16:07 ` uninteruptable sleep christophe barbe
  1 sibling, 1 reply; 11+ messages in thread
From: christophe barbe @ 2001-04-04  7:47 UTC (permalink / raw)
  To: linux-kernel

Sorry if I fork a bit the thread but I'm wondering why the load average is incremented for each D process.

I don't know if the kernel use this information (if yes please let me know).
But some programs like sendmail use this information to sleep when the load is too high (I believe from 12 for sendmail).
It makes sence but in the case of D process, the load average give a bad idea of the load because these process don't use CPU.

I use GFS to share a filesystem on several nodes. 
The file locking use real IO and so when you ask for a lock, if the lock is already owned, you fall in a D state.
This differs from what a local filesystem does but IMHO makes sense for a distributed filesytem like GFS.

Christophe

On mar, 03 avr 2001 18:40:53 Manfred Spraul wrote:
> > ps xl:
> >   F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
> > 040 1000 1230 1 9 0 24320 4 down_w D ? 0:00
> >           /home/data/mozilla/obj/dist/bin/mozi
> >
> down_w
> 
> Perhaps down_write_failed()? 2.4.3 converted the mmap semaphore to a
> rw-sem.
> Did you compile sysrq into your kernel? Then enable it with
> 
> #echo 1 > /proc/sys/kernel/sysrq
> and press <Alt>+<SysRQ>+'t'
> 
> It prints the complete back trace, not just one function name
> 
> --
>     Manfred
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Christophe Barbé
Software Engineer
Lineo High Availability Group
42-46, rue Médéric
92110 Clichy - France
phone (33).1.41.40.02.12
fax (33).1.41.40.02.01
www.lineo.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04  7:47 ` uninteruptable sleep (D state => load_avrg++) christophe barbe
@ 2001-04-04 11:15   ` Alan Cox
  2001-04-04 12:13     ` christophe barbe
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Cox @ 2001-04-04 11:15 UTC (permalink / raw)
  To: christophe barbe; +Cc: linux-kernel

> The file locking use real IO and so when you ask for a lock, if the loc=
> k is already owned, you fall in a D state.

That seems odd. They should be using interruptible sleeps so you can interrupt
the task waiting for the lock, surely.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 11:15   ` Alan Cox
@ 2001-04-04 12:13     ` christophe barbe
  2001-04-04 12:53       ` Alan Cox
                         ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: christophe barbe @ 2001-04-04 12:13 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

The sleep should certainly be interruptible and I that's what I said to the GFS guy.
But what the reason to increment the load average for each D process ?

Thanks,
Christophe

On mer, 04 avr 2001 13:15:52 Alan Cox wrote:
> > The file locking use real IO and so when you ask for a lock, if the loc=
> > k is already owned, you fall in a D state.
> 
> That seems odd. They should be using interruptible sleeps so you can interrupt
> the task waiting for the lock, surely.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Christophe Barbé
Software Engineer
Lineo High Availability Group
42-46, rue Médéric
92110 Clichy - France
phone (33).1.41.40.02.12
fax (33).1.41.40.02.01
www.lineo.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 12:13     ` christophe barbe
@ 2001-04-04 12:53       ` Alan Cox
  2001-04-04 14:20       ` Paul Jakma
  2001-04-04 22:39       ` Tim Wright
  2 siblings, 0 replies; 11+ messages in thread
From: Alan Cox @ 2001-04-04 12:53 UTC (permalink / raw)
  To: christophe barbe; +Cc: Alan Cox, linux-kernel

> The sleep should certainly be interruptible and I that's what I said to t=
> he GFS guy.
> But what the reason to increment the load average for each D process ?

D indicates short term I/O wait. This is how unix has always computed the
laod average.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 12:13     ` christophe barbe
  2001-04-04 12:53       ` Alan Cox
@ 2001-04-04 14:20       ` Paul Jakma
  2001-04-04 14:48         ` christophe barbe
  2001-04-04 22:39       ` Tim Wright
  2 siblings, 1 reply; 11+ messages in thread
From: Paul Jakma @ 2001-04-04 14:20 UTC (permalink / raw)
  To: christophe barbe; +Cc: Alan Cox, linux-kernel

On Wed, 4 Apr 2001, christophe barbe wrote:

> The sleep should certainly be interruptible and I that's what I
> said to the GFS guy. But what the reason to increment the load
> average for each D process ?

from a philosical POV: they are processes that will be runnable as
soon as the kernel returns to them.

no idea if there are technical reasons for it.

>
> Thanks,
> Christophe

--paulj


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 14:20       ` Paul Jakma
@ 2001-04-04 14:48         ` christophe barbe
  2001-04-04 15:05           ` Paul Jakma
  0 siblings, 1 reply; 11+ messages in thread
From: christophe barbe @ 2001-04-04 14:48 UTC (permalink / raw)
  To: Paul Jakma; +Cc: Alan Cox, linux-kernel

<skip>
I've unfortunately no significant Unix culture. 
I'm certainly young enough to be excused and by luck Linux shows me the road to the hacker heaven.
So now I move forward the good direction, trying to understand the POSIX stuff ....
</skip>

>From me, a POV without technical reasons is not a philosical one but more certainly an historical one.

Process that will be runnable are not participating to the load so why incrementing the load average.
Moreover if a process should be in state D only for a short time, the influence of the incrementation should be near null for an AVERAGE value.
So why doing that (I mean load++) if there's an influence only when a process stay in a D state for a long time (= when the only effect is to distort the load measure) ?

What's the technical reason behind this load_avrg++ ???

Christophe


On mer, 04 avr 2001 16:20:04 Paul Jakma wrote:
> On Wed, 4 Apr 2001, christophe barbe wrote:
> 
> > The sleep should certainly be interruptible and I that's what I
> > said to the GFS guy. But what the reason to increment the load
> > average for each D process ?
> 
> from a philosical POV: they are processes that will be runnable as
> soon as the kernel returns to them.
> 
> no idea if there are technical reasons for it.
> 
> >
> > Thanks,
> > Christophe
> 
> --paulj
> 
-- 
Christophe Barbé
Software Engineer
Lineo High Availability Group
42-46, rue Médéric
92110 Clichy - France
phone (33).1.41.40.02.12
fax (33).1.41.40.02.01
www.lineo.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 14:48         ` christophe barbe
@ 2001-04-04 15:05           ` Paul Jakma
  2001-04-04 15:15             ` christophe barbe
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Jakma @ 2001-04-04 15:05 UTC (permalink / raw)
  To: christophe barbe; +Cc: Alan Cox, linux-kernel

On Wed, 4 Apr 2001, christophe barbe wrote:

> From me, a POV without technical reasons is not a philosical one
> but more certainly an historical one.

there may be (and indeed probably are) good technical reasons, however
i am not well enough informed to say what they are.

> Process that will be runnable are not participating to the load so
> why incrementing the load average.

As i understand it:

load avg by nature is a measure of how many processes are 'runnable'
(ie waiting to run) over time.

a process waiting for the kernel to complete IO will indeed be
runnable as soon as the kernel is finished.

instead of waiting for CPU time (as with processes marked R), instead
these processes are waiting for kernel to complete.

> Moreover if a process should be
> in state D only for a short time, the influence of the
> incrementation should be near null for an AVERAGE value.

because the number of processes asleep, waiting on kernel to complete
IO may reasonably be considered to be a load.

imagine a box with a bunch of processes that do almost nothing but
call on the kernel to do IO. If you only count the runnable state
towards load_avg then your load_avg will be very low, even though your
box is swamped - you are ignoring the work of the kernel.

if you count D towards load_avg then it will reflect this abstract
'load' concept more accurately.

Ie, counting D towards load_avg is a way of taking kernel IO work into
account when calculating the load average figures.

> What's the technical reason behind this load_avrg++ ???
>
> Christophe
>

--paulj


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 15:05           ` Paul Jakma
@ 2001-04-04 15:15             ` christophe barbe
  0 siblings, 0 replies; 11+ messages in thread
From: christophe barbe @ 2001-04-04 15:15 UTC (permalink / raw)
  To: Paul Jakma; +Cc: linux-kernel

On mer, 04 avr 2001 17:05:05 Paul Jakma wrote:
> imagine a box with a bunch of processes that do almost nothing but
> call on the kernel to do IO. If you only count the runnable state
> towards load_avg then your load_avg will be very low, even though your
> box is swamped - you are ignoring the work of the kernel.
> 
> if you count D towards load_avg then it will reflect this abstract
> 'load' concept more accurately.
> 
> Ie, counting D towards load_avg is a way of taking kernel IO work into
> account when calculating the load average figures.

ok I'm convinced.
And a measure can't be perfect.

Thank you,
Christophe

-- 
Christophe Barbé
Software Engineer
Lineo High Availability Group
42-46, rue Médéric
92110 Clichy - France
phone (33).1.41.40.02.12
fax (33).1.41.40.02.01
www.lineo.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep
  2001-04-03 16:40 uninteruptable sleep Manfred Spraul
  2001-04-04  7:47 ` uninteruptable sleep (D state => load_avrg++) christophe barbe
@ 2001-04-04 16:07 ` christophe barbe
  1 sibling, 0 replies; 11+ messages in thread
From: christophe barbe @ 2001-04-04 16:07 UTC (permalink / raw)
  To: linux-kernel

This problem seems to be related with the recent post from David Howells <dhowells@cambridge.redhat.com> with the subject "rw_semaphore bug".

Christophe

On mar, 03 avr 2001 18:40:53 Manfred Spraul wrote:
> > ps xl:
> >   F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
> > 040 1000 1230 1 9 0 24320 4 down_w D ? 0:00
> >           /home/data/mozilla/obj/dist/bin/mozi
> >
> down_w
> 
> Perhaps down_write_failed()? 2.4.3 converted the mmap semaphore to a
> rw-sem.
> Did you compile sysrq into your kernel? Then enable it with
> 
> #echo 1 > /proc/sys/kernel/sysrq
> and press <Alt>+<SysRQ>+'t'
> 
> It prints the complete back trace, not just one function name
> 
> --
>     Manfred
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
-- 
Christophe Barbé
Software Engineer
Lineo High Availability Group
42-46, rue Médéric
92110 Clichy - France
phone (33).1.41.40.02.12
fax (33).1.41.40.02.01
www.lineo.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: uninteruptable sleep (D state => load_avrg++)
  2001-04-04 12:13     ` christophe barbe
  2001-04-04 12:53       ` Alan Cox
  2001-04-04 14:20       ` Paul Jakma
@ 2001-04-04 22:39       ` Tim Wright
  2 siblings, 0 replies; 11+ messages in thread
From: Tim Wright @ 2001-04-04 22:39 UTC (permalink / raw)
  To: christophe barbe; +Cc: linux-kernel

On Wed, Apr 04, 2001 at 02:13:49PM +0200, christophe barbe wrote:
> The sleep should certainly be interruptible and I that's what I said to the GFS guy.
> But what the reason to increment the load average for each D process ?
> 

OK, the Unix history goes something like this. Synchronization was achieved
using two primitives, sleep() and wakeup(). These guys rendezvous'd on a
wait channel, which was simply an 'int', and by convention was actually the
address of a data structure (yes I know int and pointers aren't the same, this
is a long time ago, OK ? :-).
Anyway, when you called sleep, you also had an associated priority. Priority
values less than PZERO were "high" priority, and >= PZERO were "low" priority.
sleeping above PZERO was interruptible, and processes sleeping at this priority
did not count towards the load. The idea was to use this for events that
potentially might never happen. Sleeping at a priority < PZERO was intended
to be used for things that are absolutely 100% guaranteed to happen, preferably
sometime very soon. Disk I/O (real disks, not NFS) fell into this category,
and hence it counts towards the load since this could be deemed a "fast wait"
state, and the process is nominally runnable. All a bit hand-wavy I know, but
it worked well enough.

The really important part of all this is that you should never sleep
uninterruptibly for anything that you cannot absolutely guarantee will happen,
otherwise you wind up with a stuck process.

Regards,

Tim


-- 
Tim Wright - timw@splhi.com or timw@aracnet.com or twright@us.ibm.com
IBM Linux Technology Center, Beaverton, Oregon
Interested in Linux scalability ? Look at http://lse.sourceforge.net/
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-04-04 22:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-03 16:40 uninteruptable sleep Manfred Spraul
2001-04-04  7:47 ` uninteruptable sleep (D state => load_avrg++) christophe barbe
2001-04-04 11:15   ` Alan Cox
2001-04-04 12:13     ` christophe barbe
2001-04-04 12:53       ` Alan Cox
2001-04-04 14:20       ` Paul Jakma
2001-04-04 14:48         ` christophe barbe
2001-04-04 15:05           ` Paul Jakma
2001-04-04 15:15             ` christophe barbe
2001-04-04 22:39       ` Tim Wright
2001-04-04 16:07 ` uninteruptable sleep christophe barbe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox