public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* scheduler went mad?
@ 2001-04-11 14:24 Priit Randla
  2001-04-11 18:46 ` Josh McKinney
  0 siblings, 1 reply; 19+ messages in thread
From: Priit Randla @ 2001-04-11 14:24 UTC (permalink / raw)
  To: linux-kernel



Hi,
   

   Yesterday i tried to start cdda2wav but somehow it didn't do
anything.
  It didn't die to kill -9 too. Machine was slow but usable. 
  vmstat 10 output:

  procs                      memory    swap          io    
system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us 
sy  id
 2  0  1   2972  40916    108  18292   0   0     0     0  121 12735   0
100   0
 2  0  1   2972  40492    108  18292   0   0     0     0  109 12740   1 
99   0
 2  0  1   2972  40492    108  18292   0   0     0     0  103 12996   0
100   0
 3  0  0   2972  40492    108  18292   0   0     0     0  102 12932   0
100   0
 3  0  1   2972  40492    108  18292   0   0     0     0  131 12652   1 
99   0
 2  0  0   2972  40496    108  18292   0   0     0     0  142 12562   1 
99   0
 2  0  0   2972  40500    108  18292   0   0     0     0  120 12684   0
100   0
 2  0  1   2972  40496    108  18292   0   0     0     0  140 12480   1 
99   0
 2  0  0   2972  39952    108  18292   0   0     0     0  160 11445   7 
93   0
 3  0  0   2972  39952    108  18292   0   0     0     0  178 12295   2 
98   0
 2  0  0   2972  39956    108  18292   0   0     0     0  214 11958   2 
98   0
 3  0  1   2972  39952    108  18292   0   0     0     0  138 12579   1 
99   0

cs field is absolutely ridiculous for my machine.																							

ps showed cdda2wav & kswapd eating all of processor time. When i tried
to close
netscape, it hang too and joined cdda2wav and kswapd:

  PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME
COMMAND
 9990 priitr    17   0 42380  41M  9928 R       0 32.5 33.4  21:47
netscape-commun
    3 root      17   0     0    0     0 SW      0 32.3  0.0  11:12
kswapd
10538 priitr    16   0    84    8     0 R       0 32.3  0.0  11:09
cdda2wav
    5 root       9   0     0    0     0 SW      0  1.5  0.0   0:19
bdflush
10616 priitr    13   0   856  856   668 R       0  0.7  0.6   0:00 top
  657 root       9   0 21160  20M  1668 S       0  0.1 16.7  29:36 X


I couldn't leave X and had to kill it. After that, both netscape and
cdda2wav were
gone and everything looks normal since then.
I'm running 2.4.3ac3 right now.

dmesg:

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: scheduler went mad?
@ 2001-04-12 14:57 Valdis.Kletnieks
  2001-04-12 15:12 ` Alan Cox
  2001-04-12 15:43 ` Hugh Dickins
  0 siblings, 2 replies; 19+ messages in thread
From: Valdis.Kletnieks @ 2001-04-12 14:57 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1998 bytes --]

I've seen the same scenario about 2-3 times a week.  kswapd and one or
more processes all CPU bound, totalling to 100%.  I've had 'esdplay' hung
on several occasions, and 2-3 times it's been xscreensaver (3.29) hung.
The 'hung' processes are consistently immune to kill -9, even as root, which
indicates to me that they're hung inside a kernel call or something.

Sometimes, something *else* will exit, and everything will 'break loose'
and return to normal after a minute or so.

It *may* not be related, but I also have a lot of this in 'dmesg':

__alloc_pages: 4-order allocation failed.
__alloc_pages: 3-order allocation failed.
i810_audio: DMA overrun on send

There was a recent posting re: the i810_audio driver amounting to "I've got
one bug to fix and then I'll put up a patch" for the 'dma overrun' message.
__alloc_pages doesn't give much information on who its caller was, so
that's somewhat of a dead end...

In page_alloc.c, __alloc_pages() has a 'goto try_again;' which will
cause it to loop around and try to get more memory.  I'm wondering if
the "hung" process is entering __alloc_pages(), and gets wedged in the
'try_again' loop - which has a call to wakeup_kswapd() inside it, which
would explain the high context-switch rate.  I'm not clear on how kswapd
can end up getting stuck and failing to free up something - unless it ends
up calling __alloc_pages itself indirectly and the PF_MEMALLOC bit isn't
enough to get it the memory it needs, causing a deadlock/loop between
kswapd and __alloc_pages/wakeup_kswapd().

Unfortunately, I've just exhausted my ability to debug this one here.. ;) 

I'm running the 2.4.3 kernel, with the following patches:

Reiserfs: 2.4.3-3.6.25.quota.bz2
linux-2.4.3-knfsd-6.g.patch.gz
linux-2.4.3-reiserfs-20010327.patch.bz2

IPv6: linux24-2.4.3-usagi-20010406.patch.gz
Crypto: patch-int-2.4.3.1

am using ReiserFS-on-LVM for basically all filesystems, if that matters...

-- 
				Valdis Kletnieks
				Operating Systems Analyst
				Virginia Tech


[-- Attachment #2: Type: application/pgp-signature, Size: 211 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2001-04-12 22:24 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-11 14:24 scheduler went mad? Priit Randla
2001-04-11 18:46 ` Josh McKinney
2001-04-12 15:46   ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2001-04-12 14:57 Valdis.Kletnieks
2001-04-12 15:12 ` Alan Cox
2001-04-12 15:44   ` Valdis.Kletnieks
2001-04-12 18:31   ` Valdis.Kletnieks
2001-04-12 15:43 ` Hugh Dickins
2001-04-12 16:02   ` Alan Cox
2001-04-12 16:29     ` Rik van Riel
2001-04-12 15:08       ` Marcelo Tosatti
2001-04-12 15:10         ` Marcelo Tosatti
2001-04-12 18:18           ` Szabolcs Szakacsits
2001-04-12 17:56             ` Rik van Riel
2001-04-12 20:12               ` Szabolcs Szakacsits
2001-04-12 20:18                 ` Rik van Riel
2001-04-12 23:02                   ` Szabolcs Szakacsits
2001-04-12 22:24                     ` Valdis.Kletnieks
2001-04-12 16:53         ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox