public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.0-test4-mm5 dbench stuck in D state
@ 2003-09-05  2:38 rwhron
  2003-09-05  3:44 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: rwhron @ 2003-09-05  2:38 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm

Uniprocessor x86 ext2 on IDE system running 2.6.0-test4-mm5
has dbench stuck in uninterruptible sleep.  -mm4 didn't 
have a problem with the same workload.

dbench 32 was hung.  I did pkill -9 dbench.  Some of the dbench
processes exited.  Some did not.  Below is "ps aux".   The hung
"dbench 32" processes have START time of 02:09.  

Apparently the mother of all dbench processes exited, because the
wrapper script moved on to the next iteration of dbench.  

A few hours later I noticed dbench wasn't making progress on 
another iteration of dbench 32.  pkill -9 ... one process with
start time of 07:49 didn't exit.

A few hours later "dbench 64" was stuck.  The disk access light
was out. The load average steady at 16.  In the ps below,
those processes have START time of 21:49.

ps aux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  1312  168 ?        S    Sep03   0:04 init [3]
root         2  0.0  0.0     0    0 ?        SWN  Sep03   0:00 [ksoftirqd/0]
root         3  0.0  0.0     0    0 ?        SW<  Sep03   0:01 [events/0]
root         4  0.0  0.0     0    0 ?        SW<  Sep03   0:00 [kblockd/0]
root         7  0.0  0.0     0    0 ?        SW   Sep03   0:28 [kswapd0]
root         8  0.0  0.0     0    0 ?        SW<  Sep03   0:00 [aio/0]
root         9  0.0  0.0     0    0 ?        SW<  Sep03   0:00 [aio_fput/0]
root        10  0.0  0.0     0    0 ?        SW   Sep03   0:00 [jfsIO]
root        11  0.0  0.0     0    0 ?        SW   Sep03   0:00 [jfsCommit]
root        12  0.0  0.0     0    0 ?        SW   Sep03   0:00 [jfsSync]
root        13  0.0  0.0     0    0 ?        SW<  Sep03   0:00 [xfslogd/0]
root        14  0.0  0.0     0    0 ?        SW<  Sep03   0:00 [xfsdatad/0]
root        15  0.0  0.0     0    0 ?        SW   Sep03   0:00 [pagebufd]
root        16  0.0  0.0     0    0 ?        SW   Sep03   0:25 [kjournald]
root       403  0.0  0.0  1368  216 ?        S    Sep03   0:00 syslogd -m 0
root       408  0.0  0.0  1300  280 ?        S    Sep03   0:00 klogd -x
rpc        429  0.0  0.0  1448  104 ?        S    Sep03   0:00 portmap
root       542  0.0  0.0  2568  140 ?        S    Sep03   0:00 /usr/sbin/sshd
root       552  0.0  0.0  1284  116 tty1     S    Sep03   0:00 /sbin/mingetty tt
root       553  0.0  0.2  2208 1028 ?        S    Sep03   0:00 login -- root
root       556  0.0  0.1  3440  400 ?        S    Sep03   0:01 /usr/sbin/sshd
rwhron     557  0.0  0.0  2200  144 pts/0    S    Sep03   0:00 -bash
root       601  0.0  0.0  2000  144 pts/0    S    Sep03   0:00 su -
root       602  0.0  0.0  2400  144 pts/0    S    Sep03   0:00 -bash
root       652  0.0  0.1  2216  544 pts/0    S    Sep03   0:00 /bin/bash ./runte
root     16813  0.0  0.0  3352  160 ?        S    Sep03   0:00 /usr/sbin/sshd
rwhron   16814  0.0  0.0  2200  152 pts/1    S    Sep03   0:00 -bash
root     28785  0.0  0.2  2180 1020 pts/0    S    02:09   0:00 /bin/bash /usr/lo
root     28786  0.0  0.1  1568  544 pts/0    S    02:09   0:00 tee -a /root/2.6.
root     28787  0.0  0.2  2180 1040 pts/0    S    02:09   0:00 /bin/bash ./dbrun
root     28790  0.0  0.1  1568  544 pts/0    S    02:09   0:00 tee -a /root/2.6.
root     28797  0.0  0.1  1496  580 pts/0    S    02:09   0:13 ping localhost
root     28816  0.0  0.1  1372  512 pts/0    D    02:09   0:09 ./dbench 32
root     28821  0.0  0.1  1372  512 pts/0    D    02:09   0:03 ./dbench 32
root     28827  0.0  0.1  1372  512 pts/0    D    02:09   0:10 ./dbench 32
root     28831  0.0  0.1  1372  512 pts/0    D    02:09   0:10 ./dbench 32
root     28836  0.0  0.1  1372  512 pts/0    D    02:09   0:10 ./dbench 32
root     28842  0.0  0.1  1372  512 pts/0    D    02:09   0:10 ./dbench 32
root     28889  0.0  0.3  2408 1304 ttyS1    S    07:38   0:27 -bash
root     29133  0.0  0.1  1372  512 pts/0    D    07:49   0:09 ./dbench 32
root     29278  0.0  0.0     0    0 ?        SW   21:48   0:02 [pdflush]
root     29284  0.0  0.1  1364  424 pts/0    S    21:49   0:00 ./dbench 64
root     29290  0.4  0.1  1372  512 pts/0    D    21:49   0:09 ./dbench 64
root     29300  0.3  0.1  1372  512 pts/0    D    21:49   0:08 ./dbench 64
root     29304  0.3  0.1  1372  512 pts/0    D    21:49   0:08 ./dbench 64
root     29308  0.4  0.1  1372  512 pts/0    D    21:49   0:09 ./dbench 64
root     29311  0.4  0.1  1372  512 pts/0    D    21:49   0:09 ./dbench 64
root     29322  0.4  0.1  1372  512 pts/0    D    21:49   0:09 ./dbench 64
root     29324  0.4  0.1  1372  512 pts/0    D    21:49   0:09 ./dbench 64
root     29337  0.3  0.1  1372  512 pts/0    D    21:49   0:08 ./dbench 64
root     29345  0.3  0.1  1372  512 pts/0    D    21:49   0:08 ./dbench 64
root     29422  0.1  0.0     0    0 ?        SW   21:52   0:02 [pdflush]
root     29433  0.0  0.1  2556  680 ttyS1    R    22:26   0:00 ps aux


The behavior is not perfectly repeatable.  :(   To recap, the
sequence of events was like this:

dbench 32    # pkill -9 dbench
dbench 32    # exited normally
dbench 32    # pkill -9 dbench
dbench 32    # pkill -9 dbench
dbench 32    # appeared to exit normally
dbench 64    # some processes don't exit

There is a <sysrq T> trace at:
http://home.earthlink.net/~rwhron/2.6.0-test4-mm5.trace

Is there something else that would help track this down?
2.6.0-test4-mm4 did not exhibit this behavior.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.0-test4-mm5 dbench stuck in D state
  2003-09-05  2:38 rwhron
@ 2003-09-05  3:44 ` Andrew Morton
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2003-09-05  3:44 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

rwhron@earthlink.net wrote:
>
> Uniprocessor x86 ext2 on IDE system running 2.6.0-test4-mm5
>  has dbench stuck in uninterruptible sleep.

Yes, sorry.  More dodgy Australian software I'm afraid.  You'll need to
revert

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test4/2.6.0-test4-mm5/broken-out/elv-insertion-fix.patch

That patch is in Linus's tree now, so it should be a nice test of our bug
reporting processes.  Sigh.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.0-test4-mm5 dbench stuck in D state
@ 2003-09-05  4:32 rwhron
  0 siblings, 0 replies; 3+ messages in thread
From: rwhron @ 2003-09-05  4:32 UTC (permalink / raw)
  To: akpm, piggin; +Cc: linux-kernel

> revert

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test4/2.6.0-test4-mm5/broken-out/elv-insertion-fix.patch

> That patch is in Linus's tree now

Backing out elv-insertion-fix.patch does the trick.
The machine has run 9 iterations of dbench and she's
still happy.

--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-09-05  4:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-05  4:32 2.6.0-test4-mm5 dbench stuck in D state rwhron
  -- strict thread matches above, loose matches on Subject: below --
2003-09-05  2:38 rwhron
2003-09-05  3:44 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox