* ext3 deadlock?
@ 2002-05-12 23:29 Paul Mackerras
2002-05-13 4:08 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Paul Mackerras @ 2002-05-12 23:29 UTC (permalink / raw)
To: linux-kernel
I'm having a problem with 2.5.15 on an old slow powerbook 3400. It
gets stuck during boot at the point where it starts syslogd. At that
point show_state() reveals that kjournald and one of the two syslogd
processes are stuck in D state. The stack trace for kjournald is:
schedule
__wait_on_buffer
journal_commit_transaction
kjournald
The stack trace on the syslogd process looks like this:
schedule
sleep_on
log_wait_commit
journal_stop
journal_force_commit
ext3_force_commit
ext3_sync_file
sys_fsync
The machine will boot up quite happily with a 2.4.19-pre7 kernel.
If I boot with the ext3 filesystems dirty (i.e. stuff in the journal)
it will usually hang while recovering the journal for the /data
filesystem (I have two partitions, root and /data).
I have just tried booting again (with clean filesystems) and this time
I have two rc.sysinit processes stuck in D state, and kjournald is
also stuck in D state. The stack trace for kjournald is as above,
and the stack trace for both rc.sysinit processes is:
schedule
sleep_on
sleep_on_buffer
do_get_write_access
journal_get_write_access
ext3_reserve_inode_write
ext3_mark_inode_dirty
ext3_dirty_inode
__mark_inode_dirty
update_atime
do_generic_file_read
generic_file_read
kernel_read
prepare_binprm
do_execve
sys_execve
I don't see this problem on any of my other powermac systems, but that
could be because this powerbook still has an old LinuxPPC/2000
userland installed on it whereas all my other boxes are running Debian
sid. I have upgraded mount to 2.11r and e2fsprogs to 1.25 on this box
though.
Can anyone suggest where I could start looking to work out why
kjournald is getting stuck in __wait_on_buffer? Where in the code
does the corresponding wakeup happen?
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ext3 deadlock?
2002-05-12 23:29 ext3 deadlock? Paul Mackerras
@ 2002-05-13 4:08 ` Andrew Morton
2002-05-14 2:15 ` Paul Mackerras
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2002-05-13 4:08 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linux-kernel
Paul Mackerras wrote:
>
> I'm having a problem with 2.5.15 on an old slow powerbook 3400. It
> gets stuck during boot at the point where it starts syslogd. At that
> point show_state() reveals that kjournald and one of the two syslogd
> processes are stuck in D state. The stack trace for kjournald is:
>
> schedule
> __wait_on_buffer
> journal_commit_transaction
> kjournald
kjournald is actually running a commit. So recovery was successful
and the filesystem is up and running.
It's this trace which is the problem. All the other processes
are blocked by kjournald.
kjournald is pretty much write-only. It will perform the
occasional read to load the indirect blocks which describe the
journal location. But that would show a different backtrace.
It appears that kjournald has submitted a block for writeout
(via submit_bh() or ll_rw_block()) and the interrupt which
signifies completion simply hasn't happened.
> ...
>
> Can anyone suggest where I could start looking to work out why
> kjournald is getting stuck in __wait_on_buffer? Where in the code
> does the corresponding wakeup happen?
journal_commit_transaction() writes blocks from several different
places, via ll_rw_block() or submit_bh(). It waits for the
buffers to come unlocked in the end_buffer_io_sync() or
journal_end_buffer_io_sync() completion handlers.
Possibly the device driver has failed to deliver an interrupt.
-
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ext3 deadlock?
2002-05-13 4:08 ` Andrew Morton
@ 2002-05-14 2:15 ` Paul Mackerras
0 siblings, 0 replies; 3+ messages in thread
From: Paul Mackerras @ 2002-05-14 2:15 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Andrew Morton writes:
> It appears that kjournald has submitted a block for writeout
> (via submit_bh() or ll_rw_block()) and the interrupt which
> signifies completion simply hasn't happened.
Ahhh... I had a "hdparm -m16 -d1 -u1 /dev/hda" command in
/etc/rc.d/rc.sysinit. If I take that out it doesn't lock up.
So it is an IDE bug not an ext3 bug. Thanks for the clue.
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2002-05-14 2:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-12 23:29 ext3 deadlock? Paul Mackerras
2002-05-13 4:08 ` Andrew Morton
2002-05-14 2:15 ` Paul Mackerras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox