public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* ext3 deadlock?
@ 2002-05-12 23:29 Paul Mackerras
  2002-05-13  4:08 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Mackerras @ 2002-05-12 23:29 UTC (permalink / raw)
  To: linux-kernel

I'm having a problem with 2.5.15 on an old slow powerbook 3400.  It
gets stuck during boot at the point where it starts syslogd.  At that
point show_state() reveals that kjournald and one of the two syslogd
processes are stuck in D state.  The stack trace for kjournald is:

schedule
__wait_on_buffer
journal_commit_transaction
kjournald

The stack trace on the syslogd process looks like this:

schedule
sleep_on
log_wait_commit
journal_stop
journal_force_commit
ext3_force_commit
ext3_sync_file
sys_fsync

The machine will boot up quite happily with a 2.4.19-pre7 kernel.
If I boot with the ext3 filesystems dirty (i.e. stuff in the journal)
it will usually hang while recovering the journal for the /data
filesystem (I have two partitions, root and /data).

I have just tried booting again (with clean filesystems) and this time
I have two rc.sysinit processes stuck in D state, and kjournald is
also stuck in D state.  The stack trace for kjournald is as above,
and the stack trace for both rc.sysinit processes is:

schedule
sleep_on
sleep_on_buffer
do_get_write_access
journal_get_write_access
ext3_reserve_inode_write
ext3_mark_inode_dirty
ext3_dirty_inode
__mark_inode_dirty
update_atime
do_generic_file_read
generic_file_read
kernel_read
prepare_binprm
do_execve
sys_execve

I don't see this problem on any of my other powermac systems, but that
could be because this powerbook still has an old LinuxPPC/2000
userland installed on it whereas all my other boxes are running Debian
sid.  I have upgraded mount to 2.11r and e2fsprogs to 1.25 on this box
though.

Can anyone suggest where I could start looking to work out why
kjournald is getting stuck in __wait_on_buffer?  Where in the code
does the corresponding wakeup happen?

Paul.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ext3 deadlock?
  2002-05-12 23:29 ext3 deadlock? Paul Mackerras
@ 2002-05-13  4:08 ` Andrew Morton
  2002-05-14  2:15   ` Paul Mackerras
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2002-05-13  4:08 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-kernel

Paul Mackerras wrote:
> 
> I'm having a problem with 2.5.15 on an old slow powerbook 3400.  It
> gets stuck during boot at the point where it starts syslogd.  At that
> point show_state() reveals that kjournald and one of the two syslogd
> processes are stuck in D state.  The stack trace for kjournald is:
> 
> schedule
> __wait_on_buffer
> journal_commit_transaction
> kjournald

kjournald is actually running a commit.  So recovery was successful
and the filesystem is up and running.

It's this trace which is the problem.  All the other processes
are blocked by kjournald.

kjournald is pretty much write-only.  It will perform the
occasional read to load the indirect blocks which describe the
journal location.  But that would show a different backtrace.

It appears that kjournald has submitted a block for writeout
(via submit_bh() or ll_rw_block()) and the interrupt which
signifies completion simply hasn't happened.

> ...
> 
> Can anyone suggest where I could start looking to work out why
> kjournald is getting stuck in __wait_on_buffer?  Where in the code
> does the corresponding wakeup happen?

journal_commit_transaction() writes blocks from several different
places, via ll_rw_block() or submit_bh().  It waits for the
buffers to come unlocked in the end_buffer_io_sync() or
journal_end_buffer_io_sync() completion handlers.

Possibly the device driver has failed to deliver an interrupt.

-

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ext3 deadlock?
  2002-05-13  4:08 ` Andrew Morton
@ 2002-05-14  2:15   ` Paul Mackerras
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Mackerras @ 2002-05-14  2:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton writes:

> It appears that kjournald has submitted a block for writeout
> (via submit_bh() or ll_rw_block()) and the interrupt which
> signifies completion simply hasn't happened.

Ahhh...  I had a "hdparm -m16 -d1 -u1 /dev/hda" command in
/etc/rc.d/rc.sysinit.  If I take that out it doesn't lock up.

So it is an IDE bug not an ext3 bug.  Thanks for the clue.

Paul.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-05-14  2:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-12 23:29 ext3 deadlock? Paul Mackerras
2002-05-13  4:08 ` Andrew Morton
2002-05-14  2:15   ` Paul Mackerras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox