2.5.60-BK reproducible oops, during LTP run

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* 2.5.60-BK reproducible oops, during LTP run
@ 2003-02-12 19:32 Jeff Garzik
  2003-02-13  7:07 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Garzik @ 2003-02-12 19:32 UTC (permalink / raw)
  To: lkml; +Cc: akpm

I have reproduced the BUG in fs/buffer.c:2533 twice now.  Test 
conditions exactly the same, fsx-linux in one window, LTP in another window.

The machine stays alive for pings and sysrq's, but I cannot ssh into it 
nor login at the console.  sysrq-s initiates a sync of the root 
filesystem on the ATA disk but never finishes.  other sysrq's and pings 
continues to work after the sysrq-s invocation.

Call trace from BUG, each time I hit it:
EIP in submit_bh
ll_rw_block
journal_commit_transaction
schedule
default_wake_function
kjournald
commit_timeout
kjournald
kernel_thread_helper

sysrq-t bits (I note truncate and sys_sync as common elements):
* shows that LTP test "ftest08" is running
* fsx-linux stack trace:  io_schedule, __wait_on_buffer, 
free_hot_cold_page, autoremove_wake_function, autoremovce_wake_function, 
journal_invalidatepage, ext2_invalidatepage, do_invalidatepage, 
truncate_complete_page, truncate_inode_pages, vmtruncate, inode_setattr, 
ext3_setattr, notify_change, do_truncate, do_sys_ftruncate, 
sys_ftruncate, syscall_call
* ftest08 stack trace, process 0: sys_wait4, do_fork, 
default_wake_function, default_wake_function, sycall_call
* ftest08 trace, proc 1: __down, default_wake_function, __down_failed, 
.text.ock.read_write, sys_llseek, sys_sync, syscall_call
* ftest08 trace, proc 2: do_readv_writev, __down, default_wake_function, 
__down_failed, .text.lock.read_write, sys_llseek, syscall_call
* ftest08 trace, proc 3: sleep_on, default_wake_function, 
log_wait_commit, journal_stop, journal_force_commit, write_inode, 
__sync_single_inode, sync_sb_inodes, sync_inodes_sb, sync_inodes, 
sys_sync, syscall_call
* ftest08 trace, proc 4: do_writepages, __down, default_wake_function, 
__down_failed, .text.lock.read_write, sync_blockdev, sys_llseek, 
sys_sync, syscall_call
* ftest08 trace, proc 5: sleep_on, default_wake_function, 
log_wait_commit, journal_stop, journal_force_commit, ext3_force_commit, 
ext3_sync_file, sys_fsync, syscall_call


sysrq-m bits:

free pages: 5552kB (0kB highmem)
active: 48451, inactive: 9478 dirty:10 writeback:0 free:1388
DMA free: 2368kB, min 128kB, low:256kB, high:384kB, active:3776kB, 
inactive:6688kB
Normal free: 3184kB min:1020 low:2040 high:3060 active:190028 inactive:31224
swap cache: add 93, delete 92, find 24/26, race 0+0
free swap: 2096776kB
63472 pages of RAM
1471 reserved pages
27697 pages shared
1 pages swap cached


hardware bits:

via c3 cpu
gcc 3.2.2
kernel 2.5.60-bk2
one 40GB ATA/100 drive, running at ATA/100
256 MB RAM


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.5.60-BK reproducible oops, during LTP run
  2003-02-12 19:32 2.5.60-BK reproducible oops, during LTP run Jeff Garzik
@ 2003-02-13  7:07 ` Andrew Morton
  2003-02-13  7:25   ` Jeff Garzik
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2003-02-13  7:07 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel

Jeff Garzik <jgarzik@pobox.com> wrote:
>
> I have reproduced the BUG in fs/buffer.c:2533 twice now.  Test 
> conditions exactly the same, fsx-linux in one window, LTP in another window.
> 

Thanks.  It's a long-standing but benign bogon which was exposed by recent
ext3 simplifications.   This needs lots of testing.

when ext3_writepage races with truncate, block_write_full_page() will see
that the page is outside i_size and will bale out with -EIO.  But
ext3_writepage() will ignore this and will proceed to add the buffers to the
transaction.

Later, kjournald tries to write them out and goes BUG() because those buffers
are not mapped to disk.

The fix is to not attach the buffers to the transaction in ext3_writepage()
if block_write_full_page() failed.

So far so good, but that page now has dirty, unmapped buffers (the buffers
were attached in a dirty state by ext3_writepage()).  So teach
block_write_full_page() to clean the buffers against the page if it is wholly
outside i_size.

(A simpler fix to all of this might be to just bale out of ext3_writepage()
if the page is outside i_size.  But that is racy against
block_write_full_page()'s subsequent execution of the same comparison).

 buffer.c     |    6 ++++++
 ext3/inode.c |   13 ++++++++++---
 2 files changed, 16 insertions(+), 3 deletions(-)

diff -puN fs/ext3/inode.c~ext3-eio-fix fs/ext3/inode.c
--- 25/fs/ext3/inode.c~ext3-eio-fix	2003-02-12 22:32:07.000000000 -0800
+++ 25-akpm/fs/ext3/inode.c	2003-02-12 22:48:40.000000000 -0800
@@ -1357,10 +1357,17 @@ static int ext3_writepage(struct page *p
 	handle = ext3_journal_current_handle();
 	lock_kernel();

-	/* And attach them to the current transaction */
+	/*
+	 * And attach them to the current transaction.  But only if 
+	 * block_write_full_page() succeeded.  Otherwise they are unmapped,
+	 * and generally junk.
+	 */
 	if (order_data) {
-		err = walk_page_buffers(handle, page_bufs,
-			0, PAGE_CACHE_SIZE, NULL, ext3_journal_dirty_data);
+		if (ret == 0) {
+			err = walk_page_buffers(handle, page_bufs,
+				0, PAGE_CACHE_SIZE, NULL,
+				ext3_journal_dirty_data);
+		}
 		walk_page_buffers(handle, page_bufs, 0,
 				PAGE_CACHE_SIZE, NULL, bput_one);
 		if (!ret)
diff -puN fs/buffer.c~ext3-eio-fix fs/buffer.c
--- 25/fs/buffer.c~ext3-eio-fix	2003-02-12 22:55:03.000000000 -0800
+++ 25-akpm/fs/buffer.c	2003-02-12 22:59:23.000000000 -0800
@@ -2502,6 +2502,12 @@ int block_write_full_page(struct page *p
 	/* Is the page fully outside i_size? (truncate in progress) */
 	offset = inode->i_size & (PAGE_CACHE_SIZE-1);
 	if (page->index >= end_index+1 || !offset) {
+		/*
+		 * The page may have dirty, unmapped buffers.  For example,
+		 * they may have been added in ext3_writepage().  Make them
+		 * freeable here, so the page does not leak.
+		 */
+		block_invalidatepage(page, 0);
 		unlock_page(page);
 		return -EIO;
 	}

_

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.5.60-BK reproducible oops, during LTP run
  2003-02-13  7:07 ` Andrew Morton
@ 2003-02-13  7:25   ` Jeff Garzik
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Garzik @ 2003-02-13  7:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Thanks for chasing the bug down.  I'll beat it up overnight.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-02-13  7:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-12 19:32 2.5.60-BK reproducible oops, during LTP run Jeff Garzik
2003-02-13  7:07 ` Andrew Morton
2003-02-13  7:25   ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox