From: Nick Piggin <npiggin@suse.de>
To: akpm@linux-foundation.org, xfs@oss.sgi.com
Cc: linux-fsdevel@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>
Subject: Re: [patch 0/9] writeback data integrity and other fixes (take 3)
Date: Tue, 28 Oct 2008 16:39:53 +0100 [thread overview]
Message-ID: <20081028153953.GB3082@wotan.suse.de> (raw)
In-Reply-To: <20081028144715.683011000@suse.de>
On Wed, Oct 29, 2008 at 01:47:15AM +1100, npiggin@suse.de wrote:
> OK, I'm happier with this patchset now. Note that I've taken your patch
> and mangled it a bit at the end of the series.
>
> This one survives and seems to run OK here, but I'm mainly doing dumb
> stress testing with a handful of filesystems, and data-io error injection
> testing. There are a lot of combinations of ways this function can operate
> and interact obviously, so it would be helpful to get more review.
>
> Chris, would you possibly have time to run your btrfs tests that are
> sensitive to problems in this code? I could provide you a single patch
> rollup against mainline if it helps.
BTW. XFS seems to be doing something interesting with my simple sync
test case with IO error injection. I map a file MAP_SHARED into a number of
processes, which then each run a loop that dirties the memory then calls
msync(MS_SYNC) on the range.
ext2 mostly reports -EIO back to userspace when a failure is injected AFAIKS.
ext3 (ordered) doesn't until a lot of errors have been injected, but eventually
reports -EIO and shuts down the filesystem. reiserfs seems to report failure
more consistently.
I haven't seen any -EIO failures from XFS... maybe I'm just not doing the
right thing, or there is a caveat I'm not aware of.
All fault injections I noticed had a trace like this:
FAULT_INJECTION: forcing a failure
Call Trace:
9f9cd758: [<6019f1de>] random32+0xe/0x20
9f9cd768: [<601a31b9>] should_fail+0xd9/0x130
9f9cd798: [<6018d0c4>] generic_make_request+0x304/0x4e0
9f9cd7a8: [<60062301>] mempool_alloc+0x51/0x130
9f9cd858: [<6018e6bf>] submit_bio+0x4f/0xe0
9f9cd8a8: [<60165505>] xfs_submit_ioend_bio+0x25/0x40
9f9cd8c8: [<6016603c>] xfs_submit_ioend+0xbc/0xf0
9f9cd908: [<60166bf9>] xfs_page_state_convert+0x3d9/0x6a0
9f9cd928: [<6005d515>] delayacct_end+0x95/0xb0
9f9cda08: [<60166ffd>] xfs_vm_writepage+0x6d/0x110
9f9cda18: [<6006618b>] set_page_dirty+0x4b/0xd0
9f9cda58: [<60066115>] __writepage+0x15/0x40
9f9cda78: [<60066775>] write_cache_pages+0x255/0x470
9f9cda90: [<60066100>] __writepage+0x0/0x40
9f9cdb98: [<600669b0>] generic_writepages+0x20/0x30
9f9cdba8: [<60165ba3>] xfs_vm_writepages+0x53/0x70
9f9cdbd8: [<600669eb>] do_writepages+0x2b/0x40
9f9cdbf8: [<6006004c>] __filemap_fdatawrite_range+0x5c/0x70
9f9cdc58: [<6006026a>] filemap_fdatawrite+0x1a/0x20
9f9cdc68: [<600a7a05>] do_fsync+0x45/0xe0
9f9cdc98: [<6007794b>] sys_msync+0x14b/0x1d0
9f9cdcf8: [<60019a70>] handle_syscall+0x50/0x80
9f9cdd18: [<6002a10f>] userspace+0x44f/0x510
9f9cdfc8: [<60016792>] fork_handler+0x62/0x70
And the kernel would sometimes say this:
Buffer I/O error on device ram0, logical block 279
lost page write due to I/O error on ram0
Buffer I/O error on device ram0, logical block 379
lost page write due to I/O error on ram0
Buffer I/O error on device ram0, logical block 389
lost page write due to I/O error on ram0
I think I also saw a slab bug when running dbench with fault injection on.
Running latest Linus kernel.
bash-3.1# dbench -t10 -c ../client.txt 8
dbench version 3.04 - Copyright Andrew Tridgell 1999-2004
Running for 10 seconds with load '../client.txt' and minimum warmup 2 secs
8 clients started
FAULT_INJECTION: forcing a failure
Call Trace:
9e7bb548: [<601623ae>] random32+0xe/0x20
9e7bb558: [<60166389>] should_fail+0xd9/0x130
9e7bb588: [<60150294>] generic_make_request+0x304/0x4e0
9e7bb598: [<60062301>] mempool_alloc+0x51/0x130
9e7bb648: [<6015188f>] submit_bio+0x4f/0xe0
9e7bb698: [<6012b440>] _xfs_buf_ioapply+0x180/0x2a0
9e7bb6a0: [<6002f600>] default_wake_function+0x0/0x10
9e7bb6f8: [<6012bae1>] xfs_buf_iorequest+0x31/0x90
9e7bb718: [<60112f75>] xlog_bdstrat_cb+0x45/0x50
9e7bb738: [<60114135>] xlog_sync+0x195/0x440
9e7bb778: [<60114491>] xlog_state_release_iclog+0xb1/0xc0
9e7bb7a8: [<60114ca9>] xlog_write+0x539/0x550
9e7bb858: [<60114e60>] xfs_log_write+0x40/0x60
9e7bb888: [<6011fbaa>] _xfs_trans_commit+0x19a/0x360
9e7bb8b8: [<600838e2>] poison_obj+0x42/0x60
9e7bb8d0: [<60082cb3>] dbg_redzone1+0x13/0x30
9e7bb8e8: [<60083999>] cache_alloc_debugcheck_after+0x99/0x1c0
9e7bb918: [<6008517b>] kmem_cache_alloc+0x8b/0x100
9e7bb958: [<60128084>] kmem_zone_alloc+0x74/0xe0
9e7bb998: [<60082ad9>] kmem_cache_size+0x9/0x10
9e7bb9a8: [<60128124>] kmem_zone_zalloc+0x34/0x50
9e7bb9e8: [<60121e8b>] xfs_dir_ialloc+0x13b/0x2e0
9e7bba58: [<601f534b>] __down_write+0xb/0x10
9e7bbaa8: [<60125b9e>] xfs_mkdir+0x37e/0x4b0
9e7bbb38: [<601f5589>] _spin_unlock+0x9/0x10
9e7bbb78: [<601301a4>] xfs_vn_mknod+0xf4/0x1a0
9e7bbbd8: [<6013025e>] xfs_vn_mkdir+0xe/0x10
9e7bbbe8: [<60091010>] vfs_mkdir+0x90/0xc0
9e7bbc18: [<600934d6>] sys_mkdirat+0x106/0x120
9e7bbc88: [<6008629b>] filp_close+0x4b/0x80
9e7bbce8: [<60093503>] sys_mkdir+0x13/0x20
9e7bbcf8: [<60019a70>] handle_syscall+0x50/0x80
9e7bbd18: [<6002a10f>] userspace+0x44f/0x510
9e7bbfc8: [<60016792>] fork_handler+0x62/0x70
I/O error in filesystem ("ram0") meta-data dev ram0 block 0x8002c ("xlog_i
odone") error 5 buf count 32768
xfs_force_shutdown(ram0,0x2) called from line 1056 of file /home/npiggin/usr/src
/linux-2.6/fs/xfs/xfs_log.c. Return address = 0x000000006011370d
Filesystem "ram0": Log I/O Error Detected. Shutting down filesystem: ram0
Please umount the filesystem, and rectify the problem(s)
xfs_force_shutdown(ram0,0x2) called from line 818 of file /home/npiggin/usr/src/
linux-2.6/fs/xfs/xfs_log.c. Return address = 0x0000000060114e7d
slab error in verify_redzone_free(): cache `xfs_log_ticket': double free detecte
d
Call Trace:
9e7bb998: [<6008372f>] __slab_error+0x1f/0x30
9e7bb9a8: [<60083cae>] cache_free_debugcheck+0x1ee/0x240
9e7bb9b0: [<60112ef0>] xlog_ticket_put+0x10/0x20
9e7bb9e8: [<60083f70>] kmem_cache_free+0x50/0xc0
9e7bba18: [<60112ef0>] xlog_ticket_put+0x10/0x20
9e7bba28: [<60114dc9>] xfs_log_done+0x59/0xb0
9e7bba68: [<6011f5de>] xfs_trans_cancel+0x7e/0x140
9e7bbaa8: [<60125a1e>] xfs_mkdir+0x1fe/0x4b0
9e7bbb38: [<601f5589>] _spin_unlock+0x9/0x10
9e7bbb78: [<601301a4>] xfs_vn_mknod+0xf4/0x1a0
9e7bbbd8: [<6013025e>] xfs_vn_mkdir+0xe/0x10
9e7bbbe8: [<60091010>] vfs_mkdir+0x90/0xc0
9e7bbc18: [<600934d6>] sys_mkdirat+0x106/0x120
9e7bbc88: [<6008629b>] filp_close+0x4b/0x80
9e7bbce8: [<60093503>] sys_mkdir+0x13/0x20
9e7bbcf8: [<60019a70>] handle_syscall+0x50/0x80
9e7bbd18: [<6002a10f>] userspace+0x44f/0x510
9e7bbfc8: [<60016792>] fork_handler+0x62/0x70
000000009e0d4ec0: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b.
(3) open ./clients/client1 failed for handle 16385 (No such file or directory)
(4) ERROR: handle 16385 was not found
Child failed with status 1
(kernel died soon afterwards)
next prev parent reply other threads:[~2008-10-28 15:39 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-28 14:47 [patch 0/9] writeback data integrity and other fixes (take 3) npiggin
2008-10-28 14:47 ` [patch 1/9] mm: write_cache_pages cyclic fix npiggin
2008-10-29 0:24 ` [patch 1.1/9] mm: write_cache_pages cyclic fix fix Nick Piggin
2008-10-28 14:47 ` [patch 2/9] mm: write_cache_pages early loop termination npiggin
2008-10-28 14:47 ` [patch 3/9] mm: write_cache_pages writepage error fix npiggin
2008-10-28 14:47 ` [patch 4/9] mm: write_cache_pages integrity fix npiggin
2008-10-28 14:47 ` [patch 5/9] mm: write_cache_pages cleanups npiggin
2008-10-28 14:47 ` [patch 6/9] mm: write_cache_pages optimise page cleaning npiggin
2008-10-28 14:47 ` [patch 7/9] mm: write_cache_pages terminate quickly npiggin
2008-10-30 23:07 ` Andrew Morton
2008-10-31 7:29 ` Nick Piggin
2008-10-28 14:47 ` [patch 8/9] mm: write_cache_pages more " npiggin
2008-10-28 14:47 ` [patch 9/9] mm: do_sync_mapping_range integrity fix npiggin
2008-10-30 23:13 ` Andrew Morton
2008-10-31 9:16 ` Nick Piggin
2008-10-31 10:04 ` Andrew Morton
2008-10-31 10:53 ` Nick Piggin
2008-10-31 20:03 ` Jamie Lokier
2008-10-31 14:10 ` Chris Mason
2008-10-31 14:30 ` steve
2008-10-31 15:02 ` Chris Mason
2008-11-01 8:04 ` Nick Piggin
2008-10-28 15:39 ` Nick Piggin [this message]
2008-10-28 22:27 ` [patch 0/9] writeback data integrity and other fixes (take 3) Dave Chinner
2008-10-29 0:04 ` Nick Piggin
2008-10-29 0:16 ` Nick Piggin
2008-10-29 3:16 ` Dave Chinner
2008-10-29 3:26 ` Dave Chinner
2008-10-29 4:11 ` Nick Piggin
2008-10-29 4:57 ` Dave Chinner
2008-10-29 5:06 ` Nick Piggin
2008-10-29 9:13 ` Christoph Hellwig
2008-10-29 21:42 ` Dave Chinner
2008-10-29 21:45 ` Christoph Hellwig
2008-10-29 21:53 ` Dave Chinner
2008-10-29 4:00 ` Nick Piggin
2008-10-29 5:27 ` Dave Chinner
2008-10-29 9:12 ` Christoph Hellwig
2008-10-29 9:21 ` Nick Piggin
2008-10-29 9:44 ` Christoph Hellwig
2008-10-29 10:30 ` Nick Piggin
2008-10-29 12:22 ` Jamie Lokier
[not found] ` <20081029122234.GE846-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2008-10-29 13:32 ` Ric Wheeler
2008-10-29 14:56 ` Chris Mason
[not found] ` <1225292196.6448.263.camel-cGoWVVl3WGUrkklhUoBCrlaTQe2KTcn/@public.gmane.org>
2008-10-30 2:16 ` Nick Piggin
[not found] ` <20081030021601.GF18041-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-10-30 12:51 ` jim owens
2008-10-30 13:41 ` Jim Rees
2008-10-29 21:43 ` Dave Chinner
2008-10-29 8:51 ` Dave Chinner
2008-10-28 23:14 ` Dave Chinner
2008-10-28 23:57 ` Nick Piggin
2008-10-29 0:05 ` Andrew Morton
2008-10-29 0:10 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081028153953.GB3082@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).