* xfs and raid5 - "Structure needs cleaning for directory open" @ 2010-05-09 18:48 ` Rainer Fuegenstein 2010-05-10 2:20 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Rainer Fuegenstein @ 2010-05-09 18:48 UTC (permalink / raw) To: xfs; +Cc: linux-raid today in the morning some daemon processes terminated because of errors in the xfs file system on top of a software raid5, consisting of 4*1.5TB WD caviar green SATA disks. current OS is centos 5.4, kernel is: Linux alfred 2.6.18-164.15.1.el5xen #1 SMP Wed Mar 17 12:04:23 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux the history: this raid was originally created on an ASUS M2N-X plus mainboard with all 4 drives connected to the on-board controller. (centos 5.4, current i386 kernel). it worked fine first, but after some months problems occured when copying files via SMB, in these situations dmesg showed a stack trace, starting with an interrupt problem deep in the kernel and reaching up to the xfs filesystem code. a few months ago the weekly raid check (/etc/cron.weekly/99-raid-check) started a re-sync of the raid which (on the M2N-X board) took about 2.5 to 3 days to complete. to overcome the interrupt problems, I recently bought an intel D510 atom mainboard and a "Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02))" sata controller, reinstalled centos 5.4 from scratch (x86_64 version) and attached the 4 sata disks which worked fine until this sunday night the 99-raid-check started again at 4:00 in the morning and lasted until just now (19:00 o'clock). around 12:00 noon (resync at about 50%) I noticed the first problems, namely "Structure needs cleaning for directory open" messages. at this time, a "du -sh *" revealed that around 50% of the data stored on the xfs was lost (due to directories that couldn't be read because of the "needs cleaning ..." error. a daring xfs_repair on the unmounted, but still syncing filesystem revealed & fixed no errors (see output below). after painfully waiting 7 hours for the resync to complete, it looks like the filesystem is OK and back to normal again: du shows the expected 3.5TB usage, there are no more "needs cleaning ..." errors and a quick check into the previously lost directories seems to show that the files contained within seem to be OK. I wonder what caused this behaviour (and how to prevent it in the future): 1) damages done to the xfs filesystem on the old board? shouldn't xfs_repair find & repair them? 2) does a re-syncing raid deliver bad/corrupt data to the filesystem layer above? 3) may this be a hardware/memory problem since xfs reports "Corruption of in-memory data detected". ? 4) is the Promise SATA controller to blame ? here's some output that may help. please let me know if you need more: *** this is where it started: May 9 04:22:01 alfred kernel: md: syncing RAID array md0 May 9 04:22:01 alfred kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. May 9 04:22:01 alfred kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstructio n. May 9 04:22:01 alfred kernel: md: using 128k window, over a total of 1465135936 blocks. May 9 04:24:06 alfred kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4565 of file fs/xfs/xfs_bmap.c. Caller 0xffff ffff8835dba8 May 9 04:24:06 alfred kernel: May 9 04:24:06 alfred kernel: Call Trace: May 9 04:24:06 alfred kernel: [<ffffffff8833f15e>] :xfs:xfs_bmap_read_extents+0x361/0x384 May 9 04:24:06 alfred kernel: [<ffffffff8835dba8>] :xfs:xfs_iread_extents+0xac/0xc8 May 9 04:24:06 alfred kernel: [<ffffffff883448c3>] :xfs:xfs_bmapi+0x226/0xe79 May 9 04:24:06 alfred kernel: [<ffffffff8021c4c6>] generic_make_request+0x211/0x228 May 9 04:24:06 alfred kernel: [<ffffffff882edd2e>] :raid456:handle_stripe+0x20a6/0x21ff May 9 04:24:06 alfred kernel: [<ffffffff88361a2b>] :xfs:xfs_iomap+0x144/0x2a5 May 9 04:24:06 alfred kernel: [<ffffffff88376c38>] :xfs:__xfs_get_blocks+0x7a/0x1bf May 9 04:24:06 alfred kernel: [<ffffffff882eebdb>] :raid456:make_request+0x4ba/0x4f4 May 9 04:24:06 alfred kernel: [<ffffffff8029bfc3>] autoremove_wake_function+0x0/0x2e May 9 04:24:06 alfred kernel: [<ffffffff80228a95>] do_mpage_readpage+0x167/0x474 May 9 04:24:06 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 04:24:06 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 04:24:06 alfred kernel: [<ffffffff8020cc70>] add_to_page_cache+0xb9/0xc5 May 9 04:24:06 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 04:24:06 alfred kernel: [<ffffffff8023a3d8>] mpage_readpages+0x91/0xd9 May 9 04:24:06 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 04:24:06 alfred kernel: [<ffffffff8020f66f>] __alloc_pages+0x65/0x2ce May 9 04:24:06 alfred kernel: [<ffffffff802137d1>] __do_page_cache_readahead+0x130/0x1ab May 9 04:24:06 alfred kernel: [<ffffffff802336c8>] blockable_page_cache_readahead+0x53/0xb2 May 9 04:24:06 alfred kernel: [<ffffffff802147a4>] page_cache_readahead+0xd6/0x1af May 9 04:24:06 alfred kernel: [<ffffffff8020c6d7>] do_generic_mapping_read+0xc6/0x38a May 9 04:24:06 alfred kernel: [<ffffffff8020d693>] file_read_actor+0x0/0x101 May 9 04:24:06 alfred kernel: [<ffffffff8020cae7>] __generic_file_aio_read+0x14c/0x198 May 9 04:24:06 alfred kernel: [<ffffffff8837d7de>] :xfs:xfs_read+0x187/0x209 May 9 04:24:06 alfred kernel: [<ffffffff8837a4d8>] :xfs:xfs_file_aio_read+0x63/0x6b May 9 04:24:06 alfred kernel: [<ffffffff8020d3d2>] do_sync_read+0xc7/0x104 May 9 04:24:06 alfred kernel: [<ffffffff8021ecec>] __dentry_open+0x101/0x1dc May 9 04:24:06 alfred kernel: [<ffffffff8029bfc3>] autoremove_wake_function+0x0/0x2e May 9 04:24:06 alfred kernel: [<ffffffff80227a40>] do_filp_open+0x2a/0x38 May 9 04:24:06 alfred kernel: [<ffffffff8020bbaf>] vfs_read+0xcb/0x171 May 9 04:24:06 alfred kernel: [<ffffffff80212495>] sys_read+0x45/0x6e May 9 04:24:06 alfred kernel: [<ffffffff8026168d>] ia32_sysret+0x0/0x5 May 9 04:24:06 alfred kernel: May 9 04:24:06 alfred kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4565 of file fs/xfs/xfs_bmap.c. Caller 0xffff ffff8835dba8 *** (many, many more) May 9 06:19:16 alfred kernel: Filesystem "md0": corrupt dinode 1610637790, (btree extents). Unmount and run xfs_repair. May 9 06:19:16 alfred kernel: Filesystem "md0": XFS internal error xfs_bmap_read_extents(1) at line 4560 of file fs/xfs/xfs_bma p.c. Caller 0xffffffff8835dba8 May 9 06:19:16 alfred kernel: May 9 06:19:16 alfred kernel: Call Trace: May 9 06:19:16 alfred kernel: [<ffffffff8833f15e>] :xfs:xfs_bmap_read_extents+0x361/0x384 May 9 06:19:16 alfred kernel: [<ffffffff8835dba8>] :xfs:xfs_iread_extents+0xac/0xc8 May 9 06:19:16 alfred kernel: [<ffffffff883448c3>] :xfs:xfs_bmapi+0x226/0xe79 May 9 06:19:16 alfred kernel: [<ffffffff8866ac47>] :ip_conntrack:tcp_pkt_to_tuple+0x0/0x61 May 9 06:19:16 alfred kernel: [<ffffffff8866883d>] :ip_conntrack:__ip_conntrack_find+0xd/0xb7 May 9 06:19:16 alfred kernel: [<ffffffff8023f750>] lock_timer_base+0x1b/0x3c May 9 06:19:16 alfred kernel: [<ffffffff8021ce99>] __mod_timer+0xb0/0xbe May 9 06:19:16 alfred kernel: [<ffffffff88668e71>] :ip_conntrack:__ip_ct_refresh_acct+0x10f/0x152 May 9 06:19:16 alfred kernel: [<ffffffff8866b8a8>] :ip_conntrack:tcp_packet+0xa5f/0xa9f May 9 06:19:16 alfred kernel: [<ffffffff88361a2b>] :xfs:xfs_iomap+0x144/0x2a5 May 9 06:19:16 alfred kernel: [<ffffffff88376c38>] :xfs:__xfs_get_blocks+0x7a/0x1bf May 9 06:19:16 alfred kernel: [<ffffffff802235ae>] alloc_buffer_head+0x31/0x36 May 9 06:19:16 alfred kernel: [<ffffffff8022fa7a>] alloc_page_buffers+0x81/0xd3 May 9 06:19:16 alfred kernel: [<ffffffff8020ea95>] __block_prepare_write+0x1ad/0x375 May 9 06:19:16 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 06:19:16 alfred kernel: [<ffffffff802bda81>] add_to_page_cache_lru+0x1c/0x22 May 9 06:19:16 alfred kernel: [<ffffffff802d3456>] block_write_begin+0x80/0xcf May 9 06:19:16 alfred kernel: [<ffffffff8837637d>] :xfs:xfs_vm_write_begin+0x19/0x1e May 9 06:19:16 alfred kernel: [<ffffffff88376d8e>] :xfs:xfs_get_blocks+0x0/0xe May 9 06:19:16 alfred kernel: [<ffffffff8021072e>] generic_file_buffered_write+0x14b/0x60c May 9 06:19:16 alfred kernel: [<ffffffff80209e60>] __d_lookup+0xb0/0xff May 9 06:19:16 alfred kernel: [<ffffffff80264931>] _spin_lock_irqsave+0x9/0x14 May 9 06:19:16 alfred kernel: [<ffffffff8837dcfe>] :xfs:xfs_write+0x49e/0x69e May 9 06:19:16 alfred kernel: [<ffffffff8022d090>] mntput_no_expire+0x19/0x89 May 9 06:19:16 alfred kernel: [<ffffffff8020edf0>] link_path_walk+0xa6/0xb2 May 9 06:19:16 alfred kernel: [<ffffffff8837a470>] :xfs:xfs_file_aio_write+0x65/0x6a May 9 06:19:16 alfred kernel: [<ffffffff802185e8>] do_sync_write+0xc7/0x104 May 9 06:19:16 alfred kernel: [<ffffffff8021ecec>] __dentry_open+0x101/0x1dc May 9 06:19:16 alfred kernel: [<ffffffff8029bfc3>] autoremove_wake_function+0x0/0x2e May 9 06:19:16 alfred kernel: [<ffffffff80227a40>] do_filp_open+0x2a/0x38 May 9 06:19:16 alfred kernel: [<ffffffff802171aa>] vfs_write+0xce/0x174 May 9 06:19:16 alfred kernel: [<ffffffff802179e2>] sys_write+0x45/0x6e May 9 06:19:16 alfred kernel: [<ffffffff8026168d>] ia32_sysret+0x0/0x5 *** also many, many more, always the same dinode May 9 12:53:32 alfred kernel: Filesystem "md0": XFS internal error xfs_btree_check_sblock at line 307 of file fs/xfs/xfs_btree. c. Caller 0xffffffff88358eb7 May 9 12:53:32 alfred kernel: May 9 12:53:32 alfred kernel: Call Trace: May 9 12:53:32 alfred kernel: [<ffffffff88349bee>] :xfs:xfs_btree_check_sblock+0xaf/0xbe May 9 12:53:32 alfred kernel: [<ffffffff88358eb7>] :xfs:xfs_inobt_increment+0x156/0x17e May 9 12:53:32 alfred kernel: [<ffffffff88358920>] :xfs:xfs_dialloc+0x4d0/0x80c May 9 12:53:32 alfred kernel: [<ffffffff802260ff>] find_or_create_page+0x3f/0xab May 9 12:53:32 alfred kernel: [<ffffffff8835eafc>] :xfs:xfs_ialloc+0x5f/0x57f May 9 12:53:32 alfred kernel: [<ffffffff8805c02a>] :ext3:ext3_get_acl+0x63/0x310 May 9 12:53:32 alfred kernel: [<ffffffff8020b242>] kmem_cache_alloc+0x62/0x6d May 9 12:53:32 alfred kernel: [<ffffffff88370b23>] :xfs:xfs_dir_ialloc+0x86/0x2b7 May 9 12:53:32 alfred kernel: [<ffffffff883654c0>] :xfs:xlog_grant_log_space+0x204/0x25c May 9 12:53:32 alfred kernel: [<ffffffff883735f8>] :xfs:xfs_create+0x237/0x45c May 9 12:53:32 alfred kernel: [<ffffffff88338d47>] :xfs:xfs_attr_get+0x8e/0x9f May 9 12:53:32 alfred kernel: [<ffffffff8837cd38>] :xfs:xfs_vn_mknod+0x144/0x215 May 9 12:53:32 alfred kernel: [<ffffffff8023bdcb>] vfs_create+0xe6/0x158 May 9 12:53:32 alfred kernel: [<ffffffff8021b38f>] open_namei+0x1a1/0x6ed May 9 12:53:32 alfred kernel: [<ffffffff80227a32>] do_filp_open+0x1c/0x38 May 9 12:53:32 alfred kernel: [<ffffffff8021a1a0>] do_sys_open+0x44/0xbe May 9 12:53:32 alfred kernel: [<ffffffff8026168d>] ia32_sysret+0x0/0x5 May 9 12:53:32 alfred kernel: *** also many, many more May 9 13:44:35 alfred kernel: 00000000: ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ÿÿÿÿÿÿÿÿ........ May 9 13:44:35 alfred kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff8834b82e May 9 13:44:35 alfred kernel: May 9 13:44:35 alfred kernel: Call Trace: May 9 13:44:35 alfred kernel: [<ffffffff8834b72d>] :xfs:xfs_da_do_buf+0x503/0x5b1 May 9 13:44:35 alfred kernel: [<ffffffff8834b82e>] :xfs:xfs_da_read_buf+0x16/0x1b May 9 13:44:35 alfred kernel: [<ffffffff8020cb6c>] _atomic_dec_and_lock+0x39/0x57 May 9 13:44:35 alfred kernel: [<ffffffff8834b82e>] :xfs:xfs_da_read_buf+0x16/0x1b May 9 13:44:35 alfred kernel: [<ffffffff88350b0c>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec May 9 13:44:35 alfred kernel: [<ffffffff88350b0c>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec May 9 13:44:35 alfred kernel: [<ffffffff88379c6c>] :xfs:xfs_hack_filldir+0x0/0x5b May 9 13:44:35 alfred kernel: [<ffffffff88379c6c>] :xfs:xfs_hack_filldir+0x0/0x5b May 9 13:44:35 alfred kernel: [<ffffffff8834d868>] :xfs:xfs_readdir+0xa7/0xb6 May 9 13:44:35 alfred kernel: [<ffffffff8837a301>] :xfs:xfs_file_readdir+0xff/0x14c May 9 13:44:35 alfred kernel: [<ffffffff80225d93>] filldir+0x0/0xb7 May 9 13:44:35 alfred kernel: [<ffffffff80225d93>] filldir+0x0/0xb7 May 9 13:44:35 alfred kernel: [<ffffffff802366f7>] vfs_readdir+0x77/0xa9 May 9 13:44:35 alfred kernel: [<ffffffff80239f2a>] sys_getdents+0x75/0xbd May 9 13:44:35 alfred kernel: [<ffffffff80260295>] tracesys+0x47/0xb6 May 9 13:44:35 alfred kernel: [<ffffffff802602f9>] tracesys+0xab/0xb6 May 9 13:44:35 alfred kernel: May 9 13:51:24 alfred kernel: Filesystem "md0": Disabling barriers, trial barrier write failed May 9 13:51:24 alfred kernel: XFS mounting filesystem md0 *** these xfs_da_do_buf errors appear at a rate of about 5 per second until 14:40 o'clock, then stop. file system was still mounted, maybe one daemon was still accessing it. *** xfs_repair performed when raid was at 50% resync and filesystem was corrupted: [root@alfred ~]# xfs_repair /dev/md0 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 [...] - agno = 62 - agno = 63 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 [...] - agno = 61 - agno = 62 - agno = 63 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done raid output after sync was finished: [root@alfred md]# cat /sys/block/md0/md/array_state clean [root@alfred md]# cat /sys/block/md0/md/mismatch_cnt 0 tnx & cu -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfs and raid5 - "Structure needs cleaning for directory open" 2010-05-09 18:48 ` xfs and raid5 - "Structure needs cleaning for directory open" Rainer Fuegenstein @ 2010-05-10 2:20 ` Dave Chinner 2010-05-17 21:28 ` Doug Ledford 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2010-05-10 2:20 UTC (permalink / raw) To: Rainer Fuegenstein; +Cc: xfs, linux-raid On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote: > > today in the morning some daemon processes terminated because of > errors in the xfs file system on top of a software raid5, consisting > of 4*1.5TB WD caviar green SATA disks. Reminds me of a recent(-ish) md/dm readahead cancellation fix - that would fit the symptoms of (btree corruption showing up under heavy IO load but no corruption on disk. However, I can't seem to find any references to it at the moment (can't remember the bug title), but perhaps your distro doesn't have the fix in it? Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfs and raid5 - "Structure needs cleaning for directory open" 2010-05-10 2:20 ` Dave Chinner @ 2010-05-17 21:28 ` Doug Ledford 2010-05-17 21:45 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Doug Ledford @ 2010-05-17 21:28 UTC (permalink / raw) To: Dave Chinner; +Cc: Rainer Fuegenstein, xfs, linux-raid [-- Attachment #1: Type: text/plain, Size: 1189 bytes --] On 05/09/2010 10:20 PM, Dave Chinner wrote: > On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote: >> >> today in the morning some daemon processes terminated because of >> errors in the xfs file system on top of a software raid5, consisting >> of 4*1.5TB WD caviar green SATA disks. > > Reminds me of a recent(-ish) md/dm readahead cancellation fix - that > would fit the symptoms of (btree corruption showing up under heavy IO > load but no corruption on disk. However, I can't seem to find any > references to it at the moment (can't remember the bug title), but > perhaps your distro doesn't have the fix in it? > > Cheers, > > Dave. That sounds plausible, as does hardware error. A memory bit flip under heavy load would cause the in memory data to be corrupt while the on disk data is good. By waiting to check it until later, the bad memory was flushed at some point and when the data was reloaded it came in ok this time. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfs and raid5 - "Structure needs cleaning for directory open" 2010-05-17 21:28 ` Doug Ledford @ 2010-05-17 21:45 ` Dave Chinner 2010-05-17 22:18 ` Doug Ledford 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2010-05-17 21:45 UTC (permalink / raw) To: Doug Ledford; +Cc: Rainer Fuegenstein, xfs, linux-raid On Mon, May 17, 2010 at 05:28:30PM -0400, Doug Ledford wrote: > On 05/09/2010 10:20 PM, Dave Chinner wrote: > > On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote: > >> > >> today in the morning some daemon processes terminated because of > >> errors in the xfs file system on top of a software raid5, consisting > >> of 4*1.5TB WD caviar green SATA disks. > > > > Reminds me of a recent(-ish) md/dm readahead cancellation fix - that > > would fit the symptoms of (btree corruption showing up under heavy IO > > load but no corruption on disk. However, I can't seem to find any > > references to it at the moment (can't remember the bug title), but > > perhaps your distro doesn't have the fix in it? > > > > Cheers, > > > > Dave. > > That sounds plausible, as does hardware error. A memory bit flip under > heavy load would cause the in memory data to be corrupt while the on > disk data is good. The data dumps from the bad blocks weren't wrong by a single bit - they were unrecogniѕable garbage - so that it very unlikely to be a memory erro causing the problem. > By waiting to check it until later, the bad memory > was flushed at some point and when the data was reloaded it came in ok > this time. Yup - XFS needs to do a better job of catching this case - the prototype metadata checksumming patch caught most of these cases... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfs and raid5 - "Structure needs cleaning for directory open" 2010-05-17 21:45 ` Dave Chinner @ 2010-05-17 22:18 ` Doug Ledford 2010-05-17 23:04 ` Dave Chinner 0 siblings, 1 reply; 6+ messages in thread From: Doug Ledford @ 2010-05-17 22:18 UTC (permalink / raw) To: Dave Chinner; +Cc: Rainer Fuegenstein, xfs, linux-raid [-- Attachment #1: Type: text/plain, Size: 1959 bytes --] On 05/17/2010 05:45 PM, Dave Chinner wrote: > On Mon, May 17, 2010 at 05:28:30PM -0400, Doug Ledford wrote: >> On 05/09/2010 10:20 PM, Dave Chinner wrote: >>> On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote: >>>> >>>> today in the morning some daemon processes terminated because of >>>> errors in the xfs file system on top of a software raid5, consisting >>>> of 4*1.5TB WD caviar green SATA disks. >>> >>> Reminds me of a recent(-ish) md/dm readahead cancellation fix - that >>> would fit the symptoms of (btree corruption showing up under heavy IO >>> load but no corruption on disk. However, I can't seem to find any >>> references to it at the moment (can't remember the bug title), but >>> perhaps your distro doesn't have the fix in it? >>> >>> Cheers, >>> >>> Dave. >> >> That sounds plausible, as does hardware error. A memory bit flip under >> heavy load would cause the in memory data to be corrupt while the on >> disk data is good. > > The data dumps from the bad blocks weren't wrong by a single bit - > they were unrecogniѕable garbage - so that it very unlikely to be > a memory erro causing the problem. Not true. It can still be a single bit error but a single bit error higher up in the chain. Aka a single bit error in the scsi command to read various sectors, then you read in all sorts of wrong data and everything from there is totally whacked. >> By waiting to check it until later, the bad memory >> was flushed at some point and when the data was reloaded it came in ok >> this time. > > Yup - XFS needs to do a better job of catching this case - the > prototype metadata checksumming patch caught most of these cases... > > Cheers, > > Dave. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xfs and raid5 - "Structure needs cleaning for directory open" 2010-05-17 22:18 ` Doug Ledford @ 2010-05-17 23:04 ` Dave Chinner 0 siblings, 0 replies; 6+ messages in thread From: Dave Chinner @ 2010-05-17 23:04 UTC (permalink / raw) To: Doug Ledford; +Cc: Rainer Fuegenstein, xfs, linux-raid On Mon, May 17, 2010 at 06:18:28PM -0400, Doug Ledford wrote: > On 05/17/2010 05:45 PM, Dave Chinner wrote: > > On Mon, May 17, 2010 at 05:28:30PM -0400, Doug Ledford wrote: > >> On 05/09/2010 10:20 PM, Dave Chinner wrote: > >>> On Sun, May 09, 2010 at 08:48:00PM +0200, Rainer Fuegenstein wrote: > >>>> > >>>> today in the morning some daemon processes terminated because of > >>>> errors in the xfs file system on top of a software raid5, consisting > >>>> of 4*1.5TB WD caviar green SATA disks. > >>> > >>> Reminds me of a recent(-ish) md/dm readahead cancellation fix - that > >>> would fit the symptoms of (btree corruption showing up under heavy IO > >>> load but no corruption on disk. However, I can't seem to find any > >>> references to it at the moment (can't remember the bug title), but > >>> perhaps your distro doesn't have the fix in it? > >>> > >>> Cheers, > >>> > >>> Dave. > >> > >> That sounds plausible, as does hardware error. A memory bit flip under > >> heavy load would cause the in memory data to be corrupt while the on > >> disk data is good. > > > > The data dumps from the bad blocks weren't wrong by a single bit - > > they were unrecogniѕable garbage - so that it very unlikely to be > > a memory erro causing the problem. > > Not true. It can still be a single bit error but a single bit error > higher up in the chain. Aka a single bit error in the scsi command to > read various sectors, then you read in all sorts of wrong data and > everything from there is totally whacked. I didn't say it *couldn't be* a bit error, just it was _very unlikely_. Hardware errors that result only in repeated XFS btree corruption in memory or causing other errors in the system is something I've never seen, even on machines with known bad memory, HBAs, interconnects, etc. Applying Occam's Razor to this case indicates that it is going to be caused by a software problem. Yes, it's still possible that it's a hardware issue, just very, very unlikely. And if it is hardware and you can prove that it was the cause, then I suggest we all buy a lottery ticket.... ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-05-17 23:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1743435018.20100510013511@kaneda.iguw.tuwien.ac.at>
[not found] ` <462402327.20100509210710@kaneda.iguw.tuwien.ac.at>
2010-05-09 18:48 ` xfs and raid5 - "Structure needs cleaning for directory open" Rainer Fuegenstein
2010-05-10 2:20 ` Dave Chinner
2010-05-17 21:28 ` Doug Ledford
2010-05-17 21:45 ` Dave Chinner
2010-05-17 22:18 ` Doug Ledford
2010-05-17 23:04 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).