From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Thu, 03 May 2007 08:02:00 -0700 (PDT) Received: from smtp-ft6.fr.colt.net (smtp-ft6.fr.colt.net [213.41.78.198]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l43F1ufB005799 for ; Thu, 3 May 2007 08:01:57 -0700 Received: from harpe.intellique.com (host.93.124.68.195.rev.coltfrance.com [195.68.124.93]) by smtp-ft6.fr.colt.net (8.13.8/8.13.8/Debian-3) with ESMTP id l43EjJQV005258 for ; Thu, 3 May 2007 16:45:19 +0200 Date: Thu, 3 May 2007 16:45:21 +0200 From: Emmanuel Florac Subject: XFS crash on linux raid Message-ID: <20070503164521.16efe075@harpe.intellique.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: xfs@oss.sgi.com Hello, Apparently quite a lot of people do encounter the same problem from time to time, but I couldn't find any solution. When writing quite a lot to the filesystem (heavy load on the fileserver), the filesystem crashes when filled at 2.5~3TB (varies from time to time). The filesystems tested where always running on a software raid 0, with disabled barriers. I tend to think that disabled write barriers are causing the crash but I'll do some more tests to get sure. I've met this problem for the first time on 12/23 (yup... merry christmas :) when a 13 TB filesystem went belly up : Dec 23 01:38:10 storiq1 -- MARK -- Dec 23 01:58:10 storiq1 -- MARK -- Dec 23 02:10:29 storiq1 kernel: xfs_iunlink_remove: xfs_itobp() returned an error 990 on md0. Returning error. Dec 23 02:10:29 storiq1 kernel: xfs_inactive:^Ixfs_ifree() returned an error = 990 on md0 Dec 23 02:10:29 storiq1 kernel: xfs_force_shutdown(md0,0x1) called from line 1763 of file fs/xfs/xfs_vnodeops.c. Return address = 0xc027f78b Dec 23 02:38:11 storiq1 -- MARK -- Dec 23 02:58:11 storiq1 -- MARK -- When mounting, it did that : Filesystem "md0": Disabling barriers, not supported by the underlying device XFS mounting filesystem md0 Starting XFS recovery on filesystem: md0 (logdev: internal) Filesystem "md0": xfs_inode_recover: Bad inode magic number, dino ptr = 0xf7196600, dino bp = 0xf718e980, ino = 119318 Filesystem "md0": XFS internal error xlog_recover_do_inode_trans(1) at line 2352 of file fs/xfs/xfs_log_recover.c. Caller 0xc025d180 xlog_recover_do_inode_trans+0x93d/0xa00 xlog_recover_do_trans+0x140/0x160 xfs_buf_delwri_queue+0x2b/0xb0 xlog_recover_do_trans+0x140/0x160 kmem_zalloc+0x1f/0x50 xlog_recover_commit_trans+0x3f/0x50 xlog_recover_process_data+0xea/0x240 xlog_do_recovery_pass+0x39a/0xb70 hrtimer_run_queues+0x29/0x110 xlog_do_log_recovery+0x96/0xd0 xlog_do_recover+0x3b/0x170 xlog_recover+0xdd/0xf0 xfs_log_mount+0xa1/0x110 xfs_mountfs+0x825/0xf30 xfs_fs_cmn_err+0x27/0x30 xfs_ioinit+0x27/0x50 xfs_mount+0x2ff/0x520 vfs_mount+0x43/0x50 xfs_fs_fill_super+0x9a/0x200 debug_mutex_add_waiter+0x3d/0xd0 snprintf+0x27/0x30 disk_name+0xb4/0xc0 sb_set_blocksize+0x1f/0x50 get_sb_bdev+0x106/0x150 xfs_fs_get_sb+0x30/0x40 xfs_fs_fill_super+0x0/0x200 do_kern_mount+0x5f/0xe0 do_new_mount+0x77/0xc0 do_mount+0x18d/0x1f0 take_cpu_down+0xb/0x20 copy_mount_options+0x63/0xc0 sys_mount+0x9f/0xe0 syscall_call+0x7/0xb XFS: log mount/recovery failed: error 990 XFS: log mount failed XFS_repair (too old a version...) hosed the filesystem and destroyed most of the 2.6TB of data. Yes, there were no backup, I wrote a recovery tool to restore the video data from the raw device but the is a different story. The system was running vanilla 2.6.17.9, and md0 was made of 3 striped RAID-5 on 3 3Ware-9550 cards, each hardware RAID-5 made of 8 750 GB drives. On a similar hardware with 2 3Ware-9550 16x750GB striped together, but running 2.6.17.13, I had a similar fs crash last week. Unfortunately I don't have the logs at hand, but we where able to reproduce several times the crash at home : Filesystem "md0": XFS internal error xfs_btree_check_sblock at line 336 of file fs/xfs/xfs_btree.c. Caller 0xc01fb282 xfs_btree_check_sblock+0x58/0xe0 xfs_alloc_lookup+0x142/0x400 xfs_alloc_lookup+0x142/0x400 kmem_zone_alloc+0x59/0xd0 xfs_btree_init_cursor+0x23/0x190 xfs_alloc_ag_vextent_near+0x54/0x9e0 xfs_bmap_add_extent+0x383/0x430 xfs_bmap_search_multi_extents+0x76/0xf0 xfs_alloc_ag_vextent+0x119/0x120 xfs_alloc_vextent+0x3db/0x4f0 xfs_bmap_btalloc+0x3ee/0x890 xfs_bmapi+0x1216/0x1690 xfs_dir2_grow_inode+0xf6/0x400 cache_alloc_refill+0xb6/0x1e0 xfs_idata_realloc+0x3b/0x130 xfs_dir2_sf_to_block+0xac/0x5d0 xfs_dir2_lookup+0x129/0x130 xfs_dir2_sf_addname+0x97/0x110 xfs_dir2_createname+0x144/0x150 xfs_trans_ijoin+0x2b/0x80 xfs_rename+0x354/0x9f0 xfs_access+0x3f/0x50 xfs_vn_rename+0x48/0xa0 __link_path_walk+0xc7c/0xc90 xfs_getattr+0x23f/0x2f0 mntput_no_expire+0x1b/0x80 cache_alloc_refill+0xb6/0x1e0 vfs_rename_other+0x96/0xd0 vfs_rename+0x258/0x2d0 do_rename+0x171/0x1a0 cache_grow+0x10b/0x160 cache_alloc_refill+0xb6/0x1e0 do_getname+0x4b/0x80 sys_renameat+0x47/0x80 sys_rename+0x28/0x30 syscall_call+0x7/0xb Filesystem "md0": XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c. Caller 0xc0245ec7 xfs_trans_cancel+0xd0/0x100 xfs_rename+0x6a7/0x9f0 xfs_rename+0x6a7/0x9f0 xfs_access+0x3f/0x50 xfs_vn_rename+0x48/0xa0 __link_path_walk+0xc7c/0xc90 xfs_getattr+0x23f/0x2f0 mntput_no_expire+0x1b/0x80 cache_alloc_refill+0xb6/0x1e0 vfs_rename_other+0x96/0xd0 vfs_rename+0x258/0x2d0 do_rename+0x171/0x1a0 cache_grow+0x10b/0x160 cache_alloc_refill+0xb6/0x1e0 do_getname+0x4b/0x80 sys_renameat+0x47/0x80 sys_rename+0x28/0x30 syscall_call+0x7/0xb xfs_force_shutdown(md0,0x8) called from line 1151 of file fs/xfs/xfs_trans.c. Return address = 0xc025f7b9 Filesystem "md0": Corruption of in-memory data detected. Shutting down filesystem: md0 Please umount the filesystem, and rectify the problem(s) xfs_force_shutdown(md0,0x1) called from line 338 of file fs/xfs/xfs_rw.c. Return address = 0xc025f7b9 xfs_force_shutdown(md0,0x1) called from line 338 of file fs/xfs/xfs_rw.c. Return address = 0xc025f7b9 After xfs_repair, the fs is fine. However, it crashes again when writing again a couple of GBs of data. It crashes again under 2.6.17.13, 2.6.17.13 SMP, 2.6.18.8, 2.6.16.36... Out of curiosity, I've tried to use reiserfs (just to see how it compares regarding this). Reiserfs crashed before even writing 100MB! So I tend to believe this is a "write barrier" problem and it looks really nasty!!! To sort this out I've started a test on a single 3Ware raid, without software raid. Any idea on how to circumvent the problem to make software RAID/LVM usable? -- ---------------------------------------- Emmanuel Florac | Intellique ----------------------------------------