From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pallai Roland Subject: Re: raid5: I lost a XFS file system due to a minor IDE cable problem Date: Fri, 25 May 2007 16:35:36 +0200 Message-ID: <200705251635.36533.dap@mail.index.hu> References: <200705241318.30711.dap@mail.index.hu> <1180056948.6183.10.camel@daptopfc.localdomain> <20070525045500.GF86004887@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20070525045500.GF86004887@sgi.com> Content-Disposition: inline Sender: linux-raid-owner@vger.kernel.org To: David Chinner Cc: Linux-Raid , xfs@oss.sgi.com List-Id: linux-raid.ids On Friday 25 May 2007 06:55:00 David Chinner wrote: > Oh, did you look at your logs and find that XFS had spammed them > about writes that were failing? The first message after the incident: May 24 01:53:50 hq kernel: Filesystem "loop1": XFS internal error xfs_btree_check_sblock at line 336 of file fs/xfs/xfs_btree.c. Caller 0xf8ac14f8 May 24 01:53:50 hq kernel: xfs_btree_check_sblock+0x4f/0xc2 [xfs] xfs_alloc_lookup+0x34e/0x47b [xfs] May 24 01:53:50 HF kernel: xfs_alloc_lookup+0x34e/0x47b [xfs] kmem_zone_zalloc+0x1b/0x43 [xfs] May 24 01:53:50 hq kernel: xfs_alloc_ag_vextent+0x24d/0x1110 [xfs] xfs_alloc_vextent+0x3bd/0x53b [xfs] May 24 01:53:50 hq kernel: xfs_bmapi+0x1ac4/0x23cd [xfs] xfs_bmap_search_multi_extents+0x8e/0xd8 [xfs] May 24 01:53:50 hq kernel: xlog_dealloc_log+0x49/0xea [xfs] xfs_iomap_write_allocate+0x2d9/0x58b [xfs] May 24 01:53:50 hq kernel: xfs_iomap+0x60e/0x82d [xfs] __wake_up_common+0x39/0x59 May 24 01:53:50 hq kernel: xfs_map_blocks+0x39/0x6c [xfs] xfs_page_state_convert+0x644/0xf9c [xfs] May 24 01:53:50 hq kernel: schedule+0x5d1/0xf4d xfs_vm_writepage+0x0/0xe0 [xfs] May 24 01:53:50 hq kernel: xfs_vm_writepage+0x57/0xe0 [xfs] mpage_writepages+0x1fb/0x3bb May 24 01:53:50 hq kernel: mpage_writepages+0x133/0x3bb xfs_vm_writepage+0x0/0xe0 [xfs] May 24 01:53:50 hq kernel: do_writepages+0x35/0x3b __writeback_single_inode+0x88/0x387 May 24 01:53:50 hq kernel: sync_sb_inodes+0x1b4/0x2a8 writeback_inodes+0x63/0xdc May 24 01:53:50 hq kernel: background_writeout+0x66/0x9f pdflush+0x0/0x1ad May 24 01:53:50 hq kernel: pdflush+0xef/0x1ad background_writeout+0x0/0x9f May 24 01:53:50 hq kernel: kthread+0xc2/0xc6 kthread+0x0/0xc6 May 24 01:53:50 hq kernel: kernel_thread_helper+0x5/0xb ..and I've spammed such messages. This "internal error" isn't a good reason to shut down the file system? I think if there's a sign of corrupted file system, the first thing we should do is to stop writes (or the entire FS) and let the admin to examine the situation. I'm not talking about my case where the md raid5 was a braindead, I'm talking about general situations. -- d