public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Pallai Roland <dap@mail.index.hu>
Cc: David Chinner <dgc@sgi.com>,
	Linux-Raid <linux-raid@vger.kernel.org>,
	xfs@oss.sgi.com
Subject: Re: raid5: I lost a XFS file system due to a minor IDE cable problem
Date: Mon, 28 May 2007 10:30:11 +1000	[thread overview]
Message-ID: <20070528003010.GS85884050@sgi.com> (raw)
In-Reply-To: <200705251635.36533.dap@mail.index.hu>

On Fri, May 25, 2007 at 04:35:36PM +0200, Pallai Roland wrote:
> 
> On Friday 25 May 2007 06:55:00 David Chinner wrote:
> > Oh, did you look at your logs and find that XFS had spammed them
> > about writes that were failing?
> 
> The first message after the incident:
> 
> May 24 01:53:50 hq kernel: Filesystem "loop1": XFS internal error xfs_btree_check_sblock at line 336 of file fs/xfs/xfs_btree.c.  Caller 0xf8ac14f8
> May 24 01:53:50 hq kernel: <f8adae69> xfs_btree_check_sblock+0x4f/0xc2 [xfs]  <f8ac14f8> xfs_alloc_lookup+0x34e/0x47b [xfs]
> May 24 01:53:50 HF kernel: <f8ac14f8> xfs_alloc_lookup+0x34e/0x47b [xfs]  <f8b1a9c7> kmem_zone_zalloc+0x1b/0x43 [xfs]
> May 24 01:53:50 hq kernel: <f8abe645> xfs_alloc_ag_vextent+0x24d/0x1110 [xfs]  <f8ac0647> xfs_alloc_vextent+0x3bd/0x53b [xfs]
> May 24 01:53:50 hq kernel: <f8ad2f7e> xfs_bmapi+0x1ac4/0x23cd [xfs]  <f8acab97> xfs_bmap_search_multi_extents+0x8e/0xd8 [xfs]
> May 24 01:53:50 hq kernel: <f8b00001> xlog_dealloc_log+0x49/0xea [xfs]  <f8afdaee> xfs_iomap_write_allocate+0x2d9/0x58b [xfs]
> May 24 01:53:50 hq kernel: <f8afc3ae> xfs_iomap+0x60e/0x82d [xfs]  <c0113bc8> __wake_up_common+0x39/0x59
> May 24 01:53:50 hq kernel: <f8b1ae11> xfs_map_blocks+0x39/0x6c [xfs]  <f8b1bd7b> xfs_page_state_convert+0x644/0xf9c [xfs]
> May 24 01:53:50 hq kernel: <c036f384> schedule+0x5d1/0xf4d  <f8b1c780> xfs_vm_writepage+0x0/0xe0 [xfs]
> May 24 01:53:50 hq kernel: <f8b1c7d7> xfs_vm_writepage+0x57/0xe0 [xfs]  <c01830e8> mpage_writepages+0x1fb/0x3bb
> May 24 01:53:50 hq kernel: <c0183020> mpage_writepages+0x133/0x3bb  <f8b1c780> xfs_vm_writepage+0x0/0xe0 [xfs]
> May 24 01:53:50 hq kernel: <c0147bb3> do_writepages+0x35/0x3b  <c018135c> __writeback_single_inode+0x88/0x387
> May 24 01:53:50 hq kernel: <c01819b7> sync_sb_inodes+0x1b4/0x2a8  <c0181c63> writeback_inodes+0x63/0xdc
> May 24 01:53:50 hq kernel: <c0147943> background_writeout+0x66/0x9f  <c01482b3> pdflush+0x0/0x1ad
> May 24 01:53:50 hq kernel: <c01483a2> pdflush+0xef/0x1ad  <c01478dd> background_writeout+0x0/0x9f
> May 24 01:53:50 hq kernel: <c012d10b> kthread+0xc2/0xc6  <c012d049> kthread+0x0/0xc6
> May 24 01:53:50 hq kernel: <c0100dd5> kernel_thread_helper+0x5/0xb
> 
> .and I've spammed such messages. This "internal error" isn't a good reason to shut down
> the file system?

Actaully, that error does shut the filesystem down in most cases. When you
see that output, the function is returning -EFSCORRUPTED. You've got a corrupted
freespace btree.

The reason why you get spammed is that this is happening during background
writeback, and there is no one to return the -EFSCORRUPTED error to. The
background writeback path doesn't specifically detect shut down filesystems or
trigger shutdowns on errors because that happens in different layers so you
just end up with failed data writes. These errors will occur on the next
foreground data or metadata allocation and that will shut the filesystem down
at that point.

I'm not sure that we should be ignoring EFSCORRUPTED errors here; maybe in
this case we should be shutting down the filesystem.  That would certainly cut
down on the spamming and would not appear to change anything other
behaviour....

> I think if there's a sign of corrupted file system, the first thing we should do
> is to stop writes (or the entire FS) and let the admin to examine the situation.

Yes, that's *exactly* what a shutdown does. In this case, your writes are
being stopped - hence the error messages - but the filesystem has not yet
been shutdown.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  reply	other threads:[~2007-05-28  0:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200705241318.30711.dap@mail.index.hu>
2007-05-24 11:20 ` raid5: I lost a XFS file system due to a minor IDE cable problem Justin Piszcz
2007-05-25  0:05   ` David Chinner
2007-05-25  1:35     ` Pallai Roland
2007-05-25  4:55       ` David Chinner
2007-05-25  5:43         ` Alberto Alonso
2007-05-25  8:36           ` David Chinner
2007-05-28 22:45             ` Alberto Alonso
2007-05-29  3:28               ` David Chinner
2007-05-29  3:37                 ` Alberto Alonso
2007-05-25 14:35         ` Pallai Roland
2007-05-28  0:30           ` David Chinner [this message]
2007-05-28  1:50             ` Pallai Roland
2007-05-28  2:17               ` David Chinner
2007-05-28 11:17                 ` Pallai Roland
2007-05-28 23:06                   ` David Chinner
2007-05-25 14:01       ` Pallai Roland
2007-05-28 12:53     ` Pallai Roland
2007-05-28 15:30       ` Pallai Roland
2007-05-28 23:36         ` David Chinner
2007-05-30 16:11         ` Christian Kujau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070528003010.GS85884050@sgi.com \
    --to=dgc@sgi.com \
    --cc=dap@mail.index.hu \
    --cc=linux-raid@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox