public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: Julien FERRERO <jferrero06@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: XFS filesystem corruption
Date: Thu, 07 Mar 2013 07:32:15 -0600	[thread overview]
Message-ID: <513896DF.7010609@hardwarefreak.com> (raw)
In-Reply-To: <CAPcwv6yAHAsmwgROs12gRtCbqTXBvTPrx8F-e4kYab2YApsobg@mail.gmail.com>

On 3/7/2013 7:04 AM, Julien FERRERO wrote:
>> It may be unrelated to your corruption, problem but I'm curious why you
>> are specifying a 32MB log section instead of letting mkfs.xfs make the
>> log size decision.
> 
> I honestly don' know, the rebuild script was written 8 years ago by an
> engineer that since left the company.
> 
> Is 32MB a short log space for a 1.5 TB of data ?

The log is for journal metadata.  So if you're capturing a frame of
video per file, or 24 or 60 frames per file, and thus are writing lots
of files, 32MB may be too small.  I'm not an expert here.  Dave C. would
be better able to answer this.  But this is a very minor problem
compared to...

> Moreover, the common usage is to power off all the equipment (included
> ours) from a general power switch.

this.  Have the crews been hard cutting power to these XFS boxen for the
8 years you mention above?  And this filesystem corruption problem
and/or corrupted files, is just now cropping up?  That's hard to
believe.  There may be a bug in 2.6.35 that exacerbates this that's been
fixed in later versions--2.6.35 is not a long term stable kernel--odd
that a vendor would choose it for long term use.

If you never had this problem before, I can only guess that previously
you were using hardware RAID controllers with BBWC having sufficient
battery hours of cache power to survive until the next power on, at
which point the BBWC RAID dumped the data to the disks.  If you switched
from that solution to non BBWC RAID, or to Linux software RAID, that
might explain why you're seeing corruption now and did not previously.
And even with BBWC RAID, hard cutting power to the system is still not a
smart thing to do.

For this kind of environment, if field techs are going to hard cut power
no matter what you tell them, then you simply MUST get LSI (or possibly
other) RAID cards with the flash backed write cache.  This doesn't rely
on batteries so the cache is never volatile, and can sit overnight, or
for days or weeks, without losing the data in the write cache.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-03-07 13:32 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-06 15:08 XFS filesystem corruption Julien FERRERO
2013-03-06 15:15 ` Emmanuel Florac
2013-03-06 16:16   ` Julien FERRERO
2013-03-06 16:47     ` Ric Wheeler
2013-03-06 22:21     ` Emmanuel Florac
2013-03-06 23:12       ` Ric Wheeler
2013-03-07 13:15         ` Julien FERRERO
2013-03-07 13:40           ` Ric Wheeler
2013-03-07 23:22           ` Dave Chinner
2013-03-08 10:16             ` Julien FERRERO
2013-03-12  9:57             ` Martin Steigerwald
2013-03-08  8:39         ` Stan Hoeppner
2013-03-08 10:17           ` Julien FERRERO
2013-03-08 12:20           ` Ric Wheeler
2013-03-08 18:59             ` Stan Hoeppner
2013-03-09  9:11               ` Dave Chinner
2013-03-09 18:51                 ` Stan Hoeppner
2013-03-10 22:45                   ` Dave Chinner
2013-03-10 23:54                     ` Stan Hoeppner
2013-03-11  0:50                       ` Dave Chinner
2013-03-11  9:29                         ` Stan Hoeppner
2013-03-11 22:45                           ` Dave Chinner
2013-03-11  9:25                       ` Julien FERRERO
2013-03-12 10:54                         ` Emmanuel Florac
2013-03-12 10:42           ` Martin Steigerwald
2013-03-12 22:16             ` Stan Hoeppner
2013-03-07  3:56 ` Stan Hoeppner
2013-03-07 13:04   ` Julien FERRERO
2013-03-07 13:32     ` Stan Hoeppner [this message]
2013-03-10  2:50     ` Eric Sandeen
2013-03-10 22:11     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513896DF.7010609@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=jferrero06@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox