public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin Steigerwald <ms@teamix.de>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Timothy Shimmin <tes@sgi.com>, xfs@oss.sgi.com
Subject: Re: Is it possible the check an frozen XFS filesytem to avoid downtime
Date: Mon, 27 Oct 2008 17:57:09 +0100	[thread overview]
Message-ID: <200810271757.09915.ms@teamix.de> (raw)
In-Reply-To: <487CC1EB.6030100@sandeen.net>

Am Dienstag, 15. Juli 2008 schrieb Eric Sandeen:
> Martin Steigerwald wrote:
> > Okay... we recommended the customer to do it the safe way unmounting the
> > filesystem completely. He did and the filesystem appear to be intact
> > *phew*. XFS appeared to detect the in memory corruption early enough.
> >
> > Its a bit strange however, cause we now know that the server sports ECC
> > RAM. Well we will see what memtest86+ has to say about it.
>
> in-memory corruption could mean, but certainly does not absolutely mean,
> problematic memory.  It could be, and usually is, a plain ol' bug (in
> xfs or elsewhere).

Ok, just as a follow up:

Now we got similar XFS errors on the second backend server, this time on a 
local hardware RAID1 while on the first backend server it was on logical 
volumes on a soft RAID spread over two dislocated external hardware RAID 
boxes.

So this appears to be an XFS bug to me. Maybe when running for long time it 
corrupts its in-memory structures. Fortunately we did not see errors in 
on-disk structures.

A colleague did a kernel update on the inactive backend 1 server from 2.6.21 
to 2.6.26 kernel from backports.org, tommorow backend 2 will follow. Let's 
see whether that solves the issue.

Anyway it seems to be a hard to trigger bug and before bugging you with 
something in kernel 2.6.21, we at least update to the latest backports.org 
kernel.

-- 
Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90

  parent reply	other threads:[~2008-10-27 16:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-14 13:42 Is it possible the check an frozen XFS filesytem to avoid downtime Martin Steigerwald
2008-07-15  3:38 ` Timothy Shimmin
2008-07-15  7:44   ` Martin Steigerwald
2008-07-15 15:27     ` Eric Sandeen
2008-07-16  7:53       ` Martin Steigerwald
2008-10-27 16:57       ` Martin Steigerwald [this message]
2008-10-27 17:15         ` Eric Sandeen
2008-10-28  8:36           ` Martin Steigerwald
2008-07-16  8:55     ` Timothy Shimmin
2008-07-15  7:47   ` Martin Steigerwald

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200810271757.09915.ms@teamix.de \
    --to=ms@teamix.de \
    --cc=sandeen@sandeen.net \
    --cc=tes@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox