From: David Chinner <dgc@sgi.com>
To: Ryan Bair <ryandbair@gmail.com>
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: Repeated XFS corruption -Corruption of in-memory data detected
Date: Tue, 31 Jul 2007 11:53:00 +1000 [thread overview]
Message-ID: <20070731015300.GM31489@sgi.com> (raw)
In-Reply-To: <1b0b1fc90707300910va417055mb1e64126c1519c9f@mail.gmail.com>
[cc xfs@oss.sgi.com]
On Mon, Jul 30, 2007 at 12:10:52PM -0400, Ryan Bair wrote:
> Kernel: 2.6.18-4-amd64 (Debian 2.6.18.dfsg.1-12etch2) Debian Etch
> System: Dell PowerEdge 1850
> Processor: 3.2 GHz Intel Xeon w/ microcode v1.14a, Hyperthreading disabled.
> RAM: 2x1GB ECC DDR-400
> RAID Controller: Dell PERC5/E using megaraid driver
>
> I got another unexpected error on my XFS partition today. I was able
> to reboot the system normally and the journal recovered on the
> following mount. Shortly thereafter, the error occurred again. After
> this the filesystem was no longer able to be mounted as the error
> would occur immediately.
>
> The volume is on a 9.5TB LVM2 volume on a Dell MD1000 loaded with 15
> 750GB drives in a RAID5 set. Writeback is disabled. Memtest86+ was run
> on this system for 48 hours without fault. The system is otherwise
> stable.
<sigh>
You're the second person today to report a software RAID5+XFS corruption on
the 2.6.18-4 Debian kernel. Almost the same signature as well - that is a
corrupted free space btree.
> XFS was able to repair the damage, but previously the drive returned
> to its corrupted state within a few hours of heavy I/O.
The other report was a shutdown before corruption got to disk,
so maybe they are different problems.
Can you post the repair output so we can see what the damage was?
Also, can you post your md/dm config so I can see if I can recreate
a similar config?
Also, seeing as the previous report was caught before corruption
got to disk, I suspected memory corruption of some kind. Can
you enable slab, vm and filesystem debugging for you kernel and
run with that?
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
prev parent reply other threads:[~2007-07-31 1:53 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-30 16:10 Repeated XFS corruption -Corruption of in-memory data detected Ryan Bair
2007-07-31 1:53 ` David Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070731015300.GM31489@sgi.com \
--to=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ryandbair@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox