From: Dave Chinner <david@fromorbit.com>
To: Wayne Walker <wwalker@crossroads.com>
Cc: xfs@oss.sgi.com
Subject: Re: File system corruption
Date: Thu, 25 Oct 2012 09:51:48 +1100 [thread overview]
Message-ID: <20121024225148.GA29378@dastard> (raw)
In-Reply-To: <50885B60.9050904@crossroads.com>
On Wed, Oct 24, 2012 at 04:19:28PM -0500, Wayne Walker wrote:
> On 10/12/2012 07:14 PM, Dave Chinner wrote:
> <snip>
> >And SB/AGF 3 and 4 are ok, too. So, the filesystem headers just
> >beyond the 2TB offset are zero. That tends to point to a block
> >device problem, as an offset of 2TB is where a 32 bit sector count
> >will overflow (i.e. 2^32). Next step is to run blktrace/blkparse
> >on the cp workload that generates the error to see if anything
> >actually writes to the 2TB offset region, and if so, where it
> >comes from. Probably best to compress the resultant blkparse
> >output file - it might be quite large but the text will compress
> >well. Cheers, Dave.
>
> Dave,
>
> Thank you for your help.
>
> 10 MB .gz file at http://rx-7.bybent.com/blktrace.sde1.out.gz
>
> What I can see seems to have most of the writes are around 2^31.
Which is 2^31 sectors of just above 1TB. i.e. writing the data into
AG #1. The writes stop a short way into #AG1. The filesystem does
not issue any writes to the AG#2 headers, only a single read. IOWs,
the filesystem is not overwriting it's own metadata during the
workload, so that implies a problem at a lower storage layer....
IOWs, I can't see the filesystem doing anything wrong here.
FWIW, can you pattern the block device around the 2TB offset and
re-run the test, and see if the sb/agf 2 are zeroed via xfs_db after
the failure occurs? i.e. do something like:
# xfs_io -F -f -c "pwrite 2047g 2g" -c fsync /dev/sde1
the mkfs, run xfs_db to dump sb 2 and agf 2, then run the test and
dump sb 2/agf 2 again after the test? use the same xfs-db scripts as
the previous time (i.e. including the drop caches commands) if you
can.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-10-24 22:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-11 17:52 File system corruption Wayne Walker
2012-10-11 18:03 ` Wayne Walker
2012-10-11 21:07 ` Dave Chinner
[not found] ` <50789076.7040402@crossroads.com>
2012-10-13 0:14 ` Dave Chinner
2012-10-24 21:19 ` Wayne Walker
2012-10-24 22:51 ` Dave Chinner [this message]
-- strict thread matches above, loose matches on Subject: below --
2009-07-16 18:08 John Quigley
2009-07-16 19:20 ` Eric Sandeen
2008-08-27 11:41 file " Ensar Gul
2004-07-12 5:39 Achuth Kamath
2004-07-12 6:56 ` David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121024225148.GA29378@dastard \
--to=david@fromorbit.com \
--cc=wwalker@crossroads.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.