From: Eric Sandeen <sandeen@sandeen.net>
To: Sage Weil <sage@inktank.com>
Cc: xfs@oss.sgi.com
Subject: Re: garbage block(s) after powercycle/reboot + sparse writes
Date: Tue, 04 Jun 2013 19:22:33 -0500 [thread overview]
Message-ID: <51AE84C9.5030903@sandeen.net> (raw)
In-Reply-To: <alpine.DEB.2.00.1306041210070.15156@cobra.newdream.net>
On 6/4/13 2:24 PM, Sage Weil wrote:
> I'm observing an interesting data corruption pattern:
>
> - write a bunch of files
> - power cycle the box
I guess this part is important? But I'm wondering why...
> - remount
> - immediately (within 1-2 seconds) write create a file and
a new file, right?
> - write to a lower offset, say offset 430423 len 527614
> - write to a higher offset, say offset 1360810 len 269613
> (there is other random io going to other files too)
>
> - about 5 seconds later, read the whole file and verify content
>
> And what I see:
>
> - the first region is correct, and intact
the lower offset you wrote?
> - the bytes that follow, up until the block boundary, are 0
that's good ;)
> - the next few blocks are *not* zero! (i've observed 1 and 6 4k blocks)
that's bad!
> - then lots of zeros, up until the second region, which appears intact.
the lot-of-zeros are probably holes?
What does xfs_bmap -vvp <filename> say about the file in question?
> I'm pretty reliably hitting this, and have reproduced it twice now and
> found the above consistent pattern (but different filenames, different
> offsets). What I haven't yet confirmed is whether the file was written at
> all prior to the powercycle, since that tends to blow away the last
> bit of the ceph logs, too. I'm adding some additional checks to see
> whether the file is in fact new when the first extent is written.
>
> The other possibly interesting thing is the offsets. The garbage regions
> I saw were
>
> 0xea000 - 0xf0000
234-240 4k blocks
> 0xff000 - 0x100000
255-256 4k blocks *shrug*
Is this what you saw w/ the write offsets & sizes you specified above?
I'm wondering if this could possibly have to do w/ speculative preallocation
on the file somehow exposing these blocks? But that's just handwaving.
-Eric
>
> Does this failure pattern look familiar to anyone? I'm pretty sure it is
> new in 3.9, which we switched over to right around the time when this
> started happening. I'm confirming that as well, but just wanted to see if
> this is ringing any bells...
>
> Thanks!
> sage
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-06-05 0:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-04 19:24 garbage block(s) after powercycle/reboot + sparse writes Sage Weil
2013-06-04 20:00 ` Ben Myers
2013-06-05 0:22 ` Eric Sandeen [this message]
2013-06-12 17:02 ` Sage Weil
2013-06-19 1:46 ` Dave Chinner
2013-06-19 3:12 ` Sage Weil
2013-06-19 4:05 ` Dave Chinner
2013-06-19 4:15 ` Sage Weil
2013-06-19 5:18 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51AE84C9.5030903@sandeen.net \
--to=sandeen@sandeen.net \
--cc=sage@inktank.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox