From: Stan Hoeppner <stan@hardwarefreak.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Julien FERRERO <jferrero06@gmail.com>,
Ric Wheeler <rwheeler@redhat.com>,
xfs@oss.sgi.com
Subject: Re: XFS filesystem corruption
Date: Sun, 10 Mar 2013 18:54:57 -0500 [thread overview]
Message-ID: <513D1D51.7010905@hardwarefreak.com> (raw)
In-Reply-To: <20130310224536.GK23616@dastard>
On 3/10/2013 5:45 PM, Dave Chinner wrote:
> On Sat, Mar 09, 2013 at 12:51:25PM -0600, Stan Hoeppner wrote:
>> On 3/9/2013 3:11 AM, Dave Chinner wrote:
>>> On Fri, Mar 08, 2013 at 12:59:22PM -0600, Stan Hoeppner wrote:
>>>> On 3/8/2013 6:20 AM, Ric Wheeler wrote:
>>>>>> Something that none of us mentioned WRT write barriers is that while the
>>>>>> filesystem structure may avoid corruption when the power is cut, files
>>>>>> may still be corrupted, in conditions such as any/all of these:
>>>>
>>>> I made it very clear I was discussing file corruption here, not
>>>> filesystem corruption. You already covered that base. I was
>>>> specifically addressing the fact that XFS performs barriers on metadata
>>>> writes but not file data writes.
>>>
>>> Actually, you're not correct there, either, Stan. ;)
>>
>> With "either" you're implying I was incorrect twice, and I wasn't, not
>> in whole anyway, maybe in part. ;)
>
> The "either" was in reference to you correcting someone else...
I wasn't attempting to correct Ric on the technicals, as that's simply
not really possible, me being a user talking to a dev. That would be
really presumptuous on my part, not to mention dumb. I had made a point
about file data corruption, and he replied talking about metadata
corruption. My "correction" was simply to clarify I was talking about
file data not metadata.
>>> XFS only issues cache flushes/FUA writes for log IO. Metadata IO is
>>> done exactly the same way that data IO is done - without barriers.
>>> It's because metadata lost in drive caches at the time of a crash is
>>> rewritten by journal replay that filesystem corruption does not
>>> occur.
>>
>> Technical semantics. Geeze, give the non dev a break now and then. ;)
>
> It's the technical semantics that matter when it comes to behaviour
> at power loss. That's why I pick on "technical semantics" - it's
> makes your analysis and understanding of problems better, and that
> means there's less for me to do in future ;)
I do my best to grab the low hanging fruit when I can so you guys can
concentrate on more important stuff.
>> Does everyone remember the transitive property of equality from math
>> class decades ago? It states "If A=B and B=C then A=C". Thus if
>> barrier writes to the journal protect the journal, and the journal
>> protects metadata, then barrier writes to the journal protect metadata.
>
> Yup, but the devil is in the detail - we don't protect individual
> metadata writes at all and that difference is significant enough to
> comment on.... :P
Elaborate on this a bit, if you have time. I was under the impression
that all directory updates were journaled first.
>> I had a detail incorrect, but not the big picture. And I'd bet the OP
>> is more interested in the big picture. So surely I'd get a B or a C
>> here, but certainly not an F.
>
> Certainly a B+ - like I said, I'm being picky because you seem to
> understand the details once explained... :)
Usually. ;) Sometimes it takes a couple of sessions before it fully
sinks in. I must say I've learned a tremendous amount from the devs on
this list, and I'm grateful that you specifically Dave have taken the
time to 'tutor' me, and others, over the last couple of years.
>>> As it is, if the application uses direct IO (likely, as it
>>> sounds like video capture/editing/playout here) then log IO
>>> will also ensure that the data written by the app is on disk (i.e.
>>> that's ithe mechanism by which fsync works).
>>
>> So this would be an interesting upside down case for XFS, as the file
>> data may be intact, but the filesystem gets corrupted, the opposite of
>> the design point.
>
> Well, if barriers are working correctly, then there won't be any
> filesystem corruption, either...
Ok, see, this is odd part here. The OP didn't seem to have this
metadata corruption issue with the old 2.6.18 kernel, at least I think
that's the one he mentioned. Then he switched to 2.6.35. IIRC there
were a number of commits around that time and some regressions. I also
recall 2.6.35 is not a long term stable kernel. I'd guess there were
reasons for that. So, I'm wondering if there was a bug/regression
relating to XFS metadata in 2.6.35 corrected in .36 or later and simply
not backported. Seems to ring a bell, vaguely. I have no idea
where/how to search for such information.
>>>>> Also, if there are active writers, this is inherently racy. A better
>>>>> script would unmount the file systems :)
>>>>
>>>> Yes, a umount would be even better.
>>>
>>> Change the bios so that the power button does not cause a power down
>>> so the OS can capture the button event and trigger an orderly
>>> shutdown.
>>
>> Dare I say "Dave you're incorrect". ;)
>
> Heh. Not so much incorrect as "unaware of the entire scope". I
> browsed the thread and didn't pick up on this little detail...
I know. That was a bit of a cheap shot, hence the judicious use of
quotes and winkies. ;) I knew you'd missed it or you'd not have
mentioned the ACPI soft power switch option.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-03-10 23:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-06 15:08 XFS filesystem corruption Julien FERRERO
2013-03-06 15:15 ` Emmanuel Florac
2013-03-06 16:16 ` Julien FERRERO
2013-03-06 16:47 ` Ric Wheeler
2013-03-06 22:21 ` Emmanuel Florac
2013-03-06 23:12 ` Ric Wheeler
2013-03-07 13:15 ` Julien FERRERO
2013-03-07 13:40 ` Ric Wheeler
2013-03-07 23:22 ` Dave Chinner
2013-03-08 10:16 ` Julien FERRERO
2013-03-12 9:57 ` Martin Steigerwald
2013-03-08 8:39 ` Stan Hoeppner
2013-03-08 10:17 ` Julien FERRERO
2013-03-08 12:20 ` Ric Wheeler
2013-03-08 18:59 ` Stan Hoeppner
2013-03-09 9:11 ` Dave Chinner
2013-03-09 18:51 ` Stan Hoeppner
2013-03-10 22:45 ` Dave Chinner
2013-03-10 23:54 ` Stan Hoeppner [this message]
2013-03-11 0:50 ` Dave Chinner
2013-03-11 9:29 ` Stan Hoeppner
2013-03-11 22:45 ` Dave Chinner
2013-03-11 9:25 ` Julien FERRERO
2013-03-12 10:54 ` Emmanuel Florac
2013-03-12 10:42 ` Martin Steigerwald
2013-03-12 22:16 ` Stan Hoeppner
2013-03-07 3:56 ` Stan Hoeppner
2013-03-07 13:04 ` Julien FERRERO
2013-03-07 13:32 ` Stan Hoeppner
2013-03-10 2:50 ` Eric Sandeen
2013-03-10 22:11 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=513D1D51.7010905@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=david@fromorbit.com \
--cc=jferrero06@gmail.com \
--cc=rwheeler@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox