From: Stan Hoeppner <stan@hardwarefreak.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Julien FERRERO <jferrero06@gmail.com>,
Ric Wheeler <rwheeler@redhat.com>,
xfs@oss.sgi.com
Subject: Re: XFS filesystem corruption
Date: Sun, 10 Mar 2013 18:54:57 -0500 [thread overview]
Message-ID: <513D1D51.7010905@hardwarefreak.com> (raw)
In-Reply-To: <20130310224536.GK23616@dastard>
On 3/10/2013 5:45 PM, Dave Chinner wrote:
> On Sat, Mar 09, 2013 at 12:51:25PM -0600, Stan Hoeppner wrote:
>> On 3/9/2013 3:11 AM, Dave Chinner wrote:
>>> On Fri, Mar 08, 2013 at 12:59:22PM -0600, Stan Hoeppner wrote:
>>>> On 3/8/2013 6:20 AM, Ric Wheeler wrote:
>>>>>> Something that none of us mentioned WRT write barriers is that while the
>>>>>> filesystem structure may avoid corruption when the power is cut, files
>>>>>> may still be corrupted, in conditions such as any/all of these:
>>>>
>>>> I made it very clear I was discussing file corruption here, not
>>>> filesystem corruption. You already covered that base. I was
>>>> specifically addressing the fact that XFS performs barriers on metadata
>>>> writes but not file data writes.
>>>
>>> Actually, you're not correct there, either, Stan. ;)
>>
>> With "either" you're implying I was incorrect twice, and I wasn't, not
>> in whole anyway, maybe in part. ;)
>
> The "either" was in reference to you correcting someone else...
I wasn't attempting to correct Ric on the technicals, as that's simply
not really possible, me being a user talking to a dev. That would be
really presumptuous on my part, not to mention dumb. I had made a point
about file data corruption, and he replied talking about metadata
corruption. My "correction" was simply to clarify I was talking about
file data not metadata.
>>> XFS only issues cache flushes/FUA writes for log IO. Metadata IO is
>>> done exactly the same way that data IO is done - without barriers.
>>> It's because metadata lost in drive caches at the time of a crash is
>>> rewritten by journal replay that filesystem corruption does not
>>> occur.
>>
>> Technical semantics. Geeze, give the non dev a break now and then. ;)
>
> It's the technical semantics that matter when it comes to behaviour
> at power loss. That's why I pick on "technical semantics" - it's
> makes your analysis and understanding of problems better, and that
> means there's less for me to do in future ;)
I do my best to grab the low hanging fruit when I can so you guys can
concentrate on more important stuff.
>> Does everyone remember the transitive property of equality from math
>> class decades ago? It states "If A=B and B=C then A=C". Thus if
>> barrier writes to the journal protect the journal, and the journal
>> protects metadata, then barrier writes to the journal protect metadata.
>
> Yup, but the devil is in the detail - we don't protect individual
> metadata writes at all and that difference is significant enough to
> comment on.... :P
Elaborate on this a bit, if you have time. I was under the impression
that all directory updates were journaled first.
>> I had a detail incorrect, but not the big picture. And I'd bet the OP
>> is more interested in the big picture. So surely I'd get a B or a C
>> here, but certainly not an F.
>
> Certainly a B+ - like I said, I'm being picky because you seem to
> understand the details once explained... :)
Usually. ;) Sometimes it takes a couple of sessions before it fully
sinks in. I must say I've learned a tremendous amount from the devs on
this list, and I'm grateful that you specifically Dave have taken the
time to 'tutor' me, and others, over the last couple of years.
>>> As it is, if the application uses direct IO (likely, as it
>>> sounds like video capture/editing/playout here) then log IO
>>> will also ensure that the data written by the app is on disk (i.e.
>>> that's ithe mechanism by which fsync works).
>>
>> So this would be an interesting upside down case for XFS, as the file
>> data may be intact, but the filesystem gets corrupted, the opposite of
>> the design point.
>
> Well, if barriers are working correctly, then there won't be any
> filesystem corruption, either...
Ok, see, this is odd part here. The OP didn't seem to have this
metadata corruption issue with the old 2.6.18 kernel, at least I think
that's the one he mentioned. Then he switched to 2.6.35. IIRC there
were a number of commits around that time and some regressions. I also
recall 2.6.35 is not a long term stable kernel. I'd guess there were
reasons for that. So, I'm wondering if there was a bug/regression
relating to XFS metadata in 2.6.35 corrected in .36 or later and simply
not backported. Seems to ring a bell, vaguely. I have no idea
where/how to search for such information.
>>>>> Also, if there are active writers, this is inherently racy. A better
>>>>> script would unmount the file systems :)
>>>>
>>>> Yes, a umount would be even better.
>>>
>>> Change the bios so that the power button does not cause a power down
>>> so the OS can capture the button event and trigger an orderly
>>> shutdown.
>>
>> Dare I say "Dave you're incorrect". ;)
>
> Heh. Not so much incorrect as "unaware of the entire scope". I
> browsed the thread and didn't pick up on this little detail...
I know. That was a bit of a cheap shot, hence the judicious use of
quotes and winkies. ;) I knew you'd missed it or you'd not have
mentioned the ACPI soft power switch option.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-03-10 23:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-06 15:08 XFS filesystem corruption Julien FERRERO
2013-03-06 15:15 ` Emmanuel Florac
2013-03-06 16:16 ` Julien FERRERO
2013-03-06 16:47 ` Ric Wheeler
2013-03-06 22:21 ` Emmanuel Florac
2013-03-06 23:12 ` Ric Wheeler
2013-03-07 13:15 ` Julien FERRERO
2013-03-07 13:40 ` Ric Wheeler
2013-03-07 23:22 ` Dave Chinner
2013-03-08 10:16 ` Julien FERRERO
2013-03-12 9:57 ` Martin Steigerwald
2013-03-08 8:39 ` Stan Hoeppner
2013-03-08 10:17 ` Julien FERRERO
2013-03-08 12:20 ` Ric Wheeler
2013-03-08 18:59 ` Stan Hoeppner
2013-03-09 9:11 ` Dave Chinner
2013-03-09 18:51 ` Stan Hoeppner
2013-03-10 22:45 ` Dave Chinner
2013-03-10 23:54 ` Stan Hoeppner [this message]
2013-03-11 0:50 ` Dave Chinner
2013-03-11 9:29 ` Stan Hoeppner
2013-03-11 22:45 ` Dave Chinner
2013-03-11 9:25 ` Julien FERRERO
2013-03-12 10:54 ` Emmanuel Florac
2013-03-12 10:42 ` Martin Steigerwald
2013-03-12 22:16 ` Stan Hoeppner
2013-03-07 3:56 ` Stan Hoeppner
2013-03-07 13:04 ` Julien FERRERO
2013-03-07 13:32 ` Stan Hoeppner
2013-03-10 2:50 ` Eric Sandeen
2013-03-10 22:11 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=513D1D51.7010905@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=david@fromorbit.com \
--cc=jferrero06@gmail.com \
--cc=rwheeler@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.