From: Eric Sandeen <sandeen@sandeen.net>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Sandeen <sandeen@redhat.com>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH, RFC] xfs: add heuristic to flush on rename
Date: Sun, 27 Apr 2014 19:20:09 -0500 [thread overview]
Message-ID: <535D9EB9.50902@sandeen.net> (raw)
In-Reply-To: <20140427231523.GZ18672@dastard>
On 4/27/14, 6:15 PM, Dave Chinner wrote:
> On Sun, Apr 27, 2014 at 04:56:07PM -0500, Eric Sandeen wrote:
>> On 4/27/14, 4:20 PM, Dave Chinner wrote:
>>> On Fri, Apr 25, 2014 at 02:42:21PM -0500, Eric Sandeen wrote:
>>>> Add a heuristic to flush data to a file which looks like it's
>>>> going through a tmpfile/rename dance, but not fsynced.
>>>>
>>>> I had a report of a system with many 0-length files after
>>>> package updates; as it turns out, the user had basically
>>>> done 'yum update' and punched the power button when it was
>>>> done.
>>>
>>> So yum didn't run sync() on completion of the update? That seems
>>> rather dangerous to me - IMO system updates need to be guaranteed to
>>> be stable by the update mechanisms, not to leave the system state to
>>> chance if power fails or the system crashes immediately after an
>>> update...
>>>
>>>
>>>> Granted, the admin should not do this. Granted, the package
>>>> manager should ensure persistence of files it updated.
>>>
>>> Yes, yes it should. Problem solved without needing to touch XFS.
>>
>> Right, I first suggested it 5 years or so ago for RPM. But hey, who
>> knows, someday maybe.
>
> grrrrr.
>
>> So no need to touch XFS, just every godawful userspace app out there...
>>
>> Somebody should bring up the topic to wider audience, I'm sure they'll
>> all get fixed in short order. Wait, or did we try that already? :)
>
> I'm not talking about any random application. Package managers are
> *CRITICAL SYSTEM INFRASTRUCTURE*. They should be architectected to
> handle failures gracefully; following *basic data integrity rules*
> is a non-negotiable requirement for a system upgrade procedure.
> Leaving the system in an indeterminate and potentially inoperable
> state after a successful upgrade completion is reported is a
> completely unacceptable outcome for any system management operation.
>
> Critical infrastructure needs to Do Things Right, not require other
> people to hack around it's failings and hope that they might be able
> to save the system when shit goes wrong. There is no excuse for
> critical infrastructure developers failing to acknowledge and
> address the data integrity requirements of their infrastructure.
Yeah, I know - choir, preaching, etc.
>>>> Ext4, however, added a heuristic like this for just this case;
>>>> someone who writes file.tmp, then renames over file, but
>>>> never issues an fsync.
>>>
>>> You mean like rsync does all the time for every file it copies?
>>
>> Yeah, I guess rsync doesn't fsync either. ;)
>
> That's because rsync doesn't need to sync until it completes all of
> the data writes. A failed
> rsync can simply be re-run after the system comes back up and
> nothing is lost. That's a very different situation to a package
> manager replacing binaries that the system may need to boot, yes?
yeah, my point is that rsync overwrites exiting files and _never_ syncs.
Not per-file, not at the end, not with any available option, AFAICT.
Different situation, yes, but arguably just as bad under the
wrong circumstances.
>>>> Now, this does smack of O_PONIES, but I would hope that it's
>>>> fairly benign. If someone already synced the tmpfile, it's
>>>> a no-op.
>>>
>>> I'd suggest it will greatly impact rsync speed and have impact on
>>> the resultant filesystem layout as it guarantees interleaving of
>>> metadata and data on disk....
>>
>> Ok, well, based on the responses thus far, sounds like a non-starter.
>>
>> I'm not wedded to it, just thought I'd float the idea.
>>
>> OTOH, it is an interesting juxtaposition to say the open O_TRUNC case
>> is worth catching, but the tempfile overwrite case is not.
>
> We went through this years ago - the O_TRUNC case is dealing with
> direct overwrite of data which we can reliably detect, usually only
> occurs one file at a time, has no major performance impact and data
> loss is almost entirely mitigated by the flush-on-close behaviour.
> It's a pretty reliable mitigation mechanism.
[citation needed] for a some of that, but *shrug*
> Rename often involves many files (so much larger writeback delay on
> async flush), it has cases we can't catch (e.g. rename of a
> directory containing unsynced data files) and has much more
> unpredictable behaviour (e.g. rename of files being actively written
> to). There's nothing worse than having unpredictable/non-repeatable
> data loss scenarios - if we can't handle all rename cases with the
> same guarantees, then we shouldn't provide any data integrity
> guarantees at all.
Ok, so it's a NAK.
I'm over it already,
-Eric
> Cheers,
>
> Dave.
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-04-28 0:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-25 19:42 [PATCH, RFC] xfs: add heuristic to flush on rename Eric Sandeen
2014-04-25 19:55 ` Christoph Hellwig
2014-04-25 19:59 ` Eric Sandeen
2014-04-25 20:00 ` Eric Sandeen
2014-04-27 21:20 ` Dave Chinner
2014-04-27 21:56 ` Eric Sandeen
2014-04-27 23:15 ` Dave Chinner
2014-04-28 0:20 ` Eric Sandeen [this message]
2014-04-28 0:48 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=535D9EB9.50902@sandeen.net \
--to=sandeen@sandeen.net \
--cc=david@fromorbit.com \
--cc=sandeen@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).