linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Eric Sandeen <sandeen@redhat.com>, xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH, RFC] xfs: add heuristic to flush on rename
Date: Mon, 28 Apr 2014 09:15:23 +1000	[thread overview]
Message-ID: <20140427231523.GZ18672@dastard> (raw)
In-Reply-To: <535D7CF7.2070409@sandeen.net>

On Sun, Apr 27, 2014 at 04:56:07PM -0500, Eric Sandeen wrote:
> On 4/27/14, 4:20 PM, Dave Chinner wrote:
> > On Fri, Apr 25, 2014 at 02:42:21PM -0500, Eric Sandeen wrote:
> >> Add a heuristic to flush data to a file which looks like it's
> >> going through a tmpfile/rename dance, but not fsynced.
> >>
> >> I had a report of a system with many 0-length files after
> >> package updates; as it turns out, the user had basically
> >> done 'yum update' and punched the power button when it was
> >> done.
> > 
> > So yum didn't run sync() on completion of the update? That seems
> > rather dangerous to me - IMO system updates need to be guaranteed to
> > be stable by the update mechanisms, not to leave the system state to
> > chance if power fails or the system crashes immediately after an
> > update...
> > 
> > 
> >> Granted, the admin should not do this.  Granted, the package
> >> manager should ensure persistence of files it updated.
> > 
> > Yes, yes it should. Problem solved without needing to touch XFS.
> 
> Right, I first suggested it 5 years or so ago for RPM.  But hey, who
> knows, someday maybe.

grrrrr.

> So no need to touch XFS, just every godawful userspace app out there...
> 
> Somebody should bring up the topic to wider audience, I'm sure they'll
> all get fixed in short order.  Wait, or did we try that already?  :)

I'm not talking about any random application. Package managers are
*CRITICAL SYSTEM INFRASTRUCTURE*. They should be architectected to
handle failures gracefully; following *basic data integrity rules*
is a non-negotiable requirement for a system upgrade procedure.
Leaving the system in an indeterminate and potentially inoperable
state after a successful upgrade completion is reported is a
completely unacceptable outcome for any system management operation.

Critical infrastructure needs to Do Things Right, not require other
people to hack around it's failings and hope that they might be able
to save the system when shit goes wrong.  There is no excuse for
critical infrastructure developers failing to acknowledge and
address the data integrity requirements of their infrastructure.

> >> Ext4, however, added a heuristic like this for just this case;
> >> someone who writes file.tmp, then renames over file, but
> >> never issues an fsync.
> > 
> > You mean like rsync does all the time for every file it copies?
> 
> Yeah, I guess rsync doesn't fsync either.  ;)

That's because rsync doesn't need to sync until it completes all of
the data writes. A failed
rsync can simply be re-run after the system comes back up and
nothing is lost. That's a very different situation to a package
manager replacing binaries that the system may need to boot, yes?

> >> Now, this does smack of O_PONIES, but I would hope that it's
> >> fairly benign.  If someone already synced the tmpfile, it's
> >> a no-op.
> > 
> > I'd suggest it will greatly impact rsync speed and have impact on
> > the resultant filesystem layout as it guarantees interleaving of
> > metadata and data on disk....
> 
> Ok, well, based on the responses thus far, sounds like a non-starter.
> 
> I'm not wedded to it, just thought I'd float the idea.
> 
> OTOH, it is an interesting juxtaposition to say the open O_TRUNC case
> is worth catching, but the tempfile overwrite case is not.

We went through this years ago - the O_TRUNC case is dealing with
direct overwrite of data which we can reliably detect, usually only
occurs one file at a time, has no major performance impact and data
loss is almost entirely mitigated by the flush-on-close behaviour.
It's a pretty reliable mitigation mechanism.

Rename often involves many files (so much larger writeback delay on
async flush), it has cases we can't catch (e.g. rename of a
directory containing unsynced data files) and has much more
unpredictable behaviour (e.g. rename of files being actively written
to). There's nothing worse than having unpredictable/non-repeatable
data loss scenarios - if we can't handle all rename cases with the
same guarantees, then we shouldn't provide any data integrity
guarantees at all.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-04-27 23:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-25 19:42 [PATCH, RFC] xfs: add heuristic to flush on rename Eric Sandeen
2014-04-25 19:55 ` Christoph Hellwig
2014-04-25 19:59   ` Eric Sandeen
2014-04-25 20:00 ` Eric Sandeen
2014-04-27 21:20 ` Dave Chinner
2014-04-27 21:56   ` Eric Sandeen
2014-04-27 23:15     ` Dave Chinner [this message]
2014-04-28  0:20       ` Eric Sandeen
2014-04-28  0:48         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140427231523.GZ18672@dastard \
    --to=david@fromorbit.com \
    --cc=sandeen@redhat.com \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).