From: Dave Chinner <david@fromorbit.com>
To: Romain Le Disez <romain.le-disez@corp.ovh.com>
Cc: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [QUESTION] multiple fsync() vs single sync()
Date: Wed, 17 Oct 2018 12:16:24 +1100 [thread overview]
Message-ID: <20181017011624.GB6311@dastard> (raw)
In-Reply-To: <6A65F394-C1BA-4339-AC9B-051885D12F65@corp.ovh.com>
On Tue, Oct 16, 2018 at 10:22:18AM +0000, Romain Le Disez wrote:
> Hi all,
>
> In this pseudo-code (extracted from OpenStack Swift [1]):
> fd=open("/tmp/tempfile", O_CREAT | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> fsync(fd);
> rename("/tmp/tempfile", "/data/foobar");
> dirfd = open("/data", O_DIRECTORY | O_RDONLY);
> fsync(dirfd);
>
> OR (the same without temporary file):
> fd=open("/data", O_TMPFILE | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> fsync(fd);
> linkat(AT_FDCWD, "/proc/self/fd/" + fd, AT_FDCWD, "/data/foobar", AT_SYMLINK_FOLLOW);
linkat(fd, "", AT_FDCWD, "/data/foobar", AT_EMPTY_PATH);
> dirfd = open("/data", O_DIRECTORY | O_RDONLY);
> fsync(dirfd);
> I’m guaranteed that, what ever happen, I’ll have a
> complete file (data+xattr) or no file at all in the directory
> /data.
Yes.
> Second question, if I replace the two fsync() by one sync(), do I
> get the same guarantee?
> fd=open("/data", O_TMPFILE | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> linkat(AT_FDCWD, « /proc/self/fd/" + fd, AT_FDCWD, "/data/foobar", AT_SYMLINK_FOLLOW);
> sync();
>
> From what I understand of the FAQ [1], write_barrier guarantee
> that journal (aka log) will be written before the inode (aka
> metadata). Did I miss something?
"write barriers" don't exist anymore. What we have these days are
cache flushes to correctly order data/metadata IO vs journal IO.
The syncfs() operation (and sync(), which is just syncfs() across
all filesystems) writes oustanding data first, then asks the
filesystem to force metadata to stable storage. XFS does that with
a log flush, which issues a cache flush (data now on stable storage)
followed by FUA log writes (metadata now on stable storage in the
journal).
So, effectively, you get the same thing in both cases. The only
difference is that sync() does a lot more work than a couple of
fsync() operations, and does work system wide on filesystems and
files you don't care about. fsync() will always perform better on a
busy system than a sync call.
Let the filesystem worry about optimising fsync calls necessary for
consistency and integrity purposes. If there was a faster way than
issuing fsync on only the objects that need it when required, then
everyone would be using it all the time....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-10-17 9:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-16 10:22 [QUESTION] multiple fsync() vs single sync() Romain Le Disez
2018-10-16 12:57 ` Carlos Maiolino
2018-10-16 13:53 ` Stefan Ring
2018-10-16 14:09 ` Romain Le Disez
2018-10-18 11:43 ` Carlos Maiolino
2018-10-17 1:16 ` Dave Chinner [this message]
2018-10-19 8:16 ` Romain Le Disez
2018-10-19 12:12 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181017011624.GB6311@dastard \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=romain.le-disez@corp.ovh.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).