From: Carlos Maiolino <cmaiolino@redhat.com>
To: Romain Le Disez <romain.le-disez@corp.ovh.com>
Cc: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [QUESTION] multiple fsync() vs single sync()
Date: Tue, 16 Oct 2018 14:57:12 +0200 [thread overview]
Message-ID: <20181016125712.5k5xt4lzhi76qaj6@odin.usersys.redhat.com> (raw)
In-Reply-To: <6A65F394-C1BA-4339-AC9B-051885D12F65@corp.ovh.com>
Hi,
>
> In this pseudo-code (extracted from OpenStack Swift [1]):
> fd=open("/tmp/tempfile", O_CREAT | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> fsync(fd);
> rename("/tmp/tempfile", "/data/foobar");
> dirfd = open("/data", O_DIRECTORY | O_RDONLY);
> fsync(dirfd);
>
> OR (the same without temporary file):
> fd=open("/data", O_TMPFILE | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> fsync(fd);
> linkat(AT_FDCWD, "/proc/self/fd/" + fd, AT_FDCWD, "/data/foobar", AT_SYMLINK_FOLLOW);
> dirfd = open("/data", O_DIRECTORY | O_RDONLY);
> fsync(dirfd);
>
>
> I’m guaranteed that, what ever happen, I’ll have a complete file (data+xattr) or no file at all in the directory /data.
>
> First question: is that a correct assumption or is there any loopholes?
Unless you have broken storage, and you are not using volatile write-cache, an
fsync of both file and directory is enough.
>
> Second question, if I replace the two fsync() by one sync(), do I get the same guarantee?
> fd=open("/data", O_TMPFILE | O_WRONLY);
> write(fd, ...);
> fsetxattr(fd, ...);
> linkat(AT_FDCWD, « /proc/self/fd/" + fd, AT_FDCWD, "/data/foobar", AT_SYMLINK_FOLLOW);
> sync();
IIRC, sync() on Linux is supposed to have the same guarantees of syncfs(), once
we wait for IO completion on sync (POSIX doesn't guarantee sync() will return
until everything is written to backing storage, but Linux does wait for IO
completion).
Short answer is, sync() does work the same way as if you run fsync() on every
file on your filesystem. The question would be. Do you want to fsync() all files
in your filesystem? This may take way longer than a pair of fsync() on the file
and its directory. But it's your call, as I said sync() will behave as if you
have ran a fsyn() on every file/directory on your filesystem.
Cheers
>
> From what I understand of the FAQ [1], write_barrier guarantee that journal (aka log) will be written before the inode (aka metadata). Did I miss something?
>
> Many thanks for your help.
>
> [1] https://github.com/openstack/swift/blob/2.19.0/swift/obj/diskfile.py#L1674-L1694
> [2] http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F
>
> --
> Romain
>
--
Carlos
next prev parent reply other threads:[~2018-10-16 20:47 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-16 10:22 [QUESTION] multiple fsync() vs single sync() Romain Le Disez
2018-10-16 12:57 ` Carlos Maiolino [this message]
2018-10-16 13:53 ` Stefan Ring
2018-10-16 14:09 ` Romain Le Disez
2018-10-18 11:43 ` Carlos Maiolino
2018-10-17 1:16 ` Dave Chinner
2018-10-19 8:16 ` Romain Le Disez
2018-10-19 12:12 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181016125712.5k5xt4lzhi76qaj6@odin.usersys.redhat.com \
--to=cmaiolino@redhat.com \
--cc=linux-xfs@vger.kernel.org \
--cc=romain.le-disez@corp.ovh.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).