From: Dave Chinner <david@fromorbit.com>
To: Vijay Chidambaram <vijay@cs.utexas.edu>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>,
Jayashree Mohan <jayashree2912@gmail.com>,
Amir Goldstein <amir73il@gmail.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
fstests <fstests@vger.kernel.org>,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: Symlink not persisted even after fsync
Date: Tue, 17 Apr 2018 10:07:36 +1000 [thread overview]
Message-ID: <20180417000736.GI23861@dastard> (raw)
In-Reply-To: <CAPaz=ELkYavvWYuMg=t+L_nSZ7fZ8CkaZr-qfGVN3QfydbOh7w@mail.gmail.com>
On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote:
> Thanks! As I mentioned before, this is useful. I have a follow-up
> question. Consider the following workload:
>
> creat foo
> link (foo, A/bar)
> fsync(foo)
> crash
>
> In this case, after the file system recovers, do we expect foo's link
> count to be 2 or 1?
So, strictly ordered behaviour:
create foo:
- creates dirent in inode B and new inode A in an atomic
transaction sequence #1
link foo -> A/bar
- creates dirent in inode C and bumps inode A link count in
an atomic transaction seqeunce #2.
fsync foo
- looks at inode A, sees it's "last modification" sequence
counter as #2
- flushes all transactions up to and including #2 to the
journal.
See the dependency chain? Both the inodes and dirents in the create
operation and the link operation are chained to the inode foo via
the atomic transactions. Hence when we flush foo, we also flush the
dependent changes because of the change atomicity requirements....
> I would say 2,
Correct, for strict ordering. But....
> but POSIX is silent on this,
Well, it's not silent, POSIX explicitly allows for fsync() to do
nothing and report success. Hence we can't really look to POSIX to
define how fsync() should behave.
> so
> thought I would confirm. The tricky part here is we are not calling
> fsync() on directory A.
Right. But directory A has a dependent change linked to foo. If we
fsync() foo, we are persisting the link count change in that file,
and hence all the other changes related to that link count change
must also be flushed. Similarly, all the cahnges related to the
creation on foo must be flushed, too.
> In this case, its not a symlink; its a hard link, so I would say the
> link count for foo should be 2.
Right - that's the "reference counted object dependency" I refered
to. i.e. it's a bi-direction atomic dependency - either we show both
the new dirent and the link count change, or we show neither of
them. Hence fsync on one object implies that we are also persisting
the related changes in the other object, too.
> But btrfs and F2FS show link count of
> 1 after a crash.
That may be valid if the dirent A/bar does not exist after recovery,
but it also means fsync() hasn't actually guaranteed inode changes
made prior to the fsync to be persistent on disk. i.e. that's a
violation of ordered metadata semantics and probably a bug.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-04-17 0:15 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-12 17:51 Symlink not persisted even after fsync Jayashree Mohan
2018-04-13 5:52 ` Amir Goldstein
2018-04-13 12:57 ` Vijay Chidambaram
[not found] ` <CAPaz=E+-baGSWhL3nD-8X4jn6rKdn2AVGLeqWh3EY5Nh-RodRA@mail.gmail.com>
2018-04-13 13:16 ` Amir Goldstein
2018-04-13 14:39 ` Jayashree Mohan
2018-04-14 1:20 ` Dave Chinner
2018-04-14 3:27 ` Vijay Chidambaram
2018-04-14 21:55 ` Dave Chinner
2018-04-15 1:13 ` Vijay Chidambaram
2018-04-15 1:30 ` Theodore Y. Ts'o
2018-04-15 1:40 ` Vijay Chidambaram
2018-04-15 1:17 ` Theodore Y. Ts'o
2018-04-15 1:38 ` Vijay Chidambaram
[not found] ` <CAHWVdUXAyyeTGNXrtTTc+tUbA3t9TUjJPSF=M4Cetj4+d1w3eQ@mail.gmail.com>
2018-04-15 14:13 ` Theodore Y. Ts'o
2018-04-16 0:10 ` Vijay Chidambaram
2018-04-16 5:39 ` Amir Goldstein
2018-04-16 15:17 ` Vijay Chidambaram
2018-04-16 5:52 ` Theodore Y. Ts'o
2018-04-16 15:09 ` Vijay Chidambaram
2018-04-17 0:07 ` Dave Chinner [this message]
2018-04-17 2:56 ` Vijay Chidambaram
2018-04-13 14:06 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180417000736.GI23861@dastard \
--to=david@fromorbit.com \
--cc=amir73il@gmail.com \
--cc=fstests@vger.kernel.org \
--cc=jayashree2912@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=tytso@mit.edu \
--cc=vijay@cs.utexas.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).