Re: Metadata of renamed inode is partially persisted after crash

linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed

From: Chao Yu <yuchao0@huawei.com>
To: "Kim, Seulbae" <seulbae@gatech.edu>,
	"jaegeuk@kernel.org" <jaegeuk@kernel.org>,
	"linux-f2fs-devel@lists.sourceforge.net"
	<linux-f2fs-devel@lists.sourceforge.net>
Cc: "Kim, Taesoo" <taesoo@gatech.edu>,
	"Yoon, Jungyeon" <jungyeon@gatech.edu>,
	"Kashyap, Sanidhya" <sanidhya@gatech.edu>,
	"Xu, Meng" <meng.xu@gatech.edu>
Subject: Re: Metadata of renamed inode is partially persisted after crash
Date: Fri, 8 Mar 2019 15:36:54 +0800	[thread overview]
Message-ID: <d8aba98a-a93d-af59-10a6-09f01b1c5ecf@huawei.com> (raw)
In-Reply-To: <BL0PR07MB390794909F5CCD4C4635FCB7D5730@BL0PR07MB3907.namprd07.prod.outlook.com>

On 2019/3/7 5:45, Kim, Seulbae wrote:
> Hi,
> 
> As we were fuzzing f2fs file system to find crash consistency bugs,
> we came across an interesting test case, which raises some questions
> regarding the consistency semantics of fdatasync().
> 
> Before proceeding to the test case,
> file "foo" already exists under the mount directory,
> and its inum is 4, size is 0, and mode is 0644.
> 
> Here's the test case:
> 0  chdir(MOUNT_DIR);
> 1  int fd_root = open(".", O_DIRECTORY, 0);
> 2  truncate(“foo”, 100); // foo's inode number is 4
> 3  sync();
> 4  int fd1 = open("bar", O_CREAT | O_RDWR, 0400); // fd1 mapped to inode #5
> (new inode)
> 5  int fd2 = open("foo", O_RDWR, 0);              // fd2 mapped to inode #4
> 6  rename("foo", "bar"); // inode #5 is unlinked
> 7  ftruncate(fd2, 500);  // inode #4’s metadata (size) is changed
> 8  fdatasync(fd1);       // persist metadata of unlinked inode #5 (no
> effect on #4)
> 9  chmod("bar", 0777);   // inode #4’s metadata (mode) is changed
> --- CRASH ---

I wrote a test program using above executing order, do the test with some
tracepoints opened, and found that line 8 will trigger checkpoint due to
fsynced file's nlink is zero, so that all metadata operations before line 8
will be persisted. Which means line 8 persist inode #5, but actually,
effecting on inode #4.

> 
> Through the test case, we created a new inode (inode #5) named "bar" (line 5),
> but practically unlinked it by renaming inode #4's name from "foo" to "bar"
> (line 6).
> As file descriptors are mapped to inodes (not names),
> line 7 changes the size of inode #4 ("bar") to 500,
> line 8 syncs the data of inode #5, which is not existent anymore, and
> line 9 changes permission of inode #4 ("bar") to 0777.
> 
> When system crashes right after executing line 9,
> the re-mounted fs image has file "foo" (inum 4), whose size is 500, but

So after checkpoint, foo has renamed to bar, eventually, image has file
"bar" (inum 4).

> mode is 0644.
> In other words, without performing fsync() on this inode,
> one of the metadata changes (size 500->100) is persisted, but the other
> (mode 0644->0777) is not.
> 
> An interesting aspect is that when we swap line 7 (ftruncate) with line 9
> (chmod), 
> then only the mode is persisted (size: 100, mode: 0777) this time, in the
> recovered image.
> And if we remove line 8, then both size and mode of inode #4 are not persisted.
> So, this led us to suspect that fdatasync'ing inode #5 is actually
> affecting the metadata status of inode #4.

As we expected.

> 
> In terms of crash consistency, this is fine since we did not fsync inode #4.
> However, doesn't this imply that there might exist a logic bug in the
> semantics of fdatasync,

The behavior is due to checkpoint triggered by fdatasync.

> especially when the file descriptor provided as its argument points to an
> old inode
> that is practically unlinked by being overwritten through renaming?

The fdatasync semantics is not broken as we expect, right?

Thanks,

> 
> Thank you,
> -Seulbae

          parent reply	other threads:[~2019-03-08  7:37 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <BL0PR07MB390794909F5CCD4C4635FCB7D5730@BL0PR07MB3907.namprd07.prod.outlook.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8aba98a-a93d-af59-10a6-09f01b1c5ecf@huawei.com \
    --to=yuchao0@huawei.com \
    --cc=jaegeuk@kernel.org \
    --cc=jungyeon@gatech.edu \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=meng.xu@gatech.edu \
    --cc=sanidhya@gatech.edu \
    --cc=seulbae@gatech.edu \
    --cc=taesoo@gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).