linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
* Re: Metadata of renamed inode is partially persisted after crash
       [not found] <BL0PR07MB390794909F5CCD4C4635FCB7D5730@BL0PR07MB3907.namprd07.prod.outlook.com>
@ 2019-03-08  7:36 ` Chao Yu
  0 siblings, 0 replies; only message in thread
From: Chao Yu @ 2019-03-08  7:36 UTC (permalink / raw)
  To: Kim, Seulbae, jaegeuk@kernel.org,
	linux-f2fs-devel@lists.sourceforge.net
  Cc: Kim, Taesoo, Yoon, Jungyeon, Kashyap, Sanidhya, Xu, Meng

On 2019/3/7 5:45, Kim, Seulbae wrote:
> Hi,
> 
> As we were fuzzing f2fs file system to find crash consistency bugs,
> we came across an interesting test case, which raises some questions
> regarding the consistency semantics of fdatasync().
> 
> Before proceeding to the test case,
> file "foo" already exists under the mount directory,
> and its inum is 4, size is 0, and mode is 0644.
> 
> Here's the test case:
> 0  chdir(MOUNT_DIR);
> 1  int fd_root = open(".", O_DIRECTORY, 0);
> 2  truncate(“foo”, 100); // foo's inode number is 4
> 3  sync();
> 4  int fd1 = open("bar", O_CREAT | O_RDWR, 0400); // fd1 mapped to inode #5
> (new inode)
> 5  int fd2 = open("foo", O_RDWR, 0);              // fd2 mapped to inode #4
> 6  rename("foo", "bar"); // inode #5 is unlinked
> 7  ftruncate(fd2, 500);  // inode #4’s metadata (size) is changed
> 8  fdatasync(fd1);       // persist metadata of unlinked inode #5 (no
> effect on #4)
> 9  chmod("bar", 0777);   // inode #4’s metadata (mode) is changed
> --- CRASH ---

I wrote a test program using above executing order, do the test with some
tracepoints opened, and found that line 8 will trigger checkpoint due to
fsynced file's nlink is zero, so that all metadata operations before line 8
will be persisted. Which means line 8 persist inode #5, but actually,
effecting on inode #4.

> 
> Through the test case, we created a new inode (inode #5) named "bar" (line 5),
> but practically unlinked it by renaming inode #4's name from "foo" to "bar"
> (line 6).
> As file descriptors are mapped to inodes (not names),
> line 7 changes the size of inode #4 ("bar") to 500,
> line 8 syncs the data of inode #5, which is not existent anymore, and
> line 9 changes permission of inode #4 ("bar") to 0777.
> 
> When system crashes right after executing line 9,
> the re-mounted fs image has file "foo" (inum 4), whose size is 500, but

So after checkpoint, foo has renamed to bar, eventually, image has file
"bar" (inum 4).

> mode is 0644.
> In other words, without performing fsync() on this inode,
> one of the metadata changes (size 500->100) is persisted, but the other
> (mode 0644->0777) is not.
> 
> An interesting aspect is that when we swap line 7 (ftruncate) with line 9
> (chmod), 
> then only the mode is persisted (size: 100, mode: 0777) this time, in the
> recovered image.
> And if we remove line 8, then both size and mode of inode #4 are not persisted.
> So, this led us to suspect that fdatasync'ing inode #5 is actually
> affecting the metadata status of inode #4.

As we expected.

> 
> In terms of crash consistency, this is fine since we did not fsync inode #4.
> However, doesn't this imply that there might exist a logic bug in the
> semantics of fdatasync,

The behavior is due to checkpoint triggered by fdatasync.

> especially when the file descriptor provided as its argument points to an
> old inode
> that is practically unlinked by being overwritten through renaming?

The fdatasync semantics is not broken as we expect, right?

Thanks,

> 
> Thank you,
> -Seulbae

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-03-08  7:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <BL0PR07MB390794909F5CCD4C4635FCB7D5730@BL0PR07MB3907.namprd07.prod.outlook.com>
2019-03-08  7:36 ` Metadata of renamed inode is partially persisted after crash Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).