public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: fdmanana@kernel.org
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3] Btrfs: fix data loss after inode eviction, renaming it, and fsync it
Date: Tue, 11 Jun 2019 19:02:15 +0200	[thread overview]
Message-ID: <20190611170215.GF3563@twin.jikos.cz> (raw)
In-Reply-To: <20190607102524.32655-1-fdmanana@kernel.org>

On Fri, Jun 07, 2019 at 11:25:24AM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> When we log an inode, regardless of logging it completely or only that it
> exists, we always update it as logged (logged_trans and last_log_commit
> fields of the inode are updated). This is generally fine and avoids future
> attempts to log it from having to do repeated work that brings no value.
> 
> However, if we write data to a file, then evict its inode after all the
> dealloc was flushed (and ordered extents completed), rename the file and
> fsync it, we end up not logging the new extents, since the rename may
> result in logging that the inode exists in case the parent directory was
> logged before. The following reproducer shows and explains how this can
> happen:
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   $ mkdir /mnt/dir
>   $ touch /mnt/dir/foo
>   $ touch /mnt/dir/bar
> 
>   # Do a direct IO write instead of a buffered write because with a
>   # buffered write we would need to make sure dealloc gets flushed and
>   # complete before we do the inode eviction later, and we can not do that
>   # from user space with call to things such as sync(2) since that results
>   # in a transaction commit as well.
>   $ xfs_io -d -c "pwrite -S 0xd3 0 4K" /mnt/dir/bar
> 
>   # Keep the directory dir in use while we evict inodes. We want our file
>   # bar's inode to be evicted but we don't want our directory's inode to
>   # be evicted (if it were evicted too, we would not be able to reproduce
>   # the issue since the first fsync below, of file foo, would result in a
>   # transaction commit.
>   $ ( cd /mnt/dir; while true; do :; done ) &
>   $ pid=$!
> 
>   # Wait a bit to give time for the background process to chdir.
>   $ sleep 0.1
> 
>   # Evict all inodes, except the inode for the directory dir because it is
>   # currently in use by our background process.
>   $ echo 2 > /proc/sys/vm/drop_caches
> 
>   # fsync file foo, which ends up persisting information about the parent
>   # directory because it is a new inode.
>   $ xfs_io -c fsync /mnt/dir/foo
> 
>   # Rename bar, this results in logging that this inode exists (inode item,
>   # names, xattrs) because the parent directory is in the log.
>   $ mv /mnt/dir/bar /mnt/dir/baz
> 
>   # Now fsync baz, which ends up doing absolutely nothing because of the
>   # rename operation which logged that the inode exists only.
>   $ xfs_io -c fsync /mnt/dir/baz
> 
>   <power failure>
> 
>   $ mount /dev/sdb /mnt
>   $ od -t x1 -A d /mnt/dir/baz
>   0000000
> 
>     --> Empty file, data we wrote is missing.
> 
> Fix this by not updating last_sub_trans of an inode when we are logging
> only that it exists and the inode was not yet logged since it was loaded
> from disk (full_sync bit set), this is enough to make btrfs_inode_in_log()
> return false for this scenario and make us log the inode. The logged_trans
> of the inode is still always setsince that alone is used to track if names
> need to be deleted as part of unlink operations.
> 
> Fixes: 257c62e1bce03e ("Btrfs: avoid tree log commit when there are no changes")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Added to misc-next, thanks.

      reply	other threads:[~2019-06-11 17:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 11:07 [PATCH] Btrfs: fix data loss after inode eviction, renaming it, and fsync it fdmanana
2019-06-06 13:19 ` [PATCH v2] " fdmanana
2019-06-07 10:25 ` [PATCH v3] " fdmanana
2019-06-11 17:02   ` David Sterba [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190611170215.GF3563@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox