From: Filipe Manana <fdmanana@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: Filipe Manana <fdmanana@suse.com>
Subject: [PATCH] Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs
Date: Tue, 13 Jan 2015 16:27:48 +0000 [thread overview]
Message-ID: <1421166468-8721-1-git-send-email-fdmanana@suse.com> (raw)
If we have an inode with a large number of hard links, some of which may
be extrefs, turn a regular ref into an extref, fsync the inode and then
replay the fsync log (after a crash/reboot), we can endup with an fsync
log that makes the replay code always fail with -EOVERFLOW when processing
the inode's references.
This is easy to reproduce with the test case I made for xfstests. Its steps
are the following:
_scratch_mkfs "-O extref" >> $seqres.full 2>&1
_init_flakey
_mount_flakey
# Create a test file with 3001 hard links. This number is large enough to
# make btrfs start using extrefs at some point even if the fs has the maximum
# possible leaf/node size (64Kb).
echo "hello world" > $SCRATCH_MNT/foo
for i in `seq 1 3000`; do
ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i`
done
# Make sure all metadata and data are durably persisted.
sync
# Now remove one link, add a new one with a new name, add another new one with
# the same name as the one we just removed and fsync the inode.
rm -f $SCRATCH_MNT/foo_link_0001
ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001
ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_0001
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
# Simulate a crash/power loss. This makes sure the next mount
# will see an fsync log and will replay that log.
_load_flakey_table $FLAKEY_DROP_WRITES
_unmount_flakey
_load_flakey_table $FLAKEY_ALLOW_WRITES
_mount_flakey
So on overflow error when overwriting a reference item (regular or extend
reference item), delete the old and replace it with the one in the fsync
log.
This issue has been present since the introduction of the extrefs feature
(2012).
A test case for xfstests follows soon. This test only passes if the previous
patch titled "Btrfs: fix fsync when extend references are added to an inode"
is applied too.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/tree-log.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index ecf462a..a1ce105 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -1245,6 +1245,28 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans,
/* finally write the back reference in the inode */
ret = overwrite_item(trans, root, path, eb, slot, key);
+ if (ret == -EOVERFLOW) {
+ /*
+ * This means we have a reference item in the fs/subvol tree
+ * that groups multiple references, some of which were added
+ * by the above loop, some are current and some are obsolete
+ * and are going to be deleted by a future stage of the fsync
+ * log replay code. So just delete the item and copy the
+ * one from the log tree into the fs/subvol tree - this is
+ * safe and later if a link count in the inode is incorrect,
+ * it will be corrected by our log replay code.
+ */
+ ret = btrfs_search_slot(trans, root, key, path, -1, 1);
+ if (WARN_ON(ret == 1))
+ ret = -EIO;
+ if (ret < 0)
+ goto out;
+ ret = btrfs_del_item(trans, root, path);
+ if (ret)
+ goto out;
+ btrfs_release_path(path);
+ ret = overwrite_item(trans, root, path, eb, slot, key);
+ }
out:
btrfs_release_path(path);
kfree(name);
--
2.1.3
next reply other threads:[~2015-01-13 16:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-13 16:27 Filipe Manana [this message]
2015-01-14 1:28 ` [PATCH v2] Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs Filipe Manana
2015-01-14 1:52 ` [PATCH v3] " Filipe Manana
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1421166468-8721-1-git-send-email-fdmanana@suse.com \
--to=fdmanana@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).