* [PATCH 1/2] btrfs: assert commit root semaphore is held when accessing backref cache
2023-01-17 11:21 [PATCH 0/2] btrfs: some speedup with fiemap when leaves are shared fdmanana
@ 2023-01-17 11:21 ` fdmanana
2023-01-17 12:34 ` Johannes Thumshirn
2023-01-17 11:21 ` [PATCH 2/2] btrfs: skip backref walking during fiemap if we know the leaf is shared fdmanana
2023-01-19 19:36 ` [PATCH 0/2] btrfs: some speedup with fiemap when leaves are shared David Sterba
2 siblings, 1 reply; 5+ messages in thread
From: fdmanana @ 2023-01-17 11:21 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
During fiemap, when accessing the cache that stores the sharedness of an
extent, we need to either be holding a transaction handle or the commit
root semaphore. I left comments about this in the comment that precedes
store_backref_shared_cache() and lookup_backref_shared_cache(), but have
actually not enforced it through assertions. So assert that the commit
root semaphore is held if we are not holding a transaction handle.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/backref.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 46851511b661..f846fec08c86 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -1252,8 +1252,12 @@ static bool lookup_backref_shared_cache(struct btrfs_backref_share_check_ctx *ct
struct btrfs_root *root,
u64 bytenr, int level, bool *is_shared)
{
+ const struct btrfs_fs_info *fs_info = root->fs_info;
struct btrfs_backref_shared_cache_entry *entry;
+ if (!current->journal_info)
+ lockdep_assert_held(&fs_info->commit_root_sem);
+
if (!ctx->use_path_cache)
return false;
@@ -1288,7 +1292,7 @@ static bool lookup_backref_shared_cache(struct btrfs_backref_share_check_ctx *ct
* could be a snapshot sharing this extent buffer.
*/
if (entry->is_shared &&
- entry->gen != btrfs_get_last_root_drop_gen(root->fs_info))
+ entry->gen != btrfs_get_last_root_drop_gen(fs_info))
return false;
*is_shared = entry->is_shared;
@@ -1318,9 +1322,13 @@ static void store_backref_shared_cache(struct btrfs_backref_share_check_ctx *ctx
struct btrfs_root *root,
u64 bytenr, int level, bool is_shared)
{
+ const struct btrfs_fs_info *fs_info = root->fs_info;
struct btrfs_backref_shared_cache_entry *entry;
u64 gen;
+ if (!current->journal_info)
+ lockdep_assert_held(&fs_info->commit_root_sem);
+
if (!ctx->use_path_cache)
return;
@@ -1336,7 +1344,7 @@ static void store_backref_shared_cache(struct btrfs_backref_share_check_ctx *ctx
ASSERT(level >= 0);
if (is_shared)
- gen = btrfs_get_last_root_drop_gen(root->fs_info);
+ gen = btrfs_get_last_root_drop_gen(fs_info);
else
gen = btrfs_root_last_snapshot(&root->root_item);
--
2.35.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/2] btrfs: skip backref walking during fiemap if we know the leaf is shared
2023-01-17 11:21 [PATCH 0/2] btrfs: some speedup with fiemap when leaves are shared fdmanana
2023-01-17 11:21 ` [PATCH 1/2] btrfs: assert commit root semaphore is held when accessing backref cache fdmanana
@ 2023-01-17 11:21 ` fdmanana
2023-01-19 19:36 ` [PATCH 0/2] btrfs: some speedup with fiemap when leaves are shared David Sterba
2 siblings, 0 replies; 5+ messages in thread
From: fdmanana @ 2023-01-17 11:21 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
During fiemap, when checking if a data extent is shared we are doing the
backref walking even if we already know the leaf is shared, which is a
waste of time since if the leaf shared then the data extent is also
shared. So skip the backref walking when we know we are in a shared leaf.
The following test was measures the gains for a case where all leaves
are shared due to a snapshot:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
umount $DEV &> /dev/null
mkfs.btrfs -f $DEV
# Use compression to quickly create files with a lot of extents
# (each with a size of 128K).
mount -o compress=lzo $DEV $MNT
# 40G gives 327680 extents, each with a size of 128K.
xfs_io -f -c "pwrite -S 0xab -b 1M 0 40G" $MNT/foobar
# Add some more files to increase the size of the fs and extent
# trees (in the real world there's a lot of files and extents
# from other files).
xfs_io -f -c "pwrite -S 0xcd -b 1M 0 20G" $MNT/file1
xfs_io -f -c "pwrite -S 0xef -b 1M 0 20G" $MNT/file2
xfs_io -f -c "pwrite -S 0x73 -b 1M 0 20G" $MNT/file3
# Create a snapshot so all the extents become indirectly shared
# through subtrees, with a generation less than or equals to the
# generation used to create the snapshot.
btrfs subvolume snapshot -r $MNT $MNT/snap1
# Unmount and mount again to clear cached metadata.
umount $MNT
mount -o compress=lzo $DEV $MNT
start=$(date +%s%N)
# The filefrag tool uses the fiemap ioctl.
filefrag $MNT/foobar
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo "fiemap took $dur milliseconds (metadata not cached)"
echo
start=$(date +%s%N)
filefrag $MNT/foobar
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo "fiemap took $dur milliseconds (metadata cached)"
umount $MNT
The results were the following on a non-debug kernel (Debian's default
kernel config).
Before this patch:
(...)
/mnt/sdi/foobar: 327680 extents found
fiemap took 1821 milliseconds (metadata not cached)
/mnt/sdi/foobar: 327680 extents found
fiemap took 399 milliseconds (metadata cached)
After this patch:
(...)
/mnt/sdi/foobar: 327680 extents found
fiemap took 591 milliseconds (metadata not cached)
/mnt/sdi/foobar: 327680 extents found
fiemap took 123 milliseconds (metadata cached)
That's a speedup of 3.1x and 3.2x.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/backref.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index f846fec08c86..90e40d5ceccd 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -1872,6 +1872,8 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
.have_delayed_delete_refs = false,
};
int level;
+ bool leaf_cached;
+ bool leaf_is_shared;
for (int i = 0; i < BTRFS_BACKREF_CTX_PREV_EXTENTS_SIZE; i++) {
if (ctx->prev_extents_cache[i].bytenr == bytenr)
@@ -1893,6 +1895,23 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
walk_ctx.time_seq = elem.seq;
}
+ ctx->use_path_cache = true;
+
+ /*
+ * We may have previously determined that the current leaf is shared.
+ * If it is, then we have a data extent that is shared due to a shared
+ * subtree (caused by snapshotting) and we don't need to check for data
+ * backrefs. If the leaf is not shared, then we must do backref walking
+ * to determine if the data extent is shared through reflinks.
+ */
+ leaf_cached = lookup_backref_shared_cache(ctx, root,
+ ctx->curr_leaf_bytenr, 0,
+ &leaf_is_shared);
+ if (leaf_cached && leaf_is_shared) {
+ ret = 1;
+ goto out_trans;
+ }
+
walk_ctx.ignore_extent_item_pos = true;
walk_ctx.trans = trans;
walk_ctx.fs_info = fs_info;
@@ -1901,7 +1920,6 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
/* -1 means we are in the bytenr of the data extent. */
level = -1;
ULIST_ITER_INIT(&uiter);
- ctx->use_path_cache = true;
while (1) {
bool is_shared;
bool cached;
@@ -1972,6 +1990,7 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
ctx->prev_extents_cache_slot = slot;
}
+out_trans:
if (trans) {
btrfs_put_tree_mod_seq(fs_info, &elem);
btrfs_end_transaction(trans);
--
2.35.1
^ permalink raw reply related [flat|nested] 5+ messages in thread