* [PATCH 0/5] btrfs: some minor log tree updates
@ 2023-05-17 11:02 fdmanana
2023-05-17 11:02 ` [PATCH 1/5] btrfs: use inode_logged() at need_log_inode() fdmanana
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
Add two optimizations to avoid falling back to transaction commits after
an inode is evicted and some cleanups. More details on the changelogs.
Filipe Manana (5):
btrfs: use inode_logged() at need_log_inode()
btrfs: use inode_logged() at btrfs_record_unlink_dir()
btrfs: update comments at btrfs_record_unlink_dir() to be more clear
btrfs: remove pointless label and goto at btrfs_record_unlink_dir()
btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool
fs/btrfs/inode.c | 8 ++++----
fs/btrfs/tree-log.c | 34 +++++++++++++++++-----------------
fs/btrfs/tree-log.h | 2 +-
3 files changed, 22 insertions(+), 22 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/5] btrfs: use inode_logged() at need_log_inode()
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
@ 2023-05-17 11:02 ` fdmanana
2023-05-17 11:02 ` [PATCH 2/5] btrfs: use inode_logged() at btrfs_record_unlink_dir() fdmanana
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
At need_log_inode() we directly check the ->logged_trans field of the
given inode to check if it was previously logged in the transaction, with
the goal of skipping logging the inode again when it's not necessary.
The ->logged_trans field in not persisted in the inode item or elsewhere,
it's only stored in memory (struct btrfs_inode), so it's transient and
lost once the inode is evicted and then loaded again. Once an inode is
loaded, we are conservative and set ->logged_trans to 0, which may mean
that either the inode was never logged in the current transaction or it
was logged but evicted before being loaded again.
Instead of checking the inode's ->logged_trans field directly, we can
use instead the helper inode_logged(), which will really check if the
inode was logged before in the current transaction in case we have a
->logged_trans field with a value of 0. This will prevent unnecessarily
logging an inode when it's not needed, and in some cases preventing a
transaction commit, in case the logging requires a fallback to a
transaction commit. The following test script shows a scenario where
due to eviction we fallback a transaction commit when trying to fsync
a file that was renamed:
$ cat test.sh
#!/bin/bash
DEV=/dev/nullb0
MNT=/mnt/nullb0
num_init_files=10000
num_new_files=10000
mkfs.btrfs -f $DEV
mount -o ssd $DEV $MNT
mkdir $MNT/testdir
for ((i = 1; i <= $num_init_files; i++)); do
echo -n > $MNT/testdir/file_$i
done
echo -n > $MNT/testdir/foo
sync
# Add some files so that there's more work in the transaction other
# than just renaming file foo.
for ((i = 1; i <= $num_new_files; i++)); do
echo -n > $MNT/testdir/new_file_$i
done
# Fsync the directory first.
xfs_io -c "fsync" $MNT/testdir
# Rename file foo.
mv $MNT/testdir/foo $MNT/testdir/bar
# Now triggger eviction of the test directory's inode.
# Once loaded again, it will have logged_trans set to 0 and
# last_unlink_trans set to the current transaction.
echo 2 > /proc/sys/vm/drop_caches
# Fsync file bar (ex-foo).
# Before the patch the fsync would result in a transaction commit
# because the inode for file bar has last_unlink_trans set to the
# current transaction, so it will attempt to log the parent directory
# as well, which will fallback to a full transaction commit because
# it also has its last_unlink_trans set to the current transaction,
# due to the inode eviction.
start=$(date +%s%N)
xfs_io -c "fsync" $MNT/testdir/bar
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo "file fsync took: $dur milliseconds"
umount $MNT
Before this patch: fsync took 22 milliseconds
After this patch: fsync took 8 milliseconds
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/tree-log.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9b212e8c70cc..ccdfe1d95572 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3252,7 +3252,7 @@ int btrfs_free_log_root_tree(struct btrfs_trans_handle *trans,
* Returns 1 if the inode was logged before in the transaction, 0 if it was not,
* and < 0 on error.
*/
-static int inode_logged(struct btrfs_trans_handle *trans,
+static int inode_logged(const struct btrfs_trans_handle *trans,
struct btrfs_inode *inode,
struct btrfs_path *path_in)
{
@@ -5303,7 +5303,7 @@ static int btrfs_check_ref_name_override(struct extent_buffer *eb,
* multiple times when multiple tasks have joined the same log transaction.
*/
static bool need_log_inode(const struct btrfs_trans_handle *trans,
- const struct btrfs_inode *inode)
+ struct btrfs_inode *inode)
{
/*
* If a directory was not modified, no dentries added or removed, we can
@@ -5321,7 +5321,7 @@ static bool need_log_inode(const struct btrfs_trans_handle *trans,
* logged_trans will be 0, in which case we have to fully log it since
* logged_trans is a transient field, not persisted.
*/
- if (inode->logged_trans == trans->transid &&
+ if (inode_logged(trans, inode, NULL) == 1 &&
!test_bit(BTRFS_INODE_COPY_EVERYTHING, &inode->runtime_flags))
return false;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/5] btrfs: use inode_logged() at btrfs_record_unlink_dir()
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
2023-05-17 11:02 ` [PATCH 1/5] btrfs: use inode_logged() at need_log_inode() fdmanana
@ 2023-05-17 11:02 ` fdmanana
2023-05-17 11:02 ` [PATCH 3/5] btrfs: update comments at btrfs_record_unlink_dir() to be more clear fdmanana
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
At btrfs_record_unlink_dir() we directly check the logged_trans field of
the given inodes to check if they were previously logged in the current
transaction, and if any of them were, then we can avoid setting the field
last_unlink_trans of the directory to the id of the current transaction if
we are in a rename path. Avoiding that can later prevent falling back to
a transaction commit if anyone attempts to log the directory.
However the logged_trans field, store in struct btrfs_inode, is transient,
not persisted in the inode item on its subvolume b+tree, so that means
that if an inode is evicted and then loaded again, its original value is
lost and it's reset to 0. So directly checking the logged_trans field can
lead to some false negative, and that only results in a performance impact
as mentioned before.
Intead of directly checking the logged_trans field of the inodes, use the
inode_logged() helper, which will check in the log tree if an inode was
logged before in case its logged_trans field has a value of 0. This way
we can avoid setting the directory inode's last_unlink_trans and cause
future logging attempts of it to fallback to transaction commits. The
following test script shows one example where this happens without this
patch:
$ cat test.sh
#!/bin/bash
DEV=/dev/nullb0
MNT=/mnt/nullb0
num_init_files=10000
num_new_files=10000
mkfs.btrfs -f $DEV
mount -o ssd $DEV $MNT
mkdir $MNT/testdir
for ((i = 1; i <= $num_init_files; i++)); do
echo -n > $MNT/testdir/file_$i
done
echo -n > $MNT/testdir/foo
sync
# Add some files so that there's more work in the transaction other
# than just renaming file foo.
for ((i = 1; i <= $num_new_files; i++)); do
echo -n > $MNT/testdir/new_file_$i
done
# Change the file, fsync it.
setfattr -n user.x1 -v 123 $MNT/testdir/foo
xfs_io -c "fsync" $MNT/testdir/foo
# Now triggger eviction of file foo but no eviction for our test
# directory, since it is being used by the process below. This will
# set logged_trans of the file's inode to 0 once it is loaded again.
(
cd $MNT/testdir
while true; do
:
done
) &
pid=$!
echo 2 > /proc/sys/vm/drop_caches
kill $pid
wait $pid
# Move foo out of our testdir. This will set last_unlink_trans
# of the directory inode to the current transaction, because
# logged_trans of both the directory and the file are set to 0.
mv $MNT/testdir/foo $MNT/foo
# Change file foo again and fsync it.
# This fsync will result in a transaction commit because the rename
# above has set last_unlink_trans of the parent directory to the id
# of the current transaction and because our inode for file foo has
# last_unlink_trans set to the current transaction, since it was
# evicted and reloaded and it was previously modified in the current
# transaction (the xattr addition).
xfs_io -c "pwrite 0 64K" $MNT/foo
start=$(date +%s%N)
xfs_io -c "fsync" $MNT/foo
end=$(date +%s%N)
dur=$(( (end - start) / 1000000 ))
echo "file fsync took: $dur milliseconds"
umount $MNT
Before this patch: fsync took 19 milliseconds
After this patch: fsync took 5 milliseconds
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/tree-log.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index ccdfe1d95572..42fe2d0b29e0 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -7329,14 +7329,14 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
* if this directory was already logged any new
* names for this file/dir will get recorded
*/
- if (dir->logged_trans == trans->transid)
+ if (inode_logged(trans, dir, NULL) == 1)
return;
/*
* if the inode we're about to unlink was logged,
* the log will be properly updated for any new names
*/
- if (inode->logged_trans == trans->transid)
+ if (inode_logged(trans, inode, NULL) == 1)
return;
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/5] btrfs: update comments at btrfs_record_unlink_dir() to be more clear
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
2023-05-17 11:02 ` [PATCH 1/5] btrfs: use inode_logged() at need_log_inode() fdmanana
2023-05-17 11:02 ` [PATCH 2/5] btrfs: use inode_logged() at btrfs_record_unlink_dir() fdmanana
@ 2023-05-17 11:02 ` fdmanana
2023-05-17 11:02 ` [PATCH 4/5] btrfs: remove pointless label and goto at btrfs_record_unlink_dir() fdmanana
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
Update the comments at btrfs_record_unlink_dir() so that they mention
where new names are logged and where old names are removed. Also, while
at it make the width of the comments closer to 80 columns and capitalize
the sentences and finish them with punctuation.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/tree-log.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 42fe2d0b29e0..225497b07eb7 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -7326,15 +7326,19 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
mutex_unlock(&inode->log_mutex);
/*
- * if this directory was already logged any new
- * names for this file/dir will get recorded
+ * If this directory was already logged, any new names will be logged
+ * with btrfs_log_new_name() and old names will be deleted from the log
+ * tree with btrfs_del_dir_entries_in_log() or with
+ * btrfs_del_inode_ref_in_log().
*/
if (inode_logged(trans, dir, NULL) == 1)
return;
/*
- * if the inode we're about to unlink was logged,
- * the log will be properly updated for any new names
+ * If the inode we're about to unlink was logged before, the log will be
+ * properly updated with the new name with btrfs_log_new_name() and the
+ * old name removed with btrfs_del_dir_entries_in_log() or with
+ * btrfs_del_inode_ref_in_log().
*/
if (inode_logged(trans, inode, NULL) == 1)
return;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/5] btrfs: remove pointless label and goto at btrfs_record_unlink_dir()
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
` (2 preceding siblings ...)
2023-05-17 11:02 ` [PATCH 3/5] btrfs: update comments at btrfs_record_unlink_dir() to be more clear fdmanana
@ 2023-05-17 11:02 ` fdmanana
2023-05-17 11:02 ` [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool fdmanana
2023-05-22 11:20 ` [PATCH 0/5] btrfs: some minor log tree updates David Sterba
5 siblings, 0 replies; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
There's no point of having a label and goto at btrfs_record_unlink_dir()
because the function is trivial and can just return early if we are not
in a rename context. So remove the label and goto and instead return
early if we are not in a rename.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/tree-log.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 225497b07eb7..061b132d3f96 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -7325,6 +7325,9 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
inode->last_unlink_trans = trans->transid;
mutex_unlock(&inode->log_mutex);
+ if (!for_rename)
+ return;
+
/*
* If this directory was already logged, any new names will be logged
* with btrfs_log_new_name() and old names will be deleted from the log
@@ -7350,13 +7353,6 @@ void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
* properly. So, we have to be conservative and force commits
* so the new name gets discovered.
*/
- if (for_rename)
- goto record;
-
- /* we can safely do the unlink without any special recording */
- return;
-
-record:
mutex_lock(&dir->log_mutex);
dir->last_unlink_trans = trans->transid;
mutex_unlock(&dir->log_mutex);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
` (3 preceding siblings ...)
2023-05-17 11:02 ` [PATCH 4/5] btrfs: remove pointless label and goto at btrfs_record_unlink_dir() fdmanana
@ 2023-05-17 11:02 ` fdmanana
2023-05-19 4:15 ` Anand Jain
2023-05-22 11:20 ` [PATCH 0/5] btrfs: some minor log tree updates David Sterba
5 siblings, 1 reply; 8+ messages in thread
From: fdmanana @ 2023-05-17 11:02 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
The for_rename argument of btrfs_record_unlink_dir() is defined as an
integer, but the argument is in fact used as a boolean. So change it to
a boolean to make its use more clear.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/inode.c | 8 ++++----
fs/btrfs/tree-log.c | 2 +-
fs/btrfs/tree-log.h | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4f9ef5ca3856..456010f48570 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4414,7 +4414,7 @@ static int btrfs_unlink(struct inode *dir, struct dentry *dentry)
}
btrfs_record_unlink_dir(trans, BTRFS_I(dir), BTRFS_I(d_inode(dentry)),
- 0);
+ false);
ret = btrfs_unlink_inode(trans, BTRFS_I(dir), BTRFS_I(d_inode(dentry)),
&fname.disk_name);
@@ -8976,9 +8976,9 @@ static int btrfs_rename_exchange(struct inode *old_dir,
if (old_dentry->d_parent != new_dentry->d_parent) {
btrfs_record_unlink_dir(trans, BTRFS_I(old_dir),
- BTRFS_I(old_inode), 1);
+ BTRFS_I(old_inode), true);
btrfs_record_unlink_dir(trans, BTRFS_I(new_dir),
- BTRFS_I(new_inode), 1);
+ BTRFS_I(new_inode), true);
}
/* src is a subvolume */
@@ -9244,7 +9244,7 @@ static int btrfs_rename(struct mnt_idmap *idmap,
if (old_dentry->d_parent != new_dentry->d_parent)
btrfs_record_unlink_dir(trans, BTRFS_I(old_dir),
- BTRFS_I(old_inode), 1);
+ BTRFS_I(old_inode), true);
if (unlikely(old_ino == BTRFS_FIRST_FREE_OBJECTID)) {
ret = btrfs_unlink_subvol(trans, BTRFS_I(old_dir), old_dentry);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 061b132d3f96..98488f6e4e88 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -7309,7 +7309,7 @@ int btrfs_recover_log_trees(struct btrfs_root *log_root_tree)
*/
void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
struct btrfs_inode *dir, struct btrfs_inode *inode,
- int for_rename)
+ bool for_rename)
{
/*
* when we're logging a file, if it hasn't been renamed
diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
index bdeb5216718f..a550a8a375cd 100644
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -100,7 +100,7 @@ void btrfs_end_log_trans(struct btrfs_root *root);
void btrfs_pin_log_trans(struct btrfs_root *root);
void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
struct btrfs_inode *dir, struct btrfs_inode *inode,
- int for_rename);
+ bool for_rename);
void btrfs_record_snapshot_destroy(struct btrfs_trans_handle *trans,
struct btrfs_inode *dir);
void btrfs_log_new_name(struct btrfs_trans_handle *trans,
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool
2023-05-17 11:02 ` [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool fdmanana
@ 2023-05-19 4:15 ` Anand Jain
0 siblings, 0 replies; 8+ messages in thread
From: Anand Jain @ 2023-05-19 4:15 UTC (permalink / raw)
To: fdmanana, linux-btrfs
On 17/5/23 19:02, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> The for_rename argument of btrfs_record_unlink_dir() is defined as an
> integer, but the argument is in fact used as a boolean. So change it to
> a boolean to make its use more clear.
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/5] btrfs: some minor log tree updates
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
` (4 preceding siblings ...)
2023-05-17 11:02 ` [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool fdmanana
@ 2023-05-22 11:20 ` David Sterba
5 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-05-22 11:20 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs
On Wed, May 17, 2023 at 12:02:11PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> Add two optimizations to avoid falling back to transaction commits after
> an inode is evicted and some cleanups. More details on the changelogs.
>
> Filipe Manana (5):
> btrfs: use inode_logged() at need_log_inode()
> btrfs: use inode_logged() at btrfs_record_unlink_dir()
> btrfs: update comments at btrfs_record_unlink_dir() to be more clear
> btrfs: remove pointless label and goto at btrfs_record_unlink_dir()
> btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-05-22 11:27 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-17 11:02 [PATCH 0/5] btrfs: some minor log tree updates fdmanana
2023-05-17 11:02 ` [PATCH 1/5] btrfs: use inode_logged() at need_log_inode() fdmanana
2023-05-17 11:02 ` [PATCH 2/5] btrfs: use inode_logged() at btrfs_record_unlink_dir() fdmanana
2023-05-17 11:02 ` [PATCH 3/5] btrfs: update comments at btrfs_record_unlink_dir() to be more clear fdmanana
2023-05-17 11:02 ` [PATCH 4/5] btrfs: remove pointless label and goto at btrfs_record_unlink_dir() fdmanana
2023-05-17 11:02 ` [PATCH 5/5] btrfs: change for_rename argument of btrfs_record_unlink_dir() to bool fdmanana
2023-05-19 4:15 ` Anand Jain
2023-05-22 11:20 ` [PATCH 0/5] btrfs: some minor log tree updates David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.