* [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset
@ 2025-04-14 16:54 Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 1/9] ext4: convert i_fc_lock to spinlock Harshad Shirwadkar
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
V8 Cover Letter
---------------
This is the V8 of the patch series. This patch series contains fixes to
review comments by Jan Kara (<jack@suse.cz>). The main changes are as
follows:
- sbi->s_fc_lock is now a mutex type, and it is not released during
commit path. This gets rid of a number of soft lockups due to race
with ext4_fc_del().
- I have dropped "ext4: make fast commit ineligible on
ext4_reserve_inode_write failure" patch.
- I have split "ext4: hold s_fc_lock while during fast commit" into
two patches: "ext4: convert s_fc_lock to mutex type" and "ext4: hold
s_fc_lock while during fast commit" to make reviewing easy.
V7 Cover Letter
---------------
This is the V7 of the patch series. This patch series contains fixes to
review comments by Jan Kara (<jack@suse.cz>). The main changes are as
follows:
- As discussed in V6 review, I have dropped "ext4: add nolock mode to
ext4_map_blocks()" patch given that we now ensure that i_data_sem is
always grabbed before ext4_mark_inode_dirty() is called.
- I have also dropped "ext4: introduce selective flushing in fast commit"
given that correctly implementing that would require more changes, and I
think they would be best done outside of this series.
- I added "ext4: introduce selective flushing in fast commit" as the last
patch in the series. While testing log group tests I found a few failures
which get fixed due to this patch.
V6 Cover Letter
---------------
This is the V6 of the patch series. This patch series contains fixes to a
bunch of kernel build warnings reported by Kernel Test Robot
(lkp@intel.com) and Dan Carpenter (dan.carpenter@linaro.org). Thank you!
V5 Cover Letter
---------------
This patch series supersedes the patch series "ext4: remove journal barrier
during fast commit" sent in July 2022[1]. Except following three patches
all the other patches are newly introduced in this series.
* ext4: convert i_fc_lock to spinlock
* ext4: for committing inode, make ext4_fc_track_inode wait
* ext4: drop i_fc_updates from inode fc info
This patchset does not introduce any on-disk format and / or user visible
API change. This patchset reworks fast-commit's commit path improve overall
performance of the commit path. Following optimizations have been added in
this series:
* Avoid having to lock the journal throughout the fast commit.
* Remove tracking of open handles per inode.
* Avoid issuing cache flushes when fast commits are contained within a
single FUA writes and there is no data that needs flushing.
* Temporarily lift committing thread's priority to match that of the
journal thread to reduce scheduling delays.
With the changes introduced in this patch series, now the commit path for
fast commits is as follows:
1. Lock the journal by calling jbd2_journal_lock_updates_no_rsv(). This
ensures that all the exsiting handles finish and no new handles can
start.
2. Mark all the fast commit eligible inodes as undergoing fast commit by
setting "EXT4_STATE_FC_COMMITTING" state.
3. Unlock the journal by calling jbd2_journal_unlock_updates. This allows
starting of new handles. If new handles try to start an update on any of
the inodes that are being committed, ext4_fc_track_inode() will block
until those inodes have finished the fast commit.
4. Submit data buffers of all the committing inodes.
5. Wait for [4] to complete.
6. Commit all the directory entry updates in the fast commit space.
7. Commit all the changed inodes in the fast commit space and clear
"EXT4_STATE_FC_COMMITTING" for all the inodes.
8. Write tail tag to ensure atomicity of commits.
(The above flow has been documented in the code as well)
I verified that the patch series introduces no regressions in "log" groups
when "fast_commit" feature is enabled.
Also, we have a paper on fast commits in USENIX ATC 2024 this year which
should become available on the website[2] in a few months.
[1] https://lwn.net/Articles/902022/
[2] https://www.usenix.org/conference/atc24
Harshad Shirwadkar (9):
ext4: convert i_fc_lock to spinlock
ext4: for committing inode, make ext4_fc_track_inode wait
ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr
ext4: rework fast commit commit path
ext4: drop i_fc_updates from inode fc info
ext4: update code documentation
ext4: temporarily elevate commit thread priority
ext4: convert s_fc_lock to mutex type
ext4: hold s_fc_lock while during fast commit
fs/ext4/ext4.h | 18 +--
fs/ext4/fast_commit.c | 365 ++++++++++++++++++++----------------------
fs/ext4/inline.c | 1 +
fs/ext4/inode.c | 12 +-
fs/ext4/super.c | 9 +-
fs/jbd2/journal.c | 2 -
6 files changed, 201 insertions(+), 206 deletions(-)
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v8 1/9] ext4: convert i_fc_lock to spinlock
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 2/9] ext4: for committing inode, make ext4_fc_track_inode wait Harshad Shirwadkar
` (8 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
Convert ext4_inode_info->i_fc_lock to spinlock to avoid sleeping
in invalid contexts.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
---
fs/ext4/ext4.h | 7 +++++--
fs/ext4/fast_commit.c | 19 +++++++++----------
fs/ext4/super.c | 2 +-
3 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5a20e9cd7184..79dfb57a7046 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1069,8 +1069,11 @@ struct ext4_inode_info {
/* Fast commit wait queue for this inode */
wait_queue_head_t i_fc_wait;
- /* Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len */
- struct mutex i_fc_lock;
+ /*
+ * Protect concurrent accesses on i_fc_lblk_start, i_fc_lblk_len
+ * and inode's EXT4_FC_STATE_COMMITTING state bit.
+ */
+ spinlock_t i_fc_lock;
/*
* i_disksize keeps track of what the inode size is ON DISK, not
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index da4263a14a20..63859ec6d91d 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -385,7 +385,7 @@ static int ext4_fc_track_template(
int ret;
tid = handle->h_transaction->t_tid;
- mutex_lock(&ei->i_fc_lock);
+ spin_lock(&ei->i_fc_lock);
if (tid == ei->i_sync_tid) {
update = true;
} else {
@@ -393,8 +393,7 @@ static int ext4_fc_track_template(
ei->i_sync_tid = tid;
}
ret = __fc_track_fn(handle, inode, args, update);
- mutex_unlock(&ei->i_fc_lock);
-
+ spin_unlock(&ei->i_fc_lock);
if (!enqueue)
return ret;
@@ -428,19 +427,19 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
struct super_block *sb = inode->i_sb;
struct ext4_sb_info *sbi = EXT4_SB(sb);
- mutex_unlock(&ei->i_fc_lock);
+ spin_unlock(&ei->i_fc_lock);
if (IS_ENCRYPTED(dir)) {
ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_ENCRYPTED_FILENAME,
handle);
- mutex_lock(&ei->i_fc_lock);
+ spin_lock(&ei->i_fc_lock);
return -EOPNOTSUPP;
}
node = kmem_cache_alloc(ext4_fc_dentry_cachep, GFP_NOFS);
if (!node) {
ext4_fc_mark_ineligible(sb, EXT4_FC_REASON_NOMEM, handle);
- mutex_lock(&ei->i_fc_lock);
+ spin_lock(&ei->i_fc_lock);
return -ENOMEM;
}
@@ -471,7 +470,7 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
list_add_tail(&node->fcd_dilist, &ei->i_fc_dilist);
}
spin_unlock(&sbi->s_fc_lock);
- mutex_lock(&ei->i_fc_lock);
+ spin_lock(&ei->i_fc_lock);
return 0;
}
@@ -893,15 +892,15 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
struct ext4_extent *ex;
int ret;
- mutex_lock(&ei->i_fc_lock);
+ spin_lock(&ei->i_fc_lock);
if (ei->i_fc_lblk_len == 0) {
- mutex_unlock(&ei->i_fc_lock);
+ spin_unlock(&ei->i_fc_lock);
return 0;
}
old_blk_size = ei->i_fc_lblk_start;
new_blk_size = ei->i_fc_lblk_start + ei->i_fc_lblk_len - 1;
ei->i_fc_lblk_len = 0;
- mutex_unlock(&ei->i_fc_lock);
+ spin_unlock(&ei->i_fc_lock);
cur_lblk_off = old_blk_size;
ext4_debug("will try writing %d to %d for inode %ld\n",
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 8122d4ffb3b5..2cf92657fdcd 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1415,7 +1415,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
ei->i_datasync_tid = 0;
INIT_WORK(&ei->i_rsv_conversion_work, ext4_end_io_rsv_work);
ext4_fc_init_inode(&ei->vfs_inode);
- mutex_init(&ei->i_fc_lock);
+ spin_lock_init(&ei->i_fc_lock);
return &ei->vfs_inode;
}
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 2/9] ext4: for committing inode, make ext4_fc_track_inode wait
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 1/9] ext4: convert i_fc_lock to spinlock Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 3/9] ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr Harshad Shirwadkar
` (7 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
If the inode that's being requested to track using ext4_fc_track_inode
is being committed, then wait until the inode finishes the
commit. Also, add calls to ext4_fc_track_inode at the right places.
With this patch, now calling ext4_reserve_inode_write() results in
inode being tracked for next fast commit. This ensures that by the
time ext4_reserve_inode_write() returns, it is ready to be modified
and won't be committed until the corresponding handle is open.
A subtle lock ordering requirement with i_data_sem (which is
documented in the code) requires that ext4_fc_track_inode() be called
before grabbing i_data_sem. So, this patch also adds explicit
ext4_fc_track_inode() calls in places where i_data_sem grabbed.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
---
fs/ext4/fast_commit.c | 35 +++++++++++++++++++++++++++++++++++
fs/ext4/inline.c | 1 +
fs/ext4/inode.c | 5 +++++
3 files changed, 41 insertions(+)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 63859ec6d91d..c4d3c71d5e6c 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -12,6 +12,7 @@
#include "ext4_extents.h"
#include "mballoc.h"
+#include <linux/lockdep.h>
/*
* Ext4 Fast Commits
* -----------------
@@ -570,6 +571,8 @@ static int __track_inode(handle_t *handle, struct inode *inode, void *arg,
void ext4_fc_track_inode(handle_t *handle, struct inode *inode)
{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ wait_queue_head_t *wq;
int ret;
if (S_ISDIR(inode->i_mode))
@@ -587,6 +590,38 @@ void ext4_fc_track_inode(handle_t *handle, struct inode *inode)
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
return;
+ if (!list_empty(&ei->i_fc_list))
+ return;
+
+ /*
+ * If we come here, we may sleep while waiting for the inode to
+ * commit. We shouldn't be holding i_data_sem when we go to sleep since
+ * the commit path needs to grab the lock while committing the inode.
+ */
+ lockdep_assert_not_held(&ei->i_data_sem);
+
+ while (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)) {
+#if (BITS_PER_LONG < 64)
+ DEFINE_WAIT_BIT(wait, &ei->i_state_flags,
+ EXT4_STATE_FC_COMMITTING);
+ wq = bit_waitqueue(&ei->i_state_flags,
+ EXT4_STATE_FC_COMMITTING);
+#else
+ DEFINE_WAIT_BIT(wait, &ei->i_flags,
+ EXT4_STATE_FC_COMMITTING);
+ wq = bit_waitqueue(&ei->i_flags,
+ EXT4_STATE_FC_COMMITTING);
+#endif
+ prepare_to_wait(wq, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
+ if (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING))
+ schedule();
+ finish_wait(wq, &wait.wq_entry);
+ }
+
+ /*
+ * From this point on, this inode will not be committed either
+ * by fast or full commit as long as the handle is open.
+ */
ret = ext4_fc_track_template(handle, inode, __track_inode, NULL, 1);
trace_ext4_fc_track_inode(handle, inode, ret);
}
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index f608f6554b95..271e7c93e477 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -601,6 +601,7 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
goto out;
}
+ ext4_fc_track_inode(handle, inode);
ret = ext4_destroy_inline_data_nolock(handle, inode);
if (ret)
goto out;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index aede80fa1781..cd0879f5f178 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -693,6 +693,8 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
if (!(flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN))
return retval;
+
+ ext4_fc_track_inode(handle, inode);
/*
* New blocks allocate and/or writing to unwritten extent
* will possibly result in updating i_data, so we take
@@ -4079,6 +4081,7 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length)
if (end_lblk > start_lblk) {
ext4_lblk_t hole_len = end_lblk - start_lblk;
+ ext4_fc_track_inode(handle, inode);
down_write(&EXT4_I(inode)->i_data_sem);
ext4_discard_preallocations(inode);
@@ -4231,6 +4234,7 @@ int ext4_truncate(struct inode *inode)
if (err)
goto out_stop;
+ ext4_fc_track_inode(handle, inode);
down_write(&EXT4_I(inode)->i_data_sem);
ext4_discard_preallocations(inode);
@@ -5874,6 +5878,7 @@ ext4_reserve_inode_write(handle_t *handle, struct inode *inode,
brelse(iloc->bh);
iloc->bh = NULL;
}
+ ext4_fc_track_inode(handle, inode);
}
ext4_std_error(inode->i_sb, err);
return err;
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 3/9] ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 1/9] ext4: convert i_fc_lock to spinlock Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 2/9] ext4: for committing inode, make ext4_fc_track_inode wait Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 4/9] ext4: rework fast commit commit path Harshad Shirwadkar
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
Mark inode dirty first and then grab i_data_sem in ext4_setattr().
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
---
fs/ext4/inode.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index cd0879f5f178..cb50786121a6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5547,9 +5547,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
down_write(&EXT4_I(inode)->i_data_sem);
old_disksize = EXT4_I(inode)->i_disksize;
EXT4_I(inode)->i_disksize = attr->ia_size;
- rc = ext4_mark_inode_dirty(handle, inode);
- if (!error)
- error = rc;
+
/*
* We have to update i_size under i_data_sem together
* with i_disksize to avoid races with writeback code
@@ -5560,6 +5558,9 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
else
EXT4_I(inode)->i_disksize = old_disksize;
up_write(&EXT4_I(inode)->i_data_sem);
+ rc = ext4_mark_inode_dirty(handle, inode);
+ if (!error)
+ error = rc;
ext4_journal_stop(handle);
if (error)
goto out_mmap_sem;
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 4/9] ext4: rework fast commit commit path
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (2 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 3/9] ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 5/9] ext4: drop i_fc_updates from inode fc info Harshad Shirwadkar
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
This patch reworks fast commit's commit path to remove locking the
journal for the entire duration of a fast commit. Instead, we only lock
the journal while marking all the eligible inodes as "committing". This
allows handles to make progress in parallel with the fast commit.
Change since last review:
I have updated the ext4_fc_cleanup() to make it more readable. Could
you please take a look at it again?
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
merge with rework
---
fs/ext4/fast_commit.c | 94 +++++++++++++++++++++++++++----------------
fs/jbd2/journal.c | 2 -
2 files changed, 60 insertions(+), 36 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index c4d3c71d5e6c..126df97944c0 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -291,20 +291,30 @@ void ext4_fc_del(struct inode *inode)
if (ext4_fc_disabled(inode->i_sb))
return;
-restart:
spin_lock(&sbi->s_fc_lock);
if (list_empty(&ei->i_fc_list) && list_empty(&ei->i_fc_dilist)) {
spin_unlock(&sbi->s_fc_lock);
return;
}
- if (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)) {
- ext4_fc_wait_committing_inode(inode);
- goto restart;
- }
-
- if (!list_empty(&ei->i_fc_list))
- list_del_init(&ei->i_fc_list);
+ /*
+ * Since ext4_fc_del is called from ext4_evict_inode while having a
+ * handle open, there is no need for us to wait here even if a fast
+ * commit is going on. That is because, if this inode is being
+ * committed, ext4_mark_inode_dirty would have waited for inode commit
+ * operation to finish before we come here. So, by the time we come
+ * here, inode's EXT4_STATE_FC_COMMITTING would have been cleared. So,
+ * we shouldn't see EXT4_STATE_FC_COMMITTING to be set on this inode
+ * here.
+ *
+ * We may come here without any handles open in the "no_delete" case of
+ * ext4_evict_inode as well. However, if that happens, we first mark the
+ * file system as fast commit ineligible anyway. So, even in that case,
+ * it is okay to remove the inode from the fc list.
+ */
+ WARN_ON(ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)
+ && !ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE));
+ list_del_init(&ei->i_fc_list);
/*
* Since this inode is getting removed, let's also remove all FC
@@ -325,8 +335,6 @@ void ext4_fc_del(struct inode *inode)
release_dentry_name_snapshot(&fc_dentry->fcd_name);
kmem_cache_free(ext4_fc_dentry_cachep, fc_dentry);
-
- return;
}
/*
@@ -998,19 +1006,6 @@ static int ext4_fc_submit_inode_data_all(journal_t *journal)
spin_lock(&sbi->s_fc_lock);
list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
- ext4_set_inode_state(&ei->vfs_inode, EXT4_STATE_FC_COMMITTING);
- while (atomic_read(&ei->i_fc_updates)) {
- DEFINE_WAIT(wait);
-
- prepare_to_wait(&ei->i_fc_wait, &wait,
- TASK_UNINTERRUPTIBLE);
- if (atomic_read(&ei->i_fc_updates)) {
- spin_unlock(&sbi->s_fc_lock);
- schedule();
- spin_lock(&sbi->s_fc_lock);
- }
- finish_wait(&ei->i_fc_wait, &wait);
- }
spin_unlock(&sbi->s_fc_lock);
ret = jbd2_submit_inode_data(journal, ei->jinode);
if (ret)
@@ -1123,6 +1118,19 @@ static int ext4_fc_perform_commit(journal_t *journal)
int ret = 0;
u32 crc = 0;
+ /*
+ * Wait for all the handles of the current transaction to complete
+ * and then lock the journal.
+ */
+ jbd2_journal_lock_updates(journal);
+ spin_lock(&sbi->s_fc_lock);
+ list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
+ ext4_set_inode_state(&iter->vfs_inode,
+ EXT4_STATE_FC_COMMITTING);
+ }
+ spin_unlock(&sbi->s_fc_lock);
+ jbd2_journal_unlock_updates(journal);
+
ret = ext4_fc_submit_inode_data_all(journal);
if (ret)
return ret;
@@ -1173,6 +1181,18 @@ static int ext4_fc_perform_commit(journal_t *journal)
ret = ext4_fc_write_inode(inode, &crc);
if (ret)
goto out;
+ ext4_clear_inode_state(inode, EXT4_STATE_FC_COMMITTING);
+ /*
+ * Make sure clearing of EXT4_STATE_FC_COMMITTING is
+ * visible before we send the wakeup. Pairs with implicit
+ * barrier in prepare_to_wait() in ext4_fc_track_inode().
+ */
+ smp_mb();
+#if (BITS_PER_LONG < 64)
+ wake_up_bit(&iter->i_state_flags, EXT4_STATE_FC_COMMITTING);
+#else
+ wake_up_bit(&iter->i_flags, EXT4_STATE_FC_COMMITTING);
+#endif
spin_lock(&sbi->s_fc_lock);
}
spin_unlock(&sbi->s_fc_lock);
@@ -1298,7 +1318,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
{
struct super_block *sb = journal->j_private;
struct ext4_sb_info *sbi = EXT4_SB(sb);
- struct ext4_inode_info *iter, *iter_n;
+ struct ext4_inode_info *ei;
struct ext4_fc_dentry_update *fc_dentry;
if (full && sbi->s_fc_bh)
@@ -1308,13 +1328,15 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
jbd2_fc_release_bufs(journal);
spin_lock(&sbi->s_fc_lock);
- list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
- i_fc_list) {
- list_del_init(&iter->i_fc_list);
- ext4_clear_inode_state(&iter->vfs_inode,
+ while (!list_empty(&sbi->s_fc_q[FC_Q_MAIN])) {
+ ei = list_first_entry(&sbi->s_fc_q[FC_Q_MAIN],
+ struct ext4_inode_info,
+ i_fc_list);
+ list_del_init(&ei->i_fc_list);
+ ext4_clear_inode_state(&ei->vfs_inode,
EXT4_STATE_FC_COMMITTING);
- if (tid_geq(tid, iter->i_sync_tid)) {
- ext4_fc_reset_inode(&iter->vfs_inode);
+ if (tid_geq(tid, ei->i_sync_tid)) {
+ ext4_fc_reset_inode(&ei->vfs_inode);
} else if (full) {
/*
* We are called after a full commit, inode has been
@@ -1325,15 +1347,19 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
* time in that case (and tid doesn't increase so
* tid check above isn't reliable).
*/
- list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
+ list_add_tail(&ei->i_fc_list,
&sbi->s_fc_q[FC_Q_STAGING]);
}
- /* Make sure EXT4_STATE_FC_COMMITTING bit is clear */
+ /*
+ * Make sure clearing of EXT4_STATE_FC_COMMITTING is
+ * visible before we send the wakeup. Pairs with implicit
+ * barrier in prepare_to_wait() in ext4_fc_track_inode().
+ */
smp_mb();
#if (BITS_PER_LONG < 64)
- wake_up_bit(&iter->i_state_flags, EXT4_STATE_FC_COMMITTING);
+ wake_up_bit(&ei->i_state_flags, EXT4_STATE_FC_COMMITTING);
#else
- wake_up_bit(&iter->i_flags, EXT4_STATE_FC_COMMITTING);
+ wake_up_bit(&ei->i_flags, EXT4_STATE_FC_COMMITTING);
#endif
}
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index a5ccba25ff47..82004a3439b4 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -728,7 +728,6 @@ int jbd2_fc_begin_commit(journal_t *journal, tid_t tid)
}
journal->j_flags |= JBD2_FAST_COMMIT_ONGOING;
write_unlock(&journal->j_state_lock);
- jbd2_journal_lock_updates(journal);
return 0;
}
@@ -742,7 +741,6 @@ static int __jbd2_fc_end_commit(journal_t *journal, tid_t tid, bool fallback)
{
if (journal->j_fc_cleanup_callback)
journal->j_fc_cleanup_callback(journal, 0, tid);
- jbd2_journal_unlock_updates(journal);
write_lock(&journal->j_state_lock);
journal->j_flags &= ~JBD2_FAST_COMMIT_ONGOING;
if (fallback)
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 5/9] ext4: drop i_fc_updates from inode fc info
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (3 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 4/9] ext4: rework fast commit commit path Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 6/9] ext4: update code documentation Harshad Shirwadkar
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
The new logic introduced in this series does not require tracking number
of active handles open on an inode. So, drop it.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4.h | 5 ----
fs/ext4/fast_commit.c | 68 -------------------------------------------
2 files changed, 73 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 79dfb57a7046..68f40fa1b0eb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1061,9 +1061,6 @@ struct ext4_inode_info {
/* End of lblk range that needs to be committed in this fast commit */
ext4_lblk_t i_fc_lblk_len;
- /* Number of ongoing updates on this inode */
- atomic_t i_fc_updates;
-
spinlock_t i_raw_lock; /* protects updates to the raw inode */
/* Fast commit wait queue for this inode */
@@ -2925,8 +2922,6 @@ void __ext4_fc_track_create(handle_t *handle, struct inode *inode,
void ext4_fc_track_create(handle_t *handle, struct dentry *dentry);
void ext4_fc_track_inode(handle_t *handle, struct inode *inode);
void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handle);
-void ext4_fc_start_update(struct inode *inode);
-void ext4_fc_stop_update(struct inode *inode);
void ext4_fc_del(struct inode *inode);
bool ext4_fc_replay_check_excluded(struct super_block *sb, ext4_fsblk_t block);
void ext4_fc_replay_cleanup(struct super_block *sb);
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 126df97944c0..2b12f5031633 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -202,32 +202,6 @@ void ext4_fc_init_inode(struct inode *inode)
INIT_LIST_HEAD(&ei->i_fc_list);
INIT_LIST_HEAD(&ei->i_fc_dilist);
init_waitqueue_head(&ei->i_fc_wait);
- atomic_set(&ei->i_fc_updates, 0);
-}
-
-/* This function must be called with sbi->s_fc_lock held. */
-static void ext4_fc_wait_committing_inode(struct inode *inode)
-__releases(&EXT4_SB(inode->i_sb)->s_fc_lock)
-{
- wait_queue_head_t *wq;
- struct ext4_inode_info *ei = EXT4_I(inode);
-
-#if (BITS_PER_LONG < 64)
- DEFINE_WAIT_BIT(wait, &ei->i_state_flags,
- EXT4_STATE_FC_COMMITTING);
- wq = bit_waitqueue(&ei->i_state_flags,
- EXT4_STATE_FC_COMMITTING);
-#else
- DEFINE_WAIT_BIT(wait, &ei->i_flags,
- EXT4_STATE_FC_COMMITTING);
- wq = bit_waitqueue(&ei->i_flags,
- EXT4_STATE_FC_COMMITTING);
-#endif
- lockdep_assert_held(&EXT4_SB(inode->i_sb)->s_fc_lock);
- prepare_to_wait(wq, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
- spin_unlock(&EXT4_SB(inode->i_sb)->s_fc_lock);
- schedule();
- finish_wait(wq, &wait.wq_entry);
}
static bool ext4_fc_disabled(struct super_block *sb)
@@ -236,48 +210,6 @@ static bool ext4_fc_disabled(struct super_block *sb)
(EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY));
}
-/*
- * Inform Ext4's fast about start of an inode update
- *
- * This function is called by the high level call VFS callbacks before
- * performing any inode update. This function blocks if there's an ongoing
- * fast commit on the inode in question.
- */
-void ext4_fc_start_update(struct inode *inode)
-{
- struct ext4_inode_info *ei = EXT4_I(inode);
-
- if (ext4_fc_disabled(inode->i_sb))
- return;
-
-restart:
- spin_lock(&EXT4_SB(inode->i_sb)->s_fc_lock);
- if (list_empty(&ei->i_fc_list))
- goto out;
-
- if (ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING)) {
- ext4_fc_wait_committing_inode(inode);
- goto restart;
- }
-out:
- atomic_inc(&ei->i_fc_updates);
- spin_unlock(&EXT4_SB(inode->i_sb)->s_fc_lock);
-}
-
-/*
- * Stop inode update and wake up waiting fast commits if any.
- */
-void ext4_fc_stop_update(struct inode *inode)
-{
- struct ext4_inode_info *ei = EXT4_I(inode);
-
- if (ext4_fc_disabled(inode->i_sb))
- return;
-
- if (atomic_dec_and_test(&ei->i_fc_updates))
- wake_up_all(&ei->i_fc_wait);
-}
-
/*
* Remove inode from fast commit list. If the inode is being committed
* we wait until inode commit is done.
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 6/9] ext4: update code documentation
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (4 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 5/9] ext4: drop i_fc_updates from inode fc info Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 7/9] ext4: temporarily elevate commit thread priority Harshad Shirwadkar
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
This patch updates code documentation to reflect the commit path changes
made in this series.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/fast_commit.c | 45 +++++++++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 2b12f5031633..43fcdeb2dac0 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -50,19 +50,24 @@
* that need to be committed during a fast commit in another in memory queue of
* inodes. During the commit operation, we commit in the following order:
*
- * [1] Lock inodes for any further data updates by setting COMMITTING state
- * [2] Submit data buffers of all the inodes
- * [3] Wait for [2] to complete
- * [4] Commit all the directory entry updates in the fast commit space
- * [5] Commit all the changed inode structures
- * [6] Write tail tag (this tag ensures the atomicity, please read the following
+ * [1] Lock the journal by calling jbd2_journal_lock_updates. This ensures that
+ * all the exsiting handles finish and no new handles can start.
+ * [2] Mark all the fast commit eligible inodes as undergoing fast commit
+ * by setting "EXT4_STATE_FC_COMMITTING" state.
+ * [3] Unlock the journal by calling jbd2_journal_unlock_updates. This allows
+ * starting of new handles. If new handles try to start an update on
+ * any of the inodes that are being committed, ext4_fc_track_inode()
+ * will block until those inodes have finished the fast commit.
+ * [4] Submit data buffers of all the committing inodes.
+ * [5] Wait for [4] to complete.
+ * [6] Commit all the directory entry updates in the fast commit space.
+ * [7] Commit all the changed inodes in the fast commit space and clear
+ * "EXT4_STATE_FC_COMMITTING" for these inodes.
+ * [8] Write tail tag (this tag ensures the atomicity, please read the following
* section for more details).
- * [7] Wait for [4], [5] and [6] to complete.
*
- * All the inode updates must call ext4_fc_start_update() before starting an
- * update. If such an ongoing update is present, fast commit waits for it to
- * complete. The completion of such an update is marked by
- * ext4_fc_stop_update().
+ * All the inode updates must be enclosed within jbd2_jounrnal_start()
+ * and jbd2_journal_stop() similar to JBD2 journaling.
*
* Fast Commit Ineligibility
* -------------------------
@@ -143,6 +148,13 @@
* similarly. Thus, by converting a non-idempotent procedure into a series of
* idempotent outcomes, fast commits ensured idempotence during the replay.
*
+ * Locking
+ * -------
+ * sbi->s_fc_lock protects the fast commit inodes queue and the fast commit
+ * dentry queue. ei->i_fc_lock protects the fast commit related info in a given
+ * inode. Most of the code avoids acquiring both the locks, but if one must do
+ * that then sbi->s_fc_lock must be acquired before ei->i_fc_lock.
+ *
* TODOs
* -----
*
@@ -157,13 +169,12 @@
* fast commit recovery even if that area is invalidated by later full
* commits.
*
- * 1) Fast commit's commit path locks the entire file system during fast
- * commit. This has significant performance penalty. Instead of that, we
- * should use ext4_fc_start/stop_update functions to start inode level
- * updates from ext4_journal_start/stop. Once we do that we can drop file
- * system locking during commit path.
+ * 1) Handle more ineligible cases.
*
- * 2) Handle more ineligible cases.
+ * 2) Change ext4_fc_commit() to lookup logical to physical mapping using extent
+ * status tree. This would get rid of the need to call ext4_fc_track_inode()
+ * before acquiring i_data_sem. To do that we would need to ensure that
+ * modified extents from the extent status tree are not evicted from memory.
*/
#include <trace/events/ext4.h>
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 7/9] ext4: temporarily elevate commit thread priority
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (5 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 6/9] ext4: update code documentation Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 8/9] ext4: convert s_fc_lock to mutex type Harshad Shirwadkar
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
Unlike JBD2 based full commits, there is no dedicated journal thread
for fast commits. Thus to reduce scheduling delays between IO
submission and completion, temporarily elevate the committer thread's
priority to match the configured priority of the JBD2 journal
thread.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4.h | 4 +++-
fs/ext4/fast_commit.c | 13 +++++++++++++
fs/ext4/super.c | 5 ++---
3 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 68f40fa1b0eb..faba91321aab 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2295,10 +2295,12 @@ static inline int ext4_emergency_state(struct super_block *sb)
#define EXT4_DEFM_NODELALLOC 0x0800
/*
- * Default journal batch times
+ * Default journal batch times and ioprio.
*/
#define EXT4_DEF_MIN_BATCH_TIME 0
#define EXT4_DEF_MAX_BATCH_TIME 15000 /* 15ms */
+#define EXT4_DEF_JOURNAL_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
+
/*
* Default values for superblock update
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 43fcdeb2dac0..3b441452f3cf 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1188,6 +1188,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
int subtid = atomic_read(&sbi->s_fc_subtid);
int status = EXT4_FC_STATUS_OK, fc_bufs_before = 0;
ktime_t start_time, commit_time;
+ int old_ioprio, journal_ioprio;
if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
return jbd2_complete_transaction(journal, commit_tid);
@@ -1195,6 +1196,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
trace_ext4_fc_commit_start(sb, commit_tid);
start_time = ktime_get();
+ old_ioprio = get_current_ioprio();
restart_fc:
ret = jbd2_fc_begin_commit(journal, commit_tid);
@@ -1225,6 +1227,15 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
goto fallback;
}
+ /*
+ * Now that we know that this thread is going to do a fast commit,
+ * elevate the priority to match that of the journal thread.
+ */
+ if (journal->j_task->io_context)
+ journal_ioprio = sbi->s_journal->j_task->io_context->ioprio;
+ else
+ journal_ioprio = EXT4_DEF_JOURNAL_IOPRIO;
+ set_task_ioprio(current, journal_ioprio);
fc_bufs_before = (sbi->s_fc_bytes + bsize - 1) / bsize;
ret = ext4_fc_perform_commit(journal);
if (ret < 0) {
@@ -1239,6 +1250,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
}
atomic_inc(&sbi->s_fc_subtid);
ret = jbd2_fc_end_commit(journal);
+ set_task_ioprio(current, old_ioprio);
/*
* weight the commit time higher than the average time so we
* don't react too strongly to vast changes in the commit time
@@ -1248,6 +1260,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
return ret;
fallback:
+ set_task_ioprio(current, old_ioprio);
ret = jbd2_fc_end_commit_fallback(journal);
ext4_fc_update_stats(sb, status, 0, 0, commit_tid);
return ret;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 2cf92657fdcd..0a4ca1c8e5ce 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1809,7 +1809,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = {
{}
};
-#define DEFAULT_JOURNAL_IOPRIO (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 3))
#define MOPT_SET 0x0001
#define MOPT_CLEAR 0x0002
@@ -5255,7 +5254,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
/* Set defaults for the variables that will be set during parsing */
if (!(ctx->spec & EXT4_SPEC_JOURNAL_IOPRIO))
- ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
+ ctx->journal_ioprio = EXT4_DEF_JOURNAL_IOPRIO;
sbi->s_inode_readahead_blks = EXT4_DEF_INODE_READAHEAD_BLKS;
sbi->s_sectors_written_start =
@@ -6495,7 +6494,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
ctx->journal_ioprio =
sbi->s_journal->j_task->io_context->ioprio;
else
- ctx->journal_ioprio = DEFAULT_JOURNAL_IOPRIO;
+ ctx->journal_ioprio = EXT4_DEF_JOURNAL_IOPRIO;
}
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 8/9] ext4: convert s_fc_lock to mutex type
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (6 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 7/9] ext4: temporarily elevate commit thread priority Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 9/9] ext4: hold s_fc_lock while during fast commit Harshad Shirwadkar
2025-04-24 14:59 ` [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Theodore Ts'o
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
This allows us to hold s_fc_lock during kmem_cache_* functions, which
is needed in the following patch.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/ext4.h | 2 +-
fs/ext4/fast_commit.c | 68 +++++++++++++++++++++----------------------
fs/ext4/super.c | 2 +-
3 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index faba91321aab..0fd198c740e2 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1754,7 +1754,7 @@ struct ext4_sb_info {
* following fields:
* ei->i_fc_list, s_fc_dentry_q, s_fc_q, s_fc_bytes, s_fc_bh.
*/
- spinlock_t s_fc_lock;
+ struct mutex s_fc_lock;
struct buffer_head *s_fc_bh;
struct ext4_fc_stats s_fc_stats;
tid_t s_fc_ineligible_tid;
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 3b441452f3cf..89895ba2e011 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -234,9 +234,9 @@ void ext4_fc_del(struct inode *inode)
if (ext4_fc_disabled(inode->i_sb))
return;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
if (list_empty(&ei->i_fc_list) && list_empty(&ei->i_fc_dilist)) {
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
return;
}
@@ -264,7 +264,7 @@ void ext4_fc_del(struct inode *inode)
* dentry create references, since it is not needed to log it anyways.
*/
if (list_empty(&ei->i_fc_dilist)) {
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
return;
}
@@ -274,7 +274,7 @@ void ext4_fc_del(struct inode *inode)
list_del_init(&fc_dentry->fcd_dilist);
WARN_ON(!list_empty(&ei->i_fc_dilist));
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
release_dentry_name_snapshot(&fc_dentry->fcd_name);
kmem_cache_free(ext4_fc_dentry_cachep, fc_dentry);
@@ -305,12 +305,12 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
has_transaction = false;
read_unlock(&sbi->s_journal->j_state_lock);
}
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
is_ineligible = ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
if (has_transaction && (!is_ineligible || tid_gt(tid, sbi->s_fc_ineligible_tid)))
sbi->s_fc_ineligible_tid = tid;
ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
WARN_ON(reason >= EXT4_FC_REASON_MAX);
sbi->s_fc_stats.fc_ineligible_reason_count[reason]++;
}
@@ -349,14 +349,14 @@ static int ext4_fc_track_template(
if (!enqueue)
return ret;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
if (list_empty(&EXT4_I(inode)->i_fc_list))
list_add_tail(&EXT4_I(inode)->i_fc_list,
(sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING) ?
&sbi->s_fc_q[FC_Q_STAGING] :
&sbi->s_fc_q[FC_Q_MAIN]);
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
return ret;
}
@@ -400,7 +400,7 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
node->fcd_ino = inode->i_ino;
take_dentry_name_snapshot(&node->fcd_name, dentry);
INIT_LIST_HEAD(&node->fcd_dilist);
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
if (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING)
list_add_tail(&node->fcd_list,
@@ -421,7 +421,7 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
WARN_ON(!list_empty(&ei->i_fc_dilist));
list_add_tail(&node->fcd_dilist, &ei->i_fc_dilist);
}
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
spin_lock(&ei->i_fc_lock);
return 0;
@@ -947,15 +947,15 @@ static int ext4_fc_submit_inode_data_all(journal_t *journal)
struct ext4_inode_info *ei;
int ret = 0;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
ret = jbd2_submit_inode_data(journal, ei->jinode);
if (ret)
return ret;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
}
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
return ret;
}
@@ -968,19 +968,19 @@ static int ext4_fc_wait_inode_data_all(journal_t *journal)
struct ext4_inode_info *pos, *n;
int ret = 0;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
list_for_each_entry_safe(pos, n, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
if (!ext4_test_inode_state(&pos->vfs_inode,
EXT4_STATE_FC_COMMITTING))
continue;
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
ret = jbd2_wait_inode_data(journal, pos->jinode);
if (ret)
return ret;
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
}
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
return 0;
}
@@ -1002,12 +1002,12 @@ __releases(&sbi->s_fc_lock)
list_for_each_entry_safe(fc_dentry, fc_dentry_n,
&sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {
if (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT) {
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry)) {
ret = -ENOSPC;
goto lock_and_exit;
}
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
continue;
}
/*
@@ -1020,7 +1020,7 @@ __releases(&sbi->s_fc_lock)
inode = &ei->vfs_inode;
WARN_ON(inode->i_ino != fc_dentry->fcd_ino);
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
/*
* We first write the inode and then the create dirent. This
@@ -1042,11 +1042,11 @@ __releases(&sbi->s_fc_lock)
goto lock_and_exit;
}
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
}
return 0;
lock_and_exit:
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
return ret;
}
@@ -1066,12 +1066,12 @@ static int ext4_fc_perform_commit(journal_t *journal)
* and then lock the journal.
*/
jbd2_journal_lock_updates(journal);
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
ext4_set_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
}
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
jbd2_journal_unlock_updates(journal);
ret = ext4_fc_submit_inode_data_all(journal);
@@ -1105,10 +1105,10 @@ static int ext4_fc_perform_commit(journal_t *journal)
}
}
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
ret = ext4_fc_commit_dentry_updates(journal, &crc);
if (ret) {
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
goto out;
}
@@ -1117,7 +1117,7 @@ static int ext4_fc_perform_commit(journal_t *journal)
if (!ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING))
continue;
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
ret = ext4_fc_write_inode_data(inode, &crc);
if (ret)
goto out;
@@ -1136,9 +1136,9 @@ static int ext4_fc_perform_commit(journal_t *journal)
#else
wake_up_bit(&iter->i_flags, EXT4_STATE_FC_COMMITTING);
#endif
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
}
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
ret = ext4_fc_write_tail(sb, crc);
@@ -1283,7 +1283,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
trace_ext4_fc_cleanup(journal, full, tid);
jbd2_fc_release_bufs(journal);
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
while (!list_empty(&sbi->s_fc_q[FC_Q_MAIN])) {
ei = list_first_entry(&sbi->s_fc_q[FC_Q_MAIN],
struct ext4_inode_info,
@@ -1325,11 +1325,11 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
fcd_list);
list_del_init(&fc_dentry->fcd_list);
list_del_init(&fc_dentry->fcd_dilist);
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
release_dentry_name_snapshot(&fc_dentry->fcd_name);
kmem_cache_free(ext4_fc_dentry_cachep, fc_dentry);
- spin_lock(&sbi->s_fc_lock);
+ mutex_lock(&sbi->s_fc_lock);
}
list_splice_init(&sbi->s_fc_dentry_q[FC_Q_STAGING],
@@ -1344,7 +1344,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
if (full)
sbi->s_fc_bytes = 0;
- spin_unlock(&sbi->s_fc_lock);
+ mutex_unlock(&sbi->s_fc_lock);
trace_ext4_fc_stats(sb);
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0a4ca1c8e5ce..e1da15499e6f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4481,7 +4481,7 @@ static void ext4_fast_commit_init(struct super_block *sb)
sbi->s_fc_bytes = 0;
ext4_clear_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
sbi->s_fc_ineligible_tid = 0;
- spin_lock_init(&sbi->s_fc_lock);
+ mutex_init(&sbi->s_fc_lock);
memset(&sbi->s_fc_stats, 0, sizeof(sbi->s_fc_stats));
sbi->s_fc_replay_state.fc_regions = NULL;
sbi->s_fc_replay_state.fc_regions_size = 0;
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v8 9/9] ext4: hold s_fc_lock while during fast commit
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (7 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 8/9] ext4: convert s_fc_lock to mutex type Harshad Shirwadkar
@ 2025-04-14 16:54 ` Harshad Shirwadkar
2025-04-24 14:59 ` [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Theodore Ts'o
9 siblings, 0 replies; 14+ messages in thread
From: Harshad Shirwadkar @ 2025-04-14 16:54 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, jack, harshads, Harshad Shirwadkar
Leaving s_fc_lock in between during commit in ext4_fc_perform_commit()
function leaves room for subtle concurrency bugs where ext4_fc_del() may
delete an inode from the fast commit list, leaving list in an inconsistent
state.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/fast_commit.c | 57 ++++++++++++-------------------------------
1 file changed, 16 insertions(+), 41 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 89895ba2e011..1c28ef79654c 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -400,6 +400,7 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
node->fcd_ino = inode->i_ino;
take_dentry_name_snapshot(&node->fcd_name, dentry);
INIT_LIST_HEAD(&node->fcd_dilist);
+ INIT_LIST_HEAD(&node->fcd_list);
mutex_lock(&sbi->s_fc_lock);
if (sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING ||
sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING)
@@ -949,11 +950,9 @@ static int ext4_fc_submit_inode_data_all(journal_t *journal)
mutex_lock(&sbi->s_fc_lock);
list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
- mutex_unlock(&sbi->s_fc_lock);
ret = jbd2_submit_inode_data(journal, ei->jinode);
if (ret)
- return ret;
- mutex_lock(&sbi->s_fc_lock);
+ break;
}
mutex_unlock(&sbi->s_fc_lock);
@@ -973,22 +972,18 @@ static int ext4_fc_wait_inode_data_all(journal_t *journal)
if (!ext4_test_inode_state(&pos->vfs_inode,
EXT4_STATE_FC_COMMITTING))
continue;
- mutex_unlock(&sbi->s_fc_lock);
ret = jbd2_wait_inode_data(journal, pos->jinode);
if (ret)
- return ret;
- mutex_lock(&sbi->s_fc_lock);
+ break;
}
mutex_unlock(&sbi->s_fc_lock);
- return 0;
+ return ret;
}
/* Commit all the directory entry updates */
static int ext4_fc_commit_dentry_updates(journal_t *journal, u32 *crc)
-__acquires(&sbi->s_fc_lock)
-__releases(&sbi->s_fc_lock)
{
struct super_block *sb = journal->j_private;
struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -1002,26 +997,22 @@ __releases(&sbi->s_fc_lock)
list_for_each_entry_safe(fc_dentry, fc_dentry_n,
&sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {
if (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT) {
- mutex_unlock(&sbi->s_fc_lock);
- if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry)) {
- ret = -ENOSPC;
- goto lock_and_exit;
- }
- mutex_lock(&sbi->s_fc_lock);
+ if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry))
+ return -ENOSPC;
continue;
}
/*
* With fcd_dilist we need not loop in sbi->s_fc_q to get the
- * corresponding inode pointer
+ * corresponding inode. Also, the corresponding inode could have been
+ * deleted, in which case, we don't need to do anything.
*/
- WARN_ON(list_empty(&fc_dentry->fcd_dilist));
+ if (list_empty(&fc_dentry->fcd_dilist))
+ continue;
ei = list_first_entry(&fc_dentry->fcd_dilist,
struct ext4_inode_info, i_fc_dilist);
inode = &ei->vfs_inode;
WARN_ON(inode->i_ino != fc_dentry->fcd_ino);
- mutex_unlock(&sbi->s_fc_lock);
-
/*
* We first write the inode and then the create dirent. This
* allows the recovery code to create an unnamed inode first
@@ -1031,23 +1022,14 @@ __releases(&sbi->s_fc_lock)
*/
ret = ext4_fc_write_inode(inode, crc);
if (ret)
- goto lock_and_exit;
-
+ return ret;
ret = ext4_fc_write_inode_data(inode, crc);
if (ret)
- goto lock_and_exit;
-
- if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry)) {
- ret = -ENOSPC;
- goto lock_and_exit;
- }
-
- mutex_lock(&sbi->s_fc_lock);
+ return ret;
+ if (!ext4_fc_add_dentry_tlv(sb, crc, fc_dentry))
+ return -ENOSPC;
}
return 0;
-lock_and_exit:
- mutex_lock(&sbi->s_fc_lock);
- return ret;
}
static int ext4_fc_perform_commit(journal_t *journal)
@@ -1107,17 +1089,14 @@ static int ext4_fc_perform_commit(journal_t *journal)
mutex_lock(&sbi->s_fc_lock);
ret = ext4_fc_commit_dentry_updates(journal, &crc);
- if (ret) {
- mutex_unlock(&sbi->s_fc_lock);
+ if (ret)
goto out;
- }
list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
inode = &iter->vfs_inode;
if (!ext4_test_inode_state(inode, EXT4_STATE_FC_COMMITTING))
continue;
- mutex_unlock(&sbi->s_fc_lock);
ret = ext4_fc_write_inode_data(inode, &crc);
if (ret)
goto out;
@@ -1136,13 +1115,11 @@ static int ext4_fc_perform_commit(journal_t *journal)
#else
wake_up_bit(&iter->i_flags, EXT4_STATE_FC_COMMITTING);
#endif
- mutex_lock(&sbi->s_fc_lock);
}
- mutex_unlock(&sbi->s_fc_lock);
-
ret = ext4_fc_write_tail(sb, crc);
out:
+ mutex_unlock(&sbi->s_fc_lock);
blk_finish_plug(&plug);
return ret;
}
@@ -1325,11 +1302,9 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
fcd_list);
list_del_init(&fc_dentry->fcd_list);
list_del_init(&fc_dentry->fcd_dilist);
- mutex_unlock(&sbi->s_fc_lock);
release_dentry_name_snapshot(&fc_dentry->fcd_name);
kmem_cache_free(ext4_fc_dentry_cachep, fc_dentry);
- mutex_lock(&sbi->s_fc_lock);
}
list_splice_init(&sbi->s_fc_dentry_q[FC_Q_STAGING],
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
` (8 preceding siblings ...)
2025-04-14 16:54 ` [PATCH v8 9/9] ext4: hold s_fc_lock while during fast commit Harshad Shirwadkar
@ 2025-04-24 14:59 ` Theodore Ts'o
2025-04-30 18:55 ` harshad shirwadkar
9 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2025-04-24 14:59 UTC (permalink / raw)
To: Harshad Shirwadkar; +Cc: linux-ext4, jack, harshads
On Mon, Apr 14, 2025 at 04:54:07PM +0000, Harshad Shirwadkar wrote:
> V8 Cover Letter
> ---------------
>
> This is the V8 of the patch series. This patch series contains fixes to
> review comments by Jan Kara (<jack@suse.cz>). The main changes are as
> follows:
Hi Harshad,
I tried applying your patch set on top of 6.15-rc3, and a number of
tests: generic/127, generic/231, generic/241, generic/418, and
generic/589 are causing the kernel to OOPS, wedge, or reboot. Most of
the test flakes and test failures were there without your patch set,
and we need to figure them out.... but Errors are new, and are
regressions.
I can send you the test artifacts under separate cover, or you can
just try running those tests using kvm-xfstests or gce-xfstests.
Thanks,
- Ted
TESTRUNID: ltm-20250423233632
KERNEL: kernel 6.15.0-rc3-xfstests-00009-gac4ab1811bb3 #22 SMP PREEMPT_DYNAMIC Wed Apr 23 14:28:04 EDT 2025 x86_64
CMDLINE: --kernel gs://gce-xfstests/kernel.deb -c ext4/4k,ext4/fast_commit,ext4/fast_commit_1k -g auto
CPUS: 2
MEM: 7680
ext4/4k: 587 tests, 55 skipped, 5243 seconds
ext4/fast_commit: 602 tests, 20 failures, 3 errors, 55 skipped, 4564 seconds
Failures: generic/506 generic/737 generic/757 generic/764
Errors: generic/127 generic/241 generic/418
ext4/fast_commit_1k: 616 tests, 39 failures, 5 errors, 59 skipped, 6169 seconds
Failures: generic/047 generic/051 generic/455 generic/475 generic/506
generic/757 generic/764
Flaky: generic/627: 80% (4/5)
Errors: generic/127 generic/231 generic/241 generic/418 generic/589
Totals: 1805 tests, 169 skipped, 59 failures, 8 errors, 15637s
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset
2025-04-24 14:59 ` [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Theodore Ts'o
@ 2025-04-30 18:55 ` harshad shirwadkar
2025-05-01 4:14 ` Theodore Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: harshad shirwadkar @ 2025-04-30 18:55 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-ext4, jack, harshads
Hi Ted,
I tried to run this on my end, and I didn't see the errors that you
pointed out. Here's my run:
Failures: ext4/049 generic/459 generic/506 generic/645
Failed 4 of 553 tests
Xunit report: /results/ext4/results-fast_commit/result.xml
END TEST: Ext4 4k block w/fast_commit Wed Apr 30 18:15:26 UTC 2025
-------------------- Summary report
KERNEL: kernel 6.14.0-rc2-xfstests-perf-gff9ebf4dde0d #31 SMP
PREEMPT_DYNAMIC Sat Apr 12 19:35:39 UTC 2025 x86_64
CMDLINE: -c fast_commit -g auto
I explicitly also ran the tests were causing error for you and I still
don't see any failures:
Ran: generic/127 generic/241 generic/418
Passed all 3 tests
Xunit report: /results/ext4/results-fast_commit/result.xml
-------------------- Summary report
KERNEL: kernel 6.14.0-rc2-xfstests-perf-gff9ebf4dde0d #31 SMP
PREEMPT_DYNAMIC Sat Apr 12 19:35:39 UTC 2025 x86_64
CMDLINE: -c fast_commit generic/127 generic/241 generic/418
Maybe my patches are conflicting with some other patches that are
being merged? Let's talk about this in tomorrow's call.
Thank you,
Harshad
On Fri, Apr 25, 2025 at 5:41 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Mon, Apr 14, 2025 at 04:54:07PM +0000, Harshad Shirwadkar wrote:
> > V8 Cover Letter
> > ---------------
> >
> > This is the V8 of the patch series. This patch series contains fixes to
> > review comments by Jan Kara (<jack@suse.cz>). The main changes are as
> > follows:
>
> Hi Harshad,
>
> I tried applying your patch set on top of 6.15-rc3, and a number of
> tests: generic/127, generic/231, generic/241, generic/418, and
> generic/589 are causing the kernel to OOPS, wedge, or reboot. Most of
> the test flakes and test failures were there without your patch set,
> and we need to figure them out.... but Errors are new, and are
> regressions.
>
> I can send you the test artifacts under separate cover, or you can
> just try running those tests using kvm-xfstests or gce-xfstests.
>
> Thanks,
>
> - Ted
>
> TESTRUNID: ltm-20250423233632
> KERNEL: kernel 6.15.0-rc3-xfstests-00009-gac4ab1811bb3 #22 SMP PREEMPT_DYNAMIC Wed Apr 23 14:28:04 EDT 2025 x86_64
> CMDLINE: --kernel gs://gce-xfstests/kernel.deb -c ext4/4k,ext4/fast_commit,ext4/fast_commit_1k -g auto
> CPUS: 2
> MEM: 7680
>
> ext4/4k: 587 tests, 55 skipped, 5243 seconds
> ext4/fast_commit: 602 tests, 20 failures, 3 errors, 55 skipped, 4564 seconds
> Failures: generic/506 generic/737 generic/757 generic/764
> Errors: generic/127 generic/241 generic/418
> ext4/fast_commit_1k: 616 tests, 39 failures, 5 errors, 59 skipped, 6169 seconds
> Failures: generic/047 generic/051 generic/455 generic/475 generic/506
> generic/757 generic/764
> Flaky: generic/627: 80% (4/5)
> Errors: generic/127 generic/231 generic/241 generic/418 generic/589
> Totals: 1805 tests, 169 skipped, 59 failures, 8 errors, 15637s
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset
2025-04-30 18:55 ` harshad shirwadkar
@ 2025-05-01 4:14 ` Theodore Ts'o
2025-05-01 13:37 ` Theodore Ts'o
0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2025-05-01 4:14 UTC (permalink / raw)
To: harshad shirwadkar; +Cc: linux-ext4, jack, harshads
On Wed, Apr 30, 2025 at 11:55:48AM -0700, harshad shirwadkar wrote:
>
> I tried to run this on my end, and I didn't see the errors that you
> pointed out. Here's my run:
>
> -------------------- Summary report
> KERNEL: kernel 6.14.0-rc2-xfstests-perf-gff9ebf4dde0d #31 SMP
> PREEMPT_DYNAMIC Sat Apr 12 19:35:39 UTC 2025 x86_64
> CMDLINE: -c fast_commit -g auto
I was applying your patches on 6.15-rc3 and this evening I tried
applying them against 6.15-rc2, which has been failing as well.
There's nothing else on the ext4 dev branch yet, but there were a
large number of ext4 patches which landed between 6.14-rc2 and
6.15-rc2.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset
2025-05-01 4:14 ` Theodore Ts'o
@ 2025-05-01 13:37 ` Theodore Ts'o
0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2025-05-01 13:37 UTC (permalink / raw)
To: harshad shirwadkar; +Cc: linux-ext4, jack, harshads
On Thu, May 01, 2025 at 12:14:04AM -0400, Theodore Ts'o wrote:
>
> I was applying your patches on 6.15-rc3 and this evening I tried
> applying them against 6.15-rc2, which has been failing as well.
>
> There's nothing else on the ext4 dev branch yet, but there were a
> large number of ext4 patches which landed between 6.14-rc2 and
> 6.15-rc2.
I tried bisecting your patch series using:
kvm-xfstests -c ext4/fast_commit generic/127 generic/241 generic/418
and the first bad commit was "ext4: rework fast commit commit path".
- Ted
git bisect start
# status: waiting for both good and bad commits
# good: [9c32cda43eb78f78c73aee4aa344b777714e259b] Linux 6.15-rc3
git bisect good 9c32cda43eb78f78c73aee4aa344b777714e259b
# status: waiting for bad commit, 1 good commit known
# bad: [a147d000c6914a51becf0d32bbb8c9124e45f6ed] ext4: hold s_fc_lock while during fast commit
git bisect bad a147d000c6914a51becf0d32bbb8c9124e45f6ed
# good: [8ffd015db85fea3e15a77027fda6c02ced4d2444] Linux 6.15-rc2
git bisect good 8ffd015db85fea3e15a77027fda6c02ced4d2444
# bad: [bfd1ce278bb3cfea0ad7017de608691a83c372bf] ext4: rework fast commit commit path
git bisect bad bfd1ce278bb3cfea0ad7017de608691a83c372bf
# good: [36eb0b696b73895b7e40dbf3043275fdc9bcb1fe] ext4: for committing inode, make ext4_fc_track_inode wait
git bisect good 36eb0b696b73895b7e40dbf3043275fdc9bcb1fe
# good: [94cee94db3acfe189463f4343ca7bd9c83570cb7] ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr
git bisect good 94cee94db3acfe189463f4343ca7bd9c83570cb7
# first bad commit: [bfd1ce278bb3cfea0ad7017de608691a83c372bf] ext4: rework fast commit commit path
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-05-01 13:37 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 16:54 [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 1/9] ext4: convert i_fc_lock to spinlock Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 2/9] ext4: for committing inode, make ext4_fc_track_inode wait Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 3/9] ext4: mark inode dirty before grabbing i_data_sem in ext4_setattr Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 4/9] ext4: rework fast commit commit path Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 5/9] ext4: drop i_fc_updates from inode fc info Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 6/9] ext4: update code documentation Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 7/9] ext4: temporarily elevate commit thread priority Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 8/9] ext4: convert s_fc_lock to mutex type Harshad Shirwadkar
2025-04-14 16:54 ` [PATCH v8 9/9] ext4: hold s_fc_lock while during fast commit Harshad Shirwadkar
2025-04-24 14:59 ` [PATCH v8 0/9] Ext4 Fast Commit Performance Patchset Theodore Ts'o
2025-04-30 18:55 ` harshad shirwadkar
2025-05-01 4:14 ` Theodore Ts'o
2025-05-01 13:37 ` Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox