From: Li Chen <me@linux.beauty>
To: Zhang Yi <yi.zhang@huaweicloud.com>,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Baokun Li <libaokun@linux.alibaba.com>, Jan Kara <jack@suse.cz>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>,
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
Zhang Yi <yi.zhang@huawei.com>,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
linux-trace-kernel@vger.kernel.org
Subject: [RFC v8 7/7] ext4: fast commit: export snapshot stats in fc_info
Date: Fri, 15 May 2026 17:18:27 +0800 [thread overview]
Message-ID: <20260515091829.194810-8-me@linux.beauty> (raw)
In-Reply-To: <20260515091829.194810-1-me@linux.beauty>
Snapshot-based fast commit can fall back when the commit-time snapshot
cannot be built (e.g. extent status cache misses). It is useful to
quantify the updates-locked window and to see why snapshotting failed.
Add best-effort snapshot counters to the ext4 superblock and extend
/proc/fs/ext4/<sb_id>/fc_info to report the number of snapshotted
inodes and ranges, snapshot failure reasons, and the average/max time
spent with journal updates locked.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
---
Changes in v8:
- Treat stale snapshot inode sizing as a capacity fallback instead of
letting log writing later report a missing snapshot.
- Use atomic64_t for the snapshot counters so fc_info cannot observe
torn 64-bit values on 32-bit systems.
Changes in v7:
- Address Sashiko review by using READ_ONCE() + div64_u64() for the fc_info
lock_updates average.
Changes in v6:
- Start consuming locked_ns in fc_info, so this patch intentionally moves
lock_updates_ns_{total,max,samples} accounting here.
- Guard the tracepoint call with trace_ext4_fc_lock_updates_enabled() and
use trace_call__ext4_fc_lock_updates() to avoid the double static_branch
at the guarded call site.
- Keep the stats unconditionally while avoiding extra tracepoint
overhead when ext4_fc_lock_updates is disabled.
fs/ext4/ext4.h | 31 ++++++++++++++
fs/ext4/fast_commit.c | 96 ++++++++++++++++++++++++++++++++++++++-----
fs/ext4/super.c | 1 +
3 files changed, 118 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index dd09d00a73af..ddc903738c6b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1550,6 +1550,36 @@ struct ext4_orphan_info {
* file blocks */
};
+/*
+ * Ext4 fast commit snapshot statistics.
+ *
+ * These are best-effort counters intended for debugging / performance
+ * introspection; they are not exact under concurrent updates.
+ */
+struct ext4_fc_snap_stats {
+ atomic64_t lock_updates_ns_total;
+ atomic64_t lock_updates_ns_max;
+ atomic64_t lock_updates_samples;
+
+ atomic64_t snap_inodes;
+ atomic64_t snap_ranges;
+
+ atomic64_t snap_fail_es_miss;
+ atomic64_t snap_fail_es_delayed;
+ atomic64_t snap_fail_es_other;
+
+ atomic64_t snap_fail_inodes_cap;
+ atomic64_t snap_fail_ranges_cap;
+ atomic64_t snap_fail_nomem;
+ atomic64_t snap_fail_inode_loc;
+
+ /*
+ * Missing inode snapshots during log writing should never happen.
+ * Keep this counter to help catch unexpected regressions.
+ */
+ atomic64_t snap_fail_no_snap;
+};
+
/*
* fourth extended-fs super-block data in memory
*/
@@ -1824,6 +1854,7 @@ struct ext4_sb_info {
struct mutex s_fc_lock;
struct buffer_head *s_fc_bh;
struct ext4_fc_stats s_fc_stats;
+ struct ext4_fc_snap_stats s_fc_snap_stats;
tid_t s_fc_ineligible_tid;
#ifdef CONFIG_EXT4_DEBUG
int s_fc_debug_max_replay;
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index dc08f8ff43d9..4ef796b9b6cb 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -281,6 +281,19 @@ static inline void ext4_fc_wake_inode_state(struct inode *inode, int bit)
ext4_inode_state_wait_bit(bit));
}
+static void ext4_fc_snap_stats_update_max(atomic64_t *stat, u64 value)
+{
+ u64 old = atomic64_read(stat);
+
+ while (value > old) {
+ u64 prev = atomic64_cmpxchg(stat, old, value);
+
+ if (prev == old)
+ break;
+ old = prev;
+ }
+}
+
/*
* Remove inode from fast commit list. If the inode is being committed
* we wait until inode commit is done.
@@ -868,6 +881,8 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
{
struct ext4_inode_info *ei = EXT4_I(inode);
struct ext4_fc_inode_snap *snap = ei->i_fc_snap;
+ struct ext4_fc_snap_stats *stats =
+ &EXT4_SB(inode->i_sb)->s_fc_snap_stats;
struct ext4_fc_inode fc_inode;
struct ext4_fc_tl tl;
u8 *dst;
@@ -875,13 +890,17 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
int inode_len;
int ret;
- if (!snap)
+ if (!snap) {
+ atomic64_inc(&stats->snap_fail_no_snap);
return -ECANCELED;
+ }
src = snap->inode_buf;
inode_len = snap->inode_len;
- if (!src || inode_len == 0)
+ if (!src || inode_len == 0) {
+ atomic64_inc(&stats->snap_fail_no_snap);
return -ECANCELED;
+ }
fc_inode.fc_ino = cpu_to_le32(inode->i_ino);
tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_INODE);
@@ -911,13 +930,17 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
{
struct ext4_inode_info *ei = EXT4_I(inode);
struct ext4_fc_inode_snap *snap = ei->i_fc_snap;
+ struct ext4_fc_snap_stats *stats =
+ &EXT4_SB(inode->i_sb)->s_fc_snap_stats;
struct ext4_fc_add_range fc_ext;
struct ext4_fc_del_range lrange;
struct ext4_extent *ex;
struct ext4_fc_range *range;
- if (!snap)
+ if (!snap) {
+ atomic64_inc(&stats->snap_fail_no_snap);
return -ECANCELED;
+ }
list_for_each_entry(range, &snap->data_list, list) {
if (range->tag == EXT4_FC_TAG_DEL_RANGE) {
@@ -978,6 +1001,8 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,
int *snap_err)
{
struct ext4_inode_info *ei = EXT4_I(inode);
+ struct ext4_fc_snap_stats *stats =
+ &EXT4_SB(inode->i_sb)->s_fc_snap_stats;
ext4_lblk_t start_lblk, end_lblk, cur_lblk;
unsigned int nr_ranges = 0;
@@ -1005,11 +1030,13 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,
u64 remaining = (u64)end_lblk - cur_lblk + 1;
if (!ext4_es_lookup_extent(inode, cur_lblk, NULL, &es, NULL)) {
+ atomic64_inc(&stats->snap_fail_es_miss);
ext4_fc_set_snap_err(snap_err, EXT4_FC_SNAP_ERR_ES_MISS);
return -EAGAIN;
}
if (ext4_es_is_delayed(&es)) {
+ atomic64_inc(&stats->snap_fail_es_delayed);
ext4_fc_set_snap_err(snap_err,
EXT4_FC_SNAP_ERR_ES_DELAYED);
return -EAGAIN;
@@ -1024,6 +1051,7 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,
}
if (nr_ranges_total + nr_ranges >= EXT4_FC_SNAPSHOT_MAX_RANGES) {
+ atomic64_inc(&stats->snap_fail_ranges_cap);
ext4_fc_set_snap_err(snap_err,
EXT4_FC_SNAP_ERR_RANGES_CAP);
return -E2BIG;
@@ -1031,6 +1059,7 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,
range = kmem_cache_alloc(ext4_fc_range_cachep, GFP_NOFS);
if (!range) {
+ atomic64_inc(&stats->snap_fail_nomem);
ext4_fc_set_snap_err(snap_err, EXT4_FC_SNAP_ERR_NOMEM);
return -ENOMEM;
}
@@ -1058,6 +1087,7 @@ static int ext4_fc_snapshot_inode_data(struct inode *inode,
range->len = max;
} else {
kmem_cache_free(ext4_fc_range_cachep, range);
+ atomic64_inc(&stats->snap_fail_es_other);
ext4_fc_set_snap_err(snap_err, EXT4_FC_SNAP_ERR_ES_OTHER);
return -EAGAIN;
}
@@ -1081,6 +1111,8 @@ static int ext4_fc_snapshot_inode(struct inode *inode,
unsigned int *nr_rangesp, int *snap_err)
{
struct ext4_inode_info *ei = EXT4_I(inode);
+ struct ext4_fc_snap_stats *stats =
+ &EXT4_SB(inode->i_sb)->s_fc_snap_stats;
struct ext4_fc_inode_snap *snap;
int inode_len = EXT4_GOOD_OLD_INODE_SIZE;
struct ext4_iloc iloc;
@@ -1091,6 +1123,7 @@ static int ext4_fc_snapshot_inode(struct inode *inode,
ret = ext4_get_inode_loc_noio(inode, &iloc);
if (ret) {
+ atomic64_inc(&stats->snap_fail_inode_loc);
ext4_fc_set_snap_err(snap_err, EXT4_FC_SNAP_ERR_INODE_LOC);
return ret;
}
@@ -1102,6 +1135,7 @@ static int ext4_fc_snapshot_inode(struct inode *inode,
snap = kmalloc(struct_size(snap, inode_buf, inode_len), GFP_NOFS);
if (!snap) {
+ atomic64_inc(&stats->snap_fail_nomem);
ext4_fc_set_snap_err(snap_err, EXT4_FC_SNAP_ERR_NOMEM);
brelse(iloc.bh);
return -ENOMEM;
@@ -1126,6 +1160,8 @@ static int ext4_fc_snapshot_inode(struct inode *inode,
list_splice_tail_init(&ranges, &snap->data_list);
ext4_fc_unlock(inode->i_sb, alloc_ctx);
+ atomic64_inc(&stats->snap_inodes);
+ atomic64_add(nr_ranges, &stats->snap_ranges);
if (nr_rangesp)
*nr_rangesp = nr_ranges;
return 0;
@@ -1229,12 +1265,10 @@ static int ext4_fc_snapshot_inodes(journal_t *journal, struct inode **inodes,
int ret = 0;
int alloc_ctx;
- if (!inodes_size)
- return 0;
-
alloc_ctx = ext4_fc_lock(sb);
list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
if (i >= inodes_size) {
+ atomic64_inc(&sbi->s_fc_snap_stats.snap_fail_inodes_cap);
ext4_fc_set_snap_err(snap_err,
EXT4_FC_SNAP_ERR_INODES_CAP);
ret = -E2BIG;
@@ -1260,6 +1294,7 @@ static int ext4_fc_snapshot_inodes(journal_t *journal, struct inode **inodes,
continue;
if (i >= inodes_size) {
+ atomic64_inc(&sbi->s_fc_snap_stats.snap_fail_inodes_cap);
ext4_fc_set_snap_err(snap_err,
EXT4_FC_SNAP_ERR_INODES_CAP);
ret = -E2BIG;
@@ -1303,6 +1338,7 @@ static int ext4_fc_perform_commit(journal_t *journal, tid_t commit_tid)
{
struct super_block *sb = journal->j_private;
struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct ext4_fc_snap_stats *snap_stats = &sbi->s_fc_snap_stats;
struct ext4_inode_info *iter;
struct ext4_fc_head head;
struct inode *inode;
@@ -1362,8 +1398,13 @@ static int ext4_fc_perform_commit(journal_t *journal, tid_t commit_tid)
return ret;
ret = ext4_fc_alloc_snapshot_inodes(sb, &inodes, &inodes_size);
- if (ret)
+ if (ret) {
+ if (ret == -E2BIG)
+ atomic64_inc(&snap_stats->snap_fail_inodes_cap);
+ else if (ret == -ENOMEM)
+ atomic64_inc(&snap_stats->snap_fail_nomem);
return ret;
+ }
/* Step 4: Mark all inodes as being committed. */
jbd2_journal_lock_updates(journal);
@@ -1384,12 +1425,15 @@ static int ext4_fc_perform_commit(journal_t *journal, tid_t commit_tid)
ret = ext4_fc_snapshot_inodes(journal, inodes, inodes_size,
&snap_inodes, &snap_ranges, &snap_err);
jbd2_journal_unlock_updates(journal);
- if (trace_ext4_fc_lock_updates_enabled()) {
- locked_ns = ktime_to_ns(ktime_sub(ktime_get(), lock_start));
- trace_call__ext4_fc_lock_updates(sb, commit_tid, locked_ns,
- snap_inodes, snap_ranges,
- ret, snap_err);
- }
+ locked_ns = ktime_to_ns(ktime_sub(ktime_get(), lock_start));
+ atomic64_add(locked_ns, &snap_stats->lock_updates_ns_total);
+ atomic64_inc(&snap_stats->lock_updates_samples);
+ ext4_fc_snap_stats_update_max(&snap_stats->lock_updates_ns_max,
+ locked_ns);
+ if (trace_ext4_fc_lock_updates_enabled())
+ trace_call__ext4_fc_lock_updates(sb, commit_tid, locked_ns,
+ snap_inodes, snap_ranges,
+ ret, snap_err);
kvfree(inodes);
if (ret)
return ret;
@@ -2657,11 +2701,26 @@ int ext4_fc_info_show(struct seq_file *seq, void *v)
{
struct ext4_sb_info *sbi = EXT4_SB((struct super_block *)seq->private);
struct ext4_fc_stats *stats = &sbi->s_fc_stats;
+ struct ext4_fc_snap_stats *snap_stats = &sbi->s_fc_snap_stats;
+ u64 lock_avg_ns = 0;
+ u64 lock_updates_samples;
+ u64 lock_updates_ns_total;
+ u64 lock_updates_ns_max;
int i;
if (v != SEQ_START_TOKEN)
return 0;
+ lock_updates_samples =
+ atomic64_read(&snap_stats->lock_updates_samples);
+ lock_updates_ns_total =
+ atomic64_read(&snap_stats->lock_updates_ns_total);
+ lock_updates_ns_max =
+ atomic64_read(&snap_stats->lock_updates_ns_max);
+ if (lock_updates_samples)
+ lock_avg_ns = div64_u64(lock_updates_ns_total,
+ lock_updates_samples);
+
seq_printf(seq,
"fc stats:\n%ld commits\n%ld ineligible\n%ld numblks\n%lluus avg_commit_time\n",
stats->fc_num_commits, stats->fc_ineligible_commits,
@@ -2672,6 +2731,23 @@ int ext4_fc_info_show(struct seq_file *seq, void *v)
seq_printf(seq, "\"%s\":\t%d\n", fc_ineligible_reasons[i],
stats->fc_ineligible_reason_count[i]);
+ seq_printf(seq,
+ "Snapshot stats:\n%llu inodes\n%llu ranges\n%lluus lock_updates_avg\n%lluus lock_updates_max\n",
+ atomic64_read(&snap_stats->snap_inodes),
+ atomic64_read(&snap_stats->snap_ranges),
+ div_u64(lock_avg_ns, 1000),
+ div_u64(lock_updates_ns_max, 1000));
+ seq_printf(seq,
+ "Snapshot failures:\n%llu es_miss\n%llu es_delayed\n%llu es_other\n%llu inodes_cap\n%llu ranges_cap\n%llu nomem\n%llu inode_loc\n%llu no_snap\n",
+ atomic64_read(&snap_stats->snap_fail_es_miss),
+ atomic64_read(&snap_stats->snap_fail_es_delayed),
+ atomic64_read(&snap_stats->snap_fail_es_other),
+ atomic64_read(&snap_stats->snap_fail_inodes_cap),
+ atomic64_read(&snap_stats->snap_fail_ranges_cap),
+ atomic64_read(&snap_stats->snap_fail_nomem),
+ atomic64_read(&snap_stats->snap_fail_inode_loc),
+ atomic64_read(&snap_stats->snap_fail_no_snap));
+
return 0;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3c869f0001c5..f1f8819a2a23 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4544,6 +4544,7 @@ static void ext4_fast_commit_init(struct super_block *sb)
sbi->s_fc_ineligible_tid = 0;
mutex_init(&sbi->s_fc_lock);
memset(&sbi->s_fc_stats, 0, sizeof(sbi->s_fc_stats));
+ memset(&sbi->s_fc_snap_stats, 0, sizeof(sbi->s_fc_snap_stats));
sbi->s_fc_replay_state.fc_regions = NULL;
sbi->s_fc_replay_state.fc_regions_size = 0;
sbi->s_fc_replay_state.fc_regions_used = 0;
--
2.53.0
prev parent reply other threads:[~2026-05-15 9:24 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 9:18 [RFC v8 0/7] ext4: fast commit: snapshot inode state for FC log Li Chen
2026-05-15 9:18 ` [RFC v8 1/7] ext4: fast commit: snapshot inode state before writing log Li Chen
2026-05-15 9:18 ` [RFC v8 2/7] ext4: lockdep: handle i_data_sem subclassing for special inodes Li Chen
2026-05-15 9:18 ` [RFC v8 3/7] ext4: fast commit: avoid waiting for FC_COMMITTING Li Chen
2026-05-15 9:18 ` [RFC v8 4/7] ext4: fast commit: avoid self-deadlock in inode snapshotting Li Chen
2026-05-15 9:18 ` [RFC v8 5/7] ext4: fast commit: avoid i_data_sem by dropping ext4_map_blocks() in snapshots Li Chen
2026-05-15 9:18 ` [RFC v8 6/7] ext4: fast commit: add lock_updates tracepoint Li Chen
2026-05-15 9:18 ` Li Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260515091829.194810-8-me@linux.beauty \
--to=me@linux.beauty \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=libaokun@linux.alibaba.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=rostedt@goodmis.org \
--cc=tytso@mit.edu \
--cc=yi.zhang@huawei.com \
--cc=yi.zhang@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox