* [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time
@ 2025-07-31 5:34 Chao Yu via Linux-f2fs-devel
2025-07-31 5:34 ` [f2fs-dev] [PATCH 2/2] f2fs: add time stats of checkpoint for debug Chao Yu via Linux-f2fs-devel
2025-08-15 16:40 ` [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time patchwork-bot+f2fs--- via Linux-f2fs-devel
0 siblings, 2 replies; 3+ messages in thread
From: Chao Yu via Linux-f2fs-devel @ 2025-07-31 5:34 UTC (permalink / raw)
To: jaegeuk; +Cc: linux-kernel, linux-f2fs-devel
generic/299 w/ mode=lfs will cause long time latency of checkpoint,
let's dump more information once we hit case.
CP merge:
- Queued : 0
- Issued : 1
- Total : 1
- Cur time : 9765(ms)
- Peak time : 9765(ms)
F2FS-fs (vdc): blocked on checkpoint for 9765 ms
CPU: 11 UID: 0 PID: 237 Comm: kworker/u128:29 Tainted: G O 6.16.0-rc3+ #409 PREEMPT(voluntary)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Workqueue: writeback wb_workfn (flush-253:32)
Call Trace:
<TASK>
dump_stack_lvl+0x6e/0xa0
f2fs_issue_checkpoint+0x268/0x280
f2fs_write_node_pages+0x6a/0x2c0
do_writepages+0xd0/0x170
__writeback_single_inode+0x56/0x4c0
writeback_sb_inodes+0x22a/0x550
__writeback_inodes_wb+0x4c/0xf0
wb_writeback+0x300/0x400
wb_workfn+0x3de/0x500
process_one_work+0x230/0x5c0
worker_thread+0x1da/0x3d0
kthread+0x10d/0x250
ret_from_fork+0x164/0x190
ret_from_fork_asm+0x1a/0x30
Cc: Jan Prusakowski <jprusakowski@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
---
fs/f2fs/checkpoint.c | 7 +++++++
fs/f2fs/f2fs.h | 8 +++++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index db3831f7f2f5..02806e2edce4 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1778,6 +1778,7 @@ static void __checkpoint_and_complete_reqs(struct f2fs_sb_info *sbi)
llist_for_each_entry_safe(req, next, dispatch_list, llnode) {
diff = (u64)ktime_ms_delta(ktime_get(), req->queue_time);
req->ret = ret;
+ req->delta_time = diff;
complete(&req->wait);
sum_diff += diff;
@@ -1873,6 +1874,12 @@ int f2fs_issue_checkpoint(struct f2fs_sb_info *sbi)
else
flush_remained_ckpt_reqs(sbi, &req);
+ if (unlikely(req.delta_time >= CP_LONG_LATENCY_THRESHOLD)) {
+ f2fs_warn_ratelimited(sbi,
+ "blocked on checkpoint for %u ms", cprc->peak_time);
+ dump_stack();
+ }
+
return req.ret;
}
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index b9058518b54e..c036af1a885a 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -334,7 +334,10 @@ struct ckpt_req {
struct completion wait; /* completion for checkpoint done */
struct llist_node llnode; /* llist_node to be linked in wait queue */
int ret; /* return code of checkpoint */
- ktime_t queue_time; /* request queued time */
+ union {
+ ktime_t queue_time; /* request queued time */
+ ktime_t delta_time; /* time in queue */
+ };
};
struct ckpt_req_control {
@@ -350,6 +353,9 @@ struct ckpt_req_control {
unsigned int peak_time; /* peak wait time in msec until now */
};
+/* a time threshold that checkpoint was blocked for, unit: ms */
+#define CP_LONG_LATENCY_THRESHOLD 5000
+
/* for the bitmap indicate blocks to be discarded */
struct discard_entry {
struct list_head list; /* list head */
--
2.49.0
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [f2fs-dev] [PATCH 2/2] f2fs: add time stats of checkpoint for debug
2025-07-31 5:34 [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time Chao Yu via Linux-f2fs-devel
@ 2025-07-31 5:34 ` Chao Yu via Linux-f2fs-devel
2025-08-15 16:40 ` [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time patchwork-bot+f2fs--- via Linux-f2fs-devel
1 sibling, 0 replies; 3+ messages in thread
From: Chao Yu via Linux-f2fs-devel @ 2025-07-31 5:34 UTC (permalink / raw)
To: jaegeuk; +Cc: linux-kernel, linux-f2fs-devel
checkpoint was blocked for 18643 ms
Step 0: 0 ms
Step 1: 38 ms
Step 2: 63 ms
Step 3: 4 ms
Step 4: 0 ms
Step 5: 0 ms
Step 6: 9 ms
Step 7: 0 ms
Step 8: 18277 ms
Step 9: 249 ms
Cc: Jan Prusakowski <jprusakowski@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
---
fs/f2fs/checkpoint.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
fs/f2fs/f2fs.h | 22 +++++++++++++++++++++
2 files changed, 68 insertions(+)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 02806e2edce4..bbe07e3a6c75 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1442,6 +1442,34 @@ u64 f2fs_get_sectors_written(struct f2fs_sb_info *sbi)
return get_sectors_written(sbi->sb->s_bdev);
}
+static inline void stat_cp_time(struct cp_control *cpc, enum cp_time type)
+{
+ cpc->stats.times[type] = ktime_get();
+}
+
+static inline void check_cp_time(struct f2fs_sb_info *sbi, struct cp_control *cpc)
+{
+ unsigned long long sb_diff, cur_diff;
+ enum cp_time ct;
+
+ sb_diff = (u64)ktime_ms_delta(sbi->cp_stats.times[CP_TIME_END],
+ sbi->cp_stats.times[CP_TIME_START]);
+ cur_diff = (u64)ktime_ms_delta(cpc->stats.times[CP_TIME_END],
+ cpc->stats.times[CP_TIME_START]);
+
+ if (cur_diff > sb_diff) {
+ sbi->cp_stats = cpc->stats;
+ if (cur_diff < CP_LONG_LATENCY_THRESHOLD)
+ return;
+
+ f2fs_warn(sbi, "checkpoint was blocked for %llu ms", cur_diff);
+ for (ct = CP_TIME_START; ct < CP_TIME_MAX - 1; ct++)
+ f2fs_warn(sbi, "Step#%d: %llu ms", ct,
+ (u64)ktime_ms_delta(cpc->stats.times[ct + 1],
+ cpc->stats.times[ct]));
+ }
+}
+
static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
{
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
@@ -1459,6 +1487,8 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* Flush all the NAT/SIT pages */
f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
+ stat_cp_time(cpc, CP_TIME_SYNC_META);
+
/* start to update checkpoint, cp ver is already updated previously */
ckpt->elapsed_time = cpu_to_le64(get_mtime(sbi, true));
ckpt->free_segment_count = cpu_to_le32(free_segments(sbi));
@@ -1555,20 +1585,26 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* Here, we have one bio having CP pack except cp pack 2 page */
f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
+ stat_cp_time(cpc, CP_TIME_SYNC_CP_META);
+
/* Wait for all dirty meta pages to be submitted for IO */
f2fs_wait_on_all_pages(sbi, F2FS_DIRTY_META);
+ stat_cp_time(cpc, CP_TIME_WAIT_DIRTY_META);
/* wait for previous submitted meta pages writeback */
f2fs_wait_on_all_pages(sbi, F2FS_WB_CP_DATA);
+ stat_cp_time(cpc, CP_TIME_WAIT_CP_DATA);
/* flush all device cache */
err = f2fs_flush_device_cache(sbi);
if (err)
return err;
+ stat_cp_time(cpc, CP_TIME_FLUSH_DEVICE);
/* barrier and flush checkpoint cp pack 2 page if it can */
commit_checkpoint(sbi, ckpt, start_blk);
f2fs_wait_on_all_pages(sbi, F2FS_WB_CP_DATA);
+ stat_cp_time(cpc, CP_TIME_WAIT_LAST_CP);
/*
* invalidate intermediate page cache borrowed from meta inode which are
@@ -1613,6 +1649,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
unsigned long long ckpt_ver;
int err = 0;
+ stat_cp_time(cpc, CP_TIME_START);
+
if (f2fs_readonly(sbi->sb) || f2fs_hw_is_readonly(sbi))
return -EROFS;
@@ -1624,6 +1662,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
if (cpc->reason != CP_RESIZE)
f2fs_down_write(&sbi->cp_global_sem);
+ stat_cp_time(cpc, CP_TIME_LOCK);
+
if (!is_sbi_flag_set(sbi, SBI_IS_DIRTY) &&
((cpc->reason & CP_FASTBOOT) || (cpc->reason & CP_SYNC) ||
((cpc->reason & CP_DISCARD) && !sbi->discard_blks)))
@@ -1639,6 +1679,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
if (err)
goto out;
+ stat_cp_time(cpc, CP_TIME_OP_LOCK);
+
trace_f2fs_write_checkpoint(sbi->sb, cpc->reason, "finish block_ops");
f2fs_flush_merged_writes(sbi);
@@ -1678,6 +1720,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_flush_sit_entries(sbi, cpc);
+ stat_cp_time(cpc, CP_TIME_FLUSH_META);
+
/* save inmem log status */
f2fs_save_inmem_curseg(sbi);
@@ -1695,6 +1739,8 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
stat_inc_cp_count(sbi);
stop:
unblock_operations(sbi);
+ stat_cp_time(cpc, CP_TIME_END);
+ check_cp_time(sbi, cpc);
if (cpc->reason & CP_RECOVERY)
f2fs_notice(sbi, "checkpoint: version = %llx", ckpt_ver);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c036af1a885a..691af2a6c159 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -269,11 +269,32 @@ enum {
#define DEF_DISABLE_QUICK_INTERVAL 1 /* 1 secs */
#define DEF_UMOUNT_DISCARD_TIMEOUT 5 /* 5 secs */
+enum cp_time {
+ CP_TIME_START, /* begin */
+ CP_TIME_LOCK, /* after cp_global_sem */
+ CP_TIME_OP_LOCK, /* after block_operation */
+ CP_TIME_FLUSH_META, /* after flush sit/nat */
+ CP_TIME_SYNC_META, /* after sync_meta_pages */
+ CP_TIME_SYNC_CP_META, /* after sync cp meta pages */
+ CP_TIME_WAIT_DIRTY_META,/* after wait on dirty meta */
+ CP_TIME_WAIT_CP_DATA, /* after wait on cp data */
+ CP_TIME_FLUSH_DEVICE, /* after flush device cache */
+ CP_TIME_WAIT_LAST_CP, /* after wait on last cp pack */
+ CP_TIME_END, /* after unblock_operation */
+ CP_TIME_MAX,
+};
+
+/* time cost stats of checkpoint */
+struct cp_stats {
+ ktime_t times[CP_TIME_MAX];
+};
+
struct cp_control {
int reason;
__u64 trim_start;
__u64 trim_end;
__u64 trim_minlen;
+ struct cp_stats stats;
};
/*
@@ -1650,6 +1671,7 @@ struct f2fs_sb_info {
unsigned long last_time[MAX_TIME]; /* to store time in jiffies */
long interval_time[MAX_TIME]; /* to store thresholds */
struct ckpt_req_control cprc_info; /* for checkpoint request control */
+ struct cp_stats cp_stats; /* for time stat of checkpoint */
struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */
--
2.49.0
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time
2025-07-31 5:34 [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time Chao Yu via Linux-f2fs-devel
2025-07-31 5:34 ` [f2fs-dev] [PATCH 2/2] f2fs: add time stats of checkpoint for debug Chao Yu via Linux-f2fs-devel
@ 2025-08-15 16:40 ` patchwork-bot+f2fs--- via Linux-f2fs-devel
1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+f2fs--- via Linux-f2fs-devel @ 2025-08-15 16:40 UTC (permalink / raw)
To: Chao Yu; +Cc: jaegeuk, linux-kernel, linux-f2fs-devel
Hello:
This series was applied to jaegeuk/f2fs.git (dev)
by Jaegeuk Kim <jaegeuk@kernel.org>:
On Thu, 31 Jul 2025 13:34:04 +0800 you wrote:
> generic/299 w/ mode=lfs will cause long time latency of checkpoint,
> let's dump more information once we hit case.
>
> CP merge:
> - Queued : 0
> - Issued : 1
> - Total : 1
> - Cur time : 9765(ms)
> - Peak time : 9765(ms)
>
> [...]
Here is the summary with links:
- [f2fs-dev,1/2] f2fs: dump more information when checkpoint was blocked for long time
https://git.kernel.org/jaegeuk/f2fs/c/3fcf228b6494
- [f2fs-dev,2/2] f2fs: add time stats of checkpoint for debug
https://git.kernel.org/jaegeuk/f2fs/c/57e74035ad5e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-08-15 16:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31 5:34 [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time Chao Yu via Linux-f2fs-devel
2025-07-31 5:34 ` [f2fs-dev] [PATCH 2/2] f2fs: add time stats of checkpoint for debug Chao Yu via Linux-f2fs-devel
2025-08-15 16:40 ` [f2fs-dev] [PATCH 1/2] f2fs: dump more information when checkpoint was blocked for long time patchwork-bot+f2fs--- via Linux-f2fs-devel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).