* [PATCH] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint
@ 2026-06-22 7:15 Zhan Xusheng
2026-06-23 6:53 ` Chao Yu
0 siblings, 1 reply; 4+ messages in thread
From: Zhan Xusheng @ 2026-06-22 7:15 UTC (permalink / raw)
To: Jaegeuk Kim
Cc: Chao Yu, Steven Rostedt, Masami Hiramatsu, linux-kernel,
linux-trace-kernel, Zhan Xusheng
The f2fs_iostat tracepoint stores the per-order read folio counts in a
fixed-size array and prints a fixed number of buckets, both hardcoded to
11. The sysfs iostat accounting array is instead sized by NR_PAGE_ORDERS
(= MAX_PAGE_ORDER + 1), which is not always 11:
arm64 16K pages -> MAX_PAGE_ORDER 11 -> NR_PAGE_ORDERS 12
arm64 64K pages -> MAX_PAGE_ORDER 13 -> NR_PAGE_ORDERS 14
f2fs enables large folios for immutable, non-compressed files, and the
read folio order is bounded by MAX_PAGECACHE_ORDER, i.e.
min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER). With THP enabled this
reaches order 11 on 16K/64K base-page kernels (MAX_XAS_ORDER caps it at
11). So an order-11 read folio is possible there and is accounted into
index 11 of the array.
On those configurations the sysfs file reports the order-11 count
correctly, but the tracepoint silently drops it: the memcpy is capped at
min(NR_PAGE_ORDERS, 11), so index 11 is never copied and the trace
disagrees with sysfs. There is no memory-safety issue, only the order-11
bucket missing from the trace; 4K-page kernels (NR_PAGE_ORDERS == 11,
max order <= 9) are unaffected.
Size the array and the printed buckets by a ceiling that covers the
largest possible NR_PAGE_ORDERS (14) with headroom, and add a
BUILD_BUG_ON() so any future growth of NR_PAGE_ORDERS fails the build
loudly instead of silently truncating again. The human-readable
"order=count" output is preserved.
Fixes: cb8ff3ead9a3 ("f2fs: add page-order information for large folio reads in iostat")
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
fs/f2fs/iostat.c | 6 ++++++
include/trace/events/f2fs.h | 20 ++++++++++++++++----
2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/fs/f2fs/iostat.c b/fs/f2fs/iostat.c
index ae265e3e9b2c..cd801bd0b910 100644
--- a/fs/f2fs/iostat.c
+++ b/fs/f2fs/iostat.c
@@ -188,6 +188,12 @@ void f2fs_update_read_folio_count(struct f2fs_sb_info *sbi, struct folio *folio)
unsigned int order = folio_order(folio);
unsigned long flags;
+ /*
+ * The f2fs_iostat tracepoint emits a fixed number of read folio order
+ * buckets. Make sure every order fits so none is silently dropped.
+ */
+ BUILD_BUG_ON(NR_PAGE_ORDERS > F2FS_IOSTAT_RD_FOLIO_ORDERS);
+
if (!sbi->iostat_enable)
return;
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index b5188d2671d7..3e810690d9de 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -2114,6 +2114,14 @@ DEFINE_EVENT(f2fs_zip_end, f2fs_decompress_pages_end,
);
#ifdef CONFIG_F2FS_IOSTAT
+/*
+ * Number of read folio order buckets emitted by the f2fs_iostat tracepoint.
+ * TP_printk() cannot loop, so the field count is fixed here and must be >=
+ * the largest possible NR_PAGE_ORDERS (14 on arm64 with 64K pages). The
+ * BUILD_BUG_ON() in f2fs_update_read_folio_count() enforces this.
+ */
+#define F2FS_IOSTAT_RD_FOLIO_ORDERS 16
+
TRACE_EVENT(f2fs_iostat,
TP_PROTO(struct f2fs_sb_info *sbi, unsigned long long *iostat,
@@ -2151,7 +2159,7 @@ TRACE_EVENT(f2fs_iostat,
__field(unsigned long long, fs_mrio)
__field(unsigned long long, fs_discard)
__field(unsigned long long, fs_reset_zone)
- __array(unsigned long long, read_folio_count, 11)
+ __array(unsigned long long, read_folio_count, F2FS_IOSTAT_RD_FOLIO_ORDERS)
),
TP_fast_assign(
@@ -2186,7 +2194,8 @@ TRACE_EVENT(f2fs_iostat,
__entry->fs_reset_zone = iostat[FS_ZONE_RESET_IO];
memset(__entry->read_folio_count, 0, sizeof(__entry->read_folio_count));
memcpy(__entry->read_folio_count, read_folio_count,
- sizeof(unsigned long long) * min_t(int, NR_PAGE_ORDERS, 11));
+ sizeof(unsigned long long) *
+ min_t(int, NR_PAGE_ORDERS, F2FS_IOSTAT_RD_FOLIO_ORDERS));
),
TP_printk("dev = (%d,%d), "
@@ -2201,7 +2210,8 @@ TRACE_EVENT(f2fs_iostat,
"fs [data=%llu, (gc_data=%llu, cdata=%llu), "
"node=%llu, meta=%llu], "
"read_folio_count [0=%llu, 1=%llu, 2=%llu, 3=%llu, 4=%llu, "
- "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu]",
+ "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu, 11=%llu, "
+ "12=%llu, 13=%llu, 14=%llu, 15=%llu]",
show_dev(__entry->dev), __entry->app_wio, __entry->app_dio,
__entry->app_bio, __entry->app_mio, __entry->app_bcdio,
__entry->app_mcdio, __entry->fs_dio, __entry->fs_cdio,
@@ -2218,7 +2228,9 @@ TRACE_EVENT(f2fs_iostat,
__entry->read_folio_count[4], __entry->read_folio_count[5],
__entry->read_folio_count[6], __entry->read_folio_count[7],
__entry->read_folio_count[8], __entry->read_folio_count[9],
- __entry->read_folio_count[10])
+ __entry->read_folio_count[10], __entry->read_folio_count[11],
+ __entry->read_folio_count[12], __entry->read_folio_count[13],
+ __entry->read_folio_count[14], __entry->read_folio_count[15])
);
#ifndef __F2FS_IOSTAT_LATENCY_TYPE
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint
2026-06-22 7:15 [PATCH] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint Zhan Xusheng
@ 2026-06-23 6:53 ` Chao Yu
2026-06-23 7:26 ` [PATCH v2] " Zhan Xusheng
0 siblings, 1 reply; 4+ messages in thread
From: Chao Yu @ 2026-06-23 6:53 UTC (permalink / raw)
To: Zhan Xusheng, Jaegeuk Kim
Cc: chao, Steven Rostedt, Masami Hiramatsu, linux-kernel,
linux-trace-kernel, Zhan Xusheng, Daniel Lee
+Cc Daniel,
On 6/22/26 15:15, Zhan Xusheng wrote:
> The f2fs_iostat tracepoint stores the per-order read folio counts in a
> fixed-size array and prints a fixed number of buckets, both hardcoded to
> 11. The sysfs iostat accounting array is instead sized by NR_PAGE_ORDERS
> (= MAX_PAGE_ORDER + 1), which is not always 11:
>
> arm64 16K pages -> MAX_PAGE_ORDER 11 -> NR_PAGE_ORDERS 12
> arm64 64K pages -> MAX_PAGE_ORDER 13 -> NR_PAGE_ORDERS 14
>
> f2fs enables large folios for immutable, non-compressed files, and the
> read folio order is bounded by MAX_PAGECACHE_ORDER, i.e.
> min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER). With THP enabled this
> reaches order 11 on 16K/64K base-page kernels (MAX_XAS_ORDER caps it at
> 11). So an order-11 read folio is possible there and is accounted into
> index 11 of the array.
>
> On those configurations the sysfs file reports the order-11 count
> correctly, but the tracepoint silently drops it: the memcpy is capped at
> min(NR_PAGE_ORDERS, 11), so index 11 is never copied and the trace
> disagrees with sysfs. There is no memory-safety issue, only the order-11
> bucket missing from the trace; 4K-page kernels (NR_PAGE_ORDERS == 11,
> max order <= 9) are unaffected.
>
> Size the array and the printed buckets by a ceiling that covers the
> largest possible NR_PAGE_ORDERS (14) with headroom, and add a
> BUILD_BUG_ON() so any future growth of NR_PAGE_ORDERS fails the build
> loudly instead of silently truncating again. The human-readable
> "order=count" output is preserved.
>
Cc: stable@kernel.org
> Fixes: cb8ff3ead9a3 ("f2fs: add page-order information for large folio reads in iostat")
> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
> ---
> fs/f2fs/iostat.c | 6 ++++++
> include/trace/events/f2fs.h | 20 ++++++++++++++++----
> 2 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/fs/f2fs/iostat.c b/fs/f2fs/iostat.c
> index ae265e3e9b2c..cd801bd0b910 100644
> --- a/fs/f2fs/iostat.c
> +++ b/fs/f2fs/iostat.c
> @@ -188,6 +188,12 @@ void f2fs_update_read_folio_count(struct f2fs_sb_info *sbi, struct folio *folio)
> unsigned int order = folio_order(folio);
> unsigned long flags;
>
> + /*
> + * The f2fs_iostat tracepoint emits a fixed number of read folio order
> + * buckets. Make sure every order fits so none is silently dropped.
> + */
> + BUILD_BUG_ON(NR_PAGE_ORDERS > F2FS_IOSTAT_RD_FOLIO_ORDERS);
What do you think of relocating this into f2fs_init_iostat()?
Thanks,
> +
> if (!sbi->iostat_enable)
> return;
>
> diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
> index b5188d2671d7..3e810690d9de 100644
> --- a/include/trace/events/f2fs.h
> +++ b/include/trace/events/f2fs.h
> @@ -2114,6 +2114,14 @@ DEFINE_EVENT(f2fs_zip_end, f2fs_decompress_pages_end,
> );
>
> #ifdef CONFIG_F2FS_IOSTAT
> +/*
> + * Number of read folio order buckets emitted by the f2fs_iostat tracepoint.
> + * TP_printk() cannot loop, so the field count is fixed here and must be >=
> + * the largest possible NR_PAGE_ORDERS (14 on arm64 with 64K pages). The
> + * BUILD_BUG_ON() in f2fs_update_read_folio_count() enforces this.
> + */
> +#define F2FS_IOSTAT_RD_FOLIO_ORDERS 16
> +
> TRACE_EVENT(f2fs_iostat,
>
> TP_PROTO(struct f2fs_sb_info *sbi, unsigned long long *iostat,
> @@ -2151,7 +2159,7 @@ TRACE_EVENT(f2fs_iostat,
> __field(unsigned long long, fs_mrio)
> __field(unsigned long long, fs_discard)
> __field(unsigned long long, fs_reset_zone)
> - __array(unsigned long long, read_folio_count, 11)
> + __array(unsigned long long, read_folio_count, F2FS_IOSTAT_RD_FOLIO_ORDERS)
> ),
>
> TP_fast_assign(
> @@ -2186,7 +2194,8 @@ TRACE_EVENT(f2fs_iostat,
> __entry->fs_reset_zone = iostat[FS_ZONE_RESET_IO];
> memset(__entry->read_folio_count, 0, sizeof(__entry->read_folio_count));
> memcpy(__entry->read_folio_count, read_folio_count,
> - sizeof(unsigned long long) * min_t(int, NR_PAGE_ORDERS, 11));
> + sizeof(unsigned long long) *
> + min_t(int, NR_PAGE_ORDERS, F2FS_IOSTAT_RD_FOLIO_ORDERS));
> ),
>
> TP_printk("dev = (%d,%d), "
> @@ -2201,7 +2210,8 @@ TRACE_EVENT(f2fs_iostat,
> "fs [data=%llu, (gc_data=%llu, cdata=%llu), "
> "node=%llu, meta=%llu], "
> "read_folio_count [0=%llu, 1=%llu, 2=%llu, 3=%llu, 4=%llu, "
> - "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu]",
> + "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu, 11=%llu, "
> + "12=%llu, 13=%llu, 14=%llu, 15=%llu]",
> show_dev(__entry->dev), __entry->app_wio, __entry->app_dio,
> __entry->app_bio, __entry->app_mio, __entry->app_bcdio,
> __entry->app_mcdio, __entry->fs_dio, __entry->fs_cdio,
> @@ -2218,7 +2228,9 @@ TRACE_EVENT(f2fs_iostat,
> __entry->read_folio_count[4], __entry->read_folio_count[5],
> __entry->read_folio_count[6], __entry->read_folio_count[7],
> __entry->read_folio_count[8], __entry->read_folio_count[9],
> - __entry->read_folio_count[10])
> + __entry->read_folio_count[10], __entry->read_folio_count[11],
> + __entry->read_folio_count[12], __entry->read_folio_count[13],
> + __entry->read_folio_count[14], __entry->read_folio_count[15])
> );
>
> #ifndef __F2FS_IOSTAT_LATENCY_TYPE
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH v2] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint
2026-06-23 6:53 ` Chao Yu
@ 2026-06-23 7:26 ` Zhan Xusheng
2026-06-23 8:50 ` Chao Yu
0 siblings, 1 reply; 4+ messages in thread
From: Zhan Xusheng @ 2026-06-23 7:26 UTC (permalink / raw)
To: Jaegeuk Kim, Chao Yu
Cc: Daniel Lee, Steven Rostedt, Masami Hiramatsu, linux-kernel,
linux-trace-kernel, stable, Zhan Xusheng
The f2fs_iostat tracepoint stores the per-order read folio counts in a
fixed-size array and prints a fixed number of buckets, both hardcoded to
11. The sysfs iostat accounting array is instead sized by NR_PAGE_ORDERS
(= MAX_PAGE_ORDER + 1), which is not always 11:
arm64 16K pages -> MAX_PAGE_ORDER 11 -> NR_PAGE_ORDERS 12
arm64 64K pages -> MAX_PAGE_ORDER 13 -> NR_PAGE_ORDERS 14
f2fs enables large folios for immutable, non-compressed files, and the
read folio order is bounded by MAX_PAGECACHE_ORDER, i.e.
min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER). With THP enabled this
reaches order 11 on 16K/64K base-page kernels (MAX_XAS_ORDER caps it at
11). So an order-11 read folio is possible there and is accounted into
index 11 of the array.
On those configurations the sysfs file reports the order-11 count
correctly, but the tracepoint silently drops it: the memcpy is capped at
min(NR_PAGE_ORDERS, 11), so index 11 is never copied and the trace
disagrees with sysfs. There is no memory-safety issue, only the order-11
bucket missing from the trace; 4K-page kernels (NR_PAGE_ORDERS == 11,
max order <= 9) are unaffected.
Size the array and the printed buckets by a ceiling that covers the
largest possible NR_PAGE_ORDERS (14) with headroom, and add a
BUILD_BUG_ON() so any future growth of NR_PAGE_ORDERS fails the build
loudly instead of silently truncating again. The human-readable
"order=count" output is preserved.
Fixes: cb8ff3ead9a3 ("f2fs: add page-order information for large folio reads in iostat")
Cc: stable@vger.kernel.org
Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
---
v2:
- Move the BUILD_BUG_ON() from f2fs_update_read_folio_count() into
f2fs_init_iostat() (Chao Yu)
- Add Cc: stable (Chao Yu)
fs/f2fs/iostat.c | 6 ++++++
include/trace/events/f2fs.h | 20 ++++++++++++++++----
2 files changed, 22 insertions(+), 4 deletions(-)
diff --git a/fs/f2fs/iostat.c b/fs/f2fs/iostat.c
index ae265e3e9b2c..12d4e18a6a50 100644
--- a/fs/f2fs/iostat.c
+++ b/fs/f2fs/iostat.c
@@ -332,6 +332,12 @@ void f2fs_destroy_iostat_processing(void)
int f2fs_init_iostat(struct f2fs_sb_info *sbi)
{
+ /*
+ * The f2fs_iostat tracepoint emits a fixed number of read folio order
+ * buckets; make sure every order fits so none is silently dropped.
+ */
+ BUILD_BUG_ON(NR_PAGE_ORDERS > F2FS_IOSTAT_RD_FOLIO_ORDERS);
+
/* init iostat info */
spin_lock_init(&sbi->iostat_lock);
spin_lock_init(&sbi->iostat_lat_lock);
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index b5188d2671d7..3e810690d9de 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -2114,6 +2114,14 @@ DEFINE_EVENT(f2fs_zip_end, f2fs_decompress_pages_end,
);
#ifdef CONFIG_F2FS_IOSTAT
+/*
+ * Number of read folio order buckets emitted by the f2fs_iostat tracepoint.
+ * TP_printk() cannot loop, so the field count is fixed here and must be >=
+ * the largest possible NR_PAGE_ORDERS (14 on arm64 with 64K pages). The
+ * BUILD_BUG_ON() in f2fs_update_read_folio_count() enforces this.
+ */
+#define F2FS_IOSTAT_RD_FOLIO_ORDERS 16
+
TRACE_EVENT(f2fs_iostat,
TP_PROTO(struct f2fs_sb_info *sbi, unsigned long long *iostat,
@@ -2151,7 +2159,7 @@ TRACE_EVENT(f2fs_iostat,
__field(unsigned long long, fs_mrio)
__field(unsigned long long, fs_discard)
__field(unsigned long long, fs_reset_zone)
- __array(unsigned long long, read_folio_count, 11)
+ __array(unsigned long long, read_folio_count, F2FS_IOSTAT_RD_FOLIO_ORDERS)
),
TP_fast_assign(
@@ -2186,7 +2194,8 @@ TRACE_EVENT(f2fs_iostat,
__entry->fs_reset_zone = iostat[FS_ZONE_RESET_IO];
memset(__entry->read_folio_count, 0, sizeof(__entry->read_folio_count));
memcpy(__entry->read_folio_count, read_folio_count,
- sizeof(unsigned long long) * min_t(int, NR_PAGE_ORDERS, 11));
+ sizeof(unsigned long long) *
+ min_t(int, NR_PAGE_ORDERS, F2FS_IOSTAT_RD_FOLIO_ORDERS));
),
TP_printk("dev = (%d,%d), "
@@ -2201,7 +2210,8 @@ TRACE_EVENT(f2fs_iostat,
"fs [data=%llu, (gc_data=%llu, cdata=%llu), "
"node=%llu, meta=%llu], "
"read_folio_count [0=%llu, 1=%llu, 2=%llu, 3=%llu, 4=%llu, "
- "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu]",
+ "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu, 11=%llu, "
+ "12=%llu, 13=%llu, 14=%llu, 15=%llu]",
show_dev(__entry->dev), __entry->app_wio, __entry->app_dio,
__entry->app_bio, __entry->app_mio, __entry->app_bcdio,
__entry->app_mcdio, __entry->fs_dio, __entry->fs_cdio,
@@ -2218,7 +2228,9 @@ TRACE_EVENT(f2fs_iostat,
__entry->read_folio_count[4], __entry->read_folio_count[5],
__entry->read_folio_count[6], __entry->read_folio_count[7],
__entry->read_folio_count[8], __entry->read_folio_count[9],
- __entry->read_folio_count[10])
+ __entry->read_folio_count[10], __entry->read_folio_count[11],
+ __entry->read_folio_count[12], __entry->read_folio_count[13],
+ __entry->read_folio_count[14], __entry->read_folio_count[15])
);
#ifndef __F2FS_IOSTAT_LATENCY_TYPE
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v2] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint
2026-06-23 7:26 ` [PATCH v2] " Zhan Xusheng
@ 2026-06-23 8:50 ` Chao Yu
0 siblings, 0 replies; 4+ messages in thread
From: Chao Yu @ 2026-06-23 8:50 UTC (permalink / raw)
To: Zhan Xusheng, Jaegeuk Kim
Cc: chao, Daniel Lee, Steven Rostedt, Masami Hiramatsu, linux-kernel,
linux-trace-kernel, stable, Zhan Xusheng
On 6/23/26 15:26, Zhan Xusheng wrote:
> The f2fs_iostat tracepoint stores the per-order read folio counts in a
> fixed-size array and prints a fixed number of buckets, both hardcoded to
> 11. The sysfs iostat accounting array is instead sized by NR_PAGE_ORDERS
> (= MAX_PAGE_ORDER + 1), which is not always 11:
>
> arm64 16K pages -> MAX_PAGE_ORDER 11 -> NR_PAGE_ORDERS 12
> arm64 64K pages -> MAX_PAGE_ORDER 13 -> NR_PAGE_ORDERS 14
>
> f2fs enables large folios for immutable, non-compressed files, and the
> read folio order is bounded by MAX_PAGECACHE_ORDER, i.e.
> min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER). With THP enabled this
> reaches order 11 on 16K/64K base-page kernels (MAX_XAS_ORDER caps it at
> 11). So an order-11 read folio is possible there and is accounted into
> index 11 of the array.
>
> On those configurations the sysfs file reports the order-11 count
> correctly, but the tracepoint silently drops it: the memcpy is capped at
> min(NR_PAGE_ORDERS, 11), so index 11 is never copied and the trace
> disagrees with sysfs. There is no memory-safety issue, only the order-11
> bucket missing from the trace; 4K-page kernels (NR_PAGE_ORDERS == 11,
> max order <= 9) are unaffected.
>
> Size the array and the printed buckets by a ceiling that covers the
> largest possible NR_PAGE_ORDERS (14) with headroom, and add a
> BUILD_BUG_ON() so any future growth of NR_PAGE_ORDERS fails the build
> loudly instead of silently truncating again. The human-readable
> "order=count" output is preserved.
>
> Fixes: cb8ff3ead9a3 ("f2fs: add page-order information for large folio reads in iostat")
> Cc: stable@vger.kernel.org
> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Thanks,
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-23 8:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-22 7:15 [PATCH] f2fs: don't drop the top folio order in the f2fs_iostat tracepoint Zhan Xusheng
2026-06-23 6:53 ` Chao Yu
2026-06-23 7:26 ` [PATCH v2] " Zhan Xusheng
2026-06-23 8:50 ` Chao Yu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox