All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] fuse: enable large folios
@ 2026-06-24  1:21 Joanne Koong
  2026-06-24  4:34 ` Jingbo Xu
  2026-06-24  6:16 ` Horst Birthelmer
  0 siblings, 2 replies; 5+ messages in thread
From: Joanne Koong @ 2026-06-24  1:21 UTC (permalink / raw)
  To: miklos; +Cc: jefflexu, horst, fuse-devel

Enable large folios, capping the max order at the largest request fuse
can issue, so a folio always fits within a single request. The order
range minimum is 0, so under memory pressure the allocator falls back to
smaller folios.

Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
file, medians, NUMA-pinned, performance governor, strictlimiting on by
default):

tmpfs backing (page-cache bound):
  workload          bs      large folios off   on        delta
  seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
  seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
  seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
  seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
  writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
  writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
  writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
  writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +

xfs on NVMe backing (device bound for cold I/O):
  workload          bs      large folios off   on        delta
  seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
  seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
  seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
  seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
  writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
  writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
  writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
  writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +

(*) device-bandwidth bound. Not much throughput gain but system cpu
utilization was roughly halved
(+) random write was tested as an overwrite of a hot region (under
writeback, this is page-cache bound, so the gain comes from lower
per-folio cpu overhead rather than higher backing-device throughput)

Random reads (4k and 128k) and writethrough writes were neutral with
no regression (no read-modify-write or read-amplification penalty from
large folios)

More information about the benchmark setup and results are in
https://github.com/joannekoong/linux/commits/fuse_large_folios_benchmarks/

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
This has a dependency on the iomap uptodate helpers that were submitted to
Christian's vfs tree [1]. If it's easier to route this patch through
Christian's tree, I can resubmit this.

[1] https://lore.kernel.org/linux-fsdevel/20260623202843.2064992-1-joannelkoong@gmail.com/

 fs/fuse/file.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index cb8da4c06d17..3c9be6d8ede1 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3136,4 +3136,14 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
 
 	if (IS_ENABLED(CONFIG_FUSE_DAX))
 		fuse_dax_inode_init(inode, flags);
+
+	if (!FUSE_IS_DAX(inode)) {
+		unsigned int max_pages = min(min(fc->max_write,
+						 fc->max_read) >> PAGE_SHIFT,
+					     fc->max_pages);
+
+		if (max_pages)
+			mapping_set_folio_order_range(inode->i_mapping, 0,
+						      ilog2(max_pages));
+	}
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-24  7:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24  1:21 [PATCH v1] fuse: enable large folios Joanne Koong
2026-06-24  4:34 ` Jingbo Xu
2026-06-24  6:10   ` Horst Birthelmer
2026-06-24  7:28     ` Jingbo Xu
2026-06-24  6:16 ` Horst Birthelmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.