[PATCH v1] fuse: enable large folios

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1] fuse: enable large folios
@ 2026-06-24  1:21 Joanne Koong
  2026-06-24  4:34 ` Jingbo Xu
  2026-06-24  6:16 ` Horst Birthelmer
  0 siblings, 2 replies; 5+ messages in thread
From: Joanne Koong @ 2026-06-24  1:21 UTC (permalink / raw)
  To: miklos; +Cc: jefflexu, horst, fuse-devel

Enable large folios, capping the max order at the largest request fuse
can issue, so a folio always fits within a single request. The order
range minimum is 0, so under memory pressure the allocator falls back to
smaller folios.

Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
file, medians, NUMA-pinned, performance governor, strictlimiting on by
default):

tmpfs backing (page-cache bound):
  workload          bs      large folios off   on        delta
  seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
  seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
  seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
  seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
  writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
  writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
  writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
  writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +

xfs on NVMe backing (device bound for cold I/O):
  workload          bs      large folios off   on        delta
  seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
  seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
  seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
  seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
  writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
  writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
  writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
  writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +

(*) device-bandwidth bound. Not much throughput gain but system cpu
utilization was roughly halved
(+) random write was tested as an overwrite of a hot region (under
writeback, this is page-cache bound, so the gain comes from lower
per-folio cpu overhead rather than higher backing-device throughput)

Random reads (4k and 128k) and writethrough writes were neutral with
no regression (no read-modify-write or read-amplification penalty from
large folios)

More information about the benchmark setup and results are in
https://github.com/joannekoong/linux/commits/fuse_large_folios_benchmarks/

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
This has a dependency on the iomap uptodate helpers that were submitted to
Christian's vfs tree [1]. If it's easier to route this patch through
Christian's tree, I can resubmit this.

[1] https://lore.kernel.org/linux-fsdevel/20260623202843.2064992-1-joannelkoong@gmail.com/

 fs/fuse/file.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index cb8da4c06d17..3c9be6d8ede1 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3136,4 +3136,14 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
 
 	if (IS_ENABLED(CONFIG_FUSE_DAX))
 		fuse_dax_inode_init(inode, flags);
+
+	if (!FUSE_IS_DAX(inode)) {
+		unsigned int max_pages = min(min(fc->max_write,
+						 fc->max_read) >> PAGE_SHIFT,
+					     fc->max_pages);
+
+		if (max_pages)
+			mapping_set_folio_order_range(inode->i_mapping, 0,
+						      ilog2(max_pages));
+	}
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] fuse: enable large folios
  2026-06-24  1:21 [PATCH v1] fuse: enable large folios Joanne Koong
@ 2026-06-24  4:34 ` Jingbo Xu
  2026-06-24  6:10   ` Horst Birthelmer
  2026-06-24  6:16 ` Horst Birthelmer
  1 sibling, 1 reply; 5+ messages in thread
From: Jingbo Xu @ 2026-06-24  4:34 UTC (permalink / raw)
  To: Joanne Koong, miklos; +Cc: horst, fuse-devel



On 6/24/26 9:21 AM, Joanne Koong wrote:
> Enable large folios, capping the max order at the largest request fuse
> can issue, so a folio always fits within a single request. The order
> range minimum is 0, so under memory pressure the allocator falls back to
> smaller folios.
> 
> Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
> file, medians, NUMA-pinned, performance governor, strictlimiting on by
> default):
> 
> tmpfs backing (page-cache bound):
>   workload          bs      large folios off   on        delta
>   seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
>   seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
>   seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
>   seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
>   writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
>   writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
>   writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
>   writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +
> 
> xfs on NVMe backing (device bound for cold I/O):
>   workload          bs      large folios off   on        delta
>   seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
>   seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
>   seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
>   seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
>   writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
>   writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
>   writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
>   writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +
> 
> (*) device-bandwidth bound. Not much throughput gain but system cpu
> utilization was roughly halved
> (+) random write was tested as an overwrite of a hot region (under
> writeback, this is page-cache bound, so the gain comes from lower
> per-folio cpu overhead rather than higher backing-device throughput)
> 
> Random reads (4k and 128k) and writethrough writes were neutral with
> no regression (no read-modify-write or read-amplification penalty from
> large folios)
> 
> More information about the benchmark setup and results are in
> https://github.com/joannekoong/linux/commits/fuse_large_folios_benchmarks/
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> This has a dependency on the iomap uptodate helpers that were submitted to
> Christian's vfs tree [1]. If it's easier to route this patch through
> Christian's tree, I can resubmit this.
> 
> [1] https://lore.kernel.org/linux-fsdevel/20260623202843.2064992-1-joannelkoong@gmail.com/
> 
>  fs/fuse/file.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index cb8da4c06d17..3c9be6d8ede1 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3136,4 +3136,14 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
>  
>  	if (IS_ENABLED(CONFIG_FUSE_DAX))
>  		fuse_dax_inode_init(inode, flags);
> +
> +	if (!FUSE_IS_DAX(inode)) {
> +		unsigned int max_pages = min(min(fc->max_write,
> +						 fc->max_read) >> PAGE_SHIFT,
> +					     fc->max_pages);
> +
> +		if (max_pages)
> +			mapping_set_folio_order_range(inode->i_mapping, 0,
> +						      ilog2(max_pages));
> +	}
>  }

mapping_set_folio_order_range(..., 0, 0) seems harmless even when
max_pages is 0.

Anyway

Reviewed-by: Jingbo Xu \<jefflexu@linux.alibaba.com\>



-- 
Thanks,
Jingbo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: [PATCH v1] fuse: enable large folios
  2026-06-24  4:34 ` Jingbo Xu
@ 2026-06-24  6:10   ` Horst Birthelmer
  2026-06-24  7:28     ` Jingbo Xu
  0 siblings, 1 reply; 5+ messages in thread
From: Horst Birthelmer @ 2026-06-24  6:10 UTC (permalink / raw)
  To: Jingbo Xu; +Cc: Joanne Koong, miklos, fuse-devel

On Wed, Jun 24, 2026 at 12:34:16PM +0800, Jingbo Xu wrote:
> 
> 
> On 6/24/26 9:21 AM, Joanne Koong wrote:
> > Enable large folios, capping the max order at the largest request fuse
> > can issue, so a folio always fits within a single request. The order
> > range minimum is 0, so under memory pressure the allocator falls back to
> > smaller folios.
> > 
> > Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
> > file, medians, NUMA-pinned, performance governor, strictlimiting on by
> > default):
> > 
> > tmpfs backing (page-cache bound):
> >   workload          bs      large folios off   on        delta
> >   seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
> >   seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
> >   seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
> >   seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
> >   writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
> >   writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
> >   writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
> >   writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +
> > 
> > xfs on NVMe backing (device bound for cold I/O):
> >   workload          bs      large folios off   on        delta
> >   seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
> >   seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
> >   seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
> >   seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
> >   writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
> >   writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
> >   writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
> >   writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +
> > 
> > (*) device-bandwidth bound. Not much throughput gain but system cpu
> > utilization was roughly halved
> > (+) random write was tested as an overwrite of a hot region (under
> > writeback, this is page-cache bound, so the gain comes from lower
> > per-folio cpu overhead rather than higher backing-device throughput)
> > 
> > Random reads (4k and 128k) and writethrough writes were neutral with
> > no regression (no read-modify-write or read-amplification penalty from
> > large folios)
> > 
> > More information about the benchmark setup and results are in
> > https://github.com/joannekoong/linux/commits/fuse_large_folios_benchmarks/
> > 
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > This has a dependency on the iomap uptodate helpers that were submitted to
> > Christian's vfs tree [1]. If it's easier to route this patch through
> > Christian's tree, I can resubmit this.
> > 
> > [1] https://lore.kernel.org/linux-fsdevel/20260623202843.2064992-1-joannelkoong@gmail.com/
> > 
> >  fs/fuse/file.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index cb8da4c06d17..3c9be6d8ede1 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -3136,4 +3136,14 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> >  
> >  	if (IS_ENABLED(CONFIG_FUSE_DAX))
> >  		fuse_dax_inode_init(inode, flags);
> > +
> > +	if (!FUSE_IS_DAX(inode)) {
> > +		unsigned int max_pages = min(min(fc->max_write,
> > +						 fc->max_read) >> PAGE_SHIFT,
> > +					     fc->max_pages);
> > +
> > +		if (max_pages)
> > +			mapping_set_folio_order_range(inode->i_mapping, 0,
> > +						      ilog2(max_pages));
> > +	}
> >  }
> 
> mapping_set_folio_order_range(..., 0, 0) seems harmless even when
> max_pages is 0.
> 
> Anyway
> 
> Reviewed-by: Jingbo Xu \<jefflexu@linux.alibaba.com\>
> 

I was just about to suggest something like

mapping_set_folio_order_range(inode->i_mapping, 0,
			      ilog2(max_pages ?: 1));

but you are right, it's not a problem.
> 
> -- 
> Thanks,
> Jingbo
> 

This was actually one of the problems I ran into when I enabled large folios in linux 6.17.
Since it would create problems for readahead.

Reviewed-By: Horst Birthelmer <hbirthelmer@ddn.com>

---
Thanks,
Horst

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] fuse: enable large folios
  2026-06-24  1:21 [PATCH v1] fuse: enable large folios Joanne Koong
  2026-06-24  4:34 ` Jingbo Xu
@ 2026-06-24  6:16 ` Horst Birthelmer
  1 sibling, 0 replies; 5+ messages in thread
From: Horst Birthelmer @ 2026-06-24  6:16 UTC (permalink / raw)
  To: Joanne Koong; +Cc: miklos, jefflexu, fuse-devel

On Tue, Jun 23, 2026 at 06:21:32PM -0700, Joanne Koong wrote:
> Enable large folios, capping the max order at the largest request fuse
> can issue, so a folio always fits within a single request. The order
> range minimum is 0, so under memory pressure the allocator falls back to
> smaller folios.
> 
> Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
> file, medians, NUMA-pinned, performance governor, strictlimiting on by
> default):
> 
> tmpfs backing (page-cache bound):
>   workload          bs      large folios off   on        delta
>   seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
>   seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
>   seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
>   seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
>   writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
>   writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
>   writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
>   writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +
> 
> xfs on NVMe backing (device bound for cold I/O):
>   workload          bs      large folios off   on        delta
>   seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
>   seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
>   seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
>   seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
>   writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
>   writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
>   writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
>   writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +
> 

Hi Joanne,

just out of curiosity, did you disable bdi strict limiting for this?
In my tests esapcially the large writes run into throttling pretty
fast, so that it effectively writes pagewise, which was not the target
of the test.

Thanks,
Horst

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] fuse: enable large folios
  2026-06-24  6:10   ` Horst Birthelmer
@ 2026-06-24  7:28     ` Jingbo Xu
  0 siblings, 0 replies; 5+ messages in thread
From: Jingbo Xu @ 2026-06-24  7:28 UTC (permalink / raw)
  To: Horst Birthelmer; +Cc: Joanne Koong, miklos, fuse-devel



On 6/24/26 2:10 PM, Horst Birthelmer wrote:
> On Wed, Jun 24, 2026 at 12:34:16PM +0800, Jingbo Xu wrote:
>>
>>
>> On 6/24/26 9:21 AM, Joanne Koong wrote:
>>> Enable large folios, capping the max order at the largest request fuse
>>> can issue, so a folio always fits within a single request. The order
>>> range minimum is 0, so under memory pressure the allocator falls back to
>>> smaller folios.
>>>
>>> Benchmarks (libfuse passthrough_hp, buffered fio, single job, 4 GiB
>>> file, medians, NUMA-pinned, performance governor, strictlimiting on by
>>> default):
>>>
>>> tmpfs backing (page-cache bound):
>>>   workload          bs      large folios off   on        delta
>>>   seq read,  cold,  128k    3110 MiB/s    4514 MiB/s     +45%
>>>   seq read,  cold,  1M      3079 MiB/s    5181 MiB/s     +68%
>>>   seq read,  warm,  128k    2438 MiB/s    4486 MiB/s     +84%
>>>   seq read,  warm,  1M      2403 MiB/s    5123 MiB/s    +113%
>>>   writeback write, seq,128k 1211 MiB/s    1699 MiB/s     +40%
>>>   writeback write, seq, 1M  1462 MiB/s    2208 MiB/s     +51%
>>>   writeback write, rand,128k 1101 MiB/s   1757 MiB/s     +60% +
>>>   writeback write, rand, 1M 1284 MiB/s    2228 MiB/s     +74% +
>>>
>>> xfs on NVMe backing (device bound for cold I/O):
>>>   workload          bs      large folios off   on        delta
>>>   seq read,  cold,  128k    2030 MiB/s    2172 MiB/s      +7% *
>>>   seq read,  cold,  1M      1999 MiB/s    2181 MiB/s      +9% *
>>>   seq read,  warm,  128k    2451 MiB/s    4939 MiB/s    +101%
>>>   seq read,  warm,  1M      2340 MiB/s    5639 MiB/s    +141%
>>>   writeback write, seq,128k  637 MiB/s     747 MiB/s     +17% *
>>>   writeback write, seq, 1M   694 MiB/s     833 MiB/s     +20% *
>>>   writeback write, rand,128k 1004 MiB/s   1648 MiB/s     +64% +
>>>   writeback write, rand, 1M 1171 MiB/s    2055 MiB/s     +75% +
>>>
>>> (*) device-bandwidth bound. Not much throughput gain but system cpu
>>> utilization was roughly halved
>>> (+) random write was tested as an overwrite of a hot region (under
>>> writeback, this is page-cache bound, so the gain comes from lower
>>> per-folio cpu overhead rather than higher backing-device throughput)
>>>
>>> Random reads (4k and 128k) and writethrough writes were neutral with
>>> no regression (no read-modify-write or read-amplification penalty from
>>> large folios)
>>>
>>> More information about the benchmark setup and results are in
>>> https://github.com/joannekoong/linux/commits/fuse_large_folios_benchmarks/
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>> ---
>>> This has a dependency on the iomap uptodate helpers that were submitted to
>>> Christian's vfs tree [1]. If it's easier to route this patch through
>>> Christian's tree, I can resubmit this.
>>>
>>> [1] https://lore.kernel.org/linux-fsdevel/20260623202843.2064992-1-joannelkoong@gmail.com/
>>>
>>>  fs/fuse/file.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>> index cb8da4c06d17..3c9be6d8ede1 100644
>>> --- a/fs/fuse/file.c
>>> +++ b/fs/fuse/file.c
>>> @@ -3136,4 +3136,14 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
>>>  
>>>  	if (IS_ENABLED(CONFIG_FUSE_DAX))
>>>  		fuse_dax_inode_init(inode, flags);
>>> +
>>> +	if (!FUSE_IS_DAX(inode)) {
>>> +		unsigned int max_pages = min(min(fc->max_write,
>>> +						 fc->max_read) >> PAGE_SHIFT,
>>> +					     fc->max_pages);
>>> +
>>> +		if (max_pages)
>>> +			mapping_set_folio_order_range(inode->i_mapping, 0,
>>> +						      ilog2(max_pages));
>>> +	}
>>>  }
>>
>> mapping_set_folio_order_range(..., 0, 0) seems harmless even when
>> max_pages is 0.
>>
>> Anyway
>>
>> Reviewed-by: Jingbo Xu \<jefflexu@linux.alibaba.com\>
>>
> 
> I was just about to suggest something like
> 
> mapping_set_folio_order_range(inode->i_mapping, 0,
> 			      ilog2(max_pages ?: 1));
> 
> but you are right, it's not a problem.
>>
>> -- 
>> Thanks,
>> Jingbo
>>
> 
> This was actually one of the problems I ran into when I enabled large folios in linux 6.17.
> Since it would create problems for readahead.
> 
> Reviewed-By: Horst Birthelmer <hbirthelmer@ddn.com>
> 

Okay I got it.  ilog2(0) doesn't equal to 0


-- 
Thanks,
Jingbo


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-24  7:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24  1:21 [PATCH v1] fuse: enable large folios Joanne Koong
2026-06-24  4:34 ` Jingbo Xu
2026-06-24  6:10   ` Horst Birthelmer
2026-06-24  7:28     ` Jingbo Xu
2026-06-24  6:16 ` Horst Birthelmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.