* [PATCH] fuse: enable large folios (if writeback cache is unused)
@ 2025-08-11 20:40 Joanne Koong
2025-08-11 21:13 ` Joanne Koong
2025-08-15 11:01 ` Jingbo Xu
0 siblings, 2 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-11 20:40 UTC (permalink / raw)
To: miklos; +Cc: linux-fsdevel, jefflexu, bernd.schubert, willy, kernel-team
Large folios are only enabled if the writeback cache isn't on.
(Strictlimiting needs to be turned off if the writeback cache is used in
conjunction with large folios, else this tanks performance.)
Benchmarks showed noticeable improvements for writes (both sequential
and random). There were no performance differences seen for random reads
or direct IO. For sequential reads, there was no performance difference
seen for the first read (which populates the page cache) but subsequent
sequential reads showed a huge speedup.
Benchmarks were run using fio on the passthrough_hp fuse server:
~/libfuse/build/example/passthrough_hp ~/libfuse ~/fuse_mnt --nopassthrough --nocache
run fio in ~/fuse_mnt:
fio --name=test --ioengine=sync --rw=write --bs=1M --size=5G --numjobs=2 --ramp_time=30 --group_reporting=1
Results (tested on bs=256K, 1M, 5M) showed roughly a 15-20% increase in
write throughput and for sequential reads after the page cache has
already been populated, there was a ~800% speedup seen.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
fs/fuse/file.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index adc4aa6810f5..2e7aae294c9e 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1167,9 +1167,10 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
pgoff_t index = pos >> PAGE_SHIFT;
unsigned int bytes;
unsigned int folio_offset;
+ fgf_t fgp = FGP_WRITEBEGIN | fgf_set_order(num);
again:
- folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
+ folio = __filemap_get_folio(mapping, index, fgp,
mapping_gfp_mask(mapping));
if (IS_ERR(folio)) {
err = PTR_ERR(folio);
@@ -3155,11 +3156,24 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
{
struct fuse_inode *fi = get_fuse_inode(inode);
struct fuse_conn *fc = get_fuse_conn(inode);
+ unsigned int max_pages, max_order;
inode->i_fop = &fuse_file_operations;
inode->i_data.a_ops = &fuse_file_aops;
- if (fc->writeback_cache)
+ if (fc->writeback_cache) {
mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
+ } else {
+ /*
+ * Large folios are only enabled if the writeback cache isn't on.
+ * If the writeback cache is on, large folios should only be
+ * enabled in conjunction with strictlimiting turned off, else
+ * performance tanks.
+ */
+ max_pages = min(min(fc->max_write, fc->max_read) >> PAGE_SHIFT,
+ fc->max_pages);
+ max_order = ilog2(max_pages);
+ mapping_set_folio_order_range(inode->i_mapping, 0, max_order);
+ }
INIT_LIST_HEAD(&fi->write_files);
INIT_LIST_HEAD(&fi->queued_writes);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-11 20:40 [PATCH] fuse: enable large folios (if writeback cache is unused) Joanne Koong
@ 2025-08-11 21:13 ` Joanne Koong
2025-08-12 3:25 ` Chunsheng Luo
2025-08-12 11:13 ` Miklos Szeredi
2025-08-15 11:01 ` Jingbo Xu
1 sibling, 2 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-11 21:13 UTC (permalink / raw)
To: miklos; +Cc: linux-fsdevel, jefflexu, bernd.schubert, willy, kernel-team
On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> Large folios are only enabled if the writeback cache isn't on.
> (Strictlimiting needs to be turned off if the writeback cache is used in
> conjunction with large folios, else this tanks performance.)
Some ideas for having this work with the writeback cache are
a) add a fuse sysctl sysadmins can set to turn off strictlimiting for
all fuse servers mounted after, in the kernel turn on large folios for
writeback if that sysctl is on
b) if the fuse server is privileged automatically turn off
strictlimiting and enable large folios for writeback
Any thoughts?
Thanks,
Joanne
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-11 21:13 ` Joanne Koong
@ 2025-08-12 3:25 ` Chunsheng Luo
2025-08-12 22:14 ` Joanne Koong
2025-08-12 11:13 ` Miklos Szeredi
1 sibling, 1 reply; 15+ messages in thread
From: Chunsheng Luo @ 2025-08-12 3:25 UTC (permalink / raw)
To: joannelkoong
Cc: bernd.schubert, jefflexu, kernel-team, linux-fsdevel, miklos,
willy
On Mon, Aug 11, 2025 Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> Large folios are only enabled if the writeback cache isn't on.
>> (Strictlimiting needs to be turned off if the writeback cache is used in
>> conjunction with large folios, else this tanks performance.)
>
> Some ideas for having this work with the writeback cache are
> a) add a fuse sysctl sysadmins can set to turn off strictlimiting for
> all fuse servers mounted after, in the kernel turn on large folios for
> writeback if that sysctl is on
> b) if the fuse server is privileged automatically turn off
> strictlimiting and enable large folios for writeback
>
> Any thoughts?
Should large folios be enabled based on mount options? Consider adding an
option in fuse_init_out to explicitly turn on large folios.
Thanks
Chunsheng Luo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-11 21:13 ` Joanne Koong
2025-08-12 3:25 ` Chunsheng Luo
@ 2025-08-12 11:13 ` Miklos Szeredi
2025-08-12 19:38 ` Darrick J. Wong
2025-08-12 22:44 ` Joanne Koong
1 sibling, 2 replies; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-12 11:13 UTC (permalink / raw)
To: Joanne Koong; +Cc: linux-fsdevel, jefflexu, bernd.schubert, willy, kernel-team
On Mon, 11 Aug 2025 at 23:13, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > Large folios are only enabled if the writeback cache isn't on.
> > (Strictlimiting needs to be turned off if the writeback cache is used in
> > conjunction with large folios, else this tanks performance.)
Is there an explanation somewhere about the writeback cache vs.
strictlimit issue?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 11:13 ` Miklos Szeredi
@ 2025-08-12 19:38 ` Darrick J. Wong
2025-08-12 23:02 ` Joanne Koong
2025-08-12 22:44 ` Joanne Koong
1 sibling, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-12 19:38 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Joanne Koong, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Tue, Aug 12, 2025 at 01:13:57PM +0200, Miklos Szeredi wrote:
> On Mon, 11 Aug 2025 at 23:13, Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > Large folios are only enabled if the writeback cache isn't on.
> > > (Strictlimiting needs to be turned off if the writeback cache is used in
> > > conjunction with large folios, else this tanks performance.)
>
> Is there an explanation somewhere about the writeback cache vs.
> strictlimit issue?
and, for n00bs such as myself: what is "strictlimit"? :)
--D
> Thanks,
> Miklos
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 3:25 ` Chunsheng Luo
@ 2025-08-12 22:14 ` Joanne Koong
0 siblings, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-12 22:14 UTC (permalink / raw)
To: Chunsheng Luo
Cc: bernd.schubert, jefflexu, kernel-team, linux-fsdevel, miklos,
willy
On Mon, Aug 11, 2025 at 8:26 PM Chunsheng Luo <luochunsheng@ustc.edu> wrote:
>
> On Mon, Aug 11, 2025 Joanne Koong <joannelkoong@gmail.com> wrote:
> >>
> >> Large folios are only enabled if the writeback cache isn't on.
> >> (Strictlimiting needs to be turned off if the writeback cache is used in
> >> conjunction with large folios, else this tanks performance.)
> >
> > Some ideas for having this work with the writeback cache are
> > a) add a fuse sysctl sysadmins can set to turn off strictlimiting for
> > all fuse servers mounted after, in the kernel turn on large folios for
> > writeback if that sysctl is on
> > b) if the fuse server is privileged automatically turn off
> > strictlimiting and enable large folios for writeback
> >
> > Any thoughts?
>
> Should large folios be enabled based on mount options? Consider adding an
> option in fuse_init_out to explicitly turn on large folios.
>
Hi Chunsheng,
Personally I'm not a fan of doing it through the init request because
it is tied hand-in-hand with disabling strictlimiting (which requires
admin privileges) and imo
a) it feels clunky that the user needs to opt into it for writeback
(for non-writeback cases, ideally large folios are the status quo) and
then also find the bdi that corresponds to that fuse mount, then go
into /sys/class/bdi/* for that bdi to disable strictlimiting, all
while making sure this happens before write workloads start
b) I think users (who most are not familiar with kernel internals)
will likely be confused by what large folios are and whether/when they
should opt into it or not
imo if the fuse server is mounted as a privileged server, I think it's
reasonable that strictlimiting could be turned off by default.
Thanks,
Joanne
> Thanks
> Chunsheng Luo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 11:13 ` Miklos Szeredi
2025-08-12 19:38 ` Darrick J. Wong
@ 2025-08-12 22:44 ` Joanne Koong
1 sibling, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-12 22:44 UTC (permalink / raw)
To: Miklos Szeredi
Cc: linux-fsdevel, jefflexu, bernd.schubert, willy, kernel-team
On Tue, Aug 12, 2025 at 4:14 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, 11 Aug 2025 at 23:13, Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > Large folios are only enabled if the writeback cache isn't on.
> > > (Strictlimiting needs to be turned off if the writeback cache is used in
> > > conjunction with large folios, else this tanks performance.)
>
> Is there an explanation somewhere about the writeback cache vs.
> strictlimit issue?
There's not much documentation about how strictlimit affects writeback
but from the balance dirty pages code, my understanding is that with
strictlimit on, the dirty throttle control uses the wb counters/limits
instead of the global ones and calculates stuff like the setpoint and
position ratio more conservatively, which leads to more eager io
throttling. This rfc patchset [1] is meant to help but it won't help
workloads that do lots of large sequential writes. Experimentally,
with strictlimiting on and the writeback cache used with large folios,
I saw around a 25 to 50% hit in throughput but with strictlimiting
disabled, there was around a 12% to 15% improvement.
(It'd also be great if others have time to confirm these benchmarks on
their systems to make sure they're also seeing the same percentage
improvements on their machines)
[1] https://lore.kernel.org/linux-fsdevel/20250801002131.255068-1-joannelkoong@gmail.com/T/#t
Thanks,
Joanne
>
> Thanks,
> Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 19:38 ` Darrick J. Wong
@ 2025-08-12 23:02 ` Joanne Koong
2025-08-13 1:20 ` Darrick J. Wong
2025-08-13 8:20 ` Miklos Szeredi
0 siblings, 2 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-12 23:02 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Miklos Szeredi, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Tue, Aug 12, 2025 at 12:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Tue, Aug 12, 2025 at 01:13:57PM +0200, Miklos Szeredi wrote:
> > On Mon, 11 Aug 2025 at 23:13, Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > > >
> > > > Large folios are only enabled if the writeback cache isn't on.
> > > > (Strictlimiting needs to be turned off if the writeback cache is used in
> > > > conjunction with large folios, else this tanks performance.)
> >
> > Is there an explanation somewhere about the writeback cache vs.
> > strictlimit issue?
>
> and, for n00bs such as myself: what is "strictlimit"? :)
>
My understanding of strictlimit is that it's a way of preventing
non-trusted filesystems from dirtying too many pages too quickly and
thus taking up too much bandwidth. It imposes stricter / more
conservative limits on how many pages a filesystem can dirty before it
gets forcibly throttled (the bulk of the logic happens in
balance_dirty_pages()). This is needed for fuse because fuse servers
may be unprivileged and malicious or buggy. The feature was introduced
in commit 5a53748568f7 ("mm/page-writeback.c: add strictlimit
feature). The reason we now run into this is because with large
folios, the dirtying happens so much faster now (eg a 1MB folio is
dirtied and copied at once instead of page by page), and as a result
the fuse server gets throttled more while doing large writes, which
ends up making the write overall slower.
Thanks,
Joanne
> --D
>
> > Thanks,
> > Miklos
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 23:02 ` Joanne Koong
@ 2025-08-13 1:20 ` Darrick J. Wong
2025-08-13 17:40 ` Joanne Koong
2025-08-13 8:20 ` Miklos Szeredi
1 sibling, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-13 1:20 UTC (permalink / raw)
To: Joanne Koong
Cc: Miklos Szeredi, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Tue, Aug 12, 2025 at 04:02:12PM -0700, Joanne Koong wrote:
> On Tue, Aug 12, 2025 at 12:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Tue, Aug 12, 2025 at 01:13:57PM +0200, Miklos Szeredi wrote:
> > > On Mon, 11 Aug 2025 at 23:13, Joanne Koong <joannelkoong@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 11, 2025 at 1:43 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > > > >
> > > > > Large folios are only enabled if the writeback cache isn't on.
> > > > > (Strictlimiting needs to be turned off if the writeback cache is used in
> > > > > conjunction with large folios, else this tanks performance.)
> > >
> > > Is there an explanation somewhere about the writeback cache vs.
> > > strictlimit issue?
> >
> > and, for n00bs such as myself: what is "strictlimit"? :)
> >
>
> My understanding of strictlimit is that it's a way of preventing
> non-trusted filesystems from dirtying too many pages too quickly and
> thus taking up too much bandwidth. It imposes stricter / more
Oh, BDI_CAP_STRICTLIMIT.
/me digs
"Then wb_thresh is 1% of 20% of 16GB. This amounts to ~8K pages."
Oh wow.
> conservative limits on how many pages a filesystem can dirty before it
> gets forcibly throttled (the bulk of the logic happens in
> balance_dirty_pages()). This is needed for fuse because fuse servers
> may be unprivileged and malicious or buggy. The feature was introduced
> in commit 5a53748568f7 ("mm/page-writeback.c: add strictlimit
> feature). The reason we now run into this is because with large
> folios, the dirtying happens so much faster now (eg a 1MB folio is
> dirtied and copied at once instead of page by page), and as a result
> the fuse server gets throttled more while doing large writes, which
> ends up making the write overall slower.
<nod> and hence your patchset gives the number of dirty blocks (pages?)
within the large folio to the writeback throttling code so that you
don't get charged for 2M of dirty data if you've really only touched a
single byte of a 2M folio, right?
Will go have a look at that tomorrow.
--D
>
> Thanks,
> Joanne
>
> > --D
> >
> > > Thanks,
> > > Miklos
> > >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-12 23:02 ` Joanne Koong
2025-08-13 1:20 ` Darrick J. Wong
@ 2025-08-13 8:20 ` Miklos Szeredi
2025-08-13 18:05 ` Joanne Koong
1 sibling, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-13 8:20 UTC (permalink / raw)
To: Joanne Koong
Cc: Darrick J. Wong, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Wed, 13 Aug 2025 at 01:02, Joanne Koong <joannelkoong@gmail.com> wrote:
> My understanding of strictlimit is that it's a way of preventing
> non-trusted filesystems from dirtying too many pages too quickly and
> thus taking up too much bandwidth. It imposes stricter / more
> conservative limits on how many pages a filesystem can dirty before it
> gets forcibly throttled (the bulk of the logic happens in
> balance_dirty_pages()). This is needed for fuse because fuse servers
> may be unprivileged and malicious or buggy. The feature was introduced
> in commit 5a53748568f7 ("mm/page-writeback.c: add strictlimit
Hmm, the commit message says that temp pages were causing the issues
that strictlimit is solving. So maybe now that temp pages are gone,
strictlimit can also be removed?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-13 1:20 ` Darrick J. Wong
@ 2025-08-13 17:40 ` Joanne Koong
0 siblings, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-13 17:40 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Miklos Szeredi, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Tue, Aug 12, 2025 at 6:20 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Tue, Aug 12, 2025 at 04:02:12PM -0700, Joanne Koong wrote:
> > On Tue, Aug 12, 2025 at 12:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > My understanding of strictlimit is that it's a way of preventing
> > non-trusted filesystems from dirtying too many pages too quickly and
> > thus taking up too much bandwidth. It imposes stricter / more
>
> Oh, BDI_CAP_STRICTLIMIT.
>
> /me digs
>
> "Then wb_thresh is 1% of 20% of 16GB. This amounts to ~8K pages."
>
> Oh wow.
>
> > conservative limits on how many pages a filesystem can dirty before it
> > gets forcibly throttled (the bulk of the logic happens in
> > balance_dirty_pages()). This is needed for fuse because fuse servers
> > may be unprivileged and malicious or buggy. The feature was introduced
> > in commit 5a53748568f7 ("mm/page-writeback.c: add strictlimit
> > feature). The reason we now run into this is because with large
> > folios, the dirtying happens so much faster now (eg a 1MB folio is
> > dirtied and copied at once instead of page by page), and as a result
> > the fuse server gets throttled more while doing large writes, which
> > ends up making the write overall slower.
>
> <nod> and hence your patchset gives the number of dirty blocks (pages?)
> within the large folio to the writeback throttling code so that you
> don't get charged for 2M of dirty data if you've really only touched a
> single byte of a 2M folio, right?
>
Yeah, exactly!
> Will go have a look at that tomorrow.
Thanks!
>
> --D
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-13 8:20 ` Miklos Szeredi
@ 2025-08-13 18:05 ` Joanne Koong
0 siblings, 0 replies; 15+ messages in thread
From: Joanne Koong @ 2025-08-13 18:05 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Darrick J. Wong, linux-fsdevel, jefflexu, bernd.schubert, willy,
kernel-team
On Wed, Aug 13, 2025 at 1:20 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, 13 Aug 2025 at 01:02, Joanne Koong <joannelkoong@gmail.com> wrote:
>
> > My understanding of strictlimit is that it's a way of preventing
> > non-trusted filesystems from dirtying too many pages too quickly and
> > thus taking up too much bandwidth. It imposes stricter / more
> > conservative limits on how many pages a filesystem can dirty before it
> > gets forcibly throttled (the bulk of the logic happens in
> > balance_dirty_pages()). This is needed for fuse because fuse servers
> > may be unprivileged and malicious or buggy. The feature was introduced
> > in commit 5a53748568f7 ("mm/page-writeback.c: add strictlimit
>
> Hmm, the commit message says that temp pages were causing the issues
> that strictlimit is solving. So maybe now that temp pages are gone,
> strictlimit can also be removed?
That sounds good to me but I think it's a bit unclear / ambiguous what
the limit for unprivileged servers should be (eg whether it should be
more conservative than that of privileged servers).
I think there's an argument to be made that strictlimiting wouldn't
deter a motivated malicious user, they could start up hundreds of
servers and pollute RAM that way.
Maybe one option is to disable strictlimiting by default but provide a
sysctl that admins can set to enforce default strictlimiting on all
unprivileged fuse servers?
Thanks,
Joanne
>
> Thanks,
> Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-11 20:40 [PATCH] fuse: enable large folios (if writeback cache is unused) Joanne Koong
2025-08-11 21:13 ` Joanne Koong
@ 2025-08-15 11:01 ` Jingbo Xu
2025-08-15 16:16 ` Joanne Koong
1 sibling, 1 reply; 15+ messages in thread
From: Jingbo Xu @ 2025-08-15 11:01 UTC (permalink / raw)
To: Joanne Koong, miklos; +Cc: linux-fsdevel, bernd.schubert, willy, kernel-team
On 8/12/25 4:40 AM, Joanne Koong wrote:
> Large folios are only enabled if the writeback cache isn't on.
> (Strictlimiting needs to be turned off if the writeback cache is used in
> conjunction with large folios, else this tanks performance.)
>
> Benchmarks showed noticeable improvements for writes (both sequential
> and random). There were no performance differences seen for random reads
> or direct IO. For sequential reads, there was no performance difference
> seen for the first read (which populates the page cache) but subsequent
> sequential reads showed a huge speedup.
>
> Benchmarks were run using fio on the passthrough_hp fuse server:
> ~/libfuse/build/example/passthrough_hp ~/libfuse ~/fuse_mnt --nopassthrough --nocache
>
> run fio in ~/fuse_mnt:
> fio --name=test --ioengine=sync --rw=write --bs=1M --size=5G --numjobs=2 --ramp_time=30 --group_reporting=1
>
> Results (tested on bs=256K, 1M, 5M) showed roughly a 15-20% increase in
> write throughput and for sequential reads after the page cache has
> already been populated, there was a ~800% speedup seen.
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> fs/fuse/file.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index adc4aa6810f5..2e7aae294c9e 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1167,9 +1167,10 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
> pgoff_t index = pos >> PAGE_SHIFT;
> unsigned int bytes;
> unsigned int folio_offset;
> + fgf_t fgp = FGP_WRITEBEGIN | fgf_set_order(num);
>
> again:
> - folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
> + folio = __filemap_get_folio(mapping, index, fgp,
> mapping_gfp_mask(mapping));
> if (IS_ERR(folio)) {
> err = PTR_ERR(folio);
> @@ -3155,11 +3156,24 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> {
> struct fuse_inode *fi = get_fuse_inode(inode);
> struct fuse_conn *fc = get_fuse_conn(inode);
> + unsigned int max_pages, max_order;
>
> inode->i_fop = &fuse_file_operations;
> inode->i_data.a_ops = &fuse_file_aops;
> - if (fc->writeback_cache)
> + if (fc->writeback_cache) {
> mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> + } else {
> + /*
> + * Large folios are only enabled if the writeback cache isn't on.
> + * If the writeback cache is on, large folios should only be
> + * enabled in conjunction with strictlimiting turned off, else
> + * performance tanks.
> + */
> + max_pages = min(min(fc->max_write, fc->max_read) >> PAGE_SHIFT,
> + fc->max_pages);
> + max_order = ilog2(max_pages);
> + mapping_set_folio_order_range(inode->i_mapping, 0, max_order);
> + }
JFYI fc->max_read shall also be honored when calculating max_order,
otherwise the following warning in fuse_readahead() may be triggered.
/*
* Large folios belonging to fuse will never
* have more pages than max_pages.
*/
WARN_ON(!pages);
--
Thanks,
Jingbo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-15 11:01 ` Jingbo Xu
@ 2025-08-15 16:16 ` Joanne Koong
2025-08-16 1:45 ` Jingbo Xu
0 siblings, 1 reply; 15+ messages in thread
From: Joanne Koong @ 2025-08-15 16:16 UTC (permalink / raw)
To: Jingbo Xu; +Cc: miklos, linux-fsdevel, bernd.schubert, willy, kernel-team
On Fri, Aug 15, 2025 at 4:01 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote:
>
>
>
> On 8/12/25 4:40 AM, Joanne Koong wrote:
> > Large folios are only enabled if the writeback cache isn't on.
> > (Strictlimiting needs to be turned off if the writeback cache is used in
> > conjunction with large folios, else this tanks performance.)
> >
> > Benchmarks showed noticeable improvements for writes (both sequential
> > and random). There were no performance differences seen for random reads
> > or direct IO. For sequential reads, there was no performance difference
> > seen for the first read (which populates the page cache) but subsequent
> > sequential reads showed a huge speedup.
> >
> > Benchmarks were run using fio on the passthrough_hp fuse server:
> > ~/libfuse/build/example/passthrough_hp ~/libfuse ~/fuse_mnt --nopassthrough --nocache
> >
> > run fio in ~/fuse_mnt:
> > fio --name=test --ioengine=sync --rw=write --bs=1M --size=5G --numjobs=2 --ramp_time=30 --group_reporting=1
> >
> > Results (tested on bs=256K, 1M, 5M) showed roughly a 15-20% increase in
> > write throughput and for sequential reads after the page cache has
> > already been populated, there was a ~800% speedup seen.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> > fs/fuse/file.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index adc4aa6810f5..2e7aae294c9e 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -1167,9 +1167,10 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
> > pgoff_t index = pos >> PAGE_SHIFT;
> > unsigned int bytes;
> > unsigned int folio_offset;
> > + fgf_t fgp = FGP_WRITEBEGIN | fgf_set_order(num);
> >
> > again:
> > - folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
> > + folio = __filemap_get_folio(mapping, index, fgp,
> > mapping_gfp_mask(mapping));
> > if (IS_ERR(folio)) {
> > err = PTR_ERR(folio);
> > @@ -3155,11 +3156,24 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> > {
> > struct fuse_inode *fi = get_fuse_inode(inode);
> > struct fuse_conn *fc = get_fuse_conn(inode);
> > + unsigned int max_pages, max_order;
> >
> > inode->i_fop = &fuse_file_operations;
> > inode->i_data.a_ops = &fuse_file_aops;
> > - if (fc->writeback_cache)
> > + if (fc->writeback_cache) {
> > mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> > + } else {
> > + /*
> > + * Large folios are only enabled if the writeback cache isn't on.
> > + * If the writeback cache is on, large folios should only be
> > + * enabled in conjunction with strictlimiting turned off, else
> > + * performance tanks.
> > + */
> > + max_pages = min(min(fc->max_write, fc->max_read) >> PAGE_SHIFT,
> > + fc->max_pages);
> > + max_order = ilog2(max_pages);
> > + mapping_set_folio_order_range(inode->i_mapping, 0, max_order);
> > + }
>
> JFYI fc->max_read shall also be honored when calculating max_order,
> otherwise the following warning in fuse_readahead() may be triggered.
>
Hi Jingbo,
I think fc->max_read gets honored in the "min(fc->max_write,
fc->max_read)" part of the max_pages calculation above.
Thanks,
Joanne
> /*
> * Large folios belonging to fuse will never
> * have more pages than max_pages.
> */
> WARN_ON(!pages);
>
>
> --
> Thanks,
> Jingbo
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] fuse: enable large folios (if writeback cache is unused)
2025-08-15 16:16 ` Joanne Koong
@ 2025-08-16 1:45 ` Jingbo Xu
0 siblings, 0 replies; 15+ messages in thread
From: Jingbo Xu @ 2025-08-16 1:45 UTC (permalink / raw)
To: Joanne Koong; +Cc: miklos, linux-fsdevel, bernd.schubert, willy, kernel-team
On 8/16/25 12:16 AM, Joanne Koong wrote:
> On Fri, Aug 15, 2025 at 4:01 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote:
>>
>>
>>
>> On 8/12/25 4:40 AM, Joanne Koong wrote:
>>> Large folios are only enabled if the writeback cache isn't on.
>>> (Strictlimiting needs to be turned off if the writeback cache is used in
>>> conjunction with large folios, else this tanks performance.)
>>>
>>> Benchmarks showed noticeable improvements for writes (both sequential
>>> and random). There were no performance differences seen for random reads
>>> or direct IO. For sequential reads, there was no performance difference
>>> seen for the first read (which populates the page cache) but subsequent
>>> sequential reads showed a huge speedup.
>>>
>>> Benchmarks were run using fio on the passthrough_hp fuse server:
>>> ~/libfuse/build/example/passthrough_hp ~/libfuse ~/fuse_mnt --nopassthrough --nocache
>>>
>>> run fio in ~/fuse_mnt:
>>> fio --name=test --ioengine=sync --rw=write --bs=1M --size=5G --numjobs=2 --ramp_time=30 --group_reporting=1
>>>
>>> Results (tested on bs=256K, 1M, 5M) showed roughly a 15-20% increase in
>>> write throughput and for sequential reads after the page cache has
>>> already been populated, there was a ~800% speedup seen.
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>> ---
>>> fs/fuse/file.c | 18 ++++++++++++++++--
>>> 1 file changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>> index adc4aa6810f5..2e7aae294c9e 100644
>>> --- a/fs/fuse/file.c
>>> +++ b/fs/fuse/file.c
>>> @@ -1167,9 +1167,10 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
>>> pgoff_t index = pos >> PAGE_SHIFT;
>>> unsigned int bytes;
>>> unsigned int folio_offset;
>>> + fgf_t fgp = FGP_WRITEBEGIN | fgf_set_order(num);
>>>
>>> again:
>>> - folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
>>> + folio = __filemap_get_folio(mapping, index, fgp,
>>> mapping_gfp_mask(mapping));
>>> if (IS_ERR(folio)) {
>>> err = PTR_ERR(folio);
>>> @@ -3155,11 +3156,24 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
>>> {
>>> struct fuse_inode *fi = get_fuse_inode(inode);
>>> struct fuse_conn *fc = get_fuse_conn(inode);
>>> + unsigned int max_pages, max_order;
>>>
>>> inode->i_fop = &fuse_file_operations;
>>> inode->i_data.a_ops = &fuse_file_aops;
>>> - if (fc->writeback_cache)
>>> + if (fc->writeback_cache) {
>>> mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
>>> + } else {
>>> + /*
>>> + * Large folios are only enabled if the writeback cache isn't on.
>>> + * If the writeback cache is on, large folios should only be
>>> + * enabled in conjunction with strictlimiting turned off, else
>>> + * performance tanks.
>>> + */
>>> + max_pages = min(min(fc->max_write, fc->max_read) >> PAGE_SHIFT,
>>> + fc->max_pages);
>>> + max_order = ilog2(max_pages);
>>> + mapping_set_folio_order_range(inode->i_mapping, 0, max_order);
>>> + }
>>
>> JFYI fc->max_read shall also be honored when calculating max_order,
>> otherwise the following warning in fuse_readahead() may be triggered.
>>
> Hi Jingbo,
>
> I think fc->max_read gets honored in the "min(fc->max_write,
> fc->max_read)" part of the max_pages calculation above.
>
Opps, yeah. My fault, sorry for the noise.
--
Thanks,
Jingbo
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-08-16 1:45 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 20:40 [PATCH] fuse: enable large folios (if writeback cache is unused) Joanne Koong
2025-08-11 21:13 ` Joanne Koong
2025-08-12 3:25 ` Chunsheng Luo
2025-08-12 22:14 ` Joanne Koong
2025-08-12 11:13 ` Miklos Szeredi
2025-08-12 19:38 ` Darrick J. Wong
2025-08-12 23:02 ` Joanne Koong
2025-08-13 1:20 ` Darrick J. Wong
2025-08-13 17:40 ` Joanne Koong
2025-08-13 8:20 ` Miklos Szeredi
2025-08-13 18:05 ` Joanne Koong
2025-08-12 22:44 ` Joanne Koong
2025-08-15 11:01 ` Jingbo Xu
2025-08-15 16:16 ` Joanne Koong
2025-08-16 1:45 ` Jingbo Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).