linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [HELP] FUSE writeback performance bottleneck
@ 2024-06-03  6:17 Jingbo Xu
  2024-06-03 14:43 ` Bernd Schubert
  0 siblings, 1 reply; 29+ messages in thread
From: Jingbo Xu @ 2024-06-03  6:17 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, lege.wang

Hi, Miklos,

We spotted a performance bottleneck for FUSE writeback in which the
writeback kworker has consumed nearly 100% CPU, among which 40% CPU is
used for copy_page().

fuse_writepages_fill
  alloc tmp_page
  copy_highpage

This is because of FUSE writeback design (see commit 3be5a52b30aa
("fuse: support writable mmap")), which newly allocates a temp page for
each dirty page to be written back, copy content of dirty page to temp
page, and then write back the temp page instead.  This special design is
intentional to avoid potential deadlocked due to buggy or even malicious
fuse user daemon.

There was a proposal of removing this constraint for virtiofs [1], which
is reasonable as users of virtiofs and virtiofs daemon don't run on the
same OS, and virtiofs daemon is usually offered by cloud vendors that
shall not be malicious.  While for the normal /dev/fuse interface, I
don't think removing the constraint is acceptable.


Come back to the writeback performance bottleneck.  Another important
factor is that, (IIUC) only one kworker at the same time is allowed for
writeback for each filesystem instance (if cgroup writeback is not
enabled).  The kworker is scheduled upon sb->s_bdi->wb.dwork, and the
workqueue infrastructure guarantees that at most one *running* worker is
allowed for one specific work (sb->s_bdi->wb.dwork) at any time.  Thus
the writeback is constraint to one CPU for each filesystem instance.

I'm not sure if offloading the page copying and then FUSE requests
sending to another worker (if a bunch of dirty pages have been
collected) is a good idea or not, e.g.

```
fuse_writepages_fill
  if fuse_writepage_need_send:
    # schedule a work

# the worker
for each dirty page in ap->pages[]:
    copy_page
fuse_writepages_send
```

Any suggestion?



This issue can be reproduced by:

1 ./libfuse/build/example/passthrough_ll -o cache=always -o writeback -o
source=/run/ /mnt
("/run/" is a tmpfs mount)

2 fio --name=write_test --ioengine=psync --iodepth=1 --rw=write --bs=1M
--direct=0 --size=1G --numjobs=2 --group_reporting --directory=/mnt
(at least two threads are needed; fio shows ~1800MiB/s buffer write
bandwidth)


[1]
https://lore.kernel.org/all/20231228123528.705-1-lege.wang@jaguarmicro.com/


-- 
Thanks,
Jingbo

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-10-14  1:57 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-03  6:17 [HELP] FUSE writeback performance bottleneck Jingbo Xu
2024-06-03 14:43 ` Bernd Schubert
2024-06-03 15:19   ` Miklos Szeredi
2024-06-03 15:32     ` Bernd Schubert
2024-06-03 22:10     ` Dave Chinner
2024-06-04  7:20       ` Miklos Szeredi
2024-06-04  1:57     ` Jingbo Xu
2024-06-04  7:27       ` Miklos Szeredi
2024-06-04  7:36         ` Jingbo Xu
2024-06-04  9:32           ` Bernd Schubert
2024-06-04 10:02             ` Miklos Szeredi
2024-06-04 14:13               ` Bernd Schubert
2024-06-04 16:53                 ` Josef Bacik
2024-06-04 21:39                   ` Bernd Schubert
2024-06-04 22:16                     ` Josef Bacik
2024-06-05  5:49                       ` Amir Goldstein
2024-06-05 15:35                         ` Josef Bacik
2024-08-22 17:00               ` Joanne Koong
2024-08-22 21:01                 ` Joanne Koong
2024-08-23  3:34               ` Jingbo Xu
2024-09-13  0:00                 ` Joanne Koong
2024-09-13  1:25                   ` Jingbo Xu
2024-06-04 12:24             ` Jingbo Xu
2024-09-11  9:32         ` Jingbo Xu
2024-09-12 23:18           ` Joanne Koong
2024-09-13  3:35             ` Jingbo Xu
2024-09-13 20:55               ` Joanne Koong
2024-10-11 23:08                 ` Joanne Koong
2024-10-14  1:57                   ` Jingbo Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).