From: Abhishek Gupta <abhishekmgupta@google.com>
To: Bernd Schubert <bschubert@ddn.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"miklos@szeredi.hu" <miklos@szeredi.hu>,
Swetha Vadlakonda <swethv@google.com>
Subject: Re: FUSE: [Regression] Fuse legacy path performance scaling lost in v6.14 vs v6.8/6.11 (iodepth scaling with io_uring)
Date: Thu, 27 Nov 2025 19:07:57 +0530 [thread overview]
Message-ID: <CAPr64AJXg9nr_xG_wpy3sDtWmy2cR+HhqphCGgWSoYs2+OjQUQ@mail.gmail.com> (raw)
In-Reply-To: <e6a41630-c2e6-4bd9-aea9-df38238f6359@ddn.com>
Hi Bernd,
Thanks for looking into this.
Please find below the fio output on 6.11 & 6.14 kernel versions.
On kernel 6.11
~/gcsfuse$ uname -a
Linux abhishek-c4-192-west4a 6.11.0-1016-gcp #16~24.04.1-Ubuntu SMP
Wed May 28 02:40:52 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
iodepth = 1
:~/fio-fio-3.38$ ./fio --name=randread --rw=randread
--ioengine=io_uring --thread
--filename_format='/home/abhishekmgupta_google_com/bucket/$jobnum'
--filesize=1G --time_based=1 --runtime=15s --bs=4K --numjobs=1
--iodepth=1 --group_reporting=1 --direct=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.38
Starting 1 thread
...
Run status group 0 (all jobs):
READ: bw=3311KiB/s (3391kB/s), 3311KiB/s-3311KiB/s
(3391kB/s-3391kB/s), io=48.5MiB (50.9MB), run=15001-15001msec
iodepth=4
:~/fio-fio-3.38$ ./fio --name=randread --rw=randread
--ioengine=io_uring --thread
--filename_format='/home/abhishekmgupta_google_com/bucket/$jobnum'
--filesize=1G --time_based=1 --runtime=15s --bs=4K --numjobs=1
--iodepth=4 --group_reporting=1 --direct=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=io_uring, iodepth=4
fio-3.38
Starting 1 thread
...
Run status group 0 (all jobs):
READ: bw=11.0MiB/s (11.6MB/s), 11.0MiB/s-11.0MiB/s
(11.6MB/s-11.6MB/s), io=166MiB (174MB), run=15002-15002msec
On kernel 6.14
:~$ uname -a
Linux abhishek-west4a-2504 6.14.0-1019-gcp #20-Ubuntu SMP Wed Oct 15
00:41:12 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
iodepth=1
:~$ fio --name=randread --rw=randread --ioengine=io_uring --thread
--filename_format='/home/abhishekmgupta_google_com/bucket/$jobnum'
--filesize=1G --time_based=1 --runtime=15s --bs=4K --numjobs=1
--iodepth=1 --group_reporting=1 --direct=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.38
Starting 1 thread
...
Run status group 0 (all jobs):
READ: bw=3576KiB/s (3662kB/s), 3576KiB/s-3576KiB/s
(3662kB/s-3662kB/s), io=52.4MiB (54.9MB), run=15001-15001msec
iodepth=4
:~$ fio --name=randread --rw=randread --ioengine=io_uring --thread
--filename_format='/home/abhishekmgupta_google_com/bucket/$jobnum'
--filesize=1G --time_based=1 --runtime=15s --bs=4K --numjobs=1
--iodepth=4 --group_reporting=1 --direct=1
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=io_uring, iodepth=4
fio-3.38
...
Run status group 0 (all jobs):
READ: bw=3863KiB/s (3956kB/s), 3863KiB/s-3863KiB/s
(3956kB/s-3956kB/s), io=56.6MiB (59.3MB), run=15001-15001msec
Thanks,
Abhishek
On Thu, Nov 27, 2025 at 12:41 AM Bernd Schubert <bschubert@ddn.com> wrote:
>
> Hi Abhishek,
>
> On 11/26/25 16:07, Abhishek Gupta wrote:
> > [You don't often get email from abhishekmgupta@google.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > Hello Team,
> >
> > I am observing a performance regression in the FUSE subsystem on
> > Kernel 6.14 compared to 6.8/6.11 when using the legacy/standard FUSE
> > interface (userspace daemon using standard read on /dev/fuse).
> >
> > Summary of Issue: On Kernel 6.8 & 6.11, increasing iodepth in fio
> > (using ioengine=io_uring) results in near-linear performance scaling.
> > On Kernel 6.14, using the exact same userspace binary, increasing
> > iodepth yields no performance improvement (behavior resembles
> > iodepth=1).
> >
> > Environment:
> > - Workload: GCSFuse (userspace daemon) + Fio
> > - Fio Config: Random Read, ioengine=io_uring, direct=1, iodepth=4.
> > - CPU: Intel.
> > - Daemon: Go-based. It uses a serialized reader loop on /dev/fuse that
> > immediately spawns a Go routine per request. So, it can serve requests
> > in parallel.
> > - Kernel Config: CONFIG_FUSE_IO_URING=y is enabled, but the daemon is
> > not registering for the ring (legacy mode).
> >
> > Benchmark Observations:
> > - Kernel 6.8/6.11: With iodepth=4, we observe ~3.5-4x throughput
> > compared to iodepth=1.
> > - Kernel 6.14: With iodepth=4, throughput is identical to iodepth=1.
> > Parallelism is effectively lost.
> >
> > Is this a known issue? I would appreciate any insights or pointers on
> > this issue.
>
> Could you give your exact fio line? I'm not aware of such a regression.
>
> bschubert2@imesrv3 ~>fio --directory=/tmp/dest --name=iops.\$jobnum --rw=randread --bs=4k --size=1G --numjobs=1 --iodepth=1 --time_based --runtime=30s --group_reporting --ioengine=io_uring --direct=1
> iops.$jobnum: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=1
> fio-3.36
> Starting 1 process
> iops.$jobnum: Laying out IO file (1 file / 1024MiB)
> ...
> Run status group 0 (all jobs):
> READ: bw=178MiB/s (186MB/s), 178MiB/s-178MiB/s (186MB/s-186MB/s), io=5331MiB (5590MB), run=30001-30001msec
>
> bschubert2@imesrv3 ~>fio --directory=/tmp/dest --name=iops.\$jobnum --rw=randread --bs=4k --size=1G --numjobs=1 --iodepth=4 --time_based --runtime=30s --group_reporting --ioengine=io_uring --direct=1
> iops.$jobnum: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=4
> fio-3.36
> Starting 1 process
> Jobs: 1 (f=1): [r(1)][100.0%][r=673MiB/s][r=172k IOPS][eta 00m:00s]
> iops.$jobnum: (groupid=0, jobs=1): err= 0: pid=52012: Wed Nov 26 20:08:17 2025
> ...
> Run status group 0 (all jobs):
> READ: bw=673MiB/s (706MB/s), 673MiB/s-673MiB/s (706MB/s-706MB/s), io=19.7GiB (21.2GB), run=30001-30001msec
>
>
> This is with libfuse `example/passthrough_hp -o allow_other --nopassthrough --foreground /tmp/source /tmp/dest`
>
>
> Thanks,
> Bernd
next prev parent reply other threads:[~2025-11-27 13:38 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-26 15:07 FUSE: [Regression] Fuse legacy path performance scaling lost in v6.14 vs v6.8/6.11 (iodepth scaling with io_uring) Abhishek Gupta
2025-11-26 19:11 ` Bernd Schubert
2025-11-27 13:37 ` Abhishek Gupta [this message]
2025-11-27 23:05 ` Bernd Schubert
2025-12-02 10:42 ` Abhishek Gupta
[not found] ` <CAPr64AKYisa=_X5fAB1ozgb3SoarKm19TD3hgwhX9csD92iBzA@mail.gmail.com>
2025-12-08 17:52 ` Bernd Schubert
2025-12-08 22:56 ` Bernd Schubert
2025-12-09 17:16 ` Abhishek Gupta
2025-12-15 4:30 ` Joanne Koong
2025-12-17 9:17 ` Abhishek Gupta
2025-12-17 10:43 ` Bernd Schubert
2025-12-17 11:34 ` Horst Birthelmer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPr64AJXg9nr_xG_wpy3sDtWmy2cR+HhqphCGgWSoYs2+OjQUQ@mail.gmail.com \
--to=abhishekmgupta@google.com \
--cc=bschubert@ddn.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=swethv@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).