From: Joanne Koong <joannelkoong@gmail.com>
To: Horst Birthelmer <horst@birthelmer.de>
Cc: Bernd Schubert <bernd@bsbernd.com>,
Horst Birthelmer <horst@birthelmer.com>,
Miklos Szeredi <miklos@szeredi.hu>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Horst Birthelmer <hbirthelmer@ddn.com>
Subject: Re: Re: [PATCH] fuse: when copying a folio delay the mark dirty until the end
Date: Fri, 20 Mar 2026 10:18:35 -0700 [thread overview]
Message-ID: <CAJnrk1YJZnk4MOL+EgzcyAYhY_++OCs94USppA4iHDOAw0TAzg@mail.gmail.com> (raw)
In-Reply-To: <abuw1Z40Dg9Rp4Vj@fedora.fritz.box>
On Thu, Mar 19, 2026 at 1:32 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>
> On Wed, Mar 18, 2026 at 06:32:25PM -0700, Joanne Koong wrote:
> > On Wed, Mar 18, 2026 at 2:52 PM Bernd Schubert <bernd@bsbernd.com> wrote:
> > >
> > > Hi Joanne,
> > >
> > > On 3/18/26 22:19, Joanne Koong wrote:
> > > > On Wed, Mar 18, 2026 at 7:03 AM Horst Birthelmer <horst@birthelmer.de> wrote:
> > > >>
> > > >> Hi Joanne,
> > > >>
> > > >> I wonder, would something like this help for large folios?
> > > >
> > > > Hi Horst,
> > > >
> > > > I don't think it's likely that the pages backing the userspace buffer
> > > > are large folios, so I think this may actually add extra overhead with
> > > > the extra folio_test_dirty() check.
> > > >
> > > > From what I've seen, the main cost that dwarfs everything else for
> > > > writes/reads is the actual IO, the context switches, and the memcpys.
> > > > I think compared to these things, the set_page_dirty_lock() cost is
> > > > negligible and pretty much undetectable.
> > >
> > >
> > > a little bit background here. We see in cpu flame graphs that the spin
> > > lock taken in unlock_request() and unlock_request() takes about the same
> > > amount of CPU time as the memcpy. Interestingly, only on Intel, but not
> > > AMD CPUs. Note that we are running with out custom page pinning, which
> > > just takes the pages from an array, so iov_iter_get_pages2() is not used.
> > >
> > > The reason for that unlock/lock is documented at the end of
> > > Documentation/filesystems/fuse/fuse.rst as Kamikaze file system. Well we
> > > don't have that, so for now these checks are modified in our branches to
> > > avoid the lock. Although that is not upstreamable. Right solution is
> > > here to extract an array of pages and do that unlock/lock per pagevec.
> > >
> > > Next in the flame graph is setting that set_page_dirty_lock which also
> > > takes as much CPU time as the memcpy. Again, Intel CPUs only.
> > > In the combination with the above pagevec method, I think right solution
> > > is to iterate over the pages, stores the last folio and then set to
> > > dirty once per folio.
> >
> > Thanks for the background context. The intel vs amd difference is
> > interesting. The approaches you mention sound reasonable. Are you able
> > to share the flame graph or is this easily repro-able using fio on the
> > passthrough_hp server?
> >
> >
> Hi Joanne,
>
> I have tried to reproduce this with passthrough_hp and I never saw it.
> So my answer would be something like: I don't think so.
>
> This happens even with large folios disabled. I was just trying to
> solve it, since I figured it will be worse with large folios.
Thanks for the context. I haven't encountered this bottleneck myself
(yet) but if you are encountering it pretty regularly, I agree with
you that it definitely seems worth addressing.
Thanks,
Joanne
>
> Thanks,
> Horst
next prev parent reply other threads:[~2026-03-20 17:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 15:16 [PATCH] fuse: when copying a folio delay the mark dirty until the end Horst Birthelmer
2026-03-16 17:29 ` Joanne Koong
2026-03-16 20:02 ` Horst Birthelmer
2026-03-16 22:06 ` Joanne Koong
2026-03-18 14:03 ` Horst Birthelmer
2026-03-18 21:19 ` Joanne Koong
2026-03-18 21:52 ` Bernd Schubert
2026-03-19 1:32 ` Joanne Koong
2026-03-19 4:27 ` Darrick J. Wong
2026-03-20 17:24 ` Joanne Koong
2026-03-19 8:32 ` Horst Birthelmer
2026-03-20 17:18 ` Joanne Koong [this message]
2026-03-26 6:35 ` kernel test robot
2026-03-26 15:05 ` [LTP] " Cyril Hrubis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJnrk1YJZnk4MOL+EgzcyAYhY_++OCs94USppA4iHDOAw0TAzg@mail.gmail.com \
--to=joannelkoong@gmail.com \
--cc=bernd@bsbernd.com \
--cc=hbirthelmer@ddn.com \
--cc=horst@birthelmer.com \
--cc=horst@birthelmer.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox