public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bernd Schubert <bernd@bsbernd.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Horst Birthelmer <horst@birthelmer.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Horst Birthelmer <hbirthelmer@ddn.com>
Subject: Re: [PATCH] fuse: when copying a folio delay the mark dirty until the end
Date: Wed, 18 Mar 2026 22:52:16 +0100	[thread overview]
Message-ID: <60103445-0d45-427c-aa00-2fa79207b129@bsbernd.com> (raw)
In-Reply-To: <CAJnrk1bF0JFAWOF=4hjMhiciSLrvob268fcQgv3P28Po8J=qwQ@mail.gmail.com>

Hi Joanne,

On 3/18/26 22:19, Joanne Koong wrote:
> On Wed, Mar 18, 2026 at 7:03 AM Horst Birthelmer <horst@birthelmer.de> wrote:
>>
>> Hi Joanne,
>>
>> I wonder, would something like this help for large folios?
> 
> Hi Horst,
> 
> I don't think it's likely that the pages backing the userspace buffer
> are large folios, so I think this may actually add extra overhead with
> the extra folio_test_dirty() check.
> 
> From what I've seen, the main cost that dwarfs everything else for
> writes/reads is the actual IO, the context switches, and the memcpys.
> I think compared to these things, the set_page_dirty_lock() cost is
> negligible and pretty much undetectable.


a little bit background here. We see in cpu flame graphs that the spin
lock taken in unlock_request() and unlock_request() takes about the same
amount of CPU time as the memcpy. Interestingly, only on Intel, but not
AMD CPUs. Note that we are running with out custom page pinning, which
just takes the pages from an array, so iov_iter_get_pages2() is not used.

The reason for that unlock/lock is documented at the end of
Documentation/filesystems/fuse/fuse.rst as Kamikaze file system. Well we
don't have that, so for now these checks are modified in our branches to
avoid the lock. Although that is not upstreamable. Right solution is
here to extract an array of pages and do that unlock/lock per pagevec.

Next in the flame graph is setting that set_page_dirty_lock which also
takes as much CPU time as the memcpy. Again, Intel CPUs only.
In the combination with the above pagevec method, I think right solution
is to iterate over the pages, stores the last folio and then set to
dirty once per folio.
Also, I disagree about that the userspace buffers are not likely large
folios, see commit
59ba47b6be9cd0146ef9a55c6e32e337e11e7625 "fuse: Check for large folio)
with SPLICE_F_MOVE". Especially Horst persistently runs into it when
doing xfstests with recent kernels. I think the issue came up first time
with 3.18ish.

One can further enforce that by setting
"/sys/kernel/mm/transparent_hugepage/enabled" to 'always', what I did
when I tested the above commit. And actually that points out that
libfuse allocations should do the madvise. I'm going to do that during
the next days, maybe tomorrow.


Thanks,
Bernd

  reply	other threads:[~2026-03-18 21:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16 15:16 [PATCH] fuse: when copying a folio delay the mark dirty until the end Horst Birthelmer
2026-03-16 17:29 ` Joanne Koong
2026-03-16 20:02   ` Horst Birthelmer
2026-03-16 22:06     ` Joanne Koong
2026-03-18 14:03       ` Horst Birthelmer
2026-03-18 21:19         ` Joanne Koong
2026-03-18 21:52           ` Bernd Schubert [this message]
2026-03-19  1:32             ` Joanne Koong
2026-03-19  4:27               ` Darrick J. Wong
2026-03-20 17:24                 ` Joanne Koong
2026-03-19  8:32               ` Horst Birthelmer
2026-03-20 17:18                 ` Joanne Koong
2026-03-26  6:35 ` kernel test robot
2026-03-26 15:05   ` [LTP] " Cyril Hrubis
2026-03-26 15:44     ` Horst Birthelmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60103445-0d45-427c-aa00-2fa79207b129@bsbernd.com \
    --to=bernd@bsbernd.com \
    --cc=hbirthelmer@ddn.com \
    --cc=horst@birthelmer.com \
    --cc=joannelkoong@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox