linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
       [not found]     ` <20250609171444.GL6156@frogsfrogsfrogs>
@ 2025-06-10  3:59       ` Christoph Hellwig
  2025-06-11  4:34         ` Matthew Wilcox
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2025-06-10  3:59 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Joanne Koong, miklos, brauner, linux-fsdevel,
	linux-xfs, bernd.schubert, kernel-team, willy, linux-mm,
	linux-nfs

On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > Where "folio laundering" means calling ->launder_folio, right?
> 
> What does fuse use folio laundering for, anyway?  It looks to me like
> the primary users are invalidate_inode_pages*.  Either the caller cares
> about flushing dirty data and has called filemap_write_and_wait_range;
> or it doesn't and wants to tear down the pagecache ahead of some other
> operation that's going to change the file contents and doesn't care.
> 
> I suppose it could be useful as a last-chance operation on a dirty folio
> that was dirtied after a filemap_write_and_wait_range but before
> invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> the caller try again (or not).  Is there a subtlety to fuse here that I
> don't know about?

My memory might be betraying me, but I think willy once launched an
attempt to see if we can kill launder_folio.  Adding him, and the
mm and nfs lists to check if I have a point :)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-10  3:59       ` [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio() Christoph Hellwig
@ 2025-06-11  4:34         ` Matthew Wilcox
  2025-06-18  4:47           ` does fuse need ->launder_folios, was: " Christoph Hellwig
  2025-06-18 12:17           ` Jeff Layton
  0 siblings, 2 replies; 12+ messages in thread
From: Matthew Wilcox @ 2025-06-11  4:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J. Wong, Joanne Koong, miklos, brauner, linux-fsdevel,
	linux-xfs, bernd.schubert, kernel-team, linux-mm, linux-nfs

On Mon, Jun 09, 2025 at 08:59:53PM -0700, Christoph Hellwig wrote:
> On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > > Where "folio laundering" means calling ->launder_folio, right?
> > 
> > What does fuse use folio laundering for, anyway?  It looks to me like
> > the primary users are invalidate_inode_pages*.  Either the caller cares
> > about flushing dirty data and has called filemap_write_and_wait_range;
> > or it doesn't and wants to tear down the pagecache ahead of some other
> > operation that's going to change the file contents and doesn't care.
> > 
> > I suppose it could be useful as a last-chance operation on a dirty folio
> > that was dirtied after a filemap_write_and_wait_range but before
> > invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> > the caller try again (or not).  Is there a subtlety to fuse here that I
> > don't know about?
> 
> My memory might be betraying me, but I think willy once launched an
> attempt to see if we can kill launder_folio.  Adding him, and the
> mm and nfs lists to check if I have a point :)

I ... got distracted with everything else.

Looking at the original addition of ->launder_page (e3db7691e9f3), I
don't understand why we need it.  invalidate_inode_pages2() isn't
supposed to invalidate dirty pages, so I don't understand why nfs
found it necessary to do writeback from ->releasepage() instead
of just returning false like iomap does.

There's now a new question of what the hell btrfs is up to with
->launder_folio, which they just added recently.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* does fuse need ->launder_folios, was: Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-11  4:34         ` Matthew Wilcox
@ 2025-06-18  4:47           ` Christoph Hellwig
  2025-06-18 12:17           ` Jeff Layton
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2025-06-18  4:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Darrick J. Wong, Joanne Koong, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs

On Wed, Jun 11, 2025 at 05:34:50AM +0100, Matthew Wilcox wrote:
> > My memory might be betraying me, but I think willy once launched an
> > attempt to see if we can kill launder_folio.  Adding him, and the
> > mm and nfs lists to check if I have a point :)
> 
> I ... got distracted with everything else.
> 
> Looking at the original addition of ->launder_page (e3db7691e9f3), I
> don't understand why we need it.  invalidate_inode_pages2() isn't
> supposed to invalidate dirty pages, so I don't understand why nfs
> found it necessary to do writeback from ->releasepage() instead
> of just returning false like iomap does.

Yeah.  Miklos (and other fuse folks), can you help figuring out
if fuse really wants ->launder_folio?  Because it would be really good
to settle this question before we have to add iomap infrastruture for
it.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-11  4:34         ` Matthew Wilcox
  2025-06-18  4:47           ` does fuse need ->launder_folios, was: " Christoph Hellwig
@ 2025-06-18 12:17           ` Jeff Layton
  2025-06-20 18:15             ` Matthew Wilcox
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff Layton @ 2025-06-18 12:17 UTC (permalink / raw)
  To: Matthew Wilcox, Christoph Hellwig
  Cc: Darrick J. Wong, Joanne Koong, miklos, brauner, linux-fsdevel,
	linux-xfs, bernd.schubert, kernel-team, linux-mm, linux-nfs

On Wed, 2025-06-11 at 05:34 +0100, Matthew Wilcox wrote:
> On Mon, Jun 09, 2025 at 08:59:53PM -0700, Christoph Hellwig wrote:
> > On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > > > Where "folio laundering" means calling ->launder_folio, right?
> > > 
> > > What does fuse use folio laundering for, anyway?  It looks to me like
> > > the primary users are invalidate_inode_pages*.  Either the caller cares
> > > about flushing dirty data and has called filemap_write_and_wait_range;
> > > or it doesn't and wants to tear down the pagecache ahead of some other
> > > operation that's going to change the file contents and doesn't care.
> > > 
> > > I suppose it could be useful as a last-chance operation on a dirty folio
> > > that was dirtied after a filemap_write_and_wait_range but before
> > > invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> > > the caller try again (or not).  Is there a subtlety to fuse here that I
> > > don't know about?
> > 
> > My memory might be betraying me, but I think willy once launched an
> > attempt to see if we can kill launder_folio.  Adding him, and the
> > mm and nfs lists to check if I have a point :)
> 
> I ... got distracted with everything else.
> 
> Looking at the original addition of ->launder_page (e3db7691e9f3), I
> don't understand why we need it.  invalidate_inode_pages2() isn't
> supposed to invalidate dirty pages, so I don't understand why nfs
> found it necessary to do writeback from ->releasepage() instead
> of just returning false like iomap does.
> 
> There's now a new question of what the hell btrfs is up to with
> ->launder_folio, which they just added recently.

IIRC...

The problem was a race where a task could could dirty a page in a
mmap'ed file after it had been written back but before it was unmapped
from the pagecache.

Bear in mind that the NFS client may need write back and then
invalidate the pagecache for a file that is still in use if it
discovers that the inode's attributes have changed on the server.

Trond's solution was to write the page out while holding the page lock
in this situation. I think we'd all welcome a way to avoid this race
that didn't require launder_folio().
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-18 12:17           ` Jeff Layton
@ 2025-06-20 18:15             ` Matthew Wilcox
  2025-06-25  5:26               ` Joanne Koong
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2025-06-20 18:15 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Christoph Hellwig, Darrick J. Wong, Joanne Koong, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs

On Wed, Jun 18, 2025 at 08:17:03AM -0400, Jeff Layton wrote:
> On Wed, 2025-06-11 at 05:34 +0100, Matthew Wilcox wrote:
> > On Mon, Jun 09, 2025 at 08:59:53PM -0700, Christoph Hellwig wrote:
> > > On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > > > > Where "folio laundering" means calling ->launder_folio, right?
> > > > 
> > > > What does fuse use folio laundering for, anyway?  It looks to me like
> > > > the primary users are invalidate_inode_pages*.  Either the caller cares
> > > > about flushing dirty data and has called filemap_write_and_wait_range;
> > > > or it doesn't and wants to tear down the pagecache ahead of some other
> > > > operation that's going to change the file contents and doesn't care.
> > > > 
> > > > I suppose it could be useful as a last-chance operation on a dirty folio
> > > > that was dirtied after a filemap_write_and_wait_range but before
> > > > invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> > > > the caller try again (or not).  Is there a subtlety to fuse here that I
> > > > don't know about?
> > > 
> > > My memory might be betraying me, but I think willy once launched an
> > > attempt to see if we can kill launder_folio.  Adding him, and the
> > > mm and nfs lists to check if I have a point :)
> > 
> > I ... got distracted with everything else.
> > 
> > Looking at the original addition of ->launder_page (e3db7691e9f3), I
> > don't understand why we need it.  invalidate_inode_pages2() isn't
> > supposed to invalidate dirty pages, so I don't understand why nfs
> > found it necessary to do writeback from ->releasepage() instead
> > of just returning false like iomap does.
> > 
> > There's now a new question of what the hell btrfs is up to with
> > ->launder_folio, which they just added recently.
> 
> IIRC...
> 
> The problem was a race where a task could could dirty a page in a
> mmap'ed file after it had been written back but before it was unmapped
> from the pagecache.
> 
> Bear in mind that the NFS client may need write back and then
> invalidate the pagecache for a file that is still in use if it
> discovers that the inode's attributes have changed on the server.
> 
> Trond's solution was to write the page out while holding the page lock
> in this situation. I think we'd all welcome a way to avoid this race
> that didn't require launder_folio().

I think the problem is that we've left it up to the filesystem to handle
"what do we do if we've dirtied a page^W folio between, say, calling
filemap_write_and_wait_range() and calling filemap_release_folio()".
Just to make sure we're all on the same page here, this is the sample
path I'm looking at:

__iomap_dio_rw
  kiocb_invalidate_pages
    filemap_invalidate_pages
      filemap_write_and_wait_range
      invalidate_inode_pages2_range
        unmap_mapping_pages
	folio_lock
	folio_wait_writeback
	folio_unmap_invalidate
	  unmap_mapping_folio
	folio_launder
	filemap_release_folio
	if (folio_test_dirty(folio))
	  return -EBUSY;

So some filesystems opt to write back the folio which has been dirtied
(by implementing ->launder_folio) and others opt to fail (and fall back to
buffered I/O when the user has requested direct I/O).  iomap filesystems
all just "return false" for dirty folios, so it's clearly an acceptable
outcome as far as xfstests go.

The question is whether this is acceptable for all the filesystem
which implement ->launder_folio today.  Because we could just move the
folio_test_dirty() to after the folio_lock() and remove all the testing
of folio dirtiness from individual filesystems.

Or have I missed something and picked the wrong sample path for
analysing why we do/don't need to writeback folios in
invalidate_inode_pages2_range()?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-20 18:15             ` Matthew Wilcox
@ 2025-06-25  5:26               ` Joanne Koong
  2025-06-25  6:26                 ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2025-06-25  5:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jeff Layton, Christoph Hellwig, Darrick J. Wong, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs

On Fri, Jun 20, 2025 at 11:15 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jun 18, 2025 at 08:17:03AM -0400, Jeff Layton wrote:
> > On Wed, 2025-06-11 at 05:34 +0100, Matthew Wilcox wrote:
> > > On Mon, Jun 09, 2025 at 08:59:53PM -0700, Christoph Hellwig wrote:
> > > > On Mon, Jun 09, 2025 at 10:14:44AM -0700, Darrick J. Wong wrote:
> > > > > > Where "folio laundering" means calling ->launder_folio, right?
> > > > >
> > > > > What does fuse use folio laundering for, anyway?  It looks to me like
> > > > > the primary users are invalidate_inode_pages*.  Either the caller cares
> > > > > about flushing dirty data and has called filemap_write_and_wait_range;
> > > > > or it doesn't and wants to tear down the pagecache ahead of some other
> > > > > operation that's going to change the file contents and doesn't care.
> > > > >
> > > > > I suppose it could be useful as a last-chance operation on a dirty folio
> > > > > that was dirtied after a filemap_write_and_wait_range but before
> > > > > invalidate_inode_pages*?  Though for xfs we just return EBUSY and let
> > > > > the caller try again (or not).  Is there a subtlety to fuse here that I
> > > > > don't know about?
> > > >
> > > > My memory might be betraying me, but I think willy once launched an
> > > > attempt to see if we can kill launder_folio.  Adding him, and the
> > > > mm and nfs lists to check if I have a point :)
> > >
> > > I ... got distracted with everything else.
> > >
> > > Looking at the original addition of ->launder_page (e3db7691e9f3), I
> > > don't understand why we need it.  invalidate_inode_pages2() isn't
> > > supposed to invalidate dirty pages, so I don't understand why nfs
> > > found it necessary to do writeback from ->releasepage() instead
> > > of just returning false like iomap does.
> > >
> > > There's now a new question of what the hell btrfs is up to with
> > > ->launder_folio, which they just added recently.
> >
> > IIRC...
> >
> > The problem was a race where a task could could dirty a page in a
> > mmap'ed file after it had been written back but before it was unmapped
> > from the pagecache.
> >
> > Bear in mind that the NFS client may need write back and then
> > invalidate the pagecache for a file that is still in use if it
> > discovers that the inode's attributes have changed on the server.
> >
> > Trond's solution was to write the page out while holding the page lock
> > in this situation. I think we'd all welcome a way to avoid this race
> > that didn't require launder_folio().
>
> I think the problem is that we've left it up to the filesystem to handle
> "what do we do if we've dirtied a page^W folio between, say, calling
> filemap_write_and_wait_range() and calling filemap_release_folio()".
> Just to make sure we're all on the same page here, this is the sample
> path I'm looking at:
>
> __iomap_dio_rw
>   kiocb_invalidate_pages
>     filemap_invalidate_pages
>       filemap_write_and_wait_range
>       invalidate_inode_pages2_range
>         unmap_mapping_pages
>         folio_lock
>         folio_wait_writeback
>         folio_unmap_invalidate
>           unmap_mapping_folio
>         folio_launder
>         filemap_release_folio
>         if (folio_test_dirty(folio))
>           return -EBUSY;
>
> So some filesystems opt to write back the folio which has been dirtied
> (by implementing ->launder_folio) and others opt to fail (and fall back to
> buffered I/O when the user has requested direct I/O).  iomap filesystems
> all just "return false" for dirty folios, so it's clearly an acceptable
> outcome as far as xfstests go.
>
> The question is whether this is acceptable for all the filesystem
> which implement ->launder_folio today.  Because we could just move the
> folio_test_dirty() to after the folio_lock() and remove all the testing
> of folio dirtiness from individual filesystems.

Or could the filesystems that implement ->launder_folio (from what I
see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
logic into their .release_folio implementation? I don't see why not.
In folio_unmap_invalidate(), we call:

        ret = folio_launder(mapping, folio);
        if (ret)
                return ret;
        if (folio->mapping != mapping)
                return -EBUSY;
        if (!filemap_release_folio(folio, gfp))
                return -EBUSY;

If fuse, nfs, btrfs, and orangefs absolutely need to do whatever logic
they're doing in .launder_folio, could they not just move it into
.release_folio?

>
> Or have I missed something and picked the wrong sample path for
> analysing why we do/don't need to writeback folios in
> invalidate_inode_pages2_range()?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-25  5:26               ` Joanne Koong
@ 2025-06-25  6:26                 ` Christoph Hellwig
  2025-06-25 16:44                   ` Joanne Koong
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2025-06-25  6:26 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Matthew Wilcox, Jeff Layton, Christoph Hellwig, Darrick J. Wong,
	miklos, brauner, linux-fsdevel, linux-xfs, bernd.schubert,
	kernel-team, linux-mm, linux-nfs

On Tue, Jun 24, 2025 at 10:26:01PM -0700, Joanne Koong wrote:
> > The question is whether this is acceptable for all the filesystem
> > which implement ->launder_folio today.  Because we could just move the
> > folio_test_dirty() to after the folio_lock() and remove all the testing
> > of folio dirtiness from individual filesystems.
> 
> Or could the filesystems that implement ->launder_folio (from what I
> see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
> logic into their .release_folio implementation? I don't see why not.
> In folio_unmap_invalidate(), we call:

Without even looking into the details from the iomap POV that basically
doesn't matter.  You'd still need the write back a single locked folio
interface, which adds API surface, and because it only writes a single
folio at a time is rather inefficient.  Not a deal breaker because
the current version look ok, but it would still be preferable to not
have an extra magic interface for it.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-25  6:26                 ` Christoph Hellwig
@ 2025-06-25 16:44                   ` Joanne Koong
  2025-07-01  5:41                     ` Darrick J. Wong
  2025-07-01  6:23                     ` Miklos Szeredi
  0 siblings, 2 replies; 12+ messages in thread
From: Joanne Koong @ 2025-06-25 16:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Matthew Wilcox, Jeff Layton, Darrick J. Wong, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs, linux-btrfs

On Tue, Jun 24, 2025 at 11:26 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Jun 24, 2025 at 10:26:01PM -0700, Joanne Koong wrote:
> > > The question is whether this is acceptable for all the filesystem
> > > which implement ->launder_folio today.  Because we could just move the
> > > folio_test_dirty() to after the folio_lock() and remove all the testing
> > > of folio dirtiness from individual filesystems.
> >
> > Or could the filesystems that implement ->launder_folio (from what I
> > see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
> > logic into their .release_folio implementation? I don't see why not.
> > In folio_unmap_invalidate(), we call:
>
> Without even looking into the details from the iomap POV that basically
> doesn't matter.  You'd still need the write back a single locked folio
> interface, which adds API surface, and because it only writes a single
> folio at a time is rather inefficient.  Not a deal breaker because
> the current version look ok, but it would still be preferable to not
> have an extra magic interface for it.
>

Yes but as I understand it, the focus right now is on getting rid of
->launder_folio as an API. The iomap pov imo is a separate issue with
determining whether fuse in particular needs to write back the dirty
page before releasing or should just fail.

btrfs uses ->launder_folio() to free some previously allocated
reservation (added in commit 872617a "btrfs: implement launder_folio
for clearing dirty page reserve") so at the very least, that logic
would need to be moved to .release_folio() (if that suffices? Adding
the btrfs group to cc). It's still vague to me whether
fuse/nfs/orangefs need to write back the dirty page, but it seems fine
to me not to - as I understand it, the worst that can happen (and
please correct me if I'm wrong here, Matthew) from just failing it
with -EBUSY is that the folio lingers longer in the page cache until
it eventually gets written back and cleared out, and that only happens
if the file is mapped and written to in that window between
filemap_write_and_wait_range() and unmap_mapping_folio(). afaics, if
fuse/nfs/orangefs do need to write back the dirty folio instead of
failing w/ -EBUSY, they could just do that logic in .release_folio.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-25 16:44                   ` Joanne Koong
@ 2025-07-01  5:41                     ` Darrick J. Wong
  2025-07-02 21:36                       ` Joanne Koong
  2025-07-01  6:23                     ` Miklos Szeredi
  1 sibling, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2025-07-01  5:41 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Christoph Hellwig, Matthew Wilcox, Jeff Layton, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs, linux-btrfs

On Wed, Jun 25, 2025 at 09:44:31AM -0700, Joanne Koong wrote:
> On Tue, Jun 24, 2025 at 11:26 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Tue, Jun 24, 2025 at 10:26:01PM -0700, Joanne Koong wrote:
> > > > The question is whether this is acceptable for all the filesystem
> > > > which implement ->launder_folio today.  Because we could just move the
> > > > folio_test_dirty() to after the folio_lock() and remove all the testing
> > > > of folio dirtiness from individual filesystems.
> > >
> > > Or could the filesystems that implement ->launder_folio (from what I
> > > see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
> > > logic into their .release_folio implementation? I don't see why not.
> > > In folio_unmap_invalidate(), we call:
> >
> > Without even looking into the details from the iomap POV that basically
> > doesn't matter.  You'd still need the write back a single locked folio
> > interface, which adds API surface, and because it only writes a single
> > folio at a time is rather inefficient.  Not a deal breaker because
> > the current version look ok, but it would still be preferable to not
> > have an extra magic interface for it.
> >
> 
> Yes but as I understand it, the focus right now is on getting rid of
> ->launder_folio as an API. The iomap pov imo is a separate issue with
> determining whether fuse in particular needs to write back the dirty
> page before releasing or should just fail.

This might not help for Joanne's case, but so far the lack of a
launder_folio in my fuse+iomap prototype hasn't hindered it at all.
From what I can tell it's ok to bounce EBUSY back to dio callers...

> btrfs uses ->launder_folio() to free some previously allocated
> reservation (added in commit 872617a "btrfs: implement launder_folio
> for clearing dirty page reserve") so at the very least, that logic
> would need to be moved to .release_folio() (if that suffices? Adding
> the btrfs group to cc). It's still vague to me whether
> fuse/nfs/orangefs need to write back the dirty page, but it seems fine

...but only because a retry will initiate another writeback so
eventually we can make some forward progress.  But it helps a lot that
fuse+iomap is handing the entire IO stack over to iomap.

> to me not to - as I understand it, the worst that can happen (and
> please correct me if I'm wrong here, Matthew) from just failing it
> with -EBUSY is that the folio lingers longer in the page cache until
> it eventually gets written back and cleared out, and that only happens
> if the file is mapped and written to in that window between
> filemap_write_and_wait_range() and unmap_mapping_folio(). afaics, if
> fuse/nfs/orangefs do need to write back the dirty folio instead of
> failing w/ -EBUSY, they could just do that logic in .release_folio.

What do you do in ->release_folio if writeback fails?  Redirty it and
return false?

--D


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-06-25 16:44                   ` Joanne Koong
  2025-07-01  5:41                     ` Darrick J. Wong
@ 2025-07-01  6:23                     ` Miklos Szeredi
  1 sibling, 0 replies; 12+ messages in thread
From: Miklos Szeredi @ 2025-07-01  6:23 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Christoph Hellwig, Matthew Wilcox, Jeff Layton, Darrick J. Wong,
	brauner, linux-fsdevel, linux-xfs, bernd.schubert, kernel-team,
	linux-mm, linux-nfs, linux-btrfs

On Wed, 25 Jun 2025 at 18:44, Joanne Koong <joannelkoong@gmail.com> wrote:

> Yes but as I understand it, the focus right now is on getting rid of
> ->launder_folio as an API. The iomap pov imo is a separate issue with
> determining whether fuse in particular needs to write back the dirty
> page before releasing or should just fail.

Fuse calls invalidate_inode_pages2() not just for direct I/O:

 - open without FOPEN_KEEP_CACHE
 - FUSE_NOTIFY_INVAL_INODE
 - mtime/size change with FUSE_AUTO_INVAL_DATA turned
on/FUSE_EXPLICIT_INVAL_DATA turned off
 - truncate

In most of these cases dirty pages d need to be written back.

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-07-01  5:41                     ` Darrick J. Wong
@ 2025-07-02 21:36                       ` Joanne Koong
  2025-07-02 21:47                         ` Joanne Koong
  0 siblings, 1 reply; 12+ messages in thread
From: Joanne Koong @ 2025-07-02 21:36 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Matthew Wilcox, Jeff Layton, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs, linux-btrfs

On Mon, Jun 30, 2025 at 10:41 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Wed, Jun 25, 2025 at 09:44:31AM -0700, Joanne Koong wrote:
> > On Tue, Jun 24, 2025 at 11:26 PM Christoph Hellwig <hch@infradead.org> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 10:26:01PM -0700, Joanne Koong wrote:
> > > > > The question is whether this is acceptable for all the filesystem
> > > > > which implement ->launder_folio today.  Because we could just move the
> > > > > folio_test_dirty() to after the folio_lock() and remove all the testing
> > > > > of folio dirtiness from individual filesystems.
> > > >
> > > > Or could the filesystems that implement ->launder_folio (from what I
> > > > see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
> > > > logic into their .release_folio implementation? I don't see why not.
> > > > In folio_unmap_invalidate(), we call:
> > >
> > > Without even looking into the details from the iomap POV that basically
> > > doesn't matter.  You'd still need the write back a single locked folio
> > > interface, which adds API surface, and because it only writes a single
> > > folio at a time is rather inefficient.  Not a deal breaker because
> > > the current version look ok, but it would still be preferable to not
> > > have an extra magic interface for it.
> > >
> >
> > Yes but as I understand it, the focus right now is on getting rid of
> > ->launder_folio as an API. The iomap pov imo is a separate issue with
> > determining whether fuse in particular needs to write back the dirty
> > page before releasing or should just fail.
>
> This might not help for Joanne's case, but so far the lack of a
> launder_folio in my fuse+iomap prototype hasn't hindered it at all.
> From what I can tell it's ok to bounce EBUSY back to dio callers...
>
> > btrfs uses ->launder_folio() to free some previously allocated
> > reservation (added in commit 872617a "btrfs: implement launder_folio
> > for clearing dirty page reserve") so at the very least, that logic
> > would need to be moved to .release_folio() (if that suffices? Adding
> > the btrfs group to cc). It's still vague to me whether
> > fuse/nfs/orangefs need to write back the dirty page, but it seems fine
>
> ...but only because a retry will initiate another writeback so
> eventually we can make some forward progress.  But it helps a lot that
> fuse+iomap is handing the entire IO stack over to iomap.
>
> > to me not to - as I understand it, the worst that can happen (and
> > please correct me if I'm wrong here, Matthew) from just failing it
> > with -EBUSY is that the folio lingers longer in the page cache until
> > it eventually gets written back and cleared out, and that only happens
> > if the file is mapped and written to in that window between
> > filemap_write_and_wait_range() and unmap_mapping_folio(). afaics, if
> > fuse/nfs/orangefs do need to write back the dirty folio instead of
> > failing w/ -EBUSY, they could just do that logic in .release_folio.
>
> What do you do in ->release_folio if writeback fails?  Redirty it and
> return false?

Yeah, I was thinking we just redirty it and return false. I don't
think that leads to any deviation from existing behavior (eg in
folio_unmap_invalidate(), a failed writeback will return -EBUSY
regardless of whether the writeback attempt happens from
->launder_folio() or ->release_folio()).
>
> --D


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio()
  2025-07-02 21:36                       ` Joanne Koong
@ 2025-07-02 21:47                         ` Joanne Koong
  0 siblings, 0 replies; 12+ messages in thread
From: Joanne Koong @ 2025-07-02 21:47 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Matthew Wilcox, Jeff Layton, miklos, brauner,
	linux-fsdevel, linux-xfs, bernd.schubert, kernel-team, linux-mm,
	linux-nfs, linux-btrfs

On Wed, Jul 2, 2025 at 2:36 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Jun 30, 2025 at 10:41 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Wed, Jun 25, 2025 at 09:44:31AM -0700, Joanne Koong wrote:
> > > On Tue, Jun 24, 2025 at 11:26 PM Christoph Hellwig <hch@infradead.org> wrote:
> > > >
> > > > On Tue, Jun 24, 2025 at 10:26:01PM -0700, Joanne Koong wrote:
> > > > > > The question is whether this is acceptable for all the filesystem
> > > > > > which implement ->launder_folio today.  Because we could just move the
> > > > > > folio_test_dirty() to after the folio_lock() and remove all the testing
> > > > > > of folio dirtiness from individual filesystems.
> > > > >
> > > > > Or could the filesystems that implement ->launder_folio (from what I
> > > > > see, there's only 4: fuse, nfs, btrfs, and orangefs) just move that
> > > > > logic into their .release_folio implementation? I don't see why not.
> > > > > In folio_unmap_invalidate(), we call:
> > > >
> > > > Without even looking into the details from the iomap POV that basically
> > > > doesn't matter.  You'd still need the write back a single locked folio
> > > > interface, which adds API surface, and because it only writes a single
> > > > folio at a time is rather inefficient.  Not a deal breaker because
> > > > the current version look ok, but it would still be preferable to not
> > > > have an extra magic interface for it.
> > > >
> > >
> > > Yes but as I understand it, the focus right now is on getting rid of
> > > ->launder_folio as an API. The iomap pov imo is a separate issue with
> > > determining whether fuse in particular needs to write back the dirty
> > > page before releasing or should just fail.
> >
> > This might not help for Joanne's case, but so far the lack of a
> > launder_folio in my fuse+iomap prototype hasn't hindered it at all.
> > From what I can tell it's ok to bounce EBUSY back to dio callers...
> >
> > > btrfs uses ->launder_folio() to free some previously allocated
> > > reservation (added in commit 872617a "btrfs: implement launder_folio
> > > for clearing dirty page reserve") so at the very least, that logic
> > > would need to be moved to .release_folio() (if that suffices? Adding
> > > the btrfs group to cc). It's still vague to me whether
> > > fuse/nfs/orangefs need to write back the dirty page, but it seems fine
> >
> > ...but only because a retry will initiate another writeback so
> > eventually we can make some forward progress.  But it helps a lot that
> > fuse+iomap is handing the entire IO stack over to iomap.
> >
> > > to me not to - as I understand it, the worst that can happen (and
> > > please correct me if I'm wrong here, Matthew) from just failing it
> > > with -EBUSY is that the folio lingers longer in the page cache until
> > > it eventually gets written back and cleared out, and that only happens
> > > if the file is mapped and written to in that window between
> > > filemap_write_and_wait_range() and unmap_mapping_folio(). afaics, if
> > > fuse/nfs/orangefs do need to write back the dirty folio instead of
> > > failing w/ -EBUSY, they could just do that logic in .release_folio.
> >
> > What do you do in ->release_folio if writeback fails?  Redirty it and
> > return false?
>
> Yeah, I was thinking we just redirty it and return false. I don't
> think that leads to any deviation from existing behavior (eg in
> folio_unmap_invalidate(), a failed writeback will return -EBUSY
> regardless of whether the writeback attempt happens from
> ->launder_folio() or ->release_folio()).

Or actually I guess the one deviation is that with ->launder_folio()
it can return back a custom error code (eg -ENOMEM) to
folio_unmap_invalidate() whereas release_folio() errors will default
to -EBUSY, but the error code propagated to folio->mapping will be the
same.
> >
> > --D


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-07-02 21:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250606233803.1421259-1-joannelkoong@gmail.com>
     [not found] ` <20250606233803.1421259-6-joannelkoong@gmail.com>
     [not found]   ` <aEZoau3AuwoeqQgu@infradead.org>
     [not found]     ` <20250609171444.GL6156@frogsfrogsfrogs>
2025-06-10  3:59       ` [PATCH v1 5/8] iomap: add iomap_writeback_dirty_folio() Christoph Hellwig
2025-06-11  4:34         ` Matthew Wilcox
2025-06-18  4:47           ` does fuse need ->launder_folios, was: " Christoph Hellwig
2025-06-18 12:17           ` Jeff Layton
2025-06-20 18:15             ` Matthew Wilcox
2025-06-25  5:26               ` Joanne Koong
2025-06-25  6:26                 ` Christoph Hellwig
2025-06-25 16:44                   ` Joanne Koong
2025-07-01  5:41                     ` Darrick J. Wong
2025-07-02 21:36                       ` Joanne Koong
2025-07-02 21:47                         ` Joanne Koong
2025-07-01  6:23                     ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).