All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] gfs2: Do not call iomap_zero_range beyond eof
@ 2025-05-08 13:34 Andreas Gruenbacher
  2025-05-08 14:17 ` Christoph Hellwig
  2025-05-08 14:30 ` Brian Foster
  0 siblings, 2 replies; 8+ messages in thread
From: Andreas Gruenbacher @ 2025-05-08 13:34 UTC (permalink / raw)
  To: Brian Foster, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2
  Cc: Andreas Gruenbacher

Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
The warning triggers on the following code path:

  gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
    __gfs2_punch_hole()
      gfs2_block_zero_range()
        iomap_zero_range()

So far, gfs2 is just zeroing out partial pages at the beginning and end
of the range, whether beyond eof or not.  The data beyond eof is already
expected to be all zeroes, though.  Truncate the range passed to
iomap_zero_range().

As an alternative approach, we could also implicitly truncate the range
inside iomap_zero_range() instead of issuing a warning.  Any thoughts?

Thanks,
Andreas

--

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index b81984def58e..d9a4309cd414 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
 				 unsigned int length)
 {
 	BUG_ON(current->journal_info);
+	if (from > inode->i_size)
+		return 0;
+	if (from + length > inode->i_size)
+		length = inode->i_size - from;
 	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
 			NULL);
 }


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 13:34 [RFC] gfs2: Do not call iomap_zero_range beyond eof Andreas Gruenbacher
@ 2025-05-08 14:17 ` Christoph Hellwig
  2025-05-08 14:30 ` Brian Foster
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-05-08 14:17 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Brian Foster, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> As an alternative approach, we could also implicitly truncate the range
> inside iomap_zero_range() instead of issuing a warning.  Any thoughts?

I'd rather not magically change the range.

>  	BUG_ON(current->journal_info);
> +	if (from + length > inode->i_size)
> +		length = inode->i_size - from;

	length = min(length, inode->i_size - from);


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 13:34 [RFC] gfs2: Do not call iomap_zero_range beyond eof Andreas Gruenbacher
  2025-05-08 14:17 ` Christoph Hellwig
@ 2025-05-08 14:30 ` Brian Foster
  2025-05-08 15:04   ` Darrick J. Wong
  1 sibling, 1 reply; 8+ messages in thread
From: Brian Foster @ 2025-05-08 14:30 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christian Brauner, Christoph Hellwig, Matthew Wilcox, linux-xfs,
	linux-fsdevel, gfs2

On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> The warning triggers on the following code path:
> 
>   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
>     __gfs2_punch_hole()
>       gfs2_block_zero_range()
>         iomap_zero_range()
> 
> So far, gfs2 is just zeroing out partial pages at the beginning and end
> of the range, whether beyond eof or not.  The data beyond eof is already
> expected to be all zeroes, though.  Truncate the range passed to
> iomap_zero_range().
> 
> As an alternative approach, we could also implicitly truncate the range
> inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> 

Thanks Andreas. The more I think about this the more it seems like
lifting this logic into iomap is a reasonable compromise between just
dropping the warning and forcing individual filesystems to work around
it. The original intent of the warning was to have something to catch
subtle bad behavior since zero range did update i_size for so long.

OTOH I think it's reasonable to argue that we shouldn't need to warn in
situations where we could just enforce correct behavior. Also, I believe
we introduced something similar to avoid post-eof weirdness wrt unshare
range [1], so precedent exists.

I'm interested if others have opinions on the iomap side.. (though as I
write this it looks like hch sits on the side of not tweaking iomap).

Brian

[1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")

> Thanks,
> Andreas
> 
> --
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> 
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index b81984def58e..d9a4309cd414 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
>  				 unsigned int length)
>  {
>  	BUG_ON(current->journal_info);
> +	if (from > inode->i_size)
> +		return 0;
> +	if (from + length > inode->i_size)
> +		length = inode->i_size - from;
>  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
>  			NULL);
>  }
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 14:30 ` Brian Foster
@ 2025-05-08 15:04   ` Darrick J. Wong
  2025-05-08 15:19     ` Brian Foster
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2025-05-08 15:04 UTC (permalink / raw)
  To: Brian Foster
  Cc: Andreas Gruenbacher, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 08, 2025 at 10:30:30AM -0400, Brian Foster wrote:
> On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> > Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> > folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> > The warning triggers on the following code path:

Which warning?  This one?

	/* warn about zeroing folios beyond eof that won't write back */
	WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size);

If so, then why are there folios that start entirely beyond EOF?

> > 
> >   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
> >     __gfs2_punch_hole()
> >       gfs2_block_zero_range()
> >         iomap_zero_range()
> > 
> > So far, gfs2 is just zeroing out partial pages at the beginning and end
> > of the range, whether beyond eof or not.  The data beyond eof is already
> > expected to be all zeroes, though.  Truncate the range passed to
> > iomap_zero_range().
> > 
> > As an alternative approach, we could also implicitly truncate the range
> > inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> > 
> 
> Thanks Andreas. The more I think about this the more it seems like
> lifting this logic into iomap is a reasonable compromise between just
> dropping the warning and forcing individual filesystems to work around
> it. The original intent of the warning was to have something to catch
> subtle bad behavior since zero range did update i_size for so long.
> 
> OTOH I think it's reasonable to argue that we shouldn't need to warn in
> situations where we could just enforce correct behavior. Also, I believe
> we introduced something similar to avoid post-eof weirdness wrt unshare
> range [1], so precedent exists.
> 
> I'm interested if others have opinions on the iomap side.. (though as I
> write this it looks like hch sits on the side of not tweaking iomap).

IIRC XFS calls iomap_zero_range during file extending operations to zero
the tail of a folio that spans EOF, so you'd have to allow for that too.

--D

> Brian
> 
> [1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")
> 
> > Thanks,
> > Andreas
> > 
> > --
> > 
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > 
> > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> > index b81984def58e..d9a4309cd414 100644
> > --- a/fs/gfs2/bmap.c
> > +++ b/fs/gfs2/bmap.c
> > @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
> >  				 unsigned int length)
> >  {
> >  	BUG_ON(current->journal_info);
> > +	if (from > inode->i_size)
> > +		return 0;
> > +	if (from + length > inode->i_size)
> > +		length = inode->i_size - from;
> >  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> >  			NULL);
> >  }
> > 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 15:04   ` Darrick J. Wong
@ 2025-05-08 15:19     ` Brian Foster
  2025-05-08 16:04       ` Darrick J. Wong
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Foster @ 2025-05-08 15:19 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 08, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> On Thu, May 08, 2025 at 10:30:30AM -0400, Brian Foster wrote:
> > On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> > > Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> > > folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> > > The warning triggers on the following code path:
> 
> Which warning?  This one?
> 
> 	/* warn about zeroing folios beyond eof that won't write back */
> 	WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size);
> 
> If so, then why are there folios that start entirely beyond EOF?
> 

Yeah.. this gfs2 instance is simply a case of their punch hole mechanism
does unconditional partial folio zeroing via iomap zero range, so if a
punch hole occurs on some unaligned range of post-eof blocks, it will
basically create and perform zeroing of post-eof folios. IIUC the caveat
here is that these blocks are all zeroed on alloc (unwritten extents are
apparently not a thing in gfs2), so the punch time zeroing and warning
are spurious. Andreas can correct me if I have any of that wrong.

> > > 
> > >   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
> > >     __gfs2_punch_hole()
> > >       gfs2_block_zero_range()
> > >         iomap_zero_range()
> > > 
> > > So far, gfs2 is just zeroing out partial pages at the beginning and end
> > > of the range, whether beyond eof or not.  The data beyond eof is already
> > > expected to be all zeroes, though.  Truncate the range passed to
> > > iomap_zero_range().
> > > 
> > > As an alternative approach, we could also implicitly truncate the range
> > > inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> > > 
> > 
> > Thanks Andreas. The more I think about this the more it seems like
> > lifting this logic into iomap is a reasonable compromise between just
> > dropping the warning and forcing individual filesystems to work around
> > it. The original intent of the warning was to have something to catch
> > subtle bad behavior since zero range did update i_size for so long.
> > 
> > OTOH I think it's reasonable to argue that we shouldn't need to warn in
> > situations where we could just enforce correct behavior. Also, I believe
> > we introduced something similar to avoid post-eof weirdness wrt unshare
> > range [1], so precedent exists.
> > 
> > I'm interested if others have opinions on the iomap side.. (though as I
> > write this it looks like hch sits on the side of not tweaking iomap).
> 
> IIRC XFS calls iomap_zero_range during file extending operations to zero
> the tail of a folio that spans EOF, so you'd have to allow for that too.
> 

Yeah, good point. Perhaps we'd want to bail on a folio that starts
beyond EOF with this approach, similar to the warning logic.

Brian

> --D
> 
> > Brian
> > 
> > [1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")
> > 
> > > Thanks,
> > > Andreas
> > > 
> > > --
> > > 
> > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > > 
> > > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> > > index b81984def58e..d9a4309cd414 100644
> > > --- a/fs/gfs2/bmap.c
> > > +++ b/fs/gfs2/bmap.c
> > > @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
> > >  				 unsigned int length)
> > >  {
> > >  	BUG_ON(current->journal_info);
> > > +	if (from > inode->i_size)
> > > +		return 0;
> > > +	if (from + length > inode->i_size)
> > > +		length = inode->i_size - from;
> > >  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> > >  			NULL);
> > >  }
> > > 
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 15:19     ` Brian Foster
@ 2025-05-08 16:04       ` Darrick J. Wong
  2025-05-08 16:34         ` Brian Foster
  2025-05-08 16:36         ` Andreas Gruenbacher
  0 siblings, 2 replies; 8+ messages in thread
From: Darrick J. Wong @ 2025-05-08 16:04 UTC (permalink / raw)
  To: Brian Foster
  Cc: Andreas Gruenbacher, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 08, 2025 at 11:19:37AM -0400, Brian Foster wrote:
> On Thu, May 08, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> > On Thu, May 08, 2025 at 10:30:30AM -0400, Brian Foster wrote:
> > > On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> > > > Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> > > > folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> > > > The warning triggers on the following code path:
> > 
> > Which warning?  This one?
> > 
> > 	/* warn about zeroing folios beyond eof that won't write back */
> > 	WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size);
> > 
> > If so, then why are there folios that start entirely beyond EOF?
> > 
> 
> Yeah.. this gfs2 instance is simply a case of their punch hole mechanism
> does unconditional partial folio zeroing via iomap zero range, so if a
> punch hole occurs on some unaligned range of post-eof blocks, it will
> basically create and perform zeroing of post-eof folios. IIUC the caveat
> here is that these blocks are all zeroed on alloc (unwritten extents are
> apparently not a thing in gfs2), so the punch time zeroing and warning
> are spurious. Andreas can correct me if I have any of that wrong.

Oh, right, because iomap_zero_iter calls iomap_write_begin, which
allocates a new folio completely beyond EOF, and then we see that new
folio and WARN about it before scribbling on the folio and dirtying it.
Correct?

If so then yeah, it doesn't seem useful to do that... unless the file
size immediately gets extended such that at least one byte of the dirty
folio is within EOF.  Even then, that seems like a stretch...

> > > > 
> > > >   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
> > > >     __gfs2_punch_hole()
> > > >       gfs2_block_zero_range()
> > > >         iomap_zero_range()
> > > > 
> > > > So far, gfs2 is just zeroing out partial pages at the beginning and end
> > > > of the range, whether beyond eof or not.  The data beyond eof is already
> > > > expected to be all zeroes, though.  Truncate the range passed to
> > > > iomap_zero_range().
> > > > 
> > > > As an alternative approach, we could also implicitly truncate the range
> > > > inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> > > > 
> > > 
> > > Thanks Andreas. The more I think about this the more it seems like
> > > lifting this logic into iomap is a reasonable compromise between just
> > > dropping the warning and forcing individual filesystems to work around
> > > it. The original intent of the warning was to have something to catch
> > > subtle bad behavior since zero range did update i_size for so long.
> > > 
> > > OTOH I think it's reasonable to argue that we shouldn't need to warn in
> > > situations where we could just enforce correct behavior. Also, I believe
> > > we introduced something similar to avoid post-eof weirdness wrt unshare
> > > range [1], so precedent exists.
> > > 
> > > I'm interested if others have opinions on the iomap side.. (though as I
> > > write this it looks like hch sits on the side of not tweaking iomap).
> > 
> > IIRC XFS calls iomap_zero_range during file extending operations to zero
> > the tail of a folio that spans EOF, so you'd have to allow for that too.
> > 
> 
> Yeah, good point. Perhaps we'd want to bail on a folio that starts
> beyond EOF with this approach, similar to the warning logic.

...because I don't see much use in zeroing and dirtying a folio that
starts well beyond EOF since iomap_writepage_handle_eof will ignore it
and there are several gigantic comments in buffered-io.c about clamping
to EOF.

<shrug> But maybe I'm missing a usecase?

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > [1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")
> > > 
> > > > Thanks,
> > > > Andreas
> > > > 
> > > > --
> > > > 
> > > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > > > 
> > > > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> > > > index b81984def58e..d9a4309cd414 100644
> > > > --- a/fs/gfs2/bmap.c
> > > > +++ b/fs/gfs2/bmap.c
> > > > @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
> > > >  				 unsigned int length)
> > > >  {
> > > >  	BUG_ON(current->journal_info);
> > > > +	if (from > inode->i_size)
> > > > +		return 0;
> > > > +	if (from + length > inode->i_size)
> > > > +		length = inode->i_size - from;
> > > >  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> > > >  			NULL);
> > > >  }
> > > > 
> > > 
> > > 
> > 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 16:04       ` Darrick J. Wong
@ 2025-05-08 16:34         ` Brian Foster
  2025-05-08 16:36         ` Andreas Gruenbacher
  1 sibling, 0 replies; 8+ messages in thread
From: Brian Foster @ 2025-05-08 16:34 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 08, 2025 at 09:04:22AM -0700, Darrick J. Wong wrote:
> On Thu, May 08, 2025 at 11:19:37AM -0400, Brian Foster wrote:
> > On Thu, May 08, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> > > On Thu, May 08, 2025 at 10:30:30AM -0400, Brian Foster wrote:
> > > > On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> > > > > Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> > > > > folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> > > > > The warning triggers on the following code path:
> > > 
> > > Which warning?  This one?
> > > 
> > > 	/* warn about zeroing folios beyond eof that won't write back */
> > > 	WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size);
> > > 
> > > If so, then why are there folios that start entirely beyond EOF?
> > > 
> > 
> > Yeah.. this gfs2 instance is simply a case of their punch hole mechanism
> > does unconditional partial folio zeroing via iomap zero range, so if a
> > punch hole occurs on some unaligned range of post-eof blocks, it will
> > basically create and perform zeroing of post-eof folios. IIUC the caveat
> > here is that these blocks are all zeroed on alloc (unwritten extents are
> > apparently not a thing in gfs2), so the punch time zeroing and warning
> > are spurious. Andreas can correct me if I have any of that wrong.
> 
> Oh, right, because iomap_zero_iter calls iomap_write_begin, which
> allocates a new folio completely beyond EOF, and then we see that new
> folio and WARN about it before scribbling on the folio and dirtying it.
> Correct?
> 
> If so then yeah, it doesn't seem useful to do that... unless the file
> size immediately gets extended such that at least one byte of the dirty
> folio is within EOF.  Even then, that seems like a stretch...
> 

Yep, agreed. An i_size update after a zero op would come after locks are
dropped and whatnot as well, so seems racy and wrong.

> > > > > 
> > > > >   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
> > > > >     __gfs2_punch_hole()
> > > > >       gfs2_block_zero_range()
> > > > >         iomap_zero_range()
> > > > > 
> > > > > So far, gfs2 is just zeroing out partial pages at the beginning and end
> > > > > of the range, whether beyond eof or not.  The data beyond eof is already
> > > > > expected to be all zeroes, though.  Truncate the range passed to
> > > > > iomap_zero_range().
> > > > > 
> > > > > As an alternative approach, we could also implicitly truncate the range
> > > > > inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> > > > > 
> > > > 
> > > > Thanks Andreas. The more I think about this the more it seems like
> > > > lifting this logic into iomap is a reasonable compromise between just
> > > > dropping the warning and forcing individual filesystems to work around
> > > > it. The original intent of the warning was to have something to catch
> > > > subtle bad behavior since zero range did update i_size for so long.
> > > > 
> > > > OTOH I think it's reasonable to argue that we shouldn't need to warn in
> > > > situations where we could just enforce correct behavior. Also, I believe
> > > > we introduced something similar to avoid post-eof weirdness wrt unshare
> > > > range [1], so precedent exists.
> > > > 
> > > > I'm interested if others have opinions on the iomap side.. (though as I
> > > > write this it looks like hch sits on the side of not tweaking iomap).
> > > 
> > > IIRC XFS calls iomap_zero_range during file extending operations to zero
> > > the tail of a folio that spans EOF, so you'd have to allow for that too.
> > > 
> > 
> > Yeah, good point. Perhaps we'd want to bail on a folio that starts
> > beyond EOF with this approach, similar to the warning logic.
> 
> ...because I don't see much use in zeroing and dirtying a folio that
> starts well beyond EOF since iomap_writepage_handle_eof will ignore it
> and there are several gigantic comments in buffered-io.c about clamping
> to EOF.
> 
> <shrug> But maybe I'm missing a usecase?
> 

Not that I'm aware of.

Brian

> --D
> 
> > Brian
> > 
> > > --D
> > > 
> > > > Brian
> > > > 
> > > > [1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")
> > > > 
> > > > > Thanks,
> > > > > Andreas
> > > > > 
> > > > > --
> > > > > 
> > > > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > > > > 
> > > > > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> > > > > index b81984def58e..d9a4309cd414 100644
> > > > > --- a/fs/gfs2/bmap.c
> > > > > +++ b/fs/gfs2/bmap.c
> > > > > @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
> > > > >  				 unsigned int length)
> > > > >  {
> > > > >  	BUG_ON(current->journal_info);
> > > > > +	if (from > inode->i_size)
> > > > > +		return 0;
> > > > > +	if (from + length > inode->i_size)
> > > > > +		length = inode->i_size - from;
> > > > >  	return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> > > > >  			NULL);
> > > > >  }
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] gfs2: Do not call iomap_zero_range beyond eof
  2025-05-08 16:04       ` Darrick J. Wong
  2025-05-08 16:34         ` Brian Foster
@ 2025-05-08 16:36         ` Andreas Gruenbacher
  1 sibling, 0 replies; 8+ messages in thread
From: Andreas Gruenbacher @ 2025-05-08 16:36 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Brian Foster, Christian Brauner, Christoph Hellwig,
	Matthew Wilcox, linux-xfs, linux-fsdevel, gfs2

On Thu, May 8, 2025 at 6:04 PM Darrick J. Wong <djwong@kernel.org> wrote:
> On Thu, May 08, 2025 at 11:19:37AM -0400, Brian Foster wrote:
> > On Thu, May 08, 2025 at 08:04:46AM -0700, Darrick J. Wong wrote:
> > > On Thu, May 08, 2025 at 10:30:30AM -0400, Brian Foster wrote:
> > > > On Thu, May 08, 2025 at 03:34:27PM +0200, Andreas Gruenbacher wrote:
> > > > > Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof
> > > > > folio"), iomap_zero_range() warns when asked to zero a folio beyond eof.
> > > > > The warning triggers on the following code path:
> > >
> > > Which warning?  This one?
> > >
> > >     /* warn about zeroing folios beyond eof that won't write back */
> > >     WARN_ON_ONCE(folio_pos(folio) > iter->inode->i_size);
> > >
> > > If so, then why are there folios that start entirely beyond EOF?
> > >
> >
> > Yeah.. this gfs2 instance is simply a case of their punch hole mechanism
> > does unconditional partial folio zeroing via iomap zero range, so if a
> > punch hole occurs on some unaligned range of post-eof blocks, it will
> > basically create and perform zeroing of post-eof folios. IIUC the caveat
> > here is that these blocks are all zeroed on alloc (unwritten extents are
> > apparently not a thing in gfs2), so the punch time zeroing and warning
> > are spurious. Andreas can correct me if I have any of that wrong.

gfs2 uses ext2-style indirect blocks. It doesn't have extents, delayed
allocation, or any kind of unwritten data tracking.

> Oh, right, because iomap_zero_iter calls iomap_write_begin, which
> allocates a new folio completely beyond EOF, and then we see that new
> folio and WARN about it before scribbling on the folio and dirtying it.
> Correct?

Yes, or at least that's also my understanding.

> If so then yeah, it doesn't seem useful to do that... unless the file
> size immediately gets extended such that at least one byte of the dirty
> folio is within EOF.  Even then, that seems like a stretch...

i_size isn't going to change in a punch hole operation:

int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
{
        [...]
        switch (mode & FALLOC_FL_MODE_MASK) {
        [...]
        case FALLOC_FL_PUNCH_HOLE:
                if (!(mode & FALLOC_FL_KEEP_SIZE))
                        return -EOPNOTSUPP;

> > > > >   gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)
> > > > >     __gfs2_punch_hole()
> > > > >       gfs2_block_zero_range()
> > > > >         iomap_zero_range()
> > > > >
> > > > > So far, gfs2 is just zeroing out partial pages at the beginning and end
> > > > > of the range, whether beyond eof or not.  The data beyond eof is already
> > > > > expected to be all zeroes, though.  Truncate the range passed to
> > > > > iomap_zero_range().
> > > > >
> > > > > As an alternative approach, we could also implicitly truncate the range
> > > > > inside iomap_zero_range() instead of issuing a warning.  Any thoughts?
> > > > >
> > > >
> > > > Thanks Andreas. The more I think about this the more it seems like
> > > > lifting this logic into iomap is a reasonable compromise between just
> > > > dropping the warning and forcing individual filesystems to work around
> > > > it. The original intent of the warning was to have something to catch
> > > > subtle bad behavior since zero range did update i_size for so long.
> > > >
> > > > OTOH I think it's reasonable to argue that we shouldn't need to warn in
> > > > situations where we could just enforce correct behavior. Also, I believe
> > > > we introduced something similar to avoid post-eof weirdness wrt unshare
> > > > range [1], so precedent exists.
> > > >
> > > > I'm interested if others have opinions on the iomap side.. (though as I
> > > > write this it looks like hch sits on the side of not tweaking iomap).
> > >
> > > IIRC XFS calls iomap_zero_range during file extending operations to zero
> > > the tail of a folio that spans EOF, so you'd have to allow for that too.
> > >
> >
> > Yeah, good point. Perhaps we'd want to bail on a folio that starts
> > beyond EOF with this approach, similar to the warning logic.
>
> ...because I don't see much use in zeroing and dirtying a folio that
> starts well beyond EOF since iomap_writepage_handle_eof will ignore it
> and there are several gigantic comments in buffered-io.c about clamping
> to EOF.
>
> <shrug> But maybe I'm missing a usecase?

Unrelated to iomap_zero_range(), but gfs2 has this fairly scary
special case in which it does insist on writing folios beyond eof. See
gfs2_write_jdata_folio(); this was introduced and is described in
commit fd4c5748b8d3 ("gfs2: writeout truncated pages").

> --D
>
> > Brian
> >
> > > --D
> > >
> > > > Brian
> > > >
> > > > [1] a311a08a4237 ("iomap: constrain the file range passed to iomap_file_unshare")
> > > >
> > > > > Thanks,
> > > > > Andreas
> > > > >
> > > > > --
> > > > >
> > > > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> > > > >
> > > > > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> > > > > index b81984def58e..d9a4309cd414 100644
> > > > > --- a/fs/gfs2/bmap.c
> > > > > +++ b/fs/gfs2/bmap.c
> > > > > @@ -1301,6 +1301,10 @@ static int gfs2_block_zero_range(struct inode *inode, loff_t from,
> > > > >                                  unsigned int length)
> > > > >  {
> > > > >         BUG_ON(current->journal_info);
> > > > > +       if (from > inode->i_size)
> > > > > +               return 0;
> > > > > +       if (from + length > inode->i_size)
> > > > > +               length = inode->i_size - from;
> > > > >         return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops,
> > > > >                         NULL);
> > > > >  }
> > > > >
> > > >
> > > >
> > >
> >
> >
>

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-05-08 16:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-08 13:34 [RFC] gfs2: Do not call iomap_zero_range beyond eof Andreas Gruenbacher
2025-05-08 14:17 ` Christoph Hellwig
2025-05-08 14:30 ` Brian Foster
2025-05-08 15:04   ` Darrick J. Wong
2025-05-08 15:19     ` Brian Foster
2025-05-08 16:04       ` Darrick J. Wong
2025-05-08 16:34         ` Brian Foster
2025-05-08 16:36         ` Andreas Gruenbacher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.