All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, djwong@kernel.org,
	hch@infradead.org, brauner@kernel.org, david@fromorbit.com,
	chandanbabu@kernel.org, jack@suse.cz, willy@infradead.org,
	yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware
Date: Fri, 31 May 2024 06:11:25 -0700	[thread overview]
Message-ID: <ZlnMfSJcm5k6Dg_e@infradead.org> (raw)
In-Reply-To: <20240529095206.2568162-2-yi.zhang@huaweicloud.com>

On Wed, May 29, 2024 at 05:51:59PM +0800, Zhang Yi wrote:
> XXX: how do we detect a iomap containing a cow mapping over a hole
> in iomap_zero_iter()? The XFS code implies this case also needs to
> zero the page cache if there is data present, so trigger for page
> cache lookup only in iomap_zero_iter() needs to handle this case as
> well.

If there is no data in the page cache and either a whole or unwritten
extent it really should not matter what is in the COW fork, a there
obviously isn't any data we could zero.

If there is data in the page cache for something that is marked as
a hole in the srcmap, but we have data in the COW fork due to
COW extsize preallocation we'd need to zero it, but as the
xfs iomap ops don't return a separate srcmap for that case we
should be fine.  Or am I missing something?

> + * Note: when zeroing unwritten extents, we might have data in the page cache
> + * over an unwritten extent. In this case, we want to do a pure lookup on the
> + * page cache and not create a new folio as we don't need to perform zeroing on
> + * unwritten extents if there is no cached data over the given range.
>   */
>  struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
>  {
>  	fgf_t fgp = FGP_WRITEBEGIN | FGP_NOFS;
>  
> +	if (iter->flags & IOMAP_ZERO) {
> +		const struct iomap *srcmap = iomap_iter_srcmap(iter);
> +
> +		if (srcmap->type == IOMAP_UNWRITTEN)
> +			fgp &= ~FGP_CREAT;
> +	}

Nit:  The comment would probably stand out a little better if it was
right next to the IOMAP_ZERO conditional instead of above the
function.

> +		if (status) {
> +			if (status == -ENOENT) {
> +				/*
> +				 * Unwritten extents need to have page cache
> +				 * lookups done to determine if they have data
> +				 * over them that needs zeroing. If there is no
> +				 * data, we'll get -ENOENT returned here, so we
> +				 * can just skip over this index.
> +				 */
> +				WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);

I'd return -EIO if the WARN_ON triggers.

> +loop_continue:

While I'm no strange to gotos for loop control something trips me
up about jumping to the end of the loop.  Here is what I could come
up with instead.  Not arguing it's objectively better, but I somehow
like it a little better:

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 700b22d6807783..81378f7cd8d7ff 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1412,49 +1412,56 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		bool ret;
 
 		status = iomap_write_begin(iter, pos, bytes, &folio);
-		if (status) {
-			if (status == -ENOENT) {
-				/*
-				 * Unwritten extents need to have page cache
-				 * lookups done to determine if they have data
-				 * over them that needs zeroing. If there is no
-				 * data, we'll get -ENOENT returned here, so we
-				 * can just skip over this index.
-				 */
-				WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);
-				if (bytes > PAGE_SIZE - offset_in_page(pos))
-					bytes = PAGE_SIZE - offset_in_page(pos);
-				goto loop_continue;
-			}
+		if (status && status != -ENOENT)
 			return status;
-		}
-		if (iter->iomap.flags & IOMAP_F_STALE)
-			break;
 
-		offset = offset_in_folio(folio, pos);
-		if (bytes > folio_size(folio) - offset)
-			bytes = folio_size(folio) - offset;
+		if (status == -ENOENT) {
+			/*
+			 * If we end up here, we did not find a folio in the
+			 * page cache for an unwritten extent and thus can
+			 * skip over the range.
+			 */
+			if (WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN))
+				return -EIO;
 
-		/*
-		 * If the folio over an unwritten extent is clean (i.e. because
-		 * it has been read from), then it already contains zeros. Hence
-		 * we can just skip it.
-		 */
-		if (srcmap->type == IOMAP_UNWRITTEN &&
-		    !folio_test_dirty(folio)) {
-			folio_unlock(folio);
-			goto loop_continue;
+			/*
+			 * XXX: It would be nice if we could get the offset of
+			 * the next entry in the pagecache so that we don't have
+			 * to iterate one page at a time here.
+			 */
+			offset = offset_in_page(pos);
+			if (bytes > PAGE_SIZE - offset)
+				bytes = PAGE_SIZE - offset;
+		} else {
+			if (iter->iomap.flags & IOMAP_F_STALE)
+				break;
+
+			offset = offset_in_folio(folio, pos);
+			if (bytes > folio_size(folio) - offset)
+				bytes = folio_size(folio) - offset;
+		
+			/*
+			 * If the folio over an unwritten extent is clean (i.e.
+			 * because it has only been read from), then it already
+			 * contains zeros.  Hence we can just skip it.
+			 */
+			if (srcmap->type == IOMAP_UNWRITTEN &&
+			    !folio_test_dirty(folio)) {
+				folio_unlock(folio);
+				status = -ENOENT;
+			}
 		}
 
-		folio_zero_range(folio, offset, bytes);
-		folio_mark_accessed(folio);
+		if (status != -ENOENT) {
+			folio_zero_range(folio, offset, bytes);
+			folio_mark_accessed(folio);
 
-		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
-		__iomap_put_folio(iter, pos, bytes, folio);
-		if (WARN_ON_ONCE(!ret))
-			return -EIO;
+			ret = iomap_write_end(iter, pos, bytes, bytes, folio);
+			__iomap_put_folio(iter, pos, bytes, folio);
+			if (WARN_ON_ONCE(!ret))
+				return -EIO;
+		}
 
-loop_continue:
 		pos += bytes;
 		length -= bytes;
 		written += bytes;

  reply	other threads:[~2024-05-31 13:11 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29  9:51 [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Zhang Yi
2024-05-29  9:51 ` [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware Zhang Yi
2024-05-31 13:11   ` Christoph Hellwig [this message]
2024-05-31 14:03     ` Darrick J. Wong
2024-05-31 14:05       ` Christoph Hellwig
2024-05-31 15:44         ` Brian Foster
2024-05-31 15:43       ` Brian Foster
2024-06-02 22:22     ` Dave Chinner
2024-06-02 11:04   ` Brian Foster
2024-06-03  9:07     ` Zhang Yi
2024-06-03 14:37       ` Brian Foster
2024-06-04 23:38         ` Dave Chinner
2024-05-29  9:52 ` [RFC PATCH v4 2/8] math64: add rem_u64() to just return the remainder Zhang Yi
2024-05-31 12:35   ` Christoph Hellwig
2024-05-31 14:04   ` Darrick J. Wong
2024-05-29  9:52 ` [RFC PATCH v4 3/8] iomap: pass blocksize to iomap_truncate_page() Zhang Yi
2024-05-31 12:39   ` Christoph Hellwig
2024-06-02 11:16     ` Brian Foster
2024-06-03 13:23     ` Zhang Yi
2024-05-29  9:52 ` [RFC PATCH v4 4/8] fsdax: pass blocksize to dax_truncate_page() Zhang Yi
2024-05-29  9:52 ` [RFC PATCH v4 5/8] xfs: refactor the truncating order Zhang Yi
2024-05-31 13:31   ` Christoph Hellwig
2024-05-31 15:27     ` Darrick J. Wong
2024-05-31 16:17       ` Christoph Hellwig
2024-06-03 13:51       ` Zhang Yi
2024-05-31 15:44   ` Darrick J. Wong
2024-06-03 14:15     ` Zhang Yi
2024-06-02 22:46   ` Dave Chinner
2024-06-03 14:18     ` Zhang Yi
2024-05-29  9:52 ` [RFC PATCH v4 6/8] xfs: correct the truncate blocksize of realtime inode Zhang Yi
2024-05-31 13:36   ` Christoph Hellwig
2024-06-03 14:35     ` Zhang Yi
2024-05-29  9:52 ` [RFC PATCH v4 7/8] xfs: reserve blocks for truncating " Zhang Yi
2024-05-31 12:42   ` Christoph Hellwig
2024-05-31 14:10     ` Darrick J. Wong
2024-05-31 14:13       ` Christoph Hellwig
2024-05-31 15:29         ` Darrick J. Wong
2024-05-31 16:17           ` Christoph Hellwig
2024-05-29  9:52 ` [RFC PATCH v4 8/8] xfs: improve truncate on a realtime inode with huge extsize Zhang Yi
2024-05-31 13:46   ` Christoph Hellwig
2024-05-31 14:12     ` Darrick J. Wong
2024-05-31 14:15       ` Christoph Hellwig
2024-05-31 15:00         ` Darrick J. Wong
2024-06-04  7:09           ` Zhang Yi
2024-05-31 12:26 ` [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Christoph Hellwig
2024-06-01  7:38   ` Zhang Yi
2024-06-01  7:40     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlnMfSJcm5k6Dg_e@infradead.org \
    --to=hch@infradead.org \
    --cc=brauner@kernel.org \
    --cc=chandanbabu@kernel.org \
    --cc=chengzhihao1@huawei.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.