From: Christoph Hellwig <hch@infradead.org>
To: Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, djwong@kernel.org,
hch@infradead.org, brauner@kernel.org, david@fromorbit.com,
chandanbabu@kernel.org, jack@suse.cz, willy@infradead.org,
yi.zhang@huawei.com, chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware
Date: Fri, 31 May 2024 06:11:25 -0700 [thread overview]
Message-ID: <ZlnMfSJcm5k6Dg_e@infradead.org> (raw)
In-Reply-To: <20240529095206.2568162-2-yi.zhang@huaweicloud.com>
On Wed, May 29, 2024 at 05:51:59PM +0800, Zhang Yi wrote:
> XXX: how do we detect a iomap containing a cow mapping over a hole
> in iomap_zero_iter()? The XFS code implies this case also needs to
> zero the page cache if there is data present, so trigger for page
> cache lookup only in iomap_zero_iter() needs to handle this case as
> well.
If there is no data in the page cache and either a whole or unwritten
extent it really should not matter what is in the COW fork, a there
obviously isn't any data we could zero.
If there is data in the page cache for something that is marked as
a hole in the srcmap, but we have data in the COW fork due to
COW extsize preallocation we'd need to zero it, but as the
xfs iomap ops don't return a separate srcmap for that case we
should be fine. Or am I missing something?
> + * Note: when zeroing unwritten extents, we might have data in the page cache
> + * over an unwritten extent. In this case, we want to do a pure lookup on the
> + * page cache and not create a new folio as we don't need to perform zeroing on
> + * unwritten extents if there is no cached data over the given range.
> */
> struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
> {
> fgf_t fgp = FGP_WRITEBEGIN | FGP_NOFS;
>
> + if (iter->flags & IOMAP_ZERO) {
> + const struct iomap *srcmap = iomap_iter_srcmap(iter);
> +
> + if (srcmap->type == IOMAP_UNWRITTEN)
> + fgp &= ~FGP_CREAT;
> + }
Nit: The comment would probably stand out a little better if it was
right next to the IOMAP_ZERO conditional instead of above the
function.
> + if (status) {
> + if (status == -ENOENT) {
> + /*
> + * Unwritten extents need to have page cache
> + * lookups done to determine if they have data
> + * over them that needs zeroing. If there is no
> + * data, we'll get -ENOENT returned here, so we
> + * can just skip over this index.
> + */
> + WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);
I'd return -EIO if the WARN_ON triggers.
> +loop_continue:
While I'm no strange to gotos for loop control something trips me
up about jumping to the end of the loop. Here is what I could come
up with instead. Not arguing it's objectively better, but I somehow
like it a little better:
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 700b22d6807783..81378f7cd8d7ff 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1412,49 +1412,56 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
bool ret;
status = iomap_write_begin(iter, pos, bytes, &folio);
- if (status) {
- if (status == -ENOENT) {
- /*
- * Unwritten extents need to have page cache
- * lookups done to determine if they have data
- * over them that needs zeroing. If there is no
- * data, we'll get -ENOENT returned here, so we
- * can just skip over this index.
- */
- WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN);
- if (bytes > PAGE_SIZE - offset_in_page(pos))
- bytes = PAGE_SIZE - offset_in_page(pos);
- goto loop_continue;
- }
+ if (status && status != -ENOENT)
return status;
- }
- if (iter->iomap.flags & IOMAP_F_STALE)
- break;
- offset = offset_in_folio(folio, pos);
- if (bytes > folio_size(folio) - offset)
- bytes = folio_size(folio) - offset;
+ if (status == -ENOENT) {
+ /*
+ * If we end up here, we did not find a folio in the
+ * page cache for an unwritten extent and thus can
+ * skip over the range.
+ */
+ if (WARN_ON_ONCE(srcmap->type != IOMAP_UNWRITTEN))
+ return -EIO;
- /*
- * If the folio over an unwritten extent is clean (i.e. because
- * it has been read from), then it already contains zeros. Hence
- * we can just skip it.
- */
- if (srcmap->type == IOMAP_UNWRITTEN &&
- !folio_test_dirty(folio)) {
- folio_unlock(folio);
- goto loop_continue;
+ /*
+ * XXX: It would be nice if we could get the offset of
+ * the next entry in the pagecache so that we don't have
+ * to iterate one page at a time here.
+ */
+ offset = offset_in_page(pos);
+ if (bytes > PAGE_SIZE - offset)
+ bytes = PAGE_SIZE - offset;
+ } else {
+ if (iter->iomap.flags & IOMAP_F_STALE)
+ break;
+
+ offset = offset_in_folio(folio, pos);
+ if (bytes > folio_size(folio) - offset)
+ bytes = folio_size(folio) - offset;
+
+ /*
+ * If the folio over an unwritten extent is clean (i.e.
+ * because it has only been read from), then it already
+ * contains zeros. Hence we can just skip it.
+ */
+ if (srcmap->type == IOMAP_UNWRITTEN &&
+ !folio_test_dirty(folio)) {
+ folio_unlock(folio);
+ status = -ENOENT;
+ }
}
- folio_zero_range(folio, offset, bytes);
- folio_mark_accessed(folio);
+ if (status != -ENOENT) {
+ folio_zero_range(folio, offset, bytes);
+ folio_mark_accessed(folio);
- ret = iomap_write_end(iter, pos, bytes, bytes, folio);
- __iomap_put_folio(iter, pos, bytes, folio);
- if (WARN_ON_ONCE(!ret))
- return -EIO;
+ ret = iomap_write_end(iter, pos, bytes, bytes, folio);
+ __iomap_put_folio(iter, pos, bytes, folio);
+ if (WARN_ON_ONCE(!ret))
+ return -EIO;
+ }
-loop_continue:
pos += bytes;
length -= bytes;
written += bytes;
next prev parent reply other threads:[~2024-05-31 13:11 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 9:51 [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Zhang Yi
2024-05-29 9:51 ` [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware Zhang Yi
2024-05-31 13:11 ` Christoph Hellwig [this message]
2024-05-31 14:03 ` Darrick J. Wong
2024-05-31 14:05 ` Christoph Hellwig
2024-05-31 15:44 ` Brian Foster
2024-05-31 15:43 ` Brian Foster
2024-06-02 22:22 ` Dave Chinner
2024-06-02 11:04 ` Brian Foster
2024-06-03 9:07 ` Zhang Yi
2024-06-03 14:37 ` Brian Foster
2024-06-04 23:38 ` Dave Chinner
2024-05-29 9:52 ` [RFC PATCH v4 2/8] math64: add rem_u64() to just return the remainder Zhang Yi
2024-05-31 12:35 ` Christoph Hellwig
2024-05-31 14:04 ` Darrick J. Wong
2024-05-29 9:52 ` [RFC PATCH v4 3/8] iomap: pass blocksize to iomap_truncate_page() Zhang Yi
2024-05-31 12:39 ` Christoph Hellwig
2024-06-02 11:16 ` Brian Foster
2024-06-03 13:23 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 4/8] fsdax: pass blocksize to dax_truncate_page() Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 5/8] xfs: refactor the truncating order Zhang Yi
2024-05-31 13:31 ` Christoph Hellwig
2024-05-31 15:27 ` Darrick J. Wong
2024-05-31 16:17 ` Christoph Hellwig
2024-06-03 13:51 ` Zhang Yi
2024-05-31 15:44 ` Darrick J. Wong
2024-06-03 14:15 ` Zhang Yi
2024-06-02 22:46 ` Dave Chinner
2024-06-03 14:18 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 6/8] xfs: correct the truncate blocksize of realtime inode Zhang Yi
2024-05-31 13:36 ` Christoph Hellwig
2024-06-03 14:35 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 7/8] xfs: reserve blocks for truncating " Zhang Yi
2024-05-31 12:42 ` Christoph Hellwig
2024-05-31 14:10 ` Darrick J. Wong
2024-05-31 14:13 ` Christoph Hellwig
2024-05-31 15:29 ` Darrick J. Wong
2024-05-31 16:17 ` Christoph Hellwig
2024-05-29 9:52 ` [RFC PATCH v4 8/8] xfs: improve truncate on a realtime inode with huge extsize Zhang Yi
2024-05-31 13:46 ` Christoph Hellwig
2024-05-31 14:12 ` Darrick J. Wong
2024-05-31 14:15 ` Christoph Hellwig
2024-05-31 15:00 ` Darrick J. Wong
2024-06-04 7:09 ` Zhang Yi
2024-05-31 12:26 ` [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Christoph Hellwig
2024-06-01 7:38 ` Zhang Yi
2024-06-01 7:40 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZlnMfSJcm5k6Dg_e@infradead.org \
--to=hch@infradead.org \
--cc=brauner@kernel.org \
--cc=chandanbabu@kernel.org \
--cc=chengzhihao1@huawei.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
--cc=yi.zhang@huawei.com \
--cc=yi.zhang@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).