linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iomap: keep on increasing i_size in iomap_write_end()
@ 2024-06-03 11:22 Zhang Yi
  2024-06-04  4:08 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Zhang Yi @ 2024-06-03 11:22 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, chandanbabu, jack,
	yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Commit '943bc0882ceb ("iomap: don't increase i_size if it's not a write
operation")' breaks xfs with realtime device on generic/561, the problem
is when unaligned truncate down a xfs realtime inode with rtextsize > 1
fs block, xfs only zero out the EOF block but doesn't zero out the tail
blocks that aligned to rtextsize, so if we don't increase i_size in
iomap_write_end(), it could expose stale data after we do an append
write beyond the aligned EOF block.

xfs should zero out the tail blocks when truncate down, but before we
finish that, let's fix the issue by just revert the changes in
iomap_write_end().

Fixes: 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation")
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Link: https://lore.kernel.org/linux-xfs/0b92a215-9d9b-3788-4504-a520778953c2@huaweicloud.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/iomap/buffered-io.c | 53 +++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 29 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c5802a459334..bd70fcbc168e 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -877,22 +877,37 @@ static bool iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
+	loff_t old_size = iter->inode->i_size;
+	size_t written;
 
 	if (srcmap->type == IOMAP_INLINE) {
 		iomap_write_end_inline(iter, folio, pos, copied);
-		return true;
+		written = copied;
+	} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
+		written = block_write_end(NULL, iter->inode->i_mapping, pos,
+					len, copied, &folio->page, NULL);
+		WARN_ON_ONCE(written != copied && written != 0);
+	} else {
+		written = __iomap_write_end(iter->inode, pos, len, copied,
+					    folio) ? copied : 0;
 	}
 
-	if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
-		size_t bh_written;
-
-		bh_written = block_write_end(NULL, iter->inode->i_mapping, pos,
-					len, copied, &folio->page, NULL);
-		WARN_ON_ONCE(bh_written != copied && bh_written != 0);
-		return bh_written == copied;
+	/*
+	 * Update the in-memory inode size after copying the data into the page
+	 * cache.  It's up to the file system to write the updated size to disk,
+	 * preferably after I/O completion so that no stale data is exposed.
+	 * Only once that's done can we unlock and release the folio.
+	 */
+	if (pos + written > old_size) {
+		i_size_write(iter->inode, pos + written);
+		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
 	}
+	__iomap_put_folio(iter, pos, written, folio);
 
-	return __iomap_write_end(iter->inode, pos, len, copied, folio);
+	if (old_size < pos)
+		pagecache_isize_extended(iter->inode, old_size, pos);
+
+	return written == copied;
 }
 
 static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
@@ -907,7 +922,6 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 
 	do {
 		struct folio *folio;
-		loff_t old_size;
 		size_t offset;		/* Offset into folio */
 		size_t bytes;		/* Bytes to write to folio */
 		size_t copied;		/* Bytes copied from user */
@@ -959,23 +973,6 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		written = iomap_write_end(iter, pos, bytes, copied, folio) ?
 			  copied : 0;
 
-		/*
-		 * Update the in-memory inode size after copying the data into
-		 * the page cache.  It's up to the file system to write the
-		 * updated size to disk, preferably after I/O completion so that
-		 * no stale data is exposed.  Only once that's done can we
-		 * unlock and release the folio.
-		 */
-		old_size = iter->inode->i_size;
-		if (pos + written > old_size) {
-			i_size_write(iter->inode, pos + written);
-			iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
-		}
-		__iomap_put_folio(iter, pos, written, folio);
-
-		if (old_size < pos)
-			pagecache_isize_extended(iter->inode, old_size, pos);
-
 		cond_resched();
 		if (unlikely(written == 0)) {
 			/*
@@ -1346,7 +1343,6 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 			bytes = folio_size(folio) - offset;
 
 		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
-		__iomap_put_folio(iter, pos, bytes, folio);
 		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
@@ -1412,7 +1408,6 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		folio_mark_accessed(folio);
 
 		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
-		__iomap_put_folio(iter, pos, bytes, folio);
 		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] iomap: keep on increasing i_size in iomap_write_end()
  2024-06-03 11:22 [PATCH] iomap: keep on increasing i_size in iomap_write_end() Zhang Yi
@ 2024-06-04  4:08 ` Christoph Hellwig
  2024-06-04  7:10   ` Zhang Yi
  2024-06-05 15:24 ` Christian Brauner
  2024-06-06  5:45 ` Chandan Babu R
  2 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2024-06-04  4:08 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, chandanbabu, jack, yi.zhang, chengzhihao1, yukuai3

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

hopefully we can bring it back soon.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iomap: keep on increasing i_size in iomap_write_end()
  2024-06-04  4:08 ` Christoph Hellwig
@ 2024-06-04  7:10   ` Zhang Yi
  0 siblings, 0 replies; 5+ messages in thread
From: Zhang Yi @ 2024-06-04  7:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, brauner, david,
	chandanbabu, jack, yi.zhang, chengzhihao1, yukuai3

On 2024/6/4 12:08, Christoph Hellwig wrote:
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> hopefully we can bring it back soon.
> 
Yeah, it will :)

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iomap: keep on increasing i_size in iomap_write_end()
  2024-06-03 11:22 [PATCH] iomap: keep on increasing i_size in iomap_write_end() Zhang Yi
  2024-06-04  4:08 ` Christoph Hellwig
@ 2024-06-05 15:24 ` Christian Brauner
  2024-06-06  5:45 ` Chandan Babu R
  2 siblings, 0 replies; 5+ messages in thread
From: Christian Brauner @ 2024-06-05 15:24 UTC (permalink / raw)
  To: Zhang Yi
  Cc: Christian Brauner, linux-kernel, djwong, hch, david, chandanbabu,
	jack, yi.zhang, chengzhihao1, yukuai3, linux-xfs, linux-fsdevel

On Mon, 03 Jun 2024 19:22:22 +0800, Zhang Yi wrote:
> Commit '943bc0882ceb ("iomap: don't increase i_size if it's not a write
> operation")' breaks xfs with realtime device on generic/561, the problem
> is when unaligned truncate down a xfs realtime inode with rtextsize > 1
> fs block, xfs only zero out the EOF block but doesn't zero out the tail
> blocks that aligned to rtextsize, so if we don't increase i_size in
> iomap_write_end(), it could expose stale data after we do an append
> write beyond the aligned EOF block.
> 
> [...]

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/1] iomap: keep on increasing i_size in iomap_write_end()
      https://git.kernel.org/vfs/vfs/c/86e71b5f0366

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iomap: keep on increasing i_size in iomap_write_end()
  2024-06-03 11:22 [PATCH] iomap: keep on increasing i_size in iomap_write_end() Zhang Yi
  2024-06-04  4:08 ` Christoph Hellwig
  2024-06-05 15:24 ` Christian Brauner
@ 2024-06-06  5:45 ` Chandan Babu R
  2 siblings, 0 replies; 5+ messages in thread
From: Chandan Babu R @ 2024-06-06  5:45 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, jack, yi.zhang, chengzhihao1, yukuai3

On Mon, Jun 03, 2024 at 07:22:22 PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> Commit '943bc0882ceb ("iomap: don't increase i_size if it's not a write
> operation")' breaks xfs with realtime device on generic/561, the problem
> is when unaligned truncate down a xfs realtime inode with rtextsize > 1
> fs block, xfs only zero out the EOF block but doesn't zero out the tail
> blocks that aligned to rtextsize, so if we don't increase i_size in
> iomap_write_end(), it could expose stale data after we do an append
> write beyond the aligned EOF block.
>
> xfs should zero out the tail blocks when truncate down, but before we
> finish that, let's fix the issue by just revert the changes in
> iomap_write_end().

I didn't notice any regressions with this patch applied. Hence,

Tested-by: Chandan Babu R <chandanbabu@kernel.org>

-- 
Chandan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-06-06  5:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-03 11:22 [PATCH] iomap: keep on increasing i_size in iomap_write_end() Zhang Yi
2024-06-04  4:08 ` Christoph Hellwig
2024-06-04  7:10   ` Zhang Yi
2024-06-05 15:24 ` Christian Brauner
2024-06-06  5:45 ` Chandan Babu R

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).