stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* FAILED: patch "[PATCH] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin" failed to apply to 6.17-stable tree
@ 2025-11-09  3:11 gregkh
  2025-11-09 23:20 ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Sasha Levin
  0 siblings, 1 reply; 6+ messages in thread
From: gregkh @ 2025-11-09  3:11 UTC (permalink / raw)
  To: djwong, cem, john.g.garry; +Cc: stable


The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.

To reproduce the conflict and resubmit, you may use the following commands:

git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 8d7bba1e8314013ecc817a91624104ceb9352ddc
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2025110942-language-suspect-13bc@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..

Possible dependencies:



thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From 8d7bba1e8314013ecc817a91624104ceb9352ddc Mon Sep 17 00:00:00 2001
From: "Darrick J. Wong" <djwong@kernel.org>
Date: Tue, 4 Nov 2025 16:15:38 -0800
Subject: [PATCH] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin

I think there are several things wrong with this function:

A) xfs_bmapi_write can return a much larger unwritten mapping than what
   the caller asked for.  We convert part of that range to written, but
   return the entire written mapping to iomap even though that's
   inaccurate.

B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
   unwritten mapping could be *smaller* than the write range (or even
   the hole range).  In this case, we convert too much file range to
   written state because we then return a smaller mapping to iomap.

C) It doesn't handle delalloc mappings.  This I covered in the patch
   that I already sent to the list.

D) Reassigning count_fsb to handle the hole means that if the second
   cmap lookup attempt succeeds (due to racing with someone else) we
   trim the mapping more than is strictly necessary.  The changing
   meaning of count_fsb makes this harder to notice.

E) The tracepoint is kinda wrong because @length is mutated.  That makes
   it harder to chase the data flows through this function because you
   can't just grep on the pos/bytecount strings.

F) We don't actually check that the br_state = XFS_EXT_NORM assignment
   is accurate, i.e that the cow fork actually contains a written
   mapping for the range we're interested in

G) Somewhat inadequate documentation of why we need to xfs_trim_extent
   so aggressively in this function.

H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
   the write range to s_maxbytes.

Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 788bfdce608a..490e12cb99be 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1091,6 +1091,29 @@ const struct iomap_ops xfs_zoned_direct_write_iomap_ops = {
 };
 #endif /* CONFIG_XFS_RT */
 
+#ifdef DEBUG
+static void
+xfs_check_atomic_cow_conversion(
+	struct xfs_inode		*ip,
+	xfs_fileoff_t			offset_fsb,
+	xfs_filblks_t			count_fsb,
+	const struct xfs_bmbt_irec	*cmap)
+{
+	struct xfs_iext_cursor		icur;
+	struct xfs_bmbt_irec		cmap2 = { };
+
+	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap2))
+		xfs_trim_extent(&cmap2, offset_fsb, count_fsb);
+
+	ASSERT(cmap2.br_startoff == cmap->br_startoff);
+	ASSERT(cmap2.br_blockcount == cmap->br_blockcount);
+	ASSERT(cmap2.br_startblock == cmap->br_startblock);
+	ASSERT(cmap2.br_state == cmap->br_state);
+}
+#else
+# define xfs_check_atomic_cow_conversion(...)	((void)0)
+#endif
+
 static int
 xfs_atomic_write_cow_iomap_begin(
 	struct inode		*inode,
@@ -1102,9 +1125,10 @@ xfs_atomic_write_cow_iomap_begin(
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
-	const xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
-	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
-	xfs_filblks_t		count_fsb = end_fsb - offset_fsb;
+	const xfs_fileoff_t	offset_fsb = XFS_B_TO_FSBT(mp, offset);
+	const xfs_fileoff_t	end_fsb = XFS_B_TO_FSB(mp, offset + length);
+	const xfs_filblks_t	count_fsb = end_fsb - offset_fsb;
+	xfs_filblks_t		hole_count_fsb;
 	int			nmaps = 1;
 	xfs_filblks_t		resaligned;
 	struct xfs_bmbt_irec	cmap;
@@ -1143,14 +1167,20 @@ xfs_atomic_write_cow_iomap_begin(
 	if (cmap.br_startoff <= offset_fsb) {
 		if (isnullstartblock(cmap.br_startblock))
 			goto convert_delay;
+
+		/*
+		 * cmap could extend outside the write range due to previous
+		 * speculative preallocations.  We must trim cmap to the write
+		 * range because the cow fork treats written mappings to mean
+		 * "write in progress".
+		 */
 		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
 		goto found;
 	}
 
-	end_fsb = cmap.br_startoff;
-	count_fsb = end_fsb - offset_fsb;
+	hole_count_fsb = cmap.br_startoff - offset_fsb;
 
-	resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb,
+	resaligned = xfs_aligned_fsb_count(offset_fsb, hole_count_fsb,
 			xfs_get_cowextsz_hint(ip));
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 
@@ -1186,7 +1216,7 @@ xfs_atomic_write_cow_iomap_begin(
 	 * atomic writes to that same range will be aligned (and don't require
 	 * this COW-based method).
 	 */
-	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
+	error = xfs_bmapi_write(tp, ip, offset_fsb, hole_count_fsb,
 			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC |
 			XFS_BMAPI_EXTSZALIGN, 0, &cmap, &nmaps);
 	if (error) {
@@ -1199,17 +1229,26 @@ xfs_atomic_write_cow_iomap_begin(
 	if (error)
 		goto out_unlock;
 
+	/*
+	 * cmap could map more blocks than the range we passed into bmapi_write
+	 * because of EXTSZALIGN or adjacent pre-existing unwritten mappings
+	 * that were merged.  Trim cmap to the original write range so that we
+	 * don't convert more than we were asked to do for this write.
+	 */
+	xfs_trim_extent(&cmap, offset_fsb, count_fsb);
+
 found:
 	if (cmap.br_state != XFS_EXT_NORM) {
-		error = xfs_reflink_convert_cow_locked(ip, offset_fsb,
-				count_fsb);
+		error = xfs_reflink_convert_cow_locked(ip, cmap.br_startoff,
+				cmap.br_blockcount);
 		if (error)
 			goto out_unlock;
 		cmap.br_state = XFS_EXT_NORM;
+		xfs_check_atomic_cow_conversion(ip, offset_fsb, count_fsb,
+				&cmap);
 	}
 
-	length = XFS_FSB_TO_B(mp, cmap.br_startoff + cmap.br_blockcount);
-	trace_xfs_iomap_found(ip, offset, length - offset, XFS_COW_FORK, &cmap);
+	trace_xfs_iomap_found(ip, offset, length, XFS_COW_FORK, &cmap);
 	seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes
  2025-11-09  3:11 FAILED: patch "[PATCH] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin" failed to apply to 6.17-stable tree gregkh
@ 2025-11-09 23:20 ` Sasha Levin
  2025-11-09 23:20   ` [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin Sasha Levin
  2025-11-21  9:47   ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Greg KH
  0 siblings, 2 replies; 6+ messages in thread
From: Sasha Levin @ 2025-11-09 23:20 UTC (permalink / raw)
  To: stable; +Cc: Darrick J. Wong, John Garry, Carlos Maiolino, Sasha Levin

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit 8d54eacd82a0623a963e0c150ad3b02970638b0d ]

With the 20 Oct 2025 release of fstests, generic/521 fails for me on
regular (aka non-block-atomic-writes) storage:

QA output created by 521
dowrite: write: Input/output error
LOG DUMP (8553 total operations):
1(  1 mod 256): SKIPPED (no operation)
2(  2 mod 256): WRITE    0x7e000 thru 0x8dfff	(0x10000 bytes) HOLE
3(  3 mod 256): READ     0x69000 thru 0x79fff	(0x11000 bytes)
4(  4 mod 256): FALLOC   0x53c38 thru 0x5e853	(0xac1b bytes) INTERIOR
5(  5 mod 256): COPY 0x55000 thru 0x59fff	(0x5000 bytes) to 0x25000 thru 0x29fff
6(  6 mod 256): WRITE    0x74000 thru 0x88fff	(0x15000 bytes)
7(  7 mod 256): ZERO     0xedb1 thru 0x11693	(0x28e3 bytes)

with a warning in dmesg from iomap about XFS trying to give it a
delalloc mapping for a directio write.  Fix the software atomic write
iomap_begin code to convert the reservation into a written mapping.
This doesn't fix the data corruption problems reported by generic/760,
but it's a start.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Stable-dep-of: 8d7bba1e8314 ("xfs: fix various problems in xfs_atomic_write_cow_iomap_begin")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/xfs/xfs_iomap.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 2a74f29573410..a4a22975c7cc9 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1121,7 +1121,7 @@ xfs_atomic_write_cow_iomap_begin(
 		return -EAGAIN;
 
 	trace_xfs_iomap_atomic_write_cow(ip, offset, length);
-
+retry:
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 
 	if (!ip->i_cowfp) {
@@ -1132,6 +1132,8 @@ xfs_atomic_write_cow_iomap_begin(
 	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap))
 		cmap.br_startoff = end_fsb;
 	if (cmap.br_startoff <= offset_fsb) {
+		if (isnullstartblock(cmap.br_startblock))
+			goto convert_delay;
 		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
 		goto found;
 	}
@@ -1160,8 +1162,10 @@ xfs_atomic_write_cow_iomap_begin(
 	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap))
 		cmap.br_startoff = end_fsb;
 	if (cmap.br_startoff <= offset_fsb) {
-		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
 		xfs_trans_cancel(tp);
+		if (isnullstartblock(cmap.br_startblock))
+			goto convert_delay;
+		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
 		goto found;
 	}
 
@@ -1201,6 +1205,19 @@ xfs_atomic_write_cow_iomap_begin(
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);
 
+convert_delay:
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	error = xfs_bmapi_convert_delalloc(ip, XFS_COW_FORK, offset, iomap,
+			NULL);
+	if (error)
+		return error;
+
+	/*
+	 * Try the lookup again, because the delalloc conversion might have
+	 * turned the COW mapping into unwritten, but we need it to be in
+	 * written state.
+	 */
+	goto retry;
 out_unlock:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
  2025-11-09 23:20 ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Sasha Levin
@ 2025-11-09 23:20   ` Sasha Levin
  2025-11-21  9:47     ` Greg KH
  2025-11-21  9:47   ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Greg KH
  1 sibling, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2025-11-09 23:20 UTC (permalink / raw)
  To: stable; +Cc: Darrick J. Wong, John Garry, Carlos Maiolino, Sasha Levin

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit 8d7bba1e8314013ecc817a91624104ceb9352ddc ]

I think there are several things wrong with this function:

A) xfs_bmapi_write can return a much larger unwritten mapping than what
   the caller asked for.  We convert part of that range to written, but
   return the entire written mapping to iomap even though that's
   inaccurate.

B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
   unwritten mapping could be *smaller* than the write range (or even
   the hole range).  In this case, we convert too much file range to
   written state because we then return a smaller mapping to iomap.

C) It doesn't handle delalloc mappings.  This I covered in the patch
   that I already sent to the list.

D) Reassigning count_fsb to handle the hole means that if the second
   cmap lookup attempt succeeds (due to racing with someone else) we
   trim the mapping more than is strictly necessary.  The changing
   meaning of count_fsb makes this harder to notice.

E) The tracepoint is kinda wrong because @length is mutated.  That makes
   it harder to chase the data flows through this function because you
   can't just grep on the pos/bytecount strings.

F) We don't actually check that the br_state = XFS_EXT_NORM assignment
   is accurate, i.e that the cow fork actually contains a written
   mapping for the range we're interested in

G) Somewhat inadequate documentation of why we need to xfs_trim_extent
   so aggressively in this function.

H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
   the write range to s_maxbytes.

Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/xfs/xfs_iomap.c | 61 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 50 insertions(+), 11 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index a4a22975c7cc9..0cf84b25d6567 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1082,6 +1082,29 @@ const struct iomap_ops xfs_zoned_direct_write_iomap_ops = {
 };
 #endif /* CONFIG_XFS_RT */
 
+#ifdef DEBUG
+static void
+xfs_check_atomic_cow_conversion(
+	struct xfs_inode		*ip,
+	xfs_fileoff_t			offset_fsb,
+	xfs_filblks_t			count_fsb,
+	const struct xfs_bmbt_irec	*cmap)
+{
+	struct xfs_iext_cursor		icur;
+	struct xfs_bmbt_irec		cmap2 = { };
+
+	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap2))
+		xfs_trim_extent(&cmap2, offset_fsb, count_fsb);
+
+	ASSERT(cmap2.br_startoff == cmap->br_startoff);
+	ASSERT(cmap2.br_blockcount == cmap->br_blockcount);
+	ASSERT(cmap2.br_startblock == cmap->br_startblock);
+	ASSERT(cmap2.br_state == cmap->br_state);
+}
+#else
+# define xfs_check_atomic_cow_conversion(...)	((void)0)
+#endif
+
 static int
 xfs_atomic_write_cow_iomap_begin(
 	struct inode		*inode,
@@ -1093,9 +1116,10 @@ xfs_atomic_write_cow_iomap_begin(
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
-	const xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
-	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
-	xfs_filblks_t		count_fsb = end_fsb - offset_fsb;
+	const xfs_fileoff_t	offset_fsb = XFS_B_TO_FSBT(mp, offset);
+	const xfs_fileoff_t	end_fsb = XFS_B_TO_FSB(mp, offset + length);
+	const xfs_filblks_t	count_fsb = end_fsb - offset_fsb;
+	xfs_filblks_t		hole_count_fsb;
 	int			nmaps = 1;
 	xfs_filblks_t		resaligned;
 	struct xfs_bmbt_irec	cmap;
@@ -1134,14 +1158,20 @@ xfs_atomic_write_cow_iomap_begin(
 	if (cmap.br_startoff <= offset_fsb) {
 		if (isnullstartblock(cmap.br_startblock))
 			goto convert_delay;
+
+		/*
+		 * cmap could extend outside the write range due to previous
+		 * speculative preallocations.  We must trim cmap to the write
+		 * range because the cow fork treats written mappings to mean
+		 * "write in progress".
+		 */
 		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
 		goto found;
 	}
 
-	end_fsb = cmap.br_startoff;
-	count_fsb = end_fsb - offset_fsb;
+	hole_count_fsb = cmap.br_startoff - offset_fsb;
 
-	resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb,
+	resaligned = xfs_aligned_fsb_count(offset_fsb, hole_count_fsb,
 			xfs_get_cowextsz_hint(ip));
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 
@@ -1177,7 +1207,7 @@ xfs_atomic_write_cow_iomap_begin(
 	 * atomic writes to that same range will be aligned (and don't require
 	 * this COW-based method).
 	 */
-	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
+	error = xfs_bmapi_write(tp, ip, offset_fsb, hole_count_fsb,
 			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC |
 			XFS_BMAPI_EXTSZALIGN, 0, &cmap, &nmaps);
 	if (error) {
@@ -1190,17 +1220,26 @@ xfs_atomic_write_cow_iomap_begin(
 	if (error)
 		goto out_unlock;
 
+	/*
+	 * cmap could map more blocks than the range we passed into bmapi_write
+	 * because of EXTSZALIGN or adjacent pre-existing unwritten mappings
+	 * that were merged.  Trim cmap to the original write range so that we
+	 * don't convert more than we were asked to do for this write.
+	 */
+	xfs_trim_extent(&cmap, offset_fsb, count_fsb);
+
 found:
 	if (cmap.br_state != XFS_EXT_NORM) {
-		error = xfs_reflink_convert_cow_locked(ip, offset_fsb,
-				count_fsb);
+		error = xfs_reflink_convert_cow_locked(ip, cmap.br_startoff,
+				cmap.br_blockcount);
 		if (error)
 			goto out_unlock;
 		cmap.br_state = XFS_EXT_NORM;
+		xfs_check_atomic_cow_conversion(ip, offset_fsb, count_fsb,
+				&cmap);
 	}
 
-	length = XFS_FSB_TO_B(mp, cmap.br_startoff + cmap.br_blockcount);
-	trace_xfs_iomap_found(ip, offset, length - offset, XFS_COW_FORK, &cmap);
+	trace_xfs_iomap_found(ip, offset, length, XFS_COW_FORK, &cmap);
 	seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes
  2025-11-09 23:20 ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Sasha Levin
  2025-11-09 23:20   ` [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin Sasha Levin
@ 2025-11-21  9:47   ` Greg KH
  1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2025-11-21  9:47 UTC (permalink / raw)
  To: Sasha Levin; +Cc: stable, Darrick J. Wong, John Garry, Carlos Maiolino

On Sun, Nov 09, 2025 at 06:20:56PM -0500, Sasha Levin wrote:
> From: "Darrick J. Wong" <djwong@kernel.org>
> 
> [ Upstream commit 8d54eacd82a0623a963e0c150ad3b02970638b0d ]
> 
> With the 20 Oct 2025 release of fstests, generic/521 fails for me on
> regular (aka non-block-atomic-writes) storage:
> 
> QA output created by 521
> dowrite: write: Input/output error
> LOG DUMP (8553 total operations):
> 1(  1 mod 256): SKIPPED (no operation)
> 2(  2 mod 256): WRITE    0x7e000 thru 0x8dfff	(0x10000 bytes) HOLE
> 3(  3 mod 256): READ     0x69000 thru 0x79fff	(0x11000 bytes)
> 4(  4 mod 256): FALLOC   0x53c38 thru 0x5e853	(0xac1b bytes) INTERIOR
> 5(  5 mod 256): COPY 0x55000 thru 0x59fff	(0x5000 bytes) to 0x25000 thru 0x29fff
> 6(  6 mod 256): WRITE    0x74000 thru 0x88fff	(0x15000 bytes)
> 7(  7 mod 256): ZERO     0xedb1 thru 0x11693	(0x28e3 bytes)
> 
> with a warning in dmesg from iomap about XFS trying to give it a
> delalloc mapping for a directio write.  Fix the software atomic write
> iomap_begin code to convert the reservation into a written mapping.
> This doesn't fix the data corruption problems reported by generic/760,
> but it's a start.
> 
> Cc: stable@vger.kernel.org # v6.16
> Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Carlos Maiolino <cem@kernel.org>
> Stable-dep-of: 8d7bba1e8314 ("xfs: fix various problems in xfs_atomic_write_cow_iomap_begin")
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  fs/xfs/xfs_iomap.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 2a74f29573410..a4a22975c7cc9 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1121,7 +1121,7 @@ xfs_atomic_write_cow_iomap_begin(
>  		return -EAGAIN;
>  
>  	trace_xfs_iomap_atomic_write_cow(ip, offset, length);
> -
> +retry:
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  
>  	if (!ip->i_cowfp) {
> @@ -1132,6 +1132,8 @@ xfs_atomic_write_cow_iomap_begin(
>  	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap))
>  		cmap.br_startoff = end_fsb;
>  	if (cmap.br_startoff <= offset_fsb) {
> +		if (isnullstartblock(cmap.br_startblock))
> +			goto convert_delay;
>  		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>  		goto found;
>  	}
> @@ -1160,8 +1162,10 @@ xfs_atomic_write_cow_iomap_begin(
>  	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap))
>  		cmap.br_startoff = end_fsb;
>  	if (cmap.br_startoff <= offset_fsb) {
> -		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>  		xfs_trans_cancel(tp);
> +		if (isnullstartblock(cmap.br_startblock))
> +			goto convert_delay;
> +		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>  		goto found;
>  	}
>  
> @@ -1201,6 +1205,19 @@ xfs_atomic_write_cow_iomap_begin(
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);
>  
> +convert_delay:
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +	error = xfs_bmapi_convert_delalloc(ip, XFS_COW_FORK, offset, iomap,
> +			NULL);
> +	if (error)
> +		return error;
> +
> +	/*
> +	 * Try the lookup again, because the delalloc conversion might have
> +	 * turned the COW mapping into unwritten, but we need it to be in
> +	 * written state.
> +	 */
> +	goto retry;
>  out_unlock:
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return error;
> -- 
> 2.51.0
> 
> 

Did not apply to 6.17.y at all :(

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
  2025-11-09 23:20   ` [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin Sasha Levin
@ 2025-11-21  9:47     ` Greg KH
  2025-11-24 15:25       ` Sasha Levin
  0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2025-11-21  9:47 UTC (permalink / raw)
  To: Sasha Levin; +Cc: stable, Darrick J. Wong, John Garry, Carlos Maiolino

On Sun, Nov 09, 2025 at 06:20:57PM -0500, Sasha Levin wrote:
> From: "Darrick J. Wong" <djwong@kernel.org>
> 
> [ Upstream commit 8d7bba1e8314013ecc817a91624104ceb9352ddc ]
> 
> I think there are several things wrong with this function:
> 
> A) xfs_bmapi_write can return a much larger unwritten mapping than what
>    the caller asked for.  We convert part of that range to written, but
>    return the entire written mapping to iomap even though that's
>    inaccurate.
> 
> B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
>    unwritten mapping could be *smaller* than the write range (or even
>    the hole range).  In this case, we convert too much file range to
>    written state because we then return a smaller mapping to iomap.
> 
> C) It doesn't handle delalloc mappings.  This I covered in the patch
>    that I already sent to the list.
> 
> D) Reassigning count_fsb to handle the hole means that if the second
>    cmap lookup attempt succeeds (due to racing with someone else) we
>    trim the mapping more than is strictly necessary.  The changing
>    meaning of count_fsb makes this harder to notice.
> 
> E) The tracepoint is kinda wrong because @length is mutated.  That makes
>    it harder to chase the data flows through this function because you
>    can't just grep on the pos/bytecount strings.
> 
> F) We don't actually check that the br_state = XFS_EXT_NORM assignment
>    is accurate, i.e that the cow fork actually contains a written
>    mapping for the range we're interested in
> 
> G) Somewhat inadequate documentation of why we need to xfs_trim_extent
>    so aggressively in this function.
> 
> H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
>    the write range to s_maxbytes.
> 
> Fix these issues, and then the atomic writes regressions in generic/760,
> generic/617, generic/091, generic/263, and generic/521 all go away for
> me.
> 
> Cc: stable@vger.kernel.org # v6.16
> Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Carlos Maiolino <cem@kernel.org>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  fs/xfs/xfs_iomap.c | 61 +++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 50 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index a4a22975c7cc9..0cf84b25d6567 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1082,6 +1082,29 @@ const struct iomap_ops xfs_zoned_direct_write_iomap_ops = {
>  };
>  #endif /* CONFIG_XFS_RT */
>  
> +#ifdef DEBUG
> +static void
> +xfs_check_atomic_cow_conversion(
> +	struct xfs_inode		*ip,
> +	xfs_fileoff_t			offset_fsb,
> +	xfs_filblks_t			count_fsb,
> +	const struct xfs_bmbt_irec	*cmap)
> +{
> +	struct xfs_iext_cursor		icur;
> +	struct xfs_bmbt_irec		cmap2 = { };
> +
> +	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap2))
> +		xfs_trim_extent(&cmap2, offset_fsb, count_fsb);
> +
> +	ASSERT(cmap2.br_startoff == cmap->br_startoff);
> +	ASSERT(cmap2.br_blockcount == cmap->br_blockcount);
> +	ASSERT(cmap2.br_startblock == cmap->br_startblock);
> +	ASSERT(cmap2.br_state == cmap->br_state);
> +}
> +#else
> +# define xfs_check_atomic_cow_conversion(...)	((void)0)
> +#endif
> +
>  static int
>  xfs_atomic_write_cow_iomap_begin(
>  	struct inode		*inode,
> @@ -1093,9 +1116,10 @@ xfs_atomic_write_cow_iomap_begin(
>  {
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	struct xfs_mount	*mp = ip->i_mount;
> -	const xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
> -	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
> -	xfs_filblks_t		count_fsb = end_fsb - offset_fsb;
> +	const xfs_fileoff_t	offset_fsb = XFS_B_TO_FSBT(mp, offset);
> +	const xfs_fileoff_t	end_fsb = XFS_B_TO_FSB(mp, offset + length);
> +	const xfs_filblks_t	count_fsb = end_fsb - offset_fsb;
> +	xfs_filblks_t		hole_count_fsb;
>  	int			nmaps = 1;
>  	xfs_filblks_t		resaligned;
>  	struct xfs_bmbt_irec	cmap;
> @@ -1134,14 +1158,20 @@ xfs_atomic_write_cow_iomap_begin(
>  	if (cmap.br_startoff <= offset_fsb) {
>  		if (isnullstartblock(cmap.br_startblock))
>  			goto convert_delay;
> +
> +		/*
> +		 * cmap could extend outside the write range due to previous
> +		 * speculative preallocations.  We must trim cmap to the write
> +		 * range because the cow fork treats written mappings to mean
> +		 * "write in progress".
> +		 */
>  		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>  		goto found;
>  	}
>  
> -	end_fsb = cmap.br_startoff;
> -	count_fsb = end_fsb - offset_fsb;
> +	hole_count_fsb = cmap.br_startoff - offset_fsb;
>  
> -	resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb,
> +	resaligned = xfs_aligned_fsb_count(offset_fsb, hole_count_fsb,
>  			xfs_get_cowextsz_hint(ip));
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  
> @@ -1177,7 +1207,7 @@ xfs_atomic_write_cow_iomap_begin(
>  	 * atomic writes to that same range will be aligned (and don't require
>  	 * this COW-based method).
>  	 */
> -	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
> +	error = xfs_bmapi_write(tp, ip, offset_fsb, hole_count_fsb,
>  			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC |
>  			XFS_BMAPI_EXTSZALIGN, 0, &cmap, &nmaps);
>  	if (error) {
> @@ -1190,17 +1220,26 @@ xfs_atomic_write_cow_iomap_begin(
>  	if (error)
>  		goto out_unlock;
>  
> +	/*
> +	 * cmap could map more blocks than the range we passed into bmapi_write
> +	 * because of EXTSZALIGN or adjacent pre-existing unwritten mappings
> +	 * that were merged.  Trim cmap to the original write range so that we
> +	 * don't convert more than we were asked to do for this write.
> +	 */
> +	xfs_trim_extent(&cmap, offset_fsb, count_fsb);
> +
>  found:
>  	if (cmap.br_state != XFS_EXT_NORM) {
> -		error = xfs_reflink_convert_cow_locked(ip, offset_fsb,
> -				count_fsb);
> +		error = xfs_reflink_convert_cow_locked(ip, cmap.br_startoff,
> +				cmap.br_blockcount);
>  		if (error)
>  			goto out_unlock;
>  		cmap.br_state = XFS_EXT_NORM;
> +		xfs_check_atomic_cow_conversion(ip, offset_fsb, count_fsb,
> +				&cmap);
>  	}
>  
> -	length = XFS_FSB_TO_B(mp, cmap.br_startoff + cmap.br_blockcount);
> -	trace_xfs_iomap_found(ip, offset, length - offset, XFS_COW_FORK, &cmap);
> +	trace_xfs_iomap_found(ip, offset, length, XFS_COW_FORK, &cmap);
>  	seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);
> -- 
> 2.51.0
> 
> 

Also did not apply to 6.17.y :(

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
  2025-11-21  9:47     ` Greg KH
@ 2025-11-24 15:25       ` Sasha Levin
  0 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2025-11-24 15:25 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, Darrick J. Wong, John Garry, Carlos Maiolino

On Fri, Nov 21, 2025 at 10:47:50AM +0100, Greg KH wrote:
>On Sun, Nov 09, 2025 at 06:20:57PM -0500, Sasha Levin wrote:
>> From: "Darrick J. Wong" <djwong@kernel.org>
>>
>> [ Upstream commit 8d7bba1e8314013ecc817a91624104ceb9352ddc ]
>>
>> I think there are several things wrong with this function:
>>
>> A) xfs_bmapi_write can return a much larger unwritten mapping than what
>>    the caller asked for.  We convert part of that range to written, but
>>    return the entire written mapping to iomap even though that's
>>    inaccurate.
>>
>> B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
>>    unwritten mapping could be *smaller* than the write range (or even
>>    the hole range).  In this case, we convert too much file range to
>>    written state because we then return a smaller mapping to iomap.
>>
>> C) It doesn't handle delalloc mappings.  This I covered in the patch
>>    that I already sent to the list.
>>
>> D) Reassigning count_fsb to handle the hole means that if the second
>>    cmap lookup attempt succeeds (due to racing with someone else) we
>>    trim the mapping more than is strictly necessary.  The changing
>>    meaning of count_fsb makes this harder to notice.
>>
>> E) The tracepoint is kinda wrong because @length is mutated.  That makes
>>    it harder to chase the data flows through this function because you
>>    can't just grep on the pos/bytecount strings.
>>
>> F) We don't actually check that the br_state = XFS_EXT_NORM assignment
>>    is accurate, i.e that the cow fork actually contains a written
>>    mapping for the range we're interested in
>>
>> G) Somewhat inadequate documentation of why we need to xfs_trim_extent
>>    so aggressively in this function.
>>
>> H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
>>    the write range to s_maxbytes.
>>
>> Fix these issues, and then the atomic writes regressions in generic/760,
>> generic/617, generic/091, generic/263, and generic/521 all go away for
>> me.
>>
>> Cc: stable@vger.kernel.org # v6.16
>> Fixes: bd1d2c21d5d249 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
>> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
>> Reviewed-by: John Garry <john.g.garry@oracle.com>
>> Signed-off-by: Carlos Maiolino <cem@kernel.org>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>> ---
>>  fs/xfs/xfs_iomap.c | 61 +++++++++++++++++++++++++++++++++++++---------
>>  1 file changed, 50 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
>> index a4a22975c7cc9..0cf84b25d6567 100644
>> --- a/fs/xfs/xfs_iomap.c
>> +++ b/fs/xfs/xfs_iomap.c
>> @@ -1082,6 +1082,29 @@ const struct iomap_ops xfs_zoned_direct_write_iomap_ops = {
>>  };
>>  #endif /* CONFIG_XFS_RT */
>>
>> +#ifdef DEBUG
>> +static void
>> +xfs_check_atomic_cow_conversion(
>> +	struct xfs_inode		*ip,
>> +	xfs_fileoff_t			offset_fsb,
>> +	xfs_filblks_t			count_fsb,
>> +	const struct xfs_bmbt_irec	*cmap)
>> +{
>> +	struct xfs_iext_cursor		icur;
>> +	struct xfs_bmbt_irec		cmap2 = { };
>> +
>> +	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &cmap2))
>> +		xfs_trim_extent(&cmap2, offset_fsb, count_fsb);
>> +
>> +	ASSERT(cmap2.br_startoff == cmap->br_startoff);
>> +	ASSERT(cmap2.br_blockcount == cmap->br_blockcount);
>> +	ASSERT(cmap2.br_startblock == cmap->br_startblock);
>> +	ASSERT(cmap2.br_state == cmap->br_state);
>> +}
>> +#else
>> +# define xfs_check_atomic_cow_conversion(...)	((void)0)
>> +#endif
>> +
>>  static int
>>  xfs_atomic_write_cow_iomap_begin(
>>  	struct inode		*inode,
>> @@ -1093,9 +1116,10 @@ xfs_atomic_write_cow_iomap_begin(
>>  {
>>  	struct xfs_inode	*ip = XFS_I(inode);
>>  	struct xfs_mount	*mp = ip->i_mount;
>> -	const xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
>> -	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
>> -	xfs_filblks_t		count_fsb = end_fsb - offset_fsb;
>> +	const xfs_fileoff_t	offset_fsb = XFS_B_TO_FSBT(mp, offset);
>> +	const xfs_fileoff_t	end_fsb = XFS_B_TO_FSB(mp, offset + length);
>> +	const xfs_filblks_t	count_fsb = end_fsb - offset_fsb;
>> +	xfs_filblks_t		hole_count_fsb;
>>  	int			nmaps = 1;
>>  	xfs_filblks_t		resaligned;
>>  	struct xfs_bmbt_irec	cmap;
>> @@ -1134,14 +1158,20 @@ xfs_atomic_write_cow_iomap_begin(
>>  	if (cmap.br_startoff <= offset_fsb) {
>>  		if (isnullstartblock(cmap.br_startblock))
>>  			goto convert_delay;
>> +
>> +		/*
>> +		 * cmap could extend outside the write range due to previous
>> +		 * speculative preallocations.  We must trim cmap to the write
>> +		 * range because the cow fork treats written mappings to mean
>> +		 * "write in progress".
>> +		 */
>>  		xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>>  		goto found;
>>  	}
>>
>> -	end_fsb = cmap.br_startoff;
>> -	count_fsb = end_fsb - offset_fsb;
>> +	hole_count_fsb = cmap.br_startoff - offset_fsb;
>>
>> -	resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb,
>> +	resaligned = xfs_aligned_fsb_count(offset_fsb, hole_count_fsb,
>>  			xfs_get_cowextsz_hint(ip));
>>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>
>> @@ -1177,7 +1207,7 @@ xfs_atomic_write_cow_iomap_begin(
>>  	 * atomic writes to that same range will be aligned (and don't require
>>  	 * this COW-based method).
>>  	 */
>> -	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
>> +	error = xfs_bmapi_write(tp, ip, offset_fsb, hole_count_fsb,
>>  			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC |
>>  			XFS_BMAPI_EXTSZALIGN, 0, &cmap, &nmaps);
>>  	if (error) {
>> @@ -1190,17 +1220,26 @@ xfs_atomic_write_cow_iomap_begin(
>>  	if (error)
>>  		goto out_unlock;
>>
>> +	/*
>> +	 * cmap could map more blocks than the range we passed into bmapi_write
>> +	 * because of EXTSZALIGN or adjacent pre-existing unwritten mappings
>> +	 * that were merged.  Trim cmap to the original write range so that we
>> +	 * don't convert more than we were asked to do for this write.
>> +	 */
>> +	xfs_trim_extent(&cmap, offset_fsb, count_fsb);
>> +
>>  found:
>>  	if (cmap.br_state != XFS_EXT_NORM) {
>> -		error = xfs_reflink_convert_cow_locked(ip, offset_fsb,
>> -				count_fsb);
>> +		error = xfs_reflink_convert_cow_locked(ip, cmap.br_startoff,
>> +				cmap.br_blockcount);
>>  		if (error)
>>  			goto out_unlock;
>>  		cmap.br_state = XFS_EXT_NORM;
>> +		xfs_check_atomic_cow_conversion(ip, offset_fsb, count_fsb,
>> +				&cmap);
>>  	}
>>
>> -	length = XFS_FSB_TO_B(mp, cmap.br_startoff + cmap.br_blockcount);
>> -	trace_xfs_iomap_found(ip, offset, length - offset, XFS_COW_FORK, &cmap);
>> +	trace_xfs_iomap_found(ip, offset, length, XFS_COW_FORK, &cmap);
>>  	seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
>>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>  	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, IOMAP_F_SHARED, seq);
>> --
>> 2.51.0
>>
>>
>
>Also did not apply to 6.17.y :(

You pulled these two for the previous release, so they're already in tree :)

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-24 15:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-09  3:11 FAILED: patch "[PATCH] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin" failed to apply to 6.17-stable tree gregkh
2025-11-09 23:20 ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Sasha Levin
2025-11-09 23:20   ` [PATCH 6.17.y 2/2] xfs: fix various problems in xfs_atomic_write_cow_iomap_begin Sasha Levin
2025-11-21  9:47     ` Greg KH
2025-11-24 15:25       ` Sasha Levin
2025-11-21  9:47   ` [PATCH 6.17.y 1/2] xfs: fix delalloc write failures in software-provided atomic writes Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).