* [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
@ 2025-07-30 10:28 Jan Kara
2025-07-30 14:04 ` Christoph Hellwig
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Jan Kara @ 2025-07-30 10:28 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Darrick J . Wong, Jan Kara, John Garry,
Ritesh Harjani (IBM), stable
Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
the logic in iomap_dio_bio_iter() in a way that when the device does
support FUA (or has no writeback cache) and the direct IO happens to
freshly allocated or unwritten extents, we will *not* issue fsync after
completing direct IO O_SYNC / O_DSYNC write because the
IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by
clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as
it was originally intended.
CC: John Garry <john.g.garry@oracle.com>
CC: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/iomap/direct-io.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
BTW, I've spotted this because some performance tests got suspiciously fast
on recent kernels :) Sadly no easy improvement to cherry-pick for me to fix
customer issue I'm chasing...
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 6f25d4cfea9f..b84f6af2eb4c 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -363,14 +363,14 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
if (iomap->flags & IOMAP_F_SHARED)
dio->flags |= IOMAP_DIO_COW;
- if (iomap->flags & IOMAP_F_NEW) {
+ if (iomap->flags & IOMAP_F_NEW)
need_zeroout = true;
- } else if (iomap->type == IOMAP_MAPPED) {
- if (iomap_dio_can_use_fua(iomap, dio))
- bio_opf |= REQ_FUA;
- else
- dio->flags &= ~IOMAP_DIO_WRITE_THROUGH;
- }
+ else if (iomap->type == IOMAP_MAPPED &&
+ iomap_dio_can_use_fua(iomap, dio))
+ bio_opf |= REQ_FUA;
+
+ if (!(bio_opf & REQ_FUA))
+ dio->flags &= ~IOMAP_DIO_WRITE_THROUGH;
/*
* We can only do deferred completion for pure overwrites that
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
@ 2025-07-30 14:04 ` Christoph Hellwig
2025-07-30 15:08 ` Darrick J. Wong
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2025-07-30 14:04 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, linux-fsdevel, Darrick J . Wong, John Garry,
Ritesh Harjani (IBM), stable
On Wed, Jul 30, 2025 at 12:28:41PM +0200, Jan Kara wrote:
> Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
> the logic in iomap_dio_bio_iter() in a way that when the device does
> support FUA (or has no writeback cache) and the direct IO happens to
> freshly allocated or unwritten extents, we will *not* issue fsync after
> completing direct IO O_SYNC / O_DSYNC write because the
> IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set.
Uh-oh.
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
2025-07-30 14:04 ` Christoph Hellwig
@ 2025-07-30 15:08 ` Darrick J. Wong
2025-07-30 16:30 ` John Garry
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Darrick J. Wong @ 2025-07-30 15:08 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, linux-fsdevel, John Garry,
Ritesh Harjani (IBM), stable
On Wed, Jul 30, 2025 at 12:28:41PM +0200, Jan Kara wrote:
> Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
> the logic in iomap_dio_bio_iter() in a way that when the device does
> support FUA (or has no writeback cache) and the direct IO happens to
> freshly allocated or unwritten extents, we will *not* issue fsync after
> completing direct IO O_SYNC / O_DSYNC write because the
> IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by
> clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as
> it was originally intended.
>
> CC: John Garry <john.g.garry@oracle.com>
> CC: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()")
> CC: stable@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>
Aha, I wonder if that's why fuse2fs+iomap occasionally fails the fstest
that tries to replay all the FUA writes, finds none, and complains.
(and I bet nobody noticed this on xfs because the log sends FUA, whereas
fuse2fs merely sends fsync to the block device)
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> fs/iomap/direct-io.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> BTW, I've spotted this because some performance tests got suspiciously fast
> on recent kernels :) Sadly no easy improvement to cherry-pick for me to fix
> customer issue I'm chasing...
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 6f25d4cfea9f..b84f6af2eb4c 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -363,14 +363,14 @@ static int iomap_dio_bio_iter(struct iomap_iter *iter, struct iomap_dio *dio)
> if (iomap->flags & IOMAP_F_SHARED)
> dio->flags |= IOMAP_DIO_COW;
>
> - if (iomap->flags & IOMAP_F_NEW) {
> + if (iomap->flags & IOMAP_F_NEW)
> need_zeroout = true;
> - } else if (iomap->type == IOMAP_MAPPED) {
> - if (iomap_dio_can_use_fua(iomap, dio))
> - bio_opf |= REQ_FUA;
> - else
> - dio->flags &= ~IOMAP_DIO_WRITE_THROUGH;
> - }
> + else if (iomap->type == IOMAP_MAPPED &&
> + iomap_dio_can_use_fua(iomap, dio))
> + bio_opf |= REQ_FUA;
> +
> + if (!(bio_opf & REQ_FUA))
> + dio->flags &= ~IOMAP_DIO_WRITE_THROUGH;
>
> /*
> * We can only do deferred completion for pure overwrites that
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
2025-07-30 14:04 ` Christoph Hellwig
2025-07-30 15:08 ` Darrick J. Wong
@ 2025-07-30 16:30 ` John Garry
2025-07-31 9:05 ` Ritesh Harjani
2025-07-31 11:24 ` Christian Brauner
4 siblings, 0 replies; 6+ messages in thread
From: John Garry @ 2025-07-30 16:30 UTC (permalink / raw)
To: Jan Kara, Christian Brauner
Cc: linux-fsdevel, Darrick J . Wong, Ritesh Harjani (IBM), stable
On 30/07/2025 11:28, Jan Kara wrote:
> Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
> the logic in iomap_dio_bio_iter() in a way that when the device does
> support FUA (or has no writeback cache) and the direct IO happens to
> freshly allocated or unwritten extents, we will*not* issue fsync after
> completing direct IO O_SYNC / O_DSYNC write because the
> IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by
> clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as
> it was originally intended.
>
> CC: John Garry<john.g.garry@oracle.com>
> CC: "Ritesh Harjani (IBM)"<ritesh.list@gmail.com>
> Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()")
> CC:stable@vger.kernel.org
> Signed-off-by: Jan Kara<jack@suse.cz>
thanks
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
` (2 preceding siblings ...)
2025-07-30 16:30 ` John Garry
@ 2025-07-31 9:05 ` Ritesh Harjani
2025-07-31 11:24 ` Christian Brauner
4 siblings, 0 replies; 6+ messages in thread
From: Ritesh Harjani @ 2025-07-31 9:05 UTC (permalink / raw)
To: Jan Kara, Christian Brauner
Cc: linux-fsdevel, Darrick J . Wong, Jan Kara, John Garry, stable
Jan Kara <jack@suse.cz> writes:
> Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
> the logic in iomap_dio_bio_iter() in a way that when the device does
> support FUA (or has no writeback cache) and the direct IO happens to
> freshly allocated or unwritten extents, we will *not* issue fsync after
> completing direct IO O_SYNC / O_DSYNC write because the
> IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by
> clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as
> it was originally intended.
>
> CC: John Garry <john.g.garry@oracle.com>
> CC: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> Fixes: d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()")
> CC: stable@vger.kernel.org
> Signed-off-by: Jan Kara <jack@suse.cz>
Nice catch. It's been there since v6.15 I guess. Looks like it didn't
get caught in xfstests either then.
We have generic/737, but looks like it only covers O_SYNC dio writes. Let
me enhance that to cover O_DYSNC dio writes too.
Looks good to me. Please feel free to add:
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
-ritesh
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
` (3 preceding siblings ...)
2025-07-31 9:05 ` Ritesh Harjani
@ 2025-07-31 11:24 ` Christian Brauner
4 siblings, 0 replies; 6+ messages in thread
From: Christian Brauner @ 2025-07-31 11:24 UTC (permalink / raw)
To: Jan Kara
Cc: Christian Brauner, linux-fsdevel, Darrick J . Wong, John Garry,
Ritesh Harjani (IBM), stable
On Wed, 30 Jul 2025 12:28:41 +0200, Jan Kara wrote:
> Commit d279c80e0bac ("iomap: inline iomap_dio_bio_opflags()") has broken
> the logic in iomap_dio_bio_iter() in a way that when the device does
> support FUA (or has no writeback cache) and the direct IO happens to
> freshly allocated or unwritten extents, we will *not* issue fsync after
> completing direct IO O_SYNC / O_DSYNC write because the
> IOMAP_DIO_WRITE_THROUGH flag stays mistakenly set. Fix the problem by
> clearing IOMAP_DIO_WRITE_THROUGH whenever we do not perform FUA write as
> it was originally intended.
>
> [...]
Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes
[1/1] iomap: Fix broken data integrity guarantees for O_SYNC writes
https://git.kernel.org/vfs/vfs/c/16f206eebbf8
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-31 11:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-30 10:28 [PATCH] iomap: Fix broken data integrity guarantees for O_SYNC writes Jan Kara
2025-07-30 14:04 ` Christoph Hellwig
2025-07-30 15:08 ` Darrick J. Wong
2025-07-30 16:30 ` John Garry
2025-07-31 9:05 ` Ritesh Harjani
2025-07-31 11:24 ` Christian Brauner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).