From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-ext4@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
Jan Kara <jack@suse.cz>, "Darrick J . Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@infradead.org>,
John Garry <john.g.garry@oracle.com>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>,
linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 4/6] ext4: Warn if we ever fallback to buffered-io for DIO atomic writes
Date: Mon, 28 Oct 2024 14:13:54 +0530 [thread overview]
Message-ID: <87bjz4mxbp.fsf@gmail.com> (raw)
In-Reply-To: <Zx8ga59h0JgU/YIC@dread.disaster.area>
Dave Chinner <david@fromorbit.com> writes:
> On Mon, Oct 28, 2024 at 06:39:36AM +0530, Ritesh Harjani wrote:
>>
>> Hi Dave,
>>
>> Dave Chinner <david@fromorbit.com> writes:
>>
>> > On Fri, Oct 25, 2024 at 09:15:53AM +0530, Ritesh Harjani (IBM) wrote:
>> >> iomap will not return -ENOTBLK in case of dio atomic writes. But let's
>> >> also add a WARN_ON_ONCE and return -EIO as a safety net.
>> >>
>> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> >> ---
>> >> fs/ext4/file.c | 10 +++++++++-
>> >> 1 file changed, 9 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
>> >> index f9516121a036..af6ebd0ac0d6 100644
>> >> --- a/fs/ext4/file.c
>> >> +++ b/fs/ext4/file.c
>> >> @@ -576,8 +576,16 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>> >> iomap_ops = &ext4_iomap_overwrite_ops;
>> >> ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops,
>> >> dio_flags, NULL, 0);
>> >> - if (ret == -ENOTBLK)
>> >> + if (ret == -ENOTBLK) {
>> >> ret = 0;
>> >> + /*
>> >> + * iomap will never return -ENOTBLK if write fails for atomic
>> >> + * write. But let's just add a safety net.
>> >> + */
>> >> + if (WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC))
>> >> + ret = -EIO;
>> >> + }
>> >
>> > Why can't the iomap code return EIO in this case for IOCB_ATOMIC?
>> > That way we don't have to put this logic into every filesystem.
>>
>> This was origially intended as a safety net hence the WARN_ON_ONCE.
>> Later Darrick pointed out that we still might have an unconverted
>> condition in iomap which can return ENOTBLK for DIO atomic writes (page
>> cache invalidation).
>
> Yes. That's my point - iomap knows that it's an atomic write, it
> knows that invalidation failed, and it knows that there is no such
> thing as buffered atomic writes. So there is no possible fallback
> here, and it should be returning EIO in the page cache invalidation
> failure case and not ENOTBLK.
>
Sorry my bad. I think I might have looked into a different version of
the code earlier. So the current patch from John already takes care of
the condition where if the page cache invalidation fails we don't return
-ENOTBLK [1]
[1]: https://lore.kernel.org/linux-xfs/Zxnp8bma2KrMDg5m@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com/T/#m3664bbe00287d98caa690bb04f51d0ef164f52b3
>> You pointed it right that it should be fixed in iomap. However do you
>> think filesystems can still keep this as safety net (maybe no need of
>> WARN_ON_ONCE).
>
> I don't see any point in adding "impossible to hit" checks into
> filesystems just in case some core infrastructure has a bug
> introduced....
>
So even though we have taken care of that case from page cache
invalidation code, however it can still happen if iomap_iter()
ever returns -ENOTBLK.
e.g.
blk_start_plug(&plug);
while ((ret = iomap_iter(&iomi, ops)) > 0) {
iomi.processed = iomap_dio_iter(&iomi, dio);
/*
* We can only poll for single bio I/Os.
*/
iocb->ki_flags &= ~IOCB_HIPRI;
}
blk_finish_plug(&plug);
/*
* We only report that we've read data up to i_size.
* Revert iter to a state corresponding to that as some callers (such
* as the splice code) rely on it.
*/
if (iov_iter_rw(iter) == READ && iomi.pos >= dio->i_size)
iov_iter_revert(iter, iomi.pos - dio->i_size);
if (ret == -EFAULT && dio->size && (dio_flags & IOMAP_DIO_PARTIAL)) {
if (!(iocb->ki_flags & IOCB_NOWAIT))
wait_for_completion = true;
ret = 0;
}
/* magic error code to fall back to buffered I/O */
if (ret == -ENOTBLK) {
wait_for_completion = true;
ret = 0;
}
Reviewing the code paths there is a lot of ping pongs between core iomap
and FS. So it's not just core iomap what we are talking about here.
So I am still inclined towards having that check in place as a safety net.
However - let me take some time to review some of this code paths
please. I wanted to send this email mainly to mention the point that
page cache invalidation case is already taken care in iomap for atomic
writes, so there is no bug there.
I will get back on rest of the cases after I have looked more closely at it.
> -Dave.
>
> --
> Dave Chinner
> david@fromorbit.com
Thanks for the review!
-ritesh
next prev parent reply other threads:[~2024-10-28 9:24 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 3:45 [PATCH 0/6] ext4: Add atomic write support for DIO Ritesh Harjani (IBM)
2024-10-25 3:45 ` [PATCH 1/6] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
2024-10-25 9:41 ` John Garry
2024-10-25 10:08 ` Ritesh Harjani
2024-10-25 16:09 ` Darrick J. Wong
2024-10-25 17:45 ` Ritesh Harjani
2024-10-25 3:45 ` [PATCH 2/6] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
2024-10-25 9:44 ` John Garry
2024-10-25 10:33 ` Ritesh Harjani
2024-10-25 16:11 ` Darrick J. Wong
2024-10-25 17:50 ` Ritesh Harjani
2024-10-25 3:45 ` [PATCH 3/6] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
2024-10-25 3:45 ` [PATCH 4/6] ext4: Warn if we ever fallback to buffered-io for DIO atomic writes Ritesh Harjani (IBM)
2024-10-25 16:16 ` Darrick J. Wong
2024-10-25 17:51 ` Ritesh Harjani
2024-10-27 22:26 ` Dave Chinner
2024-10-28 1:09 ` Ritesh Harjani
2024-10-28 5:26 ` Dave Chinner
2024-10-28 8:43 ` Ritesh Harjani [this message]
2024-10-28 18:14 ` Ritesh Harjani
2024-10-29 22:29 ` Dave Chinner
2024-10-29 23:51 ` Ritesh Harjani
2024-10-25 3:45 ` [PATCH 5/6] iomap: Lift blocksize restriction on " Ritesh Harjani (IBM)
2024-10-25 8:52 ` John Garry
2024-10-25 9:31 ` Ritesh Harjani
2024-10-25 9:59 ` John Garry
2024-10-25 10:35 ` Ritesh Harjani
2024-10-25 11:07 ` John Garry
2024-10-25 11:19 ` Ritesh Harjani
2024-10-25 12:23 ` John Garry
2024-10-25 12:36 ` Ritesh Harjani
2024-10-25 14:04 ` John Garry
2024-10-25 14:13 ` Ritesh Harjani
2024-10-25 18:28 ` Darrick J. Wong
2024-10-26 4:35 ` Ritesh Harjani
2024-10-31 21:36 ` Darrick J. Wong
2024-11-04 1:52 ` Dave Chinner
2024-11-05 0:09 ` Darrick J. Wong
2024-10-25 3:45 ` [PATCH 6/6] ext4: Add atomic write support for bigalloc Ritesh Harjani (IBM)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bjz4mxbp.fsf@gmail.com \
--to=ritesh.list@gmail.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.