public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: John Garry <john.g.garry@oracle.com>
Cc: brauner@kernel.org, djwong@kernel.org, cem@kernel.org,
	dchinner@redhat.com, hch@lst.de, ritesh.list@gmail.com,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, martin.petersen@oracle.com
Subject: Re: [PATCH 1/4] iomap: Lift blocksize restriction on atomic writes
Date: Thu, 5 Dec 2024 07:35:45 +1100	[thread overview]
Message-ID: <Z1C9IfLgB_jDCF18@dread.disaster.area> (raw)
In-Reply-To: <20241204154344.3034362-2-john.g.garry@oracle.com>

On Wed, Dec 04, 2024 at 03:43:41PM +0000, John Garry wrote:
> From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
> 
> Filesystems like ext4 can submit writes in multiples of blocksizes.
> But we still can't allow the writes to be split into multiple BIOs. Hence
> let's check if the iomap_length() is same as iter->len or not.
> 
> It is the responsibility of userspace to ensure that a write does not span
> mixed unwritten and mapped extents (which would lead to multiple BIOs).

How is "userspace" supposed to do this?

No existing utility in userspace is aware of atomic write limits or
rtextsize configs, so how does "userspace" ensure everything is
laid out in a manner compatible with atomic writes?

e.g. restoring a backup (or other disaster recovery procedures) is
going to have to lay the files out correctly for atomic writes.
backup tools often sparsify the data set and so what gets restored
will not have the same layout as the original data set...

Where's the documentation that outlines all the restrictions on
userspace behaviour to prevent this sort of problem being triggered?
Common operations such as truncate, hole punch, buffered writes,
reflinks, etc will trip over this, so application developers, users
and admins really need to know what they should be doing to avoid
stepping on this landmine...

Further to that, what is the correction process for users to get rid
of this error? They'll need some help from an atomic write
constraint aware utility that can resilver the file such that these
failures do not occur again. Can xfs_fsr do this? Or maybe the new
exchangr-range code? Or does none of this infrastructure yet exist?

> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> jpg: tweak commit message
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  fs/iomap/direct-io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index b521eb15759e..3dd883dd77d2 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -306,7 +306,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>  	size_t copied = 0;
>  	size_t orig_count;
>  
> -	if (atomic && length != fs_block_size)
> +	if (atomic && length != iter->len)
>  		return -EINVAL;

Given this is now rejecting IOs that are otherwise well formed from
userspace, this situation needs to have a different errno now. The
userspace application has not provided an invalid set of
IO parameters for this IO - the IO has been rejected because
the previously set up persistent file layout was screwed up by
something in the past.

i.e. This is not an application IO submission error anymore, hence
EINVAL is the wrong error to be returning to userspace here.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-12-04 20:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-04 15:43 [PATCH 0/4] large atomic writes for xfs John Garry
2024-12-04 15:43 ` [PATCH 1/4] iomap: Lift blocksize restriction on atomic writes John Garry
2024-12-04 20:35   ` Dave Chinner [this message]
2024-12-05  6:30     ` Darrick J. Wong
2024-12-05 11:51       ` John Garry
2024-12-05 10:52     ` John Garry
2024-12-05 21:15       ` Dave Chinner
2024-12-06  9:43         ` John Garry
2024-12-12  1:34         ` Darrick J. Wong
2025-01-14  4:41           ` Dave Chinner
2025-01-14 23:57             ` Darrick J. Wong
2025-01-15  9:30               ` John Garry
2025-01-16  6:52               ` Christoph Hellwig
2025-01-17 18:49                 ` Darrick J. Wong
2025-01-22  6:42                   ` Christoph Hellwig
2025-01-22 10:45                     ` John Garry
2025-01-22 23:51                       ` Dave Chinner
2025-01-23  9:28                         ` John Garry
2025-01-17 10:26               ` John Garry
2025-01-17 18:29                 ` Darrick J. Wong
2025-01-20  8:29                   ` John Garry
2025-01-22 21:05               ` Dave Chinner
2025-01-13 21:35         ` John Garry
2025-01-14  4:43           ` Dave Chinner
2024-12-04 15:43 ` [PATCH 2/4] xfs: Switch atomic write size check in xfs_file_write_iter() John Garry
2024-12-04 15:43 ` [PATCH 3/4] xfs: Add RT atomic write unit max to xfs_mount John Garry
2024-12-04 15:43 ` [PATCH 4/4] xfs: Update xfs_get_atomic_write_attr() for large atomic writes John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1C9IfLgB_jDCF18@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox