From: Dave Chinner <david@fromorbit.com>
To: John Garry <john.g.garry@oracle.com>
Cc: djwong@kernel.org, hch@lst.de, viro@zeniv.linux.org.uk,
brauner@kernel.org, jack@suse.cz, chandan.babu@oracle.com,
willy@infradead.org, axboe@kernel.dk, martin.petersen@oracle.com,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
tytso@mit.edu, jbongio@google.com, ojaswin@linux.ibm.com,
ritesh.list@gmail.com, mcgrof@kernel.org, p.raghav@samsung.com,
linux-xfs@vger.kernel.org, catherine.hoang@oracle.com
Subject: Re: [PATCH v3 14/21] iomap: Sub-extent zeroing
Date: Wed, 1 May 2024 11:07:04 +1000 [thread overview]
Message-ID: <ZjGVuBi6XeJYo4Ca@dread.disaster.area> (raw)
In-Reply-To: <20240429174746.2132161-15-john.g.garry@oracle.com>
On Mon, Apr 29, 2024 at 05:47:39PM +0000, John Garry wrote:
> For FS_XFLAG_FORCEALIGN support, we want to treat any sub-extent IO like
> sub-fsblock DIO, in that we will zero the sub-extent when the mapping is
> unwritten.
>
> This will be important for atomic writes support, in that atomically
> writing over a partially written extent would mean that we would need to
> do the unwritten extent conversion write separately, and the write could
> no longer be atomic.
>
> It is the task of the FS to set iomap.extent_size per iter to indicate
> sub-extent zeroing required.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
Shouldn't this be done before the XFS feature is enabled in the
series?
> ---
> fs/iomap/direct-io.c | 17 +++++++++++------
> include/linux/iomap.h | 1 +
> 2 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index f3b43d223a46..a3ed7cfa95bc 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -277,7 +277,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> {
> const struct iomap *iomap = &iter->iomap;
> struct inode *inode = iter->inode;
> - unsigned int fs_block_size = i_blocksize(inode), pad;
> + unsigned int zeroing_size, pad;
> loff_t length = iomap_length(iter);
> loff_t pos = iter->pos;
> blk_opf_t bio_opf;
> @@ -288,6 +288,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> size_t copied = 0;
> size_t orig_count;
>
> + if (iomap->extent_size)
> + zeroing_size = iomap->extent_size;
> + else
> + zeroing_size = i_blocksize(inode);
Oh, the dissonance!
iomap->extent_size isn't an extent size at all.
The size of the extent the iomap returns is iomap->length. This new
variable is the IO specific "block size" that should be assumed by
the dio code to determine if padding should be done.
IOWs, I think we should add an "io_block_size" field to the iomap,
and every filesystem that supports iomap should set it to the
filesystem block size (i_blocksize(inode)). Then the changes to the
iomap code end up just being:
- unsigned int fs_block_size = i_blocksize(inode), pad;
+ unsigned int fs_block_size = iomap->io_block_size, pad;
And the patch that introduces that infrastructure change will also
change all the filesystem implementations to unconditionally set
iomap->io_block_size to i_blocksize().
Then, in a separate patch, you can add XFS support for large IO
block sizes when we have either a large rtextsize or extent size
hints set.
> +
> if ((pos | length) & (bdev_logical_block_size(iomap->bdev) - 1) ||
> !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter))
> return -EINVAL;
> @@ -354,8 +359,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> dio->iocb->ki_flags &= ~IOCB_HIPRI;
>
> if (need_zeroout) {
> - /* zero out from the start of the block to the write offset */
> - pad = pos & (fs_block_size - 1);
> + /* zero out from the start of the region to the write offset */
> + pad = pos & (zeroing_size - 1);
> if (pad)
> iomap_dio_zero(iter, dio, pos - pad, pad);
> }
> @@ -428,10 +433,10 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> zero_tail:
> if (need_zeroout ||
> ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
> - /* zero out from the end of the write to the end of the block */
> - pad = pos & (fs_block_size - 1);
> + /* zero out from the end of the write to the end of the region */
> + pad = pos & (zeroing_size - 1);
> if (pad)
> - iomap_dio_zero(iter, dio, pos, fs_block_size - pad);
> + iomap_dio_zero(iter, dio, pos, zeroing_size - pad);
> }
> out:
> /* Undo iter limitation to current extent */
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 6fc1c858013d..42623b1cdc04 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -97,6 +97,7 @@ struct iomap {
> u64 length; /* length of mapping, bytes */
> u16 type; /* type of mapping */
> u16 flags; /* flags for mapping */
> + unsigned int extent_size;
This needs a descriptive comment. At minimum, it should tell the
reader what units are used for the variable. If it is bytes, then
it needs to be a u64, because XFS can have extent size hints well
beyond 2^32 bytes in length.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-05-01 1:07 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-29 17:47 [PATCH v3 00/21] block atomic writes for XFS John Garry
2024-04-29 17:47 ` [PATCH v3 01/21] fs: Add generic_atomic_write_valid_size() John Garry
2024-04-29 17:47 ` [PATCH v3 02/21] xfs: only allow minlen allocations when near ENOSPC John Garry
2024-04-29 17:47 ` [PATCH v3 03/21] xfs: always tail align maxlen allocations John Garry
2024-04-29 17:47 ` [PATCH v3 04/21] xfs: simplify extent allocation alignment John Garry
2024-04-29 17:47 ` [PATCH v3 05/21] xfs: make EOF allocation simpler John Garry
2024-04-29 17:47 ` [PATCH v3 06/21] xfs: introduce forced allocation alignment John Garry
2024-04-29 17:47 ` [PATCH v3 07/21] fs: xfs: align args->minlen for " John Garry
2024-06-05 14:26 ` John Garry
2024-06-06 8:47 ` Dave Chinner
2024-06-06 16:22 ` John Garry
2024-06-07 6:04 ` John Garry
2024-04-29 17:47 ` [PATCH v3 08/21] xfs: Introduce FORCEALIGN inode flag John Garry
2024-04-30 23:22 ` Dave Chinner
2024-05-01 10:03 ` John Garry
2024-05-02 0:50 ` Dave Chinner
2024-05-02 7:56 ` John Garry
2024-06-12 2:10 ` Long Li
2024-06-12 6:55 ` John Garry
2024-06-12 15:43 ` Darrick J. Wong
2024-06-13 2:04 ` Long Li
2024-04-29 17:47 ` [PATCH v3 09/21] xfs: Do not free EOF blocks for forcealign John Garry
2024-04-30 22:54 ` Dave Chinner
2024-05-01 8:30 ` John Garry
2024-05-02 1:11 ` Dave Chinner
2024-05-02 8:55 ` John Garry
2024-04-29 17:47 ` [PATCH v3 10/21] xfs: Update xfs_is_falloc_aligned() mask " John Garry
2024-04-30 23:35 ` Dave Chinner
2024-05-01 10:48 ` John Garry
2024-05-01 23:45 ` Darrick J. Wong
2024-04-29 17:47 ` [PATCH RFC v3 11/21] xfs: Unmap blocks according to forcealign John Garry
2024-05-01 0:10 ` Dave Chinner
2024-05-01 10:54 ` John Garry
2024-06-06 9:50 ` John Garry
2024-04-29 17:47 ` [PATCH RFC v3 12/21] xfs: Only free full extents for forcealign John Garry
2024-05-01 0:53 ` Dave Chinner
2024-05-01 11:24 ` John Garry
2024-05-01 23:53 ` Darrick J. Wong
2024-05-02 3:12 ` Dave Chinner
2024-04-29 17:47 ` [PATCH v3 13/21] xfs: Enable file data forcealign feature John Garry
2024-04-29 17:47 ` [PATCH v3 14/21] iomap: Sub-extent zeroing John Garry
2024-05-01 1:07 ` Dave Chinner [this message]
2024-05-01 10:23 ` John Garry
2024-05-30 10:40 ` John Garry
2024-07-26 14:29 ` John Garry
2024-07-26 17:13 ` Christoph Hellwig
2024-07-29 17:02 ` John Garry
2024-08-22 20:35 ` Darrick J. Wong
2024-06-11 3:10 ` Long Li
2024-06-11 7:29 ` John Garry
2024-04-29 17:47 ` [PATCH v3 15/21] fs: xfs: " John Garry
2024-05-01 1:32 ` Dave Chinner
2024-05-01 11:36 ` John Garry
2024-05-02 1:26 ` Dave Chinner
2024-04-29 17:47 ` [PATCH v3 16/21] fs: Add FS_XFLAG_ATOMICWRITES flag John Garry
2024-04-29 17:47 ` [PATCH v3 17/21] iomap: Atomic write support John Garry
2024-05-01 1:47 ` Dave Chinner
2024-05-01 11:08 ` John Garry
2024-05-02 1:43 ` Dave Chinner
2024-05-02 9:12 ` John Garry
2024-04-29 17:47 ` [PATCH v3 18/21] xfs: Support FS_XFLAG_ATOMICWRITES for forcealign John Garry
2024-04-29 17:47 ` [PATCH v3 19/21] xfs: Support atomic write for statx John Garry
2024-04-29 17:47 ` [PATCH v3 20/21] xfs: Validate atomic writes John Garry
2024-04-29 17:47 ` [PATCH v3 21/21] xfs: Support setting FMODE_CAN_ATOMIC_WRITE John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZjGVuBi6XeJYo4Ca@dread.disaster.area \
--to=david@fromorbit.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=catherine.hoang@oracle.com \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=jbongio@google.com \
--cc=john.g.garry@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=mcgrof@kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=p.raghav@samsung.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.