From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Hongbo Li <lihongbo22@huawei.com>,
Ryusuke Konishi <konishi.ryusuke@gmail.com>,
linux-nilfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: report the correct read/write dio alignment for reflinked inodes
Date: Thu, 29 Aug 2024 11:13:37 +1000 [thread overview]
Message-ID: <Zs/LQftjQ7EC/lGu@dread.disaster.area> (raw)
In-Reply-To: <20240828051149.1897291-4-hch@lst.de>
On Wed, Aug 28, 2024 at 08:11:03AM +0300, Christoph Hellwig wrote:
> For I/O to reflinked blocks we always need to write an entire new
> file system block, and the code enforces the file system block alignment
> for the entire file if it has any reflinked blocks.
>
> Use the new STATX_DIO_READ_ALIGN flag to report the asymmetric read
> vs write alignments for reflinked files.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_iops.c | 37 +++++++++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 1cdc8034f54d93..de2fc12688dc23 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -570,6 +570,33 @@ xfs_stat_blksize(
> return PAGE_SIZE;
> }
>
> +static void
> +xfs_report_dioalign(
> + struct xfs_inode *ip,
> + struct kstat *stat)
> +{
> + struct xfs_buftarg *target = xfs_inode_buftarg(ip);
> + struct block_device *bdev = target->bt_bdev;
> +
> + stat->result_mask |= STATX_DIOALIGN | STATX_DIO_READ_ALIGN;
> + stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
> + stat->dio_read_offset_align = bdev_logical_block_size(bdev);
> +
> + /*
> + * On COW inodes we are forced to always rewrite an entire file system
> + * block.
> + *
> + * Because applications assume they can do sector sized direct writes
> + * on XFS we provide an emulation by doing a read-modify-write cycle
> + * through the cache, but that is highly inefficient. Thus report the
> + * natively supported size here.
> + */
> + if (xfs_is_cow_inode(ip))
> + stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize;
> + else
> + stat->dio_offset_align = stat->dio_read_offset_align;
It might be worth making it explicitly clear that logical block size
aligned IO for COW operations will still work. I think that's what
you are trying to say, but it took me a while to work out. Perhaps
something like:
/*
* COW operations are inefficient on sub-fsblock aligned
* ranges. They need to copy the entire block, so the
* minimum IO size we will ever do in this case is a single
* filesystem block.
*
* Even though we support sector sized IO on COW inodes, we
* want to help applications avoid the costly RMW cycle it
* requires for COW inodes. Hence report the native
* filesystem allocation unit size here to indicate the
* smallest alignment that will avoid RMW cycles in the DIO
* write path.
*/
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-08-29 1:13 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 5:11 RFC: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 5:11 ` [PATCH 1/3] fs: reformat the statx definition Christoph Hellwig
2024-08-28 16:20 ` Darrick J. Wong
2024-08-28 5:11 ` [PATCH 2/3] fs: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 16:24 ` Darrick J. Wong
2024-08-28 23:52 ` Eric Biggers
2024-08-29 3:44 ` Christoph Hellwig
2024-08-28 5:11 ` [PATCH 3/3] xfs: report the correct read/write dio alignment for reflinked inodes Christoph Hellwig
2024-08-28 16:23 ` Darrick J. Wong
2024-08-29 1:13 ` Dave Chinner [this message]
2024-08-28 13:43 ` RFC: add STATX_DIO_READ_ALIGN Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zs/LQftjQ7EC/lGu@dread.disaster.area \
--to=david@fromorbit.com \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=konishi.ryusuke@gmail.com \
--cc=lihongbo22@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nilfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.