From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Hongbo Li <lihongbo22@huawei.com>,
Ryusuke Konishi <konishi.ryusuke@gmail.com>,
linux-nilfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: report the correct read/write dio alignment for reflinked inodes
Date: Thu, 29 Aug 2024 11:13:37 +1000 [thread overview]
Message-ID: <Zs/LQftjQ7EC/lGu@dread.disaster.area> (raw)
In-Reply-To: <20240828051149.1897291-4-hch@lst.de>
On Wed, Aug 28, 2024 at 08:11:03AM +0300, Christoph Hellwig wrote:
> For I/O to reflinked blocks we always need to write an entire new
> file system block, and the code enforces the file system block alignment
> for the entire file if it has any reflinked blocks.
>
> Use the new STATX_DIO_READ_ALIGN flag to report the asymmetric read
> vs write alignments for reflinked files.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_iops.c | 37 +++++++++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 1cdc8034f54d93..de2fc12688dc23 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -570,6 +570,33 @@ xfs_stat_blksize(
> return PAGE_SIZE;
> }
>
> +static void
> +xfs_report_dioalign(
> + struct xfs_inode *ip,
> + struct kstat *stat)
> +{
> + struct xfs_buftarg *target = xfs_inode_buftarg(ip);
> + struct block_device *bdev = target->bt_bdev;
> +
> + stat->result_mask |= STATX_DIOALIGN | STATX_DIO_READ_ALIGN;
> + stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
> + stat->dio_read_offset_align = bdev_logical_block_size(bdev);
> +
> + /*
> + * On COW inodes we are forced to always rewrite an entire file system
> + * block.
> + *
> + * Because applications assume they can do sector sized direct writes
> + * on XFS we provide an emulation by doing a read-modify-write cycle
> + * through the cache, but that is highly inefficient. Thus report the
> + * natively supported size here.
> + */
> + if (xfs_is_cow_inode(ip))
> + stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize;
> + else
> + stat->dio_offset_align = stat->dio_read_offset_align;
It might be worth making it explicitly clear that logical block size
aligned IO for COW operations will still work. I think that's what
you are trying to say, but it took me a while to work out. Perhaps
something like:
/*
* COW operations are inefficient on sub-fsblock aligned
* ranges. They need to copy the entire block, so the
* minimum IO size we will ever do in this case is a single
* filesystem block.
*
* Even though we support sector sized IO on COW inodes, we
* want to help applications avoid the costly RMW cycle it
* requires for COW inodes. Hence report the native
* filesystem allocation unit size here to indicate the
* smallest alignment that will avoid RMW cycles in the DIO
* write path.
*/
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-08-29 1:13 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 5:11 RFC: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 5:11 ` [PATCH 1/3] fs: reformat the statx definition Christoph Hellwig
2024-08-28 16:20 ` Darrick J. Wong
2024-08-28 5:11 ` [PATCH 2/3] fs: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 16:24 ` Darrick J. Wong
2024-08-28 23:52 ` Eric Biggers
2024-08-29 3:44 ` Christoph Hellwig
2024-08-28 5:11 ` [PATCH 3/3] xfs: report the correct read/write dio alignment for reflinked inodes Christoph Hellwig
2024-08-28 16:23 ` Darrick J. Wong
2024-08-29 1:13 ` Dave Chinner [this message]
2024-08-28 13:43 ` RFC: add STATX_DIO_READ_ALIGN Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zs/LQftjQ7EC/lGu@dread.disaster.area \
--to=david@fromorbit.com \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=konishi.ryusuke@gmail.com \
--cc=lihongbo22@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nilfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).