From: Eric Biggers <ebiggers@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Chandan Babu R <chandan.babu@oracle.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Hongbo Li <lihongbo22@huawei.com>,
Ryusuke Konishi <konishi.ryusuke@gmail.com>,
linux-nilfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] fs: add STATX_DIO_READ_ALIGN
Date: Wed, 28 Aug 2024 23:52:27 +0000 [thread overview]
Message-ID: <20240828235227.GB558903@google.com> (raw)
In-Reply-To: <20240828051149.1897291-3-hch@lst.de>
On Wed, Aug 28, 2024 at 08:11:02AM +0300, Christoph Hellwig wrote:
> Add a separate dio read align field, as many out of place write
> file systems can easily do reads aligned to the device sector size,
> but require bigger alignment for writes.
>
> This is usually papered over by falling back to buffered I/O for smaller
> writes and doing read-modify-write cycles, but performance for this
> sucks, so applications benefit from knowing the actual write alignment.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/stat.c | 1 +
> include/linux/stat.h | 1 +
> include/uapi/linux/stat.h | 4 +++-
> 3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/stat.c b/fs/stat.c
> index 89ce1be563108c..044e7ad9f3abc2 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -700,6 +700,7 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
> tmp.stx_mnt_id = stat->mnt_id;
> tmp.stx_dio_mem_align = stat->dio_mem_align;
> tmp.stx_dio_offset_align = stat->dio_offset_align;
> + tmp.stx_dio_read_offset_align = stat->dio_read_offset_align;
> tmp.stx_subvol = stat->subvol;
> tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
> tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index 3d900c86981c5b..9d8382e23a9c69 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -52,6 +52,7 @@ struct kstat {
> u64 mnt_id;
> u32 dio_mem_align;
> u32 dio_offset_align;
> + u32 dio_read_offset_align;
> u64 change_cookie;
> u64 subvol;
> u32 atomic_write_unit_min;
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 8b35d7d511a287..f78ee3670dd5d7 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -179,7 +179,8 @@ struct statx {
> /* Max atomic write segment count */
> __u32 stx_atomic_write_segments_max;
>
> - __u32 __spare1[1];
> + /* File offset alignment for direct I/O reads */
> + __u32 stx_dio_read_offset_align;
>
> /* 0xb8 */
> __u64 __spare3[9]; /* Spare space for future expansion */
> @@ -213,6 +214,7 @@ struct statx {
> #define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */
> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
> +#define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
>
> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
Thanks. We maybe should have included read/write separation in STATX_DIOALIGN,
but at the time the only case that was brought up was "DIO reads are supported
but DIO writes are not" which people had argued was not useful.
Is this patch meant to support that case, or just the case where DIO in both
directions is supported but with different alignments? Is that different file
offset alignments, different memory alignments, or both? This patch doesn't add
a stx_dio_read_mem_align field, so it's still assumed that both directions share
the existing stx_dio_mem_align property, including whether DIO is supported at
all (0 vs. nonzero). So as proposed, the only case it helps with is where DIO
in both directions is supported with the same memory alignment but different
file offset alignments. Maybe that is intended, but it's not clear to me.
Are there specific userspace applications that would like to take advantage of a
smaller value of stx_dio_read_offset_align compared to the existing
stx_dio_offset_align? Even in the case of a discrepancy between reads and
writes, applications are supposed to be able to just use stx_dio_offset_align
since it has to be the greater of the two alignments.
And as always, new UAPIs need to be accompanied by documentation. In this case
that's an update to the statx man page.
- Eric
next prev parent reply other threads:[~2024-08-28 23:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 5:11 RFC: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 5:11 ` [PATCH 1/3] fs: reformat the statx definition Christoph Hellwig
2024-08-28 16:20 ` Darrick J. Wong
2024-08-28 5:11 ` [PATCH 2/3] fs: add STATX_DIO_READ_ALIGN Christoph Hellwig
2024-08-28 16:24 ` Darrick J. Wong
2024-08-28 23:52 ` Eric Biggers [this message]
2024-08-29 3:44 ` Christoph Hellwig
2024-08-28 5:11 ` [PATCH 3/3] xfs: report the correct read/write dio alignment for reflinked inodes Christoph Hellwig
2024-08-28 16:23 ` Darrick J. Wong
2024-08-29 1:13 ` Dave Chinner
2024-08-28 13:43 ` RFC: add STATX_DIO_READ_ALIGN Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240828235227.GB558903@google.com \
--to=ebiggers@kernel.org \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=konishi.ryusuke@gmail.com \
--cc=lihongbo22@huawei.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nilfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).