linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
@ 2025-08-25 11:15 ` Christoph Hellwig
  2025-08-25 11:32   ` Carlos Maiolino
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-08-25 11:15 UTC (permalink / raw)
  To: cem; +Cc: linux-xfs

Use the direct I/O alignment reporting from ->getattr instead of
reimplementing it.  This exposes the relaxation of the memory
alignment in the XFS_IOC_DIOINFO info and ensure the information will
stay in sync.  Note that randholes.c in xfstests has a bug where it
incorrectly fails when the required memory alignment is smaller than the
pointer size.  Round up the reported value as there is a fair chance that
this code got copied into various applications.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---

Changes since v1:
 - update the comment

 fs/xfs/xfs_ioctl.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index e1051a530a50..ff0a8dc74948 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1209,21 +1209,21 @@ xfs_file_ioctl(
 				current->comm);
 		return -ENOTTY;
 	case XFS_IOC_DIOINFO: {
-		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
+		struct kstat		st;
 		struct dioattr		da;
 
-		da.d_mem = target->bt_logical_sectorsize;
+		error = vfs_getattr(&filp->f_path, &st, STATX_DIOALIGN, 0);
+		if (error)
+			return error;
 
 		/*
-		 * See xfs_report_dioalign() for an explanation about why this
-		 * reports a value larger than the sector size for COW inodes.
+		 * Some userspace directly feeds the return value to
+		 * posix_memalign, which fails for values that are smaller than
+		 * the pointer size.  Round up the value to not break userspace.
 		 */
-		if (xfs_is_cow_inode(ip))
-			da.d_miniosz = xfs_inode_alloc_unitsize(ip);
-		else
-			da.d_miniosz = target->bt_logical_sectorsize;
+		da.d_mem = roundup(st.dio_mem_align, sizeof(void *));
+		da.d_miniosz = st.dio_offset_align;
 		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
-
 		if (copy_to_user(arg, &da, sizeof(da)))
 			return -EFAULT;
 		return 0;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-25 11:15 ` [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr Christoph Hellwig
@ 2025-08-25 11:32   ` Carlos Maiolino
  2025-08-25 11:33     ` Christoph Hellwig
  2025-08-25 15:29   ` Darrick J. Wong
  2025-08-28 13:19   ` Carlos Maiolino
  2 siblings, 1 reply; 8+ messages in thread
From: Carlos Maiolino @ 2025-08-25 11:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Aug 25, 2025 at 01:15:00PM +0200, Christoph Hellwig wrote:
> Use the direct I/O alignment reporting from ->getattr instead of
> reimplementing it.  This exposes the relaxation of the memory
> alignment in the XFS_IOC_DIOINFO info and ensure the information will
> stay in sync.  Note that randholes.c in xfstests has a bug where it
> incorrectly fails when the required memory alignment is smaller than the
> pointer size.  Round up the reported value as there is a fair chance that
> this code got copied into various applications.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> 
> Changes since v1:
>  - update the comment
> 
>  fs/xfs/xfs_ioctl.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index e1051a530a50..ff0a8dc74948 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1209,21 +1209,21 @@ xfs_file_ioctl(
>  				current->comm);
>  		return -ENOTTY;
>  	case XFS_IOC_DIOINFO: {
> -		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
> +		struct kstat		st;
>  		struct dioattr		da;
> 
> -		da.d_mem = target->bt_logical_sectorsize;
> +		error = vfs_getattr(&filp->f_path, &st, STATX_DIOALIGN, 0);
> +		if (error)
> +			return error;
> 
>  		/*
> -		 * See xfs_report_dioalign() for an explanation about why this
> -		 * reports a value larger than the sector size for COW inodes.
> +		 * Some userspace directly feeds the return value to

		Some userspace /tools/ directly... ?

		I could fix this at commit time if this is the only
		change

> +		 * posix_memalign, which fails for values that are smaller than
> +		 * the pointer size.  Round up the value to not break userspace.
>  		 */
> -		if (xfs_is_cow_inode(ip))
> -			da.d_miniosz = xfs_inode_alloc_unitsize(ip);
> -		else
> -			da.d_miniosz = target->bt_logical_sectorsize;
> +		da.d_mem = roundup(st.dio_mem_align, sizeof(void *));
> +		da.d_miniosz = st.dio_offset_align;
>  		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
> -
>  		if (copy_to_user(arg, &da, sizeof(da)))
>  			return -EFAULT;
>  		return 0;

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> --
> 2.47.2
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-25 11:32   ` Carlos Maiolino
@ 2025-08-25 11:33     ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-08-25 11:33 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: Christoph Hellwig, linux-xfs

On Mon, Aug 25, 2025 at 01:32:23PM +0200, Carlos Maiolino wrote:
> >  		/*
> > -		 * See xfs_report_dioalign() for an explanation about why this
> > -		 * reports a value larger than the sector size for COW inodes.
> > +		 * Some userspace directly feeds the return value to
> 
> 		Some userspace /tools/ directly... ?

Tools, programs.  Or just userspace :)

I don't really care which way.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-25 11:15 ` [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr Christoph Hellwig
  2025-08-25 11:32   ` Carlos Maiolino
@ 2025-08-25 15:29   ` Darrick J. Wong
  2025-08-26 13:14     ` Christoph Hellwig
  2025-08-28 13:19   ` Carlos Maiolino
  2 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2025-08-25 15:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: cem, linux-xfs

On Mon, Aug 25, 2025 at 01:15:00PM +0200, Christoph Hellwig wrote:
> Use the direct I/O alignment reporting from ->getattr instead of
> reimplementing it.  This exposes the relaxation of the memory
> alignment in the XFS_IOC_DIOINFO info and ensure the information will
> stay in sync.  Note that randholes.c in xfstests has a bug where it
> incorrectly fails when the required memory alignment is smaller than the
> pointer size.  Round up the reported value as there is a fair chance that
> this code got copied into various applications.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> 
> Changes since v1:
>  - update the comment
> 
>  fs/xfs/xfs_ioctl.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index e1051a530a50..ff0a8dc74948 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1209,21 +1209,21 @@ xfs_file_ioctl(
>  				current->comm);
>  		return -ENOTTY;
>  	case XFS_IOC_DIOINFO: {
> -		struct xfs_buftarg	*target = xfs_inode_buftarg(ip);
> +		struct kstat		st;
>  		struct dioattr		da;
>  
> -		da.d_mem = target->bt_logical_sectorsize;
> +		error = vfs_getattr(&filp->f_path, &st, STATX_DIOALIGN, 0);
> +		if (error)
> +			return error;
>  
>  		/*
> -		 * See xfs_report_dioalign() for an explanation about why this
> -		 * reports a value larger than the sector size for COW inodes.
> +		 * Some userspace directly feeds the return value to
> +		 * posix_memalign, which fails for values that are smaller than
> +		 * the pointer size.  Round up the value to not break userspace.

Looks ok, don't care much if you say "userspace" or "userspace
programs".
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

>  		 */
> -		if (xfs_is_cow_inode(ip))
> -			da.d_miniosz = xfs_inode_alloc_unitsize(ip);
> -		else
> -			da.d_miniosz = target->bt_logical_sectorsize;
> +		da.d_mem = roundup(st.dio_mem_align, sizeof(void *));

...though one thing I /do/ wonder is whether this roundup() should be in
the vfs statx code?  Do people need to be able to initiate directio with
buffers that are not aligned even to pointer size?

--D

> +		da.d_miniosz = st.dio_offset_align;
>  		da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
> -
>  		if (copy_to_user(arg, &da, sizeof(da)))
>  			return -EFAULT;
>  		return 0;
> -- 
> 2.47.2
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-25 15:29   ` Darrick J. Wong
@ 2025-08-26 13:14     ` Christoph Hellwig
  2025-08-26 16:57       ` Keith Busch
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2025-08-26 13:14 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: cem, linux-xfs, Keith Busch

On Mon, Aug 25, 2025 at 08:29:36AM -0700, Darrick J. Wong wrote:
> > -		if (xfs_is_cow_inode(ip))
> > -			da.d_miniosz = xfs_inode_alloc_unitsize(ip);
> > -		else
> > -			da.d_miniosz = target->bt_logical_sectorsize;
> > +		da.d_mem = roundup(st.dio_mem_align, sizeof(void *));
> 
> ...though one thing I /do/ wonder is whether this roundup() should be in
> the vfs statx code?  Do people need to be able to initiate directio with
> buffers that are not aligned even to pointer size?

I've added Keith to Cc who is on a quest to reduce alignment requirement
as much as possible to add some input.  But as the new statx interface
never had it, adding it now seems off.  Also dword (4 byte) alignment
is pretty common in all kinds of storage specifications, so being able
to support this for things running on top of file systems seems like
a good idea in general.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-26 13:14     ` Christoph Hellwig
@ 2025-08-26 16:57       ` Keith Busch
  2025-08-27  7:36         ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2025-08-26 16:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, cem, linux-xfs

On Tue, Aug 26, 2025 at 03:14:47PM +0200, Christoph Hellwig wrote:
> On Mon, Aug 25, 2025 at 08:29:36AM -0700, Darrick J. Wong wrote:
> > > -		if (xfs_is_cow_inode(ip))
> > > -			da.d_miniosz = xfs_inode_alloc_unitsize(ip);
> > > -		else
> > > -			da.d_miniosz = target->bt_logical_sectorsize;
> > > +		da.d_mem = roundup(st.dio_mem_align, sizeof(void *));
> > 
> > ...though one thing I /do/ wonder is whether this roundup() should be in
> > the vfs statx code?  Do people need to be able to initiate directio with
> > buffers that are not aligned even to pointer size?
> 
> I've added Keith to Cc who is on a quest to reduce alignment requirement
> as much as possible to add some input.  But as the new statx interface
> never had it, adding it now seems off.  Also dword (4 byte) alignment
> is pretty common in all kinds of storage specifications, so being able
> to support this for things running on top of file systems seems like
> a good idea in general.

Yeah, dword dma support is so common in part because that's the
granularity of PCIe TLP lengths.

Not sure what to say about this patch right now, but it triggered
a thought: if I can successfully get filesystem and block layers to
tolerate the hardware's minimum alignments, how is user space to know
it's allowed to send IO aligned to it? The existing statx dio fields
just refer to address alignments, but lengths are still assumed to be
block sized.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-26 16:57       ` Keith Busch
@ 2025-08-27  7:36         ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-08-27  7:36 UTC (permalink / raw)
  To: Keith Busch; +Cc: Christoph Hellwig, Darrick J. Wong, cem, linux-xfs

On Tue, Aug 26, 2025 at 10:57:28AM -0600, Keith Busch wrote:
> Not sure what to say about this patch right now, but it triggered
> a thought: if I can successfully get filesystem and block layers to
> tolerate the hardware's minimum alignments, how is user space to know
> it's allowed to send IO aligned to it? The existing statx dio fields
> just refer to address alignments, but lengths are still assumed to be
> block sized.

Length as in total transfer length?  Nothing in the block layer or
storage interface will allow you to do sub-sector transfers, so this
just won't work.

Length as in segment length?  If we can support that we'll need a new
limit for that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
  2025-08-25 11:15 ` [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr Christoph Hellwig
  2025-08-25 11:32   ` Carlos Maiolino
  2025-08-25 15:29   ` Darrick J. Wong
@ 2025-08-28 13:19   ` Carlos Maiolino
  2 siblings, 0 replies; 8+ messages in thread
From: Carlos Maiolino @ 2025-08-28 13:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, 25 Aug 2025 13:15:00 +0200, Christoph Hellwig wrote:
> Use the direct I/O alignment reporting from ->getattr instead of
> reimplementing it.  This exposes the relaxation of the memory
> alignment in the XFS_IOC_DIOINFO info and ensure the information will
> stay in sync.  Note that randholes.c in xfstests has a bug where it
> incorrectly fails when the required memory alignment is smaller than the
> pointer size.  Round up the reported value as there is a fair chance that
> this code got copied into various applications.
> 
> [...]

Applied to for-next, thanks!

[1/1] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr
      commit: 851c4c96db001f51bdad1432aa54549c7fe2c63e

Best regards,
-- 
Carlos Maiolino <cem@kernel.org>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-08-28 13:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cG84V92R_rvXt_xDUKDRAZU_E6E69atqXw04uiv_deBLGkFtMFj_XYvumw4sZh6EOFZpn33yItQ55aPJs5hNNw==@protonmail.internalid>
2025-08-25 11:15 ` [PATCH v2] xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr Christoph Hellwig
2025-08-25 11:32   ` Carlos Maiolino
2025-08-25 11:33     ` Christoph Hellwig
2025-08-25 15:29   ` Darrick J. Wong
2025-08-26 13:14     ` Christoph Hellwig
2025-08-26 16:57       ` Keith Busch
2025-08-27  7:36         ` Christoph Hellwig
2025-08-28 13:19   ` Carlos Maiolino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).