All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Biggers <ebiggers@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fscrypt@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-xfs@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
	Jaegeuk Kim <jaegeuk@kernel.org>, Chao Yu <chao@kernel.org>
Subject: Re: [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
Date: Thu, 20 Jan 2022 18:36:03 -0800	[thread overview]
Message-ID: <20220121023603.GH13563@magnolia> (raw)
In-Reply-To: <20220120235755.GI59729@dread.disaster.area>

On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> > On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > > 
> > > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > > 
> > > > > > > It might make sense to use this as an opportunity to implement
> > > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > > 
> > > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > > list the /file/ position alignment requirement:
> > > > > > 
> > > > > > struct dioattr {
> > > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > > };
> > > > > 
> > > > > Well, the comment above struct dioattr says:
> > > > > 
> > > > > 	/*
> > > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > > 	 * alignment.
> > > > > 	 */
> > > > > 
> > > > > So d_miniosz serves that purpose already.
> > > > > 
> > > > > > 
> > > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > > block size, right?
> > > > > 
> > > > > The file position must be a multiple of the filesystem block size, yes.
> > > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > > good enough for the fscrypt direct I/O case.
> > > > 
> > > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > > be enough to get the fscrypt patchset moving again.
> > > 
> > > On the contrary, I'd much prefer to see this information added to
> > > statx(). The file offset alignment info is a property of the current
> > > file (e.g. XFS can have different per-file requirements depending on
> > > whether the file data is hosted on the data or RT device, etc) and
> > > so it's not a fixed property of the filesystem.
> > > 
> > > statx() was designed to be extended with per-file property
> > > information, and we already have stuff like filesystem block size in
> > > that syscall. Hence I would much prefer that we extend it with the
> > > DIO properties we need to support rather than "create" a new VFS
> > > ioctl to extract this information. We already have statx(), so let's
> > > use it for what it was intended for.

Eh, ok.  Let's do that instead.

> > > 
> > 
> > I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> > looks at the filesystem only,
> 
> You've got that wrong.
> 
>         case XFS_IOC_DIOINFO: {
> >>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
>                 struct dioattr          da;
> 
>                 da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> 
> xfs_inode_buftarg() is determining which block device the inode is
> storing it's data on, so the returned dioattr values can be
> different for different inodes in the filesystem...
> 
> It's always been that way since the early Irix days - XFS RT devices
> could have very different IO constraints than the data device and
> DIO had to conform to the hardware limits underlying the filesystem.
> Hence the dioattr information has -always- been per-inode
> information.
> 
> > (Per-file state is required for encrypted
> > files.  It's also required for other filesystem features; e.g., files that use
> > compression or fs-verity don't support direct I/O at all.)
> 
> Which is exactly why is should be a property of statx(), rather than
> try to re-use a ~30 year old filesystem specific API from a
> different OS that was never intended to indicate things like "DIO
> not supported on this file at all"....

Heh.  You mean like ALLOCSP?  Ok ok point taken.

> We've been bitten many times by this "lift a rarely used filesystem
> specific ioctl to the VFS because it exists" method of API
> promotion. It almost always ends up in us discovering further down
> the track that there's something wrong with the API, it doesn't
> quite do what we need, we have to extend it anyway, or it's just
> plain borken, etc. And then we have to create a new, fit for purpose
> API anyway, and there's two VFS APIs we have to maintain forever
> instead of just one...
> 
> Can we learn from past mistakes this time instead of repeating them
> yet again?

Sure.  How's this?  I couldn't think of a real case of directio
requiring different alignments for pos and bytecount, so the only real
addition here is the alignment requirements for best performance.

struct statx {
...
	/* 0x90 */
	__u64	stx_mnt_id;

	/* Memory buffer alignment required for directio, in bytes. */
	__u32	stx_dio_mem_align;

	/* File range alignment required for directio, in bytes. */
	__u32	stx_dio_fpos_align_min;

	/* 0xa0 */

	/* File range alignment needed for best performance, in bytes. */
	__u32	stx_dio_fpos_align_opt;

	/* Maximum size of a directio request, in bytes. */
	__u32	stx_dio_max_iosize;

	__u64	__spare3[11];	/* Spare space for future expansion */
	/* 0x100 */
};

Along with:

#define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */

How about that?

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Theodore Ts'o <tytso@mit.edu>,
	linux-f2fs-devel@lists.sourceforge.net,
	Eric Biggers <ebiggers@kernel.org>,
	linux-fscrypt@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [f2fs-dev] [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto
Date: Thu, 20 Jan 2022 18:36:03 -0800	[thread overview]
Message-ID: <20220121023603.GH13563@magnolia> (raw)
In-Reply-To: <20220120235755.GI59729@dread.disaster.area>

On Fri, Jan 21, 2022 at 10:57:55AM +1100, Dave Chinner wrote:
> On Thu, Jan 20, 2022 at 02:48:52PM -0800, Eric Biggers wrote:
> > On Fri, Jan 21, 2022 at 09:04:14AM +1100, Dave Chinner wrote:
> > > On Thu, Jan 20, 2022 at 01:00:27PM -0800, Darrick J. Wong wrote:
> > > > On Thu, Jan 20, 2022 at 12:39:14PM -0800, Eric Biggers wrote:
> > > > > On Thu, Jan 20, 2022 at 09:10:27AM -0800, Darrick J. Wong wrote:
> > > > > > On Thu, Jan 20, 2022 at 12:30:23AM -0800, Christoph Hellwig wrote:
> > > > > > > On Wed, Jan 19, 2022 at 11:12:10PM -0800, Eric Biggers wrote:
> > > > > > > > 
> > > > > > > > Given the above, as far as I know the only remaining objection to this
> > > > > > > > patchset would be that DIO constraints aren't sufficiently discoverable
> > > > > > > > by userspace.  Now, to put this in context, this is a longstanding issue
> > > > > > > > with all Linux filesystems, except XFS which has XFS_IOC_DIOINFO.  It's
> > > > > > > > not specific to this feature, and it doesn't actually seem to be too
> > > > > > > > important in practice; many other filesystem features place constraints
> > > > > > > > on DIO, and f2fs even *only* allows fully FS block size aligned DIO.
> > > > > > > > (And for better or worse, many systems using fscrypt already have
> > > > > > > > out-of-tree patches that enable DIO support, and people don't seem to
> > > > > > > > have trouble with the FS block size alignment requirement.)
> > > > > > > 
> > > > > > > It might make sense to use this as an opportunity to implement
> > > > > > > XFS_IOC_DIOINFO for ext4 and f2fs.
> > > > > > 
> > > > > > Hmm.  A potential problem with DIOINFO is that it doesn't explicitly
> > > > > > list the /file/ position alignment requirement:
> > > > > > 
> > > > > > struct dioattr {
> > > > > > 	__u32		d_mem;		/* data buffer memory alignment */
> > > > > > 	__u32		d_miniosz;	/* min xfer size		*/
> > > > > > 	__u32		d_maxiosz;	/* max xfer size		*/
> > > > > > };
> > > > > 
> > > > > Well, the comment above struct dioattr says:
> > > > > 
> > > > > 	/*
> > > > > 	 * Direct I/O attribute record used with XFS_IOC_DIOINFO
> > > > > 	 * d_miniosz is the min xfer size, xfer size multiple and file seek offset
> > > > > 	 * alignment.
> > > > > 	 */
> > > > > 
> > > > > So d_miniosz serves that purpose already.
> > > > > 
> > > > > > 
> > > > > > Since I /think/ fscrypt requires that directio writes be aligned to file
> > > > > > block size, right?
> > > > > 
> > > > > The file position must be a multiple of the filesystem block size, yes.
> > > > > Likewise for the "minimum xfer size" and "xfer size multiple", and the "data
> > > > > buffer memory alignment" for that matter.  So I think XFS_IOC_DIOINFO would be
> > > > > good enough for the fscrypt direct I/O case.
> > > > 
> > > > Oh, ok then.  In that case, just hoist XFS_IOC_DIOINFO to the VFS and
> > > > add a couple of implementations for ext4 and f2fs, and I think that'll
> > > > be enough to get the fscrypt patchset moving again.
> > > 
> > > On the contrary, I'd much prefer to see this information added to
> > > statx(). The file offset alignment info is a property of the current
> > > file (e.g. XFS can have different per-file requirements depending on
> > > whether the file data is hosted on the data or RT device, etc) and
> > > so it's not a fixed property of the filesystem.
> > > 
> > > statx() was designed to be extended with per-file property
> > > information, and we already have stuff like filesystem block size in
> > > that syscall. Hence I would much prefer that we extend it with the
> > > DIO properties we need to support rather than "create" a new VFS
> > > ioctl to extract this information. We already have statx(), so let's
> > > use it for what it was intended for.

Eh, ok.  Let's do that instead.

> > > 
> > 
> > I assumed that XFS_IOC_DIOINFO *was* per-file.  XFS's *implementation* of it
> > looks at the filesystem only,
> 
> You've got that wrong.
> 
>         case XFS_IOC_DIOINFO: {
> >>>>>>          struct xfs_buftarg      *target = xfs_inode_buftarg(ip);
>                 struct dioattr          da;
> 
>                 da.d_mem =  da.d_miniosz = target->bt_logical_sectorsize;
> 
> xfs_inode_buftarg() is determining which block device the inode is
> storing it's data on, so the returned dioattr values can be
> different for different inodes in the filesystem...
> 
> It's always been that way since the early Irix days - XFS RT devices
> could have very different IO constraints than the data device and
> DIO had to conform to the hardware limits underlying the filesystem.
> Hence the dioattr information has -always- been per-inode
> information.
> 
> > (Per-file state is required for encrypted
> > files.  It's also required for other filesystem features; e.g., files that use
> > compression or fs-verity don't support direct I/O at all.)
> 
> Which is exactly why is should be a property of statx(), rather than
> try to re-use a ~30 year old filesystem specific API from a
> different OS that was never intended to indicate things like "DIO
> not supported on this file at all"....

Heh.  You mean like ALLOCSP?  Ok ok point taken.

> We've been bitten many times by this "lift a rarely used filesystem
> specific ioctl to the VFS because it exists" method of API
> promotion. It almost always ends up in us discovering further down
> the track that there's something wrong with the API, it doesn't
> quite do what we need, we have to extend it anyway, or it's just
> plain borken, etc. And then we have to create a new, fit for purpose
> API anyway, and there's two VFS APIs we have to maintain forever
> instead of just one...
> 
> Can we learn from past mistakes this time instead of repeating them
> yet again?

Sure.  How's this?  I couldn't think of a real case of directio
requiring different alignments for pos and bytecount, so the only real
addition here is the alignment requirements for best performance.

struct statx {
...
	/* 0x90 */
	__u64	stx_mnt_id;

	/* Memory buffer alignment required for directio, in bytes. */
	__u32	stx_dio_mem_align;

	/* File range alignment required for directio, in bytes. */
	__u32	stx_dio_fpos_align_min;

	/* 0xa0 */

	/* File range alignment needed for best performance, in bytes. */
	__u32	stx_dio_fpos_align_opt;

	/* Maximum size of a directio request, in bytes. */
	__u32	stx_dio_max_iosize;

	__u64	__spare3[11];	/* Spare space for future expansion */
	/* 0x100 */
};

Along with:

#define STATX_DIRECTIO	0x00001000U	/* Want/got directio geometry */

How about that?

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2022-01-21  2:36 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-20  7:12 [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto Eric Biggers
2022-01-20  7:12 ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 1/5] fscrypt: add functions for direct I/O support Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:27   ` Christoph Hellwig
2022-01-20  8:27     ` [f2fs-dev] " Christoph Hellwig
2022-01-20  9:04     ` Eric Biggers
2022-01-20  9:04       ` [f2fs-dev] " Eric Biggers
2022-01-21  7:10       ` Christoph Hellwig
2022-01-21  7:10         ` [f2fs-dev] " Christoph Hellwig
2022-01-20  7:12 ` [PATCH v10 2/5] iomap: support direct I/O with fscrypt using blk-crypto Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:28   ` Christoph Hellwig
2022-01-20  8:28     ` [f2fs-dev] " Christoph Hellwig
2022-01-20  7:12 ` [PATCH v10 3/5] ext4: " Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 4/5] f2fs: " Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  7:12 ` [PATCH v10 5/5] fscrypt: update documentation for direct I/O support Eric Biggers
2022-01-20  7:12   ` [f2fs-dev] " Eric Biggers
2022-01-20  8:30 ` [PATCH v10 0/5] add support for direct I/O with fscrypt using blk-crypto Christoph Hellwig
2022-01-20  8:30   ` [f2fs-dev] " Christoph Hellwig
2022-01-20 17:10   ` Darrick J. Wong
2022-01-20 17:10     ` [f2fs-dev] " Darrick J. Wong
2022-01-20 20:39     ` Eric Biggers
2022-01-20 20:39       ` [f2fs-dev] " Eric Biggers
2022-01-20 21:00       ` Darrick J. Wong
2022-01-20 21:00         ` [f2fs-dev] " Darrick J. Wong
2022-01-20 22:04         ` Dave Chinner
2022-01-20 22:04           ` [f2fs-dev] " Dave Chinner
2022-01-20 22:48           ` Eric Biggers
2022-01-20 22:48             ` [f2fs-dev] " Eric Biggers
2022-01-20 23:57             ` Dave Chinner
2022-01-20 23:57               ` [f2fs-dev] " Dave Chinner
2022-01-21  2:36               ` Darrick J. Wong [this message]
2022-01-21  2:36                 ` Darrick J. Wong
2022-01-21  7:12                 ` Christoph Hellwig
2022-01-21  7:12                   ` [f2fs-dev] " Christoph Hellwig
2022-01-23 23:03                 ` Dave Chinner
2022-01-23 23:03                   ` [f2fs-dev] " Dave Chinner
2022-02-09  1:10                   ` Eric Biggers
2022-02-09  1:10                     ` [f2fs-dev] " Eric Biggers
2022-02-10  4:03                     ` Dave Chinner
2022-02-10  4:03                       ` [f2fs-dev] " Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220121023603.GH13563@magnolia \
    --to=djwong@kernel.org \
    --cc=chao@kernel.org \
    --cc=david@fromorbit.com \
    --cc=ebiggers@kernel.org \
    --cc=hch@infradead.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.