public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] ext4: Add atomic writes support for DIO
@ 2024-10-30 15:57 Ritesh Harjani (IBM)
  2024-10-30 15:57 ` [PATCH v3 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-30 15:57 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

v2 -> v3:
==========
1. Patch-1 adds an "experimental" string in dmesg log during mount when EXT4
   detects that it is capable of doing DIO atomic writes on a given device
   with min and max unit details.
2. Patch-4 has been updated to avoid returning -ENOTBLK (in ext4_iomap_end)
   if the request belongs to atomic write. This patch also adds a WARN_ON_ONCE()
   if atomic write ever fallback to buffered-io (to catch any unwanted bugs in the future).
   More details in the commit log of patch-4.
3. Collected RBs tag from John for Patch 2 & 3.

[v2]: https://lore.kernel.org/linux-ext4/cover.1729944406.git.ritesh.list@gmail.com/

Previous cover letter log:

In v2, we had split the series and this one only takes care of
atomic writes for single fsblock.
That means for now this gets only enabled on bs < ps systems on ext4.
Enablement of atomic writes for bigalloc (multi-fsblock support) is still
under discussion and may require general consensus within the filesystem
community [1].

This series adds the base feature support to enable atomic writes in
direct-io path for ext4. We advertise the minimum and the maximum atomic
write unit sizes via statx on a regular file.

This series allows users to utilize atomic write support using -
1. on bs < ps systems via - mkfs.ext4 -F -b 16384 /dev/sda

This can then be utilized using -
	xfs_io -fdc "pwrite -V 1 -A -b16k 0 16k" /mnt/f1

This is built on top of John's DIO atomic write series for XFS [2].
The VFS and block layer enablement for atomic writes were merged already.


[1]: https://lore.kernel.org/linux-ext4/87jzdvmqfz.fsf@gmail.com
[2]: https://lore.kernel.org/linux-xfs/20241019125113.369994-1-john.g.garry@oracle.com/


Changelogs:
===========
PATCH -> PATCH v2:
- addressed review comments from John and Darrick.
- renamed ext4_sb_info variables names: fs_awu* -> s_awu*
- [PATCH]: https://lore.kernel.org/linux-ext4/cover.1729825985.git.ritesh.list@gmail.com/

RFC -> PATCH:
- Dropped RFC tag
- Last RFC was posted a while ago but back then a lot of VFS and block layer
  interfaces were still not merged. Those are now merged, thanks to John and
  everyone else.
- [RFC] - https://lore.kernel.org/linux-ext4/cover.1709356594.git.ritesh.list@gmail.com/



Ritesh Harjani (IBM) (4):
  ext4: Add statx support for atomic writes
  ext4: Check for atomic writes support in write iter
  ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  ext4: Do not fallback to buffered-io for DIO atomic write

 fs/ext4/ext4.h  |  9 +++++++++
 fs/ext4/file.c  | 24 ++++++++++++++++++++++++
 fs/ext4/inode.c | 28 +++++++++++++++++++++++-----
 fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
 4 files changed, 87 insertions(+), 5 deletions(-)

--
2.46.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] ext4: Add statx support for atomic writes
  2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
@ 2024-10-30 15:57 ` Ritesh Harjani (IBM)
  2024-10-31 21:42   ` Darrick J. Wong
  2024-10-30 15:57 ` [PATCH v3 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-30 15:57 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

This patch adds base support for atomic writes via statx getattr.
On bs < ps systems, we can create FS with say bs of 16k. That means
both atomic write min and max unit can be set to 16k for supporting
atomic writes.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/ext4.h  |  9 +++++++++
 fs/ext4/inode.c | 14 ++++++++++++++
 fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 44b0d418143c..6ee49aaacd2b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1729,6 +1729,10 @@ struct ext4_sb_info {
 	 */
 	struct work_struct s_sb_upd_work;
 
+	/* Atomic write unit values in bytes */
+	unsigned int s_awu_min;
+	unsigned int s_awu_max;
+
 	/* Ext4 fast commit sub transaction ID */
 	atomic_t s_fc_subtid;
 
@@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
 	return buffer_uptodate(bh);
 }
 
+static inline bool ext4_can_atomic_write(struct super_block *sb)
+{
+	return EXT4_SB(sb)->s_awu_min > 0;
+}
+
 extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
 				  loff_t pos, unsigned len,
 				  get_block_t *get_block);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 54bdd4884fe6..fcdee27b9aa2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
 		}
 	}
 
+	if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
+		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+		unsigned int awu_min, awu_max;
+
+		if (ext4_can_atomic_write(inode->i_sb)) {
+			awu_min = sbi->s_awu_min;
+			awu_max = sbi->s_awu_max;
+		} else {
+			awu_min = awu_max = 0;
+		}
+
+		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
+	}
+
 	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
 	if (flags & EXT4_APPEND_FL)
 		stat->attributes |= STATX_ATTR_APPEND;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 16a4ce704460..ebe1660bd840 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
 	return 0;
 }
 
+/*
+ * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
+ * @sb: super block
+ * TODO: Later add support for bigalloc
+ */
+static void ext4_atomic_write_init(struct super_block *sb)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct block_device *bdev = sb->s_bdev;
+
+	if (!bdev_can_atomic_write(bdev))
+		return;
+
+	if (!ext4_has_feature_extents(sb))
+		return;
+
+	sbi->s_awu_min = max(sb->s_blocksize,
+			      bdev_atomic_write_unit_min_bytes(bdev));
+	sbi->s_awu_max = min(sb->s_blocksize,
+			      bdev_atomic_write_unit_max_bytes(bdev));
+	if (sbi->s_awu_min && sbi->s_awu_max &&
+	    sbi->s_awu_min <= sbi->s_awu_max) {
+		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
+			 sbi->s_awu_min, sbi->s_awu_max);
+	} else {
+		sbi->s_awu_min = 0;
+		sbi->s_awu_max = 0;
+	}
+}
+
 static void ext4_fast_commit_init(struct super_block *sb)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 
 	spin_lock_init(&sbi->s_bdev_wb_lock);
 
+	ext4_atomic_write_init(sb);
 	ext4_fast_commit_init(sb);
 
 	sb->s_root = NULL;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] ext4: Check for atomic writes support in write iter
  2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
  2024-10-30 15:57 ` [PATCH v3 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-10-30 15:57 ` Ritesh Harjani (IBM)
  2024-10-31 21:42   ` Darrick J. Wong
  2024-10-30 15:57 ` [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-30 15:57 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

Let's validate the given constraints for atomic write request.
Otherwise it will fail with -EINVAL. Currently atomic write is only
supported on DIO, so for buffered-io it will return -EOPNOTSUPP.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index f14aed14b9cf..a7b9b9751a3f 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -692,6 +692,20 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if (IS_DAX(inode))
 		return ext4_dax_write_iter(iocb, from);
 #endif
+
+	if (iocb->ki_flags & IOCB_ATOMIC) {
+		size_t len = iov_iter_count(from);
+		int ret;
+
+		if (len < EXT4_SB(inode->i_sb)->s_awu_min ||
+		    len > EXT4_SB(inode->i_sb)->s_awu_max)
+			return -EINVAL;
+
+		ret = generic_atomic_write_valid(iocb, from);
+		if (ret)
+			return ret;
+	}
+
 	if (iocb->ki_flags & IOCB_DIRECT)
 		return ext4_dio_write_iter(iocb, from);
 	else
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
  2024-10-30 15:57 ` [PATCH v3 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
  2024-10-30 15:57 ` [PATCH v3 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-10-30 15:57 ` Ritesh Harjani (IBM)
  2024-10-31 21:42   ` Darrick J. Wong
  2024-10-30 15:57 ` [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
  2024-10-31 22:01 ` [PATCH v3 0/4] ext4: Add atomic writes support for DIO Darrick J. Wong
  4 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-30 15:57 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

FS needs to add the fmode capability in order to support atomic writes
during file open (refer kiocb_set_rw_flags()). Set this capability on
a regular file if ext4 can do atomic write.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index a7b9b9751a3f..8116bd78910b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -898,6 +898,9 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
 			return ret;
 	}
 
+	if (S_ISREG(inode->i_mode) && ext4_can_atomic_write(inode->i_sb))
+		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
+
 	filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
 	return dquot_file_open(inode, filp);
 }
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
                   ` (2 preceding siblings ...)
  2024-10-30 15:57 ` [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-10-30 15:57 ` Ritesh Harjani (IBM)
  2024-10-31 21:51   ` Darrick J. Wong
  2024-10-31 22:01 ` [PATCH v3 0/4] ext4: Add atomic writes support for DIO Darrick J. Wong
  4 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-30 15:57 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

atomic writes is currently only supported for single fsblock and only
for direct-io. We should not return -ENOTBLK for atomic writes since we
want the atomic write request to either complete fully or fail
otherwise. We should not fallback to buffered-io in case of DIO atomic
write requests.
Let's also catch if this ever happens by adding some WARN_ON_ONCE before
buffered-io handling for direct-io atomic writes.

More details of the discussion [1].

[1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c  |  7 +++++++
 fs/ext4/inode.c | 14 +++++++++-----
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8116bd78910b..61787a37e9d4 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		ssize_t err;
 		loff_t endbyte;
 
+		/*
+		 * There is no support for atomic writes on buffered-io yet,
+		 * we should never fallback to buffered-io for DIO atomic
+		 * writes.
+		 */
+		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
+
 		offset = iocb->ki_pos;
 		err = ext4_buffered_write_iter(iocb, from);
 		if (err < 0)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fcdee27b9aa2..26b3c84d7f64 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3449,12 +3449,16 @@ static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
 {
 	/*
 	 * Check to see whether an error occurred while writing out the data to
-	 * the allocated blocks. If so, return the magic error code so that we
-	 * fallback to buffered I/O and attempt to complete the remainder of
-	 * the I/O. Any blocks that may have been allocated in preparation for
-	 * the direct I/O will be reused during buffered I/O.
+	 * the allocated blocks. If so, return the magic error code for
+	 * non-atomic write so that we fallback to buffered I/O and attempt to
+	 * complete the remainder of the I/O.
+	 * For atomic writes we will simply fail the I/O request if we coudn't
+	 * write anything. For non-atomic writes, any blocks that may have been
+	 * allocated in preparation for the direct I/O will be reused during
+	 * buffered I/O.
 	 */
-	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
+	if (!(flags & IOMAP_ATOMIC) && (flags & (IOMAP_WRITE | IOMAP_DIRECT))
+			&& written == 0)
 		return -ENOTBLK;
 
 	return 0;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] ext4: Add statx support for atomic writes
  2024-10-30 15:57 ` [PATCH v3 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-10-31 21:42   ` Darrick J. Wong
  2024-11-01  2:30     ` Ritesh Harjani
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2024-10-31 21:42 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Wed, Oct 30, 2024 at 09:27:38PM +0530, Ritesh Harjani (IBM) wrote:
> This patch adds base support for atomic writes via statx getattr.
> On bs < ps systems, we can create FS with say bs of 16k. That means
> both atomic write min and max unit can be set to 16k for supporting
> atomic writes.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ext4/ext4.h  |  9 +++++++++
>  fs/ext4/inode.c | 14 ++++++++++++++
>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 44b0d418143c..6ee49aaacd2b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>  	 */
>  	struct work_struct s_sb_upd_work;
>  
> +	/* Atomic write unit values in bytes */
> +	unsigned int s_awu_min;
> +	unsigned int s_awu_max;
> +
>  	/* Ext4 fast commit sub transaction ID */
>  	atomic_t s_fc_subtid;
>  
> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>  	return buffer_uptodate(bh);
>  }
>  
> +static inline bool ext4_can_atomic_write(struct super_block *sb)
> +{
> +	return EXT4_SB(sb)->s_awu_min > 0;

Huh, I was expecting you to stick to passing in the struct inode,
and then you end up with:

static inline bool ext4_can_atomic_write(struct inode *inode)
{
	return S_ISREG(inode->i_mode) &&
	       EXT4_SB(inode->i_sb)->s_awu_min > 0);
}

> +}
> +
>  extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>  				  loff_t pos, unsigned len,
>  				  get_block_t *get_block);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 54bdd4884fe6..fcdee27b9aa2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>  		}
>  	}
>  
> +	if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {

...and then the callsites become:

	if (request_mask & STATX_WRITE_ATOMIC) {
		unsigned int awu_min = 0, awu_max = 0;

		if (ext4_can_atomic_write(inode)) {
			awu_min = sbi->s_awu_min;
			awu_max = sbi->s_awu_max;
		}

		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
	}

(I forget, is it bad if statx to a directory returns STATX_WRITE_ATOMIC
even with awu_{min,max} set to zero?)

Other than that nit, this looks good to me.

--D

> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +		unsigned int awu_min, awu_max;
> +
> +		if (ext4_can_atomic_write(inode->i_sb)) {
> +			awu_min = sbi->s_awu_min;
> +			awu_max = sbi->s_awu_max;
> +		} else {
> +			awu_min = awu_max = 0;
> +		}
> +
> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> +	}
> +
>  	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>  	if (flags & EXT4_APPEND_FL)
>  		stat->attributes |= STATX_ATTR_APPEND;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 16a4ce704460..ebe1660bd840 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>  	return 0;
>  }
>  
> +/*
> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> + * @sb: super block
> + * TODO: Later add support for bigalloc
> + */
> +static void ext4_atomic_write_init(struct super_block *sb)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	struct block_device *bdev = sb->s_bdev;
> +
> +	if (!bdev_can_atomic_write(bdev))
> +		return;
> +
> +	if (!ext4_has_feature_extents(sb))
> +		return;
> +
> +	sbi->s_awu_min = max(sb->s_blocksize,
> +			      bdev_atomic_write_unit_min_bytes(bdev));
> +	sbi->s_awu_max = min(sb->s_blocksize,
> +			      bdev_atomic_write_unit_max_bytes(bdev));
> +	if (sbi->s_awu_min && sbi->s_awu_max &&
> +	    sbi->s_awu_min <= sbi->s_awu_max) {
> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
> +			 sbi->s_awu_min, sbi->s_awu_max);
> +	} else {
> +		sbi->s_awu_min = 0;
> +		sbi->s_awu_max = 0;
> +	}
> +}
> +
>  static void ext4_fast_commit_init(struct super_block *sb)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>  
>  	spin_lock_init(&sbi->s_bdev_wb_lock);
>  
> +	ext4_atomic_write_init(sb);
>  	ext4_fast_commit_init(sb);
>  
>  	sb->s_root = NULL;
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] ext4: Check for atomic writes support in write iter
  2024-10-30 15:57 ` [PATCH v3 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-10-31 21:42   ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-10-31 21:42 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Wed, Oct 30, 2024 at 09:27:39PM +0530, Ritesh Harjani (IBM) wrote:
> Let's validate the given constraints for atomic write request.
> Otherwise it will fail with -EINVAL. Currently atomic write is only
> supported on DIO, so for buffered-io it will return -EOPNOTSUPP.
> 
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Looks decent,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/ext4/file.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index f14aed14b9cf..a7b9b9751a3f 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -692,6 +692,20 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	if (IS_DAX(inode))
>  		return ext4_dax_write_iter(iocb, from);
>  #endif
> +
> +	if (iocb->ki_flags & IOCB_ATOMIC) {
> +		size_t len = iov_iter_count(from);
> +		int ret;
> +
> +		if (len < EXT4_SB(inode->i_sb)->s_awu_min ||
> +		    len > EXT4_SB(inode->i_sb)->s_awu_max)
> +			return -EINVAL;
> +
> +		ret = generic_atomic_write_valid(iocb, from);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	if (iocb->ki_flags & IOCB_DIRECT)
>  		return ext4_dio_write_iter(iocb, from);
>  	else
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  2024-10-30 15:57 ` [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-10-31 21:42   ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-10-31 21:42 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Wed, Oct 30, 2024 at 09:27:40PM +0530, Ritesh Harjani (IBM) wrote:
> FS needs to add the fmode capability in order to support atomic writes
> during file open (refer kiocb_set_rw_flags()). Set this capability on
> a regular file if ext4 can do atomic write.
> 
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ext4/file.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index a7b9b9751a3f..8116bd78910b 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -898,6 +898,9 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
>  			return ret;
>  	}
>  
> +	if (S_ISREG(inode->i_mode) && ext4_can_atomic_write(inode->i_sb))

Modulo my comment earlier about ext4_can_atomic_write, this looks ok to
me.  With either variant, I say:
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> +		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
> +
>  	filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
>  	return dquot_file_open(inode, filp);
>  }
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-10-30 15:57 ` [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
@ 2024-10-31 21:51   ` Darrick J. Wong
  2024-11-01  3:11     ` Ritesh Harjani
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2024-10-31 21:51 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Wed, Oct 30, 2024 at 09:27:41PM +0530, Ritesh Harjani (IBM) wrote:
> atomic writes is currently only supported for single fsblock and only
> for direct-io. We should not return -ENOTBLK for atomic writes since we
> want the atomic write request to either complete fully or fail
> otherwise. We should not fallback to buffered-io in case of DIO atomic
> write requests.
> Let's also catch if this ever happens by adding some WARN_ON_ONCE before
> buffered-io handling for direct-io atomic writes.
> 
> More details of the discussion [1].
> 
> [1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04
> 
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ext4/file.c  |  7 +++++++
>  fs/ext4/inode.c | 14 +++++++++-----
>  2 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 8116bd78910b..61787a37e9d4 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		ssize_t err;
>  		loff_t endbyte;
>  
> +		/*
> +		 * There is no support for atomic writes on buffered-io yet,
> +		 * we should never fallback to buffered-io for DIO atomic
> +		 * writes.
> +		 */
> +		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
> +
>  		offset = iocb->ki_pos;
>  		err = ext4_buffered_write_iter(iocb, from);
>  		if (err < 0)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index fcdee27b9aa2..26b3c84d7f64 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3449,12 +3449,16 @@ static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
>  {
>  	/*
>  	 * Check to see whether an error occurred while writing out the data to
> -	 * the allocated blocks. If so, return the magic error code so that we
> -	 * fallback to buffered I/O and attempt to complete the remainder of
> -	 * the I/O. Any blocks that may have been allocated in preparation for
> -	 * the direct I/O will be reused during buffered I/O.
> +	 * the allocated blocks. If so, return the magic error code for
> +	 * non-atomic write so that we fallback to buffered I/O and attempt to
> +	 * complete the remainder of the I/O.
> +	 * For atomic writes we will simply fail the I/O request if we coudn't
> +	 * write anything. For non-atomic writes, any blocks that may have been
> +	 * allocated in preparation for the direct I/O will be reused during
> +	 * buffered I/O.
>  	 */
> -	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
> +	if (!(flags & IOMAP_ATOMIC) && (flags & (IOMAP_WRITE | IOMAP_DIRECT))

Huh.  The WRITE|DIRECT check doesn't look right to me, because the
expression returns true for any write or any directio.  I think that's
currently "ok" because ext4_iomap_end is only called for directio
writes, but this bugs me anyway.  For a directio write fallback, that
comparison really should be:

	(flags & (WRITE|DIRECT)) == (WRITE|DIRECT)

static inline bool
ext4_want_directio_fallback(unsigned flags, ssize_t written)
{
	/* must be a directio to fall back to buffered */
	if (flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
		    (IOMAP_WRITE | IOMAP_DIRECT)
		return false;

	/* atomic writes are all-or-nothing */
	if (flags & IOMAP_ATOMIC)
		return false;

	/* can only try again if we wrote nothing */
	return written == 0;
}

	if (ext4_want_directio_fallback(flags, written))
		return -ENOTBLK;

> +			&& written == 0)

Nit: put the '&&' operator on the previous line when there's a multiline
expression.

--D

>  		return -ENOTBLK;
>  
>  	return 0;
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/4] ext4: Add atomic writes support for DIO
  2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
                   ` (3 preceding siblings ...)
  2024-10-30 15:57 ` [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
@ 2024-10-31 22:01 ` Darrick J. Wong
  4 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-10-31 22:01 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Wed, Oct 30, 2024 at 09:27:37PM +0530, Ritesh Harjani (IBM) wrote:

Assuming Ted acks this series, I have a fun question: Can we merge this
for 6.13 alongside the single-fsblock xfs implementation?

And how do we want to merge this?  It looks like Jens took only the
first three patches from John's series, leaving this:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-6.13/block-atomic

[PATCH v10 4/8] fs: Export generic_atomic_write_valid() John Garry
[PATCH v10 5/8] fs: iomap: Atomic write support John Garry
[PATCH v10 6/8] xfs: Support atomic write for statx John Garry
[PATCH v10 7/8] xfs: Validate atomic writes John Garry
[PATCH v10 8/8] xfs: Support setting FMODE_CAN_ATOMIC_WRITE John Garry

Note the fs and iomap stuff is not in that branch.

So should xfs create a 6.13 merge branch from block-atomic containing
all of its new stuff including the xfs atomic writes changes?  And then
I guess merge the ext4 changes too??  ext4 code coming in via xfs, yuck.

Or should cem just create a 6.13 merge branch with everything *except*
the atomic writes stuff?  Call that branch "xfs-6.13-merge".  Then one
of us with commit privileges creates a separate branch off of
block-atomic, add both the xfs series and then the ext4 series?  Call
that branch "fs-atomic-writes".

Then I guess cem could create a third branch from xfs-6.13-merge, merge
the fs-atomic-writes branch into that third branch, and push that third
branch to for-next on git.kernel.org so it can get picked up by
rothwell's for-next and fs-next?

(Note that Ted could do likewise with ext4; cem doesn't have to be part
of this.)

Does that work for people?  The "sending multiple branches to linus" way
is the best method I can think of, though it's more release manager
work.

--D

> v2 -> v3:
> ==========
> 1. Patch-1 adds an "experimental" string in dmesg log during mount when EXT4
>    detects that it is capable of doing DIO atomic writes on a given device
>    with min and max unit details.
> 2. Patch-4 has been updated to avoid returning -ENOTBLK (in ext4_iomap_end)
>    if the request belongs to atomic write. This patch also adds a WARN_ON_ONCE()
>    if atomic write ever fallback to buffered-io (to catch any unwanted bugs in the future).
>    More details in the commit log of patch-4.
> 3. Collected RBs tag from John for Patch 2 & 3.
> 
> [v2]: https://lore.kernel.org/linux-ext4/cover.1729944406.git.ritesh.list@gmail.com/
> 
> Previous cover letter log:
> 
> In v2, we had split the series and this one only takes care of
> atomic writes for single fsblock.
> That means for now this gets only enabled on bs < ps systems on ext4.
> Enablement of atomic writes for bigalloc (multi-fsblock support) is still
> under discussion and may require general consensus within the filesystem
> community [1].
> 
> This series adds the base feature support to enable atomic writes in
> direct-io path for ext4. We advertise the minimum and the maximum atomic
> write unit sizes via statx on a regular file.
> 
> This series allows users to utilize atomic write support using -
> 1. on bs < ps systems via - mkfs.ext4 -F -b 16384 /dev/sda
> 
> This can then be utilized using -
> 	xfs_io -fdc "pwrite -V 1 -A -b16k 0 16k" /mnt/f1
> 
> This is built on top of John's DIO atomic write series for XFS [2].
> The VFS and block layer enablement for atomic writes were merged already.
> 
> 
> [1]: https://lore.kernel.org/linux-ext4/87jzdvmqfz.fsf@gmail.com
> [2]: https://lore.kernel.org/linux-xfs/20241019125113.369994-1-john.g.garry@oracle.com/
> 
> 
> Changelogs:
> ===========
> PATCH -> PATCH v2:
> - addressed review comments from John and Darrick.
> - renamed ext4_sb_info variables names: fs_awu* -> s_awu*
> - [PATCH]: https://lore.kernel.org/linux-ext4/cover.1729825985.git.ritesh.list@gmail.com/
> 
> RFC -> PATCH:
> - Dropped RFC tag
> - Last RFC was posted a while ago but back then a lot of VFS and block layer
>   interfaces were still not merged. Those are now merged, thanks to John and
>   everyone else.
> - [RFC] - https://lore.kernel.org/linux-ext4/cover.1709356594.git.ritesh.list@gmail.com/
> 
> 
> 
> Ritesh Harjani (IBM) (4):
>   ext4: Add statx support for atomic writes
>   ext4: Check for atomic writes support in write iter
>   ext4: Support setting FMODE_CAN_ATOMIC_WRITE
>   ext4: Do not fallback to buffered-io for DIO atomic write
> 
>  fs/ext4/ext4.h  |  9 +++++++++
>  fs/ext4/file.c  | 24 ++++++++++++++++++++++++
>  fs/ext4/inode.c | 28 +++++++++++++++++++++++-----
>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>  4 files changed, 87 insertions(+), 5 deletions(-)
> 
> --
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] ext4: Add statx support for atomic writes
  2024-10-31 21:42   ` Darrick J. Wong
@ 2024-11-01  2:30     ` Ritesh Harjani
  2024-11-01  3:33       ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani @ 2024-11-01  2:30 UTC (permalink / raw)
  To: Darrick J. Wong, John Garry
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel


Hi John & Darrick,

"Darrick J. Wong" <djwong@kernel.org> writes:

> On Wed, Oct 30, 2024 at 09:27:38PM +0530, Ritesh Harjani (IBM) wrote:
>> This patch adds base support for atomic writes via statx getattr.
>> On bs < ps systems, we can create FS with say bs of 16k. That means
>> both atomic write min and max unit can be set to 16k for supporting
>> atomic writes.
>> 
>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> ---
>>  fs/ext4/ext4.h  |  9 +++++++++
>>  fs/ext4/inode.c | 14 ++++++++++++++
>>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>>  3 files changed, 54 insertions(+)
>> 
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 44b0d418143c..6ee49aaacd2b 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>>  	 */
>>  	struct work_struct s_sb_upd_work;
>>  
>> +	/* Atomic write unit values in bytes */
>> +	unsigned int s_awu_min;
>> +	unsigned int s_awu_max;
>> +
>>  	/* Ext4 fast commit sub transaction ID */
>>  	atomic_t s_fc_subtid;
>>  
>> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>>  	return buffer_uptodate(bh);
>>  }
>>  
>> +static inline bool ext4_can_atomic_write(struct super_block *sb)
>> +{
>> +	return EXT4_SB(sb)->s_awu_min > 0;
>
> Huh, I was expecting you to stick to passing in the struct inode,
> and then you end up with:
>
> static inline bool ext4_can_atomic_write(struct inode *inode)
> {
> 	return S_ISREG(inode->i_mode) &&
> 	       EXT4_SB(inode->i_sb)->s_awu_min > 0);
> }
>

Ok. John also had commented on the same thing before. 
We may only need this, when ext4 get extsize hint support. But for now
we mainly only need to check that EXT4 SB supports atomic write or not.
i.e. s_awu_min should be greater than 0. 

But sure I can make above suggested change to keep it consistent with XFS, along
with below discussed change (Please have a look)...

>> +}
>> +
>>  extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>>  				  loff_t pos, unsigned len,
>>  				  get_block_t *get_block);
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 54bdd4884fe6..fcdee27b9aa2 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>>  		}
>>  	}
>>  
>> +	if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
>
> ...and then the callsites become:
>
> 	if (request_mask & STATX_WRITE_ATOMIC) {
> 		unsigned int awu_min = 0, awu_max = 0;
>
> 		if (ext4_can_atomic_write(inode)) {
> 			awu_min = sbi->s_awu_min;
> 			awu_max = sbi->s_awu_max;
> 		}
>
> 		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> 	}
>
> (I forget, is it bad if statx to a directory returns STATX_WRITE_ATOMIC
> even with awu_{min,max} set to zero?)

I mainly kept it consistent with XFS. But it's not a bad idea to do that. 
That will help applications check for atomic write support on the root
directory mount point rather than creating a regular file just for
verification. Because of below result_mask, which we only set within generic_fill_statx_atomic_writes() 

	stat->result_mask |= STATX_WRITE_ATOMIC;

If we make this change to ext4, XFS will have to fix it too, to keep
the behavior consistent for both.
Shall I go ahead and make the change in v4 for EXT4?

-ritesh

>
> Other than that nit, this looks good to me.
>
> --D
>
>> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> +		unsigned int awu_min, awu_max;
>> +
>> +		if (ext4_can_atomic_write(inode->i_sb)) {
>> +			awu_min = sbi->s_awu_min;
>> +			awu_max = sbi->s_awu_max;
>> +		} else {
>> +			awu_min = awu_max = 0;
>> +		}
>> +
>> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
>> +	}
>> +
>>  	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>>  	if (flags & EXT4_APPEND_FL)
>>  		stat->attributes |= STATX_ATTR_APPEND;
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 16a4ce704460..ebe1660bd840 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>>  	return 0;
>>  }
>>  
>> +/*
>> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
>> + * @sb: super block
>> + * TODO: Later add support for bigalloc
>> + */
>> +static void ext4_atomic_write_init(struct super_block *sb)
>> +{
>> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
>> +	struct block_device *bdev = sb->s_bdev;
>> +
>> +	if (!bdev_can_atomic_write(bdev))
>> +		return;
>> +
>> +	if (!ext4_has_feature_extents(sb))
>> +		return;
>> +
>> +	sbi->s_awu_min = max(sb->s_blocksize,
>> +			      bdev_atomic_write_unit_min_bytes(bdev));
>> +	sbi->s_awu_max = min(sb->s_blocksize,
>> +			      bdev_atomic_write_unit_max_bytes(bdev));
>> +	if (sbi->s_awu_min && sbi->s_awu_max &&
>> +	    sbi->s_awu_min <= sbi->s_awu_max) {
>> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
>> +			 sbi->s_awu_min, sbi->s_awu_max);
>> +	} else {
>> +		sbi->s_awu_min = 0;
>> +		sbi->s_awu_max = 0;
>> +	}
>> +}
>> +
>>  static void ext4_fast_commit_init(struct super_block *sb)
>>  {
>>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
>> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>>  
>>  	spin_lock_init(&sbi->s_bdev_wb_lock);
>>  
>> +	ext4_atomic_write_init(sb);
>>  	ext4_fast_commit_init(sb);
>>  
>>  	sb->s_root = NULL;
>> -- 
>> 2.46.0
>> 
>> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-10-31 21:51   ` Darrick J. Wong
@ 2024-11-01  3:11     ` Ritesh Harjani
  0 siblings, 0 replies; 13+ messages in thread
From: Ritesh Harjani @ 2024-11-01  3:11 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

"Darrick J. Wong" <djwong@kernel.org> writes:

> On Wed, Oct 30, 2024 at 09:27:41PM +0530, Ritesh Harjani (IBM) wrote:
>> atomic writes is currently only supported for single fsblock and only
>> for direct-io. We should not return -ENOTBLK for atomic writes since we
>> want the atomic write request to either complete fully or fail
>> otherwise. We should not fallback to buffered-io in case of DIO atomic
>> write requests.
>> Let's also catch if this ever happens by adding some WARN_ON_ONCE before
>> buffered-io handling for direct-io atomic writes.
>> 
>> More details of the discussion [1].
>> 
>> [1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04
>> 
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> ---
>>  fs/ext4/file.c  |  7 +++++++
>>  fs/ext4/inode.c | 14 +++++++++-----
>>  2 files changed, 16 insertions(+), 5 deletions(-)
>> 
>> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
>> index 8116bd78910b..61787a37e9d4 100644
>> --- a/fs/ext4/file.c
>> +++ b/fs/ext4/file.c
>> @@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>>  		ssize_t err;
>>  		loff_t endbyte;
>>  
>> +		/*
>> +		 * There is no support for atomic writes on buffered-io yet,
>> +		 * we should never fallback to buffered-io for DIO atomic
>> +		 * writes.
>> +		 */
>> +		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
>> +
>>  		offset = iocb->ki_pos;
>>  		err = ext4_buffered_write_iter(iocb, from);
>>  		if (err < 0)
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index fcdee27b9aa2..26b3c84d7f64 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -3449,12 +3449,16 @@ static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
>>  {
>>  	/*
>>  	 * Check to see whether an error occurred while writing out the data to
>> -	 * the allocated blocks. If so, return the magic error code so that we
>> -	 * fallback to buffered I/O and attempt to complete the remainder of
>> -	 * the I/O. Any blocks that may have been allocated in preparation for
>> -	 * the direct I/O will be reused during buffered I/O.
>> +	 * the allocated blocks. If so, return the magic error code for
>> +	 * non-atomic write so that we fallback to buffered I/O and attempt to
>> +	 * complete the remainder of the I/O.
>> +	 * For atomic writes we will simply fail the I/O request if we coudn't
>> +	 * write anything. For non-atomic writes, any blocks that may have been
>> +	 * allocated in preparation for the direct I/O will be reused during
>> +	 * buffered I/O.
>>  	 */
>> -	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
>> +	if (!(flags & IOMAP_ATOMIC) && (flags & (IOMAP_WRITE | IOMAP_DIRECT))
>
> Huh.  The WRITE|DIRECT check doesn't look right to me, because the
> expression returns true for any write or any directio.  I think that's
> currently "ok" because ext4_iomap_end is only called for directio
> writes, but this bugs me anyway.  For a directio write fallback, that
> comparison really should be:
>
> 	(flags & (WRITE|DIRECT)) == (WRITE|DIRECT)
>

yes. You are right. It is working since ext4 only supports iomap
for DIRECTIO. But I agree it's better be fixed to avoid problem in future.

> static inline bool
> ext4_want_directio_fallback(unsigned flags, ssize_t written)
> {
> 	/* must be a directio to fall back to buffered */
> 	if (flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
> 		    (IOMAP_WRITE | IOMAP_DIRECT)
> 		return false;
>
> 	/* atomic writes are all-or-nothing */
> 	if (flags & IOMAP_ATOMIC)
> 		return false;
>
> 	/* can only try again if we wrote nothing */
> 	return written == 0;
> }
>
> 	if (ext4_want_directio_fallback(flags, written))
> 		return -ENOTBLK;
>

I like the above helper. Thanks for doing that. 
I will incorporate this in v4.


>> +			&& written == 0)
>
> Nit: put the '&&' operator on the previous line when there's a multiline
> expression.
>

I guess we don't need this if we do it with your above inline helper.
But sure, next time will keep in mind for any such changes.

> --D
>

Thanks for the review!
-ritesh

>>  		return -ENOTBLK;
>>  
>>  	return 0;
>> -- 
>> 2.46.0
>> 
>> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] ext4: Add statx support for atomic writes
  2024-11-01  2:30     ` Ritesh Harjani
@ 2024-11-01  3:33       ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-11-01  3:33 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: John Garry, linux-ext4, Theodore Ts'o, Jan Kara,
	Christoph Hellwig, Ojaswin Mujoo, Dave Chinner, linux-kernel,
	linux-xfs, linux-fsdevel

On Fri, Nov 01, 2024 at 08:00:35AM +0530, Ritesh Harjani wrote:
> 
> Hi John & Darrick,
> 
> "Darrick J. Wong" <djwong@kernel.org> writes:
> 
> > On Wed, Oct 30, 2024 at 09:27:38PM +0530, Ritesh Harjani (IBM) wrote:
> >> This patch adds base support for atomic writes via statx getattr.
> >> On bs < ps systems, we can create FS with say bs of 16k. That means
> >> both atomic write min and max unit can be set to 16k for supporting
> >> atomic writes.
> >> 
> >> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> >> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> >> ---
> >>  fs/ext4/ext4.h  |  9 +++++++++
> >>  fs/ext4/inode.c | 14 ++++++++++++++
> >>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
> >>  3 files changed, 54 insertions(+)
> >> 
> >> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> >> index 44b0d418143c..6ee49aaacd2b 100644
> >> --- a/fs/ext4/ext4.h
> >> +++ b/fs/ext4/ext4.h
> >> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
> >>  	 */
> >>  	struct work_struct s_sb_upd_work;
> >>  
> >> +	/* Atomic write unit values in bytes */
> >> +	unsigned int s_awu_min;
> >> +	unsigned int s_awu_max;
> >> +
> >>  	/* Ext4 fast commit sub transaction ID */
> >>  	atomic_t s_fc_subtid;
> >>  
> >> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
> >>  	return buffer_uptodate(bh);
> >>  }
> >>  
> >> +static inline bool ext4_can_atomic_write(struct super_block *sb)
> >> +{
> >> +	return EXT4_SB(sb)->s_awu_min > 0;
> >
> > Huh, I was expecting you to stick to passing in the struct inode,
> > and then you end up with:
> >
> > static inline bool ext4_can_atomic_write(struct inode *inode)
> > {
> > 	return S_ISREG(inode->i_mode) &&
> > 	       EXT4_SB(inode->i_sb)->s_awu_min > 0);
> > }
> >
> 
> Ok. John also had commented on the same thing before. 
> We may only need this, when ext4 get extsize hint support. But for now
> we mainly only need to check that EXT4 SB supports atomic write or not.
> i.e. s_awu_min should be greater than 0. 
> 
> But sure I can make above suggested change to keep it consistent with XFS, along
> with below discussed change (Please have a look)...
> 
> >> +}
> >> +
> >>  extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> >>  				  loff_t pos, unsigned len,
> >>  				  get_block_t *get_block);
> >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> >> index 54bdd4884fe6..fcdee27b9aa2 100644
> >> --- a/fs/ext4/inode.c
> >> +++ b/fs/ext4/inode.c
> >> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
> >>  		}
> >>  	}
> >>  
> >> +	if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
> >
> > ...and then the callsites become:
> >
> > 	if (request_mask & STATX_WRITE_ATOMIC) {
> > 		unsigned int awu_min = 0, awu_max = 0;
> >
> > 		if (ext4_can_atomic_write(inode)) {
> > 			awu_min = sbi->s_awu_min;
> > 			awu_max = sbi->s_awu_max;
> > 		}
> >
> > 		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> > 	}
> >
> > (I forget, is it bad if statx to a directory returns STATX_WRITE_ATOMIC
> > even with awu_{min,max} set to zero?)
> 
> I mainly kept it consistent with XFS. But it's not a bad idea to do that. 
> That will help applications check for atomic write support on the root
> directory mount point rather than creating a regular file just for
> verification. Because of below result_mask, which we only set within generic_fill_statx_atomic_writes() 
> 
> 	stat->result_mask |= STATX_WRITE_ATOMIC;
> 
> If we make this change to ext4, XFS will have to fix it too, to keep
> the behavior consistent for both.
> Shall I go ahead and make the change in v4 for EXT4?

Hmmm, that's a good question -- if a program asks for STATX_WRITE_ATOMIC
on a directory, should we set the ATOMIC flag in statx.stx_mask but
leave the values as zeroes if the underlying block device/fs supports
atomic writes at all?  For XFS I guess the "underlying bdev" is
determined by the directory's RTINHERIT bit && xfs_has_realtime().

Thoughts?

But maybe that doesn't make sense since (a) fundamentally you can't do a
directio write to a directory and (b) it's not that hard to create a
file, set the REALTIME bit on it as desired (on xfs) and then query the
untorn write geometry.  So maybe that check should be:

if (request_mask & STATX_WRITE_ATOMIC && S_ISREG(inode->i_mode))

--D

> -ritesh
> 
> >
> > Other than that nit, this looks good to me.
> >
> > --D
> >
> >> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> >> +		unsigned int awu_min, awu_max;
> >> +
> >> +		if (ext4_can_atomic_write(inode->i_sb)) {
> >> +			awu_min = sbi->s_awu_min;
> >> +			awu_max = sbi->s_awu_max;
> >> +		} else {
> >> +			awu_min = awu_max = 0;
> >> +		}
> >> +
> >> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> >> +	}
> >> +
> >>  	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
> >>  	if (flags & EXT4_APPEND_FL)
> >>  		stat->attributes |= STATX_ATTR_APPEND;
> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> >> index 16a4ce704460..ebe1660bd840 100644
> >> --- a/fs/ext4/super.c
> >> +++ b/fs/ext4/super.c
> >> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
> >>  	return 0;
> >>  }
> >>  
> >> +/*
> >> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> >> + * @sb: super block
> >> + * TODO: Later add support for bigalloc
> >> + */
> >> +static void ext4_atomic_write_init(struct super_block *sb)
> >> +{
> >> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> >> +	struct block_device *bdev = sb->s_bdev;
> >> +
> >> +	if (!bdev_can_atomic_write(bdev))
> >> +		return;
> >> +
> >> +	if (!ext4_has_feature_extents(sb))
> >> +		return;
> >> +
> >> +	sbi->s_awu_min = max(sb->s_blocksize,
> >> +			      bdev_atomic_write_unit_min_bytes(bdev));
> >> +	sbi->s_awu_max = min(sb->s_blocksize,
> >> +			      bdev_atomic_write_unit_max_bytes(bdev));
> >> +	if (sbi->s_awu_min && sbi->s_awu_max &&
> >> +	    sbi->s_awu_min <= sbi->s_awu_max) {
> >> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
> >> +			 sbi->s_awu_min, sbi->s_awu_max);
> >> +	} else {
> >> +		sbi->s_awu_min = 0;
> >> +		sbi->s_awu_max = 0;
> >> +	}
> >> +}
> >> +
> >>  static void ext4_fast_commit_init(struct super_block *sb)
> >>  {
> >>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> >> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
> >>  
> >>  	spin_lock_init(&sbi->s_bdev_wb_lock);
> >>  
> >> +	ext4_atomic_write_init(sb);
> >>  	ext4_fast_commit_init(sb);
> >>  
> >>  	sb->s_root = NULL;
> >> -- 
> >> 2.46.0
> >> 
> >> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-11-01  3:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-30 15:57 [PATCH v3 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
2024-10-30 15:57 ` [PATCH v3 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
2024-10-31 21:42   ` Darrick J. Wong
2024-11-01  2:30     ` Ritesh Harjani
2024-11-01  3:33       ` Darrick J. Wong
2024-10-30 15:57 ` [PATCH v3 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
2024-10-31 21:42   ` Darrick J. Wong
2024-10-30 15:57 ` [PATCH v3 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
2024-10-31 21:42   ` Darrick J. Wong
2024-10-30 15:57 ` [PATCH v3 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
2024-10-31 21:51   ` Darrick J. Wong
2024-11-01  3:11     ` Ritesh Harjani
2024-10-31 22:01 ` [PATCH v3 0/4] ext4: Add atomic writes support for DIO Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox