linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] ext4: Add atomic writes support for DIO
@ 2024-11-01  6:50 Ritesh Harjani (IBM)
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-11-01  6:50 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

v3 -> v4:
=========
1. Patch-1 changes the helper function from ext4_can_atomic_write() to
   ext4_inode_can_atomic_write() based on suggestions from Darrick and John.
   Patch-3 then goes and uses it.
2. Patch-4 adds an inline helper ext4_want_directio_fallback() which simplifies
   the logic checks and inherently fixes condition on when to return -ENOTBLK
   which otherwise was always returning true for any write or directio in
   ext4_iomap_end(). It was ok since ext4 only supports direct-io via iomap.
[v3]: https://lore.kernel.org/linux-xfs/cover.1730286164.git.ritesh.list@gmail.com/

(Note as mentioned below as well this is built on top of John XFS atomic series [2])
[2]: https://lore.kernel.org/linux-xfs/20241019125113.369994-1-john.g.garry@oracle.com/


v2 -> v3:
==========
1. Patch-1 adds an "experimental" string in dmesg log during mount when EXT4
   detects that it is capable of doing DIO atomic writes on a given device
   with min and max unit details.
2. Patch-4 has been updated to avoid returning -ENOTBLK (in ext4_iomap_end)
   if the request belongs to atomic write. This patch also adds a WARN_ON_ONCE()
   if atomic write ever fallback to buffered-io (to catch any unwanted bugs in the future).
   More details in the commit log of patch-4.
3. Collected RBs tag from John for Patch 2 & 3.

[v2]: https://lore.kernel.org/linux-ext4/cover.1729944406.git.ritesh.list@gmail.com/

Previous cover letter log:

In v2, we had split the series and this one only takes care of
atomic writes for single fsblock.
That means for now this gets only enabled on bs < ps systems on ext4.
Enablement of atomic writes for bigalloc (multi-fsblock support) is still
under discussion and may require general consensus within the filesystem
community [1].

This series adds the base feature support to enable atomic writes in
direct-io path for ext4. We advertise the minimum and the maximum atomic
write unit sizes via statx on a regular file.

This series allows users to utilize atomic write support using -
1. on bs < ps systems via - mkfs.ext4 -F -b 16384 /dev/sda

This can then be utilized using -
	xfs_io -fdc "pwrite -V 1 -A -b16k 0 16k" /mnt/f1

This is built on top of John's DIO atomic write series for XFS [2].
The VFS and block layer enablement for atomic writes were merged already.


[1]: https://lore.kernel.org/linux-ext4/87jzdvmqfz.fsf@gmail.com
[2]: https://lore.kernel.org/linux-xfs/20241019125113.369994-1-john.g.garry@oracle.com/


Changelogs:
===========
PATCH -> PATCH v2:
- addressed review comments from John and Darrick.
- renamed ext4_sb_info variables names: fs_awu* -> s_awu*
- [PATCH]: https://lore.kernel.org/linux-ext4/cover.1729825985.git.ritesh.list@gmail.com/

RFC -> PATCH:
- Dropped RFC tag
- Last RFC was posted a while ago but back then a lot of VFS and block layer
  interfaces were still not merged. Those are now merged, thanks to John and
  everyone else.
- [RFC] - https://lore.kernel.org/linux-ext4/cover.1709356594.git.ritesh.list@gmail.com/


Ritesh Harjani (IBM) (4):
  ext4: Add statx support for atomic writes
  ext4: Check for atomic writes support in write iter
  ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  ext4: Do not fallback to buffered-io for DIO atomic write

 fs/ext4/ext4.h  | 10 ++++++++++
 fs/ext4/file.c  | 24 ++++++++++++++++++++++++
 fs/ext4/inode.c | 39 ++++++++++++++++++++++++++++++++++-----
 fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
 4 files changed, 99 insertions(+), 5 deletions(-)

--
2.46.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] ext4: Add statx support for atomic writes
  2024-11-01  6:50 [PATCH v4 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
@ 2024-11-01  6:50 ` Ritesh Harjani (IBM)
  2024-11-01 11:17   ` Jan Kara
                     ` (2 more replies)
  2024-11-01  6:50 ` [PATCH v4 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-11-01  6:50 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

This patch adds base support for atomic writes via statx getattr.
On bs < ps systems, we can create FS with say bs of 16k. That means
both atomic write min and max unit can be set to 16k for supporting
atomic writes.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/ext4.h  | 10 ++++++++++
 fs/ext4/inode.c | 12 ++++++++++++
 fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 44b0d418143c..494d443e9fc9 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1729,6 +1729,10 @@ struct ext4_sb_info {
 	 */
 	struct work_struct s_sb_upd_work;
 
+	/* Atomic write unit values in bytes */
+	unsigned int s_awu_min;
+	unsigned int s_awu_max;
+
 	/* Ext4 fast commit sub transaction ID */
 	atomic_t s_fc_subtid;
 
@@ -3855,6 +3859,12 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
 	return buffer_uptodate(bh);
 }
 
+static inline bool ext4_inode_can_atomic_write(struct inode *inode)
+{
+
+	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;
+}
+
 extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
 				  loff_t pos, unsigned len,
 				  get_block_t *get_block);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 54bdd4884fe6..3e827cfa762e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5578,6 +5578,18 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
 		}
 	}
 
+	if ((request_mask & STATX_WRITE_ATOMIC) && S_ISREG(inode->i_mode)) {
+		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+		unsigned int awu_min = 0, awu_max = 0;
+
+		if (ext4_inode_can_atomic_write(inode)) {
+			awu_min = sbi->s_awu_min;
+			awu_max = sbi->s_awu_max;
+		}
+
+		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
+	}
+
 	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
 	if (flags & EXT4_APPEND_FL)
 		stat->attributes |= STATX_ATTR_APPEND;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 16a4ce704460..ebe1660bd840 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
 	return 0;
 }
 
+/*
+ * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
+ * @sb: super block
+ * TODO: Later add support for bigalloc
+ */
+static void ext4_atomic_write_init(struct super_block *sb)
+{
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	struct block_device *bdev = sb->s_bdev;
+
+	if (!bdev_can_atomic_write(bdev))
+		return;
+
+	if (!ext4_has_feature_extents(sb))
+		return;
+
+	sbi->s_awu_min = max(sb->s_blocksize,
+			      bdev_atomic_write_unit_min_bytes(bdev));
+	sbi->s_awu_max = min(sb->s_blocksize,
+			      bdev_atomic_write_unit_max_bytes(bdev));
+	if (sbi->s_awu_min && sbi->s_awu_max &&
+	    sbi->s_awu_min <= sbi->s_awu_max) {
+		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
+			 sbi->s_awu_min, sbi->s_awu_max);
+	} else {
+		sbi->s_awu_min = 0;
+		sbi->s_awu_max = 0;
+	}
+}
+
 static void ext4_fast_commit_init(struct super_block *sb)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 
 	spin_lock_init(&sbi->s_bdev_wb_lock);
 
+	ext4_atomic_write_init(sb);
 	ext4_fast_commit_init(sb);
 
 	sb->s_root = NULL;
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] ext4: Check for atomic writes support in write iter
  2024-11-01  6:50 [PATCH v4 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-11-01  6:50 ` Ritesh Harjani (IBM)
  2024-11-01 11:15   ` Jan Kara
  2024-11-01  6:50 ` [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
  2024-11-01  6:50 ` [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
  3 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-11-01  6:50 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

Let's validate the given constraints for atomic write request.
Otherwise it will fail with -EINVAL. Currently atomic write is only
supported on DIO, so for buffered-io it will return -EOPNOTSUPP.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index f14aed14b9cf..a7b9b9751a3f 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -692,6 +692,20 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if (IS_DAX(inode))
 		return ext4_dax_write_iter(iocb, from);
 #endif
+
+	if (iocb->ki_flags & IOCB_ATOMIC) {
+		size_t len = iov_iter_count(from);
+		int ret;
+
+		if (len < EXT4_SB(inode->i_sb)->s_awu_min ||
+		    len > EXT4_SB(inode->i_sb)->s_awu_max)
+			return -EINVAL;
+
+		ret = generic_atomic_write_valid(iocb, from);
+		if (ret)
+			return ret;
+	}
+
 	if (iocb->ki_flags & IOCB_DIRECT)
 		return ext4_dio_write_iter(iocb, from);
 	else
--
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  2024-11-01  6:50 [PATCH v4 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
  2024-11-01  6:50 ` [PATCH v4 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-11-01  6:50 ` Ritesh Harjani (IBM)
  2024-11-01 11:14   ` Jan Kara
  2024-11-01  6:50 ` [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
  3 siblings, 1 reply; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-11-01  6:50 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

FS needs to add the fmode capability in order to support atomic writes
during file open (refer kiocb_set_rw_flags()). Set this capability on
a regular file if ext4 can do atomic write.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index a7b9b9751a3f..96d936f5584b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -898,6 +898,9 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
 			return ret;
 	}

+	if (ext4_inode_can_atomic_write(inode))
+		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
+
 	filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
 	return dquot_file_open(inode, filp);
 }
--
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-11-01  6:50 [PATCH v4 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
                   ` (2 preceding siblings ...)
  2024-11-01  6:50 ` [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-11-01  6:50 ` Ritesh Harjani (IBM)
  2024-11-01 11:13   ` Jan Kara
  2024-11-01 14:46   ` Darrick J. Wong
  3 siblings, 2 replies; 13+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-11-01  6:50 UTC (permalink / raw)
  To: linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel, Ritesh Harjani (IBM)

atomic writes is currently only supported for single fsblock and only
for direct-io. We should not return -ENOTBLK for atomic writes since we
want the atomic write request to either complete fully or fail
otherwise. Hence, we should never fallback to buffered-io in case of
DIO atomic write requests.
Let's also catch if this ever happens by adding some WARN_ON_ONCE before
buffered-io handling for direct-io atomic writes. More details of the
discussion [1].

While at it let's add an inline helper ext4_want_directio_fallback() which
simplifies the logic checks and inherently fixes condition on when to return
-ENOTBLK which otherwise was always returning true for any write or directio in
ext4_iomap_end(). It was ok since ext4 only supports direct-io via iomap.

[1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04
Suggested-by: Darrick J. Wong <djwong@kernel.org> # inline helper
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c  |  7 +++++++
 fs/ext4/inode.c | 27 ++++++++++++++++++++++-----
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 96d936f5584b..a7de03e47db0 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		ssize_t err;
 		loff_t endbyte;

+		/*
+		 * There is no support for atomic writes on buffered-io yet,
+		 * we should never fallback to buffered-io for DIO atomic
+		 * writes.
+		 */
+		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
+
 		offset = iocb->ki_pos;
 		err = ext4_buffered_write_iter(iocb, from);
 		if (err < 0)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3e827cfa762e..5b9eeb74ce47 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3444,17 +3444,34 @@ static int ext4_iomap_overwrite_begin(struct inode *inode, loff_t offset,
 	return ret;
 }

+static inline bool ext4_want_directio_fallback(unsigned flags, ssize_t written)
+{
+	/* must be a directio to fall back to buffered */
+	if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
+		    (IOMAP_WRITE | IOMAP_DIRECT))
+		return false;
+
+	/* atomic writes are all-or-nothing */
+	if (flags & IOMAP_ATOMIC)
+		return false;
+
+	/* can only try again if we wrote nothing */
+	return written == 0;
+}
+
 static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
 			  ssize_t written, unsigned flags, struct iomap *iomap)
 {
 	/*
 	 * Check to see whether an error occurred while writing out the data to
-	 * the allocated blocks. If so, return the magic error code so that we
-	 * fallback to buffered I/O and attempt to complete the remainder of
-	 * the I/O. Any blocks that may have been allocated in preparation for
-	 * the direct I/O will be reused during buffered I/O.
+	 * the allocated blocks. If so, return the magic error code for
+	 * non-atomic write so that we fallback to buffered I/O and attempt to
+	 * complete the remainder of the I/O.
+	 * For non-atomic writes, any blocks that may have been
+	 * allocated in preparation for the direct I/O will be reused during
+	 * buffered I/O. For atomic write, we never fallback to buffered-io.
 	 */
-	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
+	if (ext4_want_directio_fallback(flags, written))
 		return -ENOTBLK;

 	return 0;
--
2.46.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-11-01  6:50 ` [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
@ 2024-11-01 11:13   ` Jan Kara
  2024-11-01 14:46   ` Darrick J. Wong
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-11-01 11:13 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Darrick J . Wong,
	Christoph Hellwig, John Garry, Ojaswin Mujoo, Dave Chinner,
	linux-kernel, linux-xfs, linux-fsdevel

On Fri 01-11-24 12:20:54, Ritesh Harjani (IBM) wrote:
> atomic writes is currently only supported for single fsblock and only
> for direct-io. We should not return -ENOTBLK for atomic writes since we
> want the atomic write request to either complete fully or fail
> otherwise. Hence, we should never fallback to buffered-io in case of
> DIO atomic write requests.
> Let's also catch if this ever happens by adding some WARN_ON_ONCE before
> buffered-io handling for direct-io atomic writes. More details of the
> discussion [1].
> 
> While at it let's add an inline helper ext4_want_directio_fallback() which
> simplifies the logic checks and inherently fixes condition on when to return
> -ENOTBLK which otherwise was always returning true for any write or directio in
> ext4_iomap_end(). It was ok since ext4 only supports direct-io via iomap.
> 
> [1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04
> Suggested-by: Darrick J. Wong <djwong@kernel.org> # inline helper
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/file.c  |  7 +++++++
>  fs/ext4/inode.c | 27 ++++++++++++++++++++++-----
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 96d936f5584b..a7de03e47db0 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		ssize_t err;
>  		loff_t endbyte;
> 
> +		/*
> +		 * There is no support for atomic writes on buffered-io yet,
> +		 * we should never fallback to buffered-io for DIO atomic
> +		 * writes.
> +		 */
> +		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
> +
>  		offset = iocb->ki_pos;
>  		err = ext4_buffered_write_iter(iocb, from);
>  		if (err < 0)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3e827cfa762e..5b9eeb74ce47 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3444,17 +3444,34 @@ static int ext4_iomap_overwrite_begin(struct inode *inode, loff_t offset,
>  	return ret;
>  }
> 
> +static inline bool ext4_want_directio_fallback(unsigned flags, ssize_t written)
> +{
> +	/* must be a directio to fall back to buffered */
> +	if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
> +		    (IOMAP_WRITE | IOMAP_DIRECT))
> +		return false;
> +
> +	/* atomic writes are all-or-nothing */
> +	if (flags & IOMAP_ATOMIC)
> +		return false;
> +
> +	/* can only try again if we wrote nothing */
> +	return written == 0;
> +}
> +
>  static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
>  			  ssize_t written, unsigned flags, struct iomap *iomap)
>  {
>  	/*
>  	 * Check to see whether an error occurred while writing out the data to
> -	 * the allocated blocks. If so, return the magic error code so that we
> -	 * fallback to buffered I/O and attempt to complete the remainder of
> -	 * the I/O. Any blocks that may have been allocated in preparation for
> -	 * the direct I/O will be reused during buffered I/O.
> +	 * the allocated blocks. If so, return the magic error code for
> +	 * non-atomic write so that we fallback to buffered I/O and attempt to
> +	 * complete the remainder of the I/O.
> +	 * For non-atomic writes, any blocks that may have been
> +	 * allocated in preparation for the direct I/O will be reused during
> +	 * buffered I/O. For atomic write, we never fallback to buffered-io.
>  	 */
> -	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
> +	if (ext4_want_directio_fallback(flags, written))
>  		return -ENOTBLK;
> 
>  	return 0;
> --
> 2.46.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
  2024-11-01  6:50 ` [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-11-01 11:14   ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-11-01 11:14 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Darrick J . Wong,
	Christoph Hellwig, John Garry, Ojaswin Mujoo, Dave Chinner,
	linux-kernel, linux-xfs, linux-fsdevel

On Fri 01-11-24 12:20:53, Ritesh Harjani (IBM) wrote:
> FS needs to add the fmode capability in order to support atomic writes
> during file open (refer kiocb_set_rw_flags()). Set this capability on
> a regular file if ext4 can do atomic write.
> 
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza

> ---
>  fs/ext4/file.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index a7b9b9751a3f..96d936f5584b 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -898,6 +898,9 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
>  			return ret;
>  	}
> 
> +	if (ext4_inode_can_atomic_write(inode))
> +		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
> +
>  	filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
>  	return dquot_file_open(inode, filp);
>  }
> --
> 2.46.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] ext4: Check for atomic writes support in write iter
  2024-11-01  6:50 ` [PATCH v4 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-11-01 11:15   ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-11-01 11:15 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Darrick J . Wong,
	Christoph Hellwig, John Garry, Ojaswin Mujoo, Dave Chinner,
	linux-kernel, linux-xfs, linux-fsdevel

On Fri 01-11-24 12:20:52, Ritesh Harjani (IBM) wrote:
> Let's validate the given constraints for atomic write request.
> Otherwise it will fail with -EINVAL. Currently atomic write is only
> supported on DIO, so for buffered-io it will return -EOPNOTSUPP.
> 
> Reviewed-by: John Garry <john.g.garry@oracle.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/file.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index f14aed14b9cf..a7b9b9751a3f 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -692,6 +692,20 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  	if (IS_DAX(inode))
>  		return ext4_dax_write_iter(iocb, from);
>  #endif
> +
> +	if (iocb->ki_flags & IOCB_ATOMIC) {
> +		size_t len = iov_iter_count(from);
> +		int ret;
> +
> +		if (len < EXT4_SB(inode->i_sb)->s_awu_min ||
> +		    len > EXT4_SB(inode->i_sb)->s_awu_max)
> +			return -EINVAL;
> +
> +		ret = generic_atomic_write_valid(iocb, from);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	if (iocb->ki_flags & IOCB_DIRECT)
>  		return ext4_dio_write_iter(iocb, from);
>  	else
> --
> 2.46.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] ext4: Add statx support for atomic writes
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-11-01 11:17   ` Jan Kara
  2024-11-01 15:12   ` Darrick J. Wong
  2024-11-01 16:00   ` John Garry
  2 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-11-01 11:17 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Darrick J . Wong,
	Christoph Hellwig, John Garry, Ojaswin Mujoo, Dave Chinner,
	linux-kernel, linux-xfs, linux-fsdevel

On Fri 01-11-24 12:20:51, Ritesh Harjani (IBM) wrote:
> This patch adds base support for atomic writes via statx getattr.
> On bs < ps systems, we can create FS with say bs of 16k. That means
> both atomic write min and max unit can be set to 16k for supporting
> atomic writes.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

I guess this is a good start. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ext4.h  | 10 ++++++++++
>  fs/ext4/inode.c | 12 ++++++++++++
>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 44b0d418143c..494d443e9fc9 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>  	 */
>  	struct work_struct s_sb_upd_work;
>  
> +	/* Atomic write unit values in bytes */
> +	unsigned int s_awu_min;
> +	unsigned int s_awu_max;
> +
>  	/* Ext4 fast commit sub transaction ID */
>  	atomic_t s_fc_subtid;
>  
> @@ -3855,6 +3859,12 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>  	return buffer_uptodate(bh);
>  }
>  
> +static inline bool ext4_inode_can_atomic_write(struct inode *inode)
> +{
> +
> +	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;
> +}
> +
>  extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>  				  loff_t pos, unsigned len,
>  				  get_block_t *get_block);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 54bdd4884fe6..3e827cfa762e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5578,6 +5578,18 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>  		}
>  	}
>  
> +	if ((request_mask & STATX_WRITE_ATOMIC) && S_ISREG(inode->i_mode)) {
> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +		unsigned int awu_min = 0, awu_max = 0;
> +
> +		if (ext4_inode_can_atomic_write(inode)) {
> +			awu_min = sbi->s_awu_min;
> +			awu_max = sbi->s_awu_max;
> +		}
> +
> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> +	}
> +
>  	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>  	if (flags & EXT4_APPEND_FL)
>  		stat->attributes |= STATX_ATTR_APPEND;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 16a4ce704460..ebe1660bd840 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>  	return 0;
>  }
>  
> +/*
> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> + * @sb: super block
> + * TODO: Later add support for bigalloc
> + */
> +static void ext4_atomic_write_init(struct super_block *sb)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	struct block_device *bdev = sb->s_bdev;
> +
> +	if (!bdev_can_atomic_write(bdev))
> +		return;
> +
> +	if (!ext4_has_feature_extents(sb))
> +		return;
> +
> +	sbi->s_awu_min = max(sb->s_blocksize,
> +			      bdev_atomic_write_unit_min_bytes(bdev));
> +	sbi->s_awu_max = min(sb->s_blocksize,
> +			      bdev_atomic_write_unit_max_bytes(bdev));
> +	if (sbi->s_awu_min && sbi->s_awu_max &&
> +	    sbi->s_awu_min <= sbi->s_awu_max) {
> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
> +			 sbi->s_awu_min, sbi->s_awu_max);
> +	} else {
> +		sbi->s_awu_min = 0;
> +		sbi->s_awu_max = 0;
> +	}
> +}
> +
>  static void ext4_fast_commit_init(struct super_block *sb)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>  
>  	spin_lock_init(&sbi->s_bdev_wb_lock);
>  
> +	ext4_atomic_write_init(sb);
>  	ext4_fast_commit_init(sb);
>  
>  	sb->s_root = NULL;
> -- 
> 2.46.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
  2024-11-01  6:50 ` [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
  2024-11-01 11:13   ` Jan Kara
@ 2024-11-01 14:46   ` Darrick J. Wong
  1 sibling, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-11-01 14:46 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Fri, Nov 01, 2024 at 12:20:54PM +0530, Ritesh Harjani (IBM) wrote:
> atomic writes is currently only supported for single fsblock and only
> for direct-io. We should not return -ENOTBLK for atomic writes since we
> want the atomic write request to either complete fully or fail
> otherwise. Hence, we should never fallback to buffered-io in case of
> DIO atomic write requests.
> Let's also catch if this ever happens by adding some WARN_ON_ONCE before
> buffered-io handling for direct-io atomic writes. More details of the
> discussion [1].
> 
> While at it let's add an inline helper ext4_want_directio_fallback() which
> simplifies the logic checks and inherently fixes condition on when to return
> -ENOTBLK which otherwise was always returning true for any write or directio in
> ext4_iomap_end(). It was ok since ext4 only supports direct-io via iomap.
> 
> [1]: https://lore.kernel.org/linux-xfs/cover.1729825985.git.ritesh.list@gmail.com/T/#m9dbecc11bed713ed0d7a486432c56b105b555f04
> Suggested-by: Darrick J. Wong <djwong@kernel.org> # inline helper

Looks good to me now,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ext4/file.c  |  7 +++++++
>  fs/ext4/inode.c | 27 ++++++++++++++++++++++-----
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 96d936f5584b..a7de03e47db0 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -599,6 +599,13 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		ssize_t err;
>  		loff_t endbyte;
> 
> +		/*
> +		 * There is no support for atomic writes on buffered-io yet,
> +		 * we should never fallback to buffered-io for DIO atomic
> +		 * writes.
> +		 */
> +		WARN_ON_ONCE(iocb->ki_flags & IOCB_ATOMIC);
> +
>  		offset = iocb->ki_pos;
>  		err = ext4_buffered_write_iter(iocb, from);
>  		if (err < 0)
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3e827cfa762e..5b9eeb74ce47 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3444,17 +3444,34 @@ static int ext4_iomap_overwrite_begin(struct inode *inode, loff_t offset,
>  	return ret;
>  }
> 
> +static inline bool ext4_want_directio_fallback(unsigned flags, ssize_t written)
> +{
> +	/* must be a directio to fall back to buffered */
> +	if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) !=
> +		    (IOMAP_WRITE | IOMAP_DIRECT))
> +		return false;
> +
> +	/* atomic writes are all-or-nothing */
> +	if (flags & IOMAP_ATOMIC)
> +		return false;
> +
> +	/* can only try again if we wrote nothing */
> +	return written == 0;
> +}
> +
>  static int ext4_iomap_end(struct inode *inode, loff_t offset, loff_t length,
>  			  ssize_t written, unsigned flags, struct iomap *iomap)
>  {
>  	/*
>  	 * Check to see whether an error occurred while writing out the data to
> -	 * the allocated blocks. If so, return the magic error code so that we
> -	 * fallback to buffered I/O and attempt to complete the remainder of
> -	 * the I/O. Any blocks that may have been allocated in preparation for
> -	 * the direct I/O will be reused during buffered I/O.
> +	 * the allocated blocks. If so, return the magic error code for
> +	 * non-atomic write so that we fallback to buffered I/O and attempt to
> +	 * complete the remainder of the I/O.
> +	 * For non-atomic writes, any blocks that may have been
> +	 * allocated in preparation for the direct I/O will be reused during
> +	 * buffered I/O. For atomic write, we never fallback to buffered-io.
>  	 */
> -	if (flags & (IOMAP_WRITE | IOMAP_DIRECT) && written == 0)
> +	if (ext4_want_directio_fallback(flags, written))
>  		return -ENOTBLK;
> 
>  	return 0;
> --
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] ext4: Add statx support for atomic writes
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
  2024-11-01 11:17   ` Jan Kara
@ 2024-11-01 15:12   ` Darrick J. Wong
  2024-11-01 16:00   ` John Garry
  2 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2024-11-01 15:12 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Jan Kara, Christoph Hellwig,
	John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On Fri, Nov 01, 2024 at 12:20:51PM +0530, Ritesh Harjani (IBM) wrote:
> This patch adds base support for atomic writes via statx getattr.
> On bs < ps systems, we can create FS with say bs of 16k. That means
> both atomic write min and max unit can be set to 16k for supporting
> atomic writes.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Looks good to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/ext4/ext4.h  | 10 ++++++++++
>  fs/ext4/inode.c | 12 ++++++++++++
>  fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 44b0d418143c..494d443e9fc9 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>  	 */
>  	struct work_struct s_sb_upd_work;
>  
> +	/* Atomic write unit values in bytes */
> +	unsigned int s_awu_min;
> +	unsigned int s_awu_max;
> +
>  	/* Ext4 fast commit sub transaction ID */
>  	atomic_t s_fc_subtid;
>  
> @@ -3855,6 +3859,12 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>  	return buffer_uptodate(bh);
>  }
>  
> +static inline bool ext4_inode_can_atomic_write(struct inode *inode)
> +{
> +
> +	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;
> +}
> +
>  extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>  				  loff_t pos, unsigned len,
>  				  get_block_t *get_block);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 54bdd4884fe6..3e827cfa762e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5578,6 +5578,18 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>  		}
>  	}
>  
> +	if ((request_mask & STATX_WRITE_ATOMIC) && S_ISREG(inode->i_mode)) {
> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +		unsigned int awu_min = 0, awu_max = 0;
> +
> +		if (ext4_inode_can_atomic_write(inode)) {
> +			awu_min = sbi->s_awu_min;
> +			awu_max = sbi->s_awu_max;
> +		}
> +
> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> +	}
> +
>  	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>  	if (flags & EXT4_APPEND_FL)
>  		stat->attributes |= STATX_ATTR_APPEND;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 16a4ce704460..ebe1660bd840 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>  	return 0;
>  }
>  
> +/*
> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> + * @sb: super block
> + * TODO: Later add support for bigalloc
> + */
> +static void ext4_atomic_write_init(struct super_block *sb)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	struct block_device *bdev = sb->s_bdev;
> +
> +	if (!bdev_can_atomic_write(bdev))
> +		return;
> +
> +	if (!ext4_has_feature_extents(sb))
> +		return;
> +
> +	sbi->s_awu_min = max(sb->s_blocksize,
> +			      bdev_atomic_write_unit_min_bytes(bdev));
> +	sbi->s_awu_max = min(sb->s_blocksize,
> +			      bdev_atomic_write_unit_max_bytes(bdev));
> +	if (sbi->s_awu_min && sbi->s_awu_max &&
> +	    sbi->s_awu_min <= sbi->s_awu_max) {
> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
> +			 sbi->s_awu_min, sbi->s_awu_max);
> +	} else {
> +		sbi->s_awu_min = 0;
> +		sbi->s_awu_max = 0;
> +	}
> +}
> +
>  static void ext4_fast_commit_init(struct super_block *sb)
>  {
>  	struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>  
>  	spin_lock_init(&sbi->s_bdev_wb_lock);
>  
> +	ext4_atomic_write_init(sb);
>  	ext4_fast_commit_init(sb);
>  
>  	sb->s_root = NULL;
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] ext4: Add statx support for atomic writes
  2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
  2024-11-01 11:17   ` Jan Kara
  2024-11-01 15:12   ` Darrick J. Wong
@ 2024-11-01 16:00   ` John Garry
  2024-11-01 17:23     ` Ritesh Harjani
  2 siblings, 1 reply; 13+ messages in thread
From: John Garry @ 2024-11-01 16:00 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

On 01/11/2024 06:50, Ritesh Harjani (IBM) wrote:
> This patch adds base support for atomic writes via statx getattr.
> On bs < ps systems, we can create FS with say bs of 16k. That means
> both atomic write min and max unit can be set to 16k for supporting
> atomic writes.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

Regardless of nitpicks:

Reviewed-by: John Garry <john.g.garry@oracle.com>

> ---
>   fs/ext4/ext4.h  | 10 ++++++++++
>   fs/ext4/inode.c | 12 ++++++++++++
>   fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>   3 files changed, 53 insertions(+)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 44b0d418143c..494d443e9fc9 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>   	 */
>   	struct work_struct s_sb_upd_work;
>   
> +	/* Atomic write unit values in bytes */
> +	unsigned int s_awu_min;
> +	unsigned int s_awu_max;
> +
>   	/* Ext4 fast commit sub transaction ID */
>   	atomic_t s_fc_subtid;
>   
> @@ -3855,6 +3859,12 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>   	return buffer_uptodate(bh);
>   }
>   
> +static inline bool ext4_inode_can_atomic_write(struct inode *inode)
> +{
> +

nit: superfluous blank line

> +	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;

I am not sure if the S_ISREG() check is required. Other callers also do 
the check (like ext4_getattr() for when calling 
ext4_inode_can_atomic_write()) or don't need it (ext4_file_open()). I 
say ext4_file_open() doesn't need it as ext4_file_open() is only ever 
called for regular files, right?

> +}
> +
>   extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>   				  loff_t pos, unsigned len,
>   				  get_block_t *get_block);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 54bdd4884fe6..3e827cfa762e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5578,6 +5578,18 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>   		}
>   	}
>   
> +	if ((request_mask & STATX_WRITE_ATOMIC) && S_ISREG(inode->i_mode)) {

nit: maybe you could have factored out the S_ISREG() check with 
STATX_DIOALIGN

> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +		unsigned int awu_min = 0, awu_max = 0;
> +
> +		if (ext4_inode_can_atomic_write(inode)) {
> +			awu_min = sbi->s_awu_min;
> +			awu_max = sbi->s_awu_max;
> +		}
> +
> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> +	}
> +
>   	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>   	if (flags & EXT4_APPEND_FL)
>   		stat->attributes |= STATX_ATTR_APPEND;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 16a4ce704460..ebe1660bd840 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>   	return 0;
>   }
>   
> +/*
> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> + * @sb: super block
> + * TODO: Later add support for bigalloc
> + */
> +static void ext4_atomic_write_init(struct super_block *sb)
> +{
> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
> +	struct block_device *bdev = sb->s_bdev;
> +
> +	if (!bdev_can_atomic_write(bdev))
> +		return;
> +
> +	if (!ext4_has_feature_extents(sb))
> +		return;
> +
> +	sbi->s_awu_min = max(sb->s_blocksize,
> +			      bdev_atomic_write_unit_min_bytes(bdev));
> +	sbi->s_awu_max = min(sb->s_blocksize,
> +			      bdev_atomic_write_unit_max_bytes(bdev));
> +	if (sbi->s_awu_min && sbi->s_awu_max &&
> +	    sbi->s_awu_min <= sbi->s_awu_max) {
> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
> +			 sbi->s_awu_min, sbi->s_awu_max);
> +	} else {
> +		sbi->s_awu_min = 0;
> +		sbi->s_awu_max = 0;
> +	}
> +}
> +
>   static void ext4_fast_commit_init(struct super_block *sb)
>   {
>   	struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>   
>   	spin_lock_init(&sbi->s_bdev_wb_lock);
>   
> +	ext4_atomic_write_init(sb);
>   	ext4_fast_commit_init(sb);
>   
>   	sb->s_root = NULL;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] ext4: Add statx support for atomic writes
  2024-11-01 16:00   ` John Garry
@ 2024-11-01 17:23     ` Ritesh Harjani
  0 siblings, 0 replies; 13+ messages in thread
From: Ritesh Harjani @ 2024-11-01 17:23 UTC (permalink / raw)
  To: John Garry, linux-ext4
  Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
	Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
	linux-fsdevel

John Garry <john.g.garry@oracle.com> writes:

> On 01/11/2024 06:50, Ritesh Harjani (IBM) wrote:
>> This patch adds base support for atomic writes via statx getattr.
>> On bs < ps systems, we can create FS with say bs of 16k. That means
>> both atomic write min and max unit can be set to 16k for supporting
>> atomic writes.
>> 
>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
> Regardless of nitpicks:
>
> Reviewed-by: John Garry <john.g.garry@oracle.com>
>

Thanks John for the review!

Since as you too mentioned the remaining points are minor and not
critical review comments, I will address them next time in the
multi-fsblock variant. With all other aspects now finalized in this v4
version, this looks ready to be picked up for the merge window. 

-ritesh

>> ---
>>   fs/ext4/ext4.h  | 10 ++++++++++
>>   fs/ext4/inode.c | 12 ++++++++++++
>>   fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>>   3 files changed, 53 insertions(+)
>> 
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 44b0d418143c..494d443e9fc9 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>>   	 */
>>   	struct work_struct s_sb_upd_work;
>>   
>> +	/* Atomic write unit values in bytes */
>> +	unsigned int s_awu_min;
>> +	unsigned int s_awu_max;
>> +
>>   	/* Ext4 fast commit sub transaction ID */
>>   	atomic_t s_fc_subtid;
>>   
>> @@ -3855,6 +3859,12 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>>   	return buffer_uptodate(bh);
>>   }
>>   
>> +static inline bool ext4_inode_can_atomic_write(struct inode *inode)
>> +{
>> +
>
> nit: superfluous blank line
>

Sure.

>> +	return S_ISREG(inode->i_mode) && EXT4_SB(inode->i_sb)->s_awu_min > 0;
>
> I am not sure if the S_ISREG() check is required. Other callers also do 
> the check (like ext4_getattr() for when calling 
> ext4_inode_can_atomic_write()) or don't need it (ext4_file_open()). I 
> say ext4_file_open() doesn't need it as ext4_file_open() is only ever 
> called for regular files, right?
>

Yes. However I believe we might end up using this from other places when
we add support of extsize. So we might need S_ISREG check.
But sure let me re-think on that during the multi-fsblock variant time.

>> +}
>> +
>>   extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>>   				  loff_t pos, unsigned len,
>>   				  get_block_t *get_block);
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 54bdd4884fe6..3e827cfa762e 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5578,6 +5578,18 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>>   		}
>>   	}
>>   
>> +	if ((request_mask & STATX_WRITE_ATOMIC) && S_ISREG(inode->i_mode)) {
>
> nit: maybe you could have factored out the S_ISREG() check with 
> STATX_DIOALIGN
>

Sure.

>> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> +		unsigned int awu_min = 0, awu_max = 0;
>> +
>> +		if (ext4_inode_can_atomic_write(inode)) {
>> +			awu_min = sbi->s_awu_min;
>> +			awu_max = sbi->s_awu_max;
>> +		}
>> +
>> +		generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
>> +	}
>> +
>>   	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>>   	if (flags & EXT4_APPEND_FL)
>>   		stat->attributes |= STATX_ATTR_APPEND;
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 16a4ce704460..ebe1660bd840 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>>   	return 0;
>>   }
>>   
>> +/*
>> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
>> + * @sb: super block
>> + * TODO: Later add support for bigalloc
>> + */
>> +static void ext4_atomic_write_init(struct super_block *sb)
>> +{
>> +	struct ext4_sb_info *sbi = EXT4_SB(sb);
>> +	struct block_device *bdev = sb->s_bdev;
>> +
>> +	if (!bdev_can_atomic_write(bdev))
>> +		return;
>> +
>> +	if (!ext4_has_feature_extents(sb))
>> +		return;
>> +
>> +	sbi->s_awu_min = max(sb->s_blocksize,
>> +			      bdev_atomic_write_unit_min_bytes(bdev));
>> +	sbi->s_awu_max = min(sb->s_blocksize,
>> +			      bdev_atomic_write_unit_max_bytes(bdev));
>> +	if (sbi->s_awu_min && sbi->s_awu_max &&
>> +	    sbi->s_awu_min <= sbi->s_awu_max) {
>> +		ext4_msg(sb, KERN_NOTICE, "Supports (experimental) DIO atomic writes awu_min: %u, awu_max: %u",
>> +			 sbi->s_awu_min, sbi->s_awu_max);
>> +	} else {
>> +		sbi->s_awu_min = 0;
>> +		sbi->s_awu_max = 0;
>> +	}
>> +}
>> +
>>   static void ext4_fast_commit_init(struct super_block *sb)
>>   {
>>   	struct ext4_sb_info *sbi = EXT4_SB(sb);
>> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>>   
>>   	spin_lock_init(&sbi->s_bdev_wb_lock);
>>   
>> +	ext4_atomic_write_init(sb);
>>   	ext4_fast_commit_init(sb);
>>   
>>   	sb->s_root = NULL;

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-11-01 18:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01  6:50 [PATCH v4 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
2024-11-01  6:50 ` [PATCH v4 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
2024-11-01 11:17   ` Jan Kara
2024-11-01 15:12   ` Darrick J. Wong
2024-11-01 16:00   ` John Garry
2024-11-01 17:23     ` Ritesh Harjani
2024-11-01  6:50 ` [PATCH v4 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
2024-11-01 11:15   ` Jan Kara
2024-11-01  6:50 ` [PATCH v4 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
2024-11-01 11:14   ` Jan Kara
2024-11-01  6:50 ` [PATCH v4 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
2024-11-01 11:13   ` Jan Kara
2024-11-01 14:46   ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).