* [PATCH v2 0/4] ext4: Add atomic writes support for DIO
@ 2024-10-27 18:17 Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-27 18:17 UTC (permalink / raw)
To: linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel, Ritesh Harjani (IBM)
This is v2 of atomic write enablement on ext4 for DIO. I have split this series
as discussed in v1 [1], so this only enables atomic writes to a single fsblock.
That means for now this gets only enabled on bs < ps systems on ext4.
Enablement of atomic writes for bigalloc (multi-fsblock support) is still
under discussion and may require general consensus within the filesystem
community [1].
This series adds the base feature support to enable atomic writes in
direct-io path for ext4. We advertise the minimum and the maximum atomic
write unit sizes via statx on a regular file.
This series allows users to utilize atomic write support using -
1. on bs < ps systems via - mkfs.ext4 -F -b 16384 /dev/sda
This can then be utilized using -
xfs_io -fdc "pwrite -V 1 -A -b16k 0 16k" /mnt/f1
This is built on top of John's DIO atomic write series for XFS [2].
The VFS and block layer enablement for atomic writes were merged already.
[1]: https://lore.kernel.org/linux-ext4/87jzdvmqfz.fsf@gmail.com
[2]: https://lore.kernel.org/linux-xfs/20241019125113.369994-1-john.g.garry@oracle.com/
Changelogs:
===========
PATCH -> PATCH v2:
- addressed review comments from John and Darrick.
- renamed ext4_sb_info variables names: fs_awu* -> s_awu*
- [PATCH]: https://lore.kernel.org/linux-ext4/cover.1729825985.git.ritesh.list@gmail.com/
RFC -> PATCH:
- Dropped RFC tag
- Last RFC was posted a while ago but back then a lot of VFS and block layer
interfaces were still not merged. Those are now merged, thanks to John and
everyone else.
- [RFC] - https://lore.kernel.org/linux-ext4/cover.1709356594.git.ritesh.list@gmail.com/
Ritesh Harjani (IBM) (4):
ext4: Add statx support for atomic writes
ext4: Check for atomic writes support in write iter
ext4: Support setting FMODE_CAN_ATOMIC_WRITE
ext4: Do not fallback to buffered-io for DIO atomic write
fs/ext4/ext4.h | 9 +++++++++
fs/ext4/file.c | 29 ++++++++++++++++++++++++++++-
fs/ext4/inode.c | 14 ++++++++++++++
fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
4 files changed, 82 insertions(+), 1 deletion(-)
--
2.46.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/4] ext4: Add statx support for atomic writes
2024-10-27 18:17 [PATCH v2 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
@ 2024-10-27 18:17 ` Ritesh Harjani (IBM)
2024-10-29 8:54 ` John Garry
2024-10-27 18:17 ` [PATCH v2 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-27 18:17 UTC (permalink / raw)
To: linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel, Ritesh Harjani (IBM)
This patch adds base support for atomic writes via statx getattr.
On bs < ps systems, we can create FS with say bs of 16k. That means
both atomic write min and max unit can be set to 16k for supporting
atomic writes.
Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
fs/ext4/ext4.h | 9 +++++++++
fs/ext4/inode.c | 14 ++++++++++++++
fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
3 files changed, 54 insertions(+)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 44b0d418143c..6ee49aaacd2b 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1729,6 +1729,10 @@ struct ext4_sb_info {
*/
struct work_struct s_sb_upd_work;
+ /* Atomic write unit values in bytes */
+ unsigned int s_awu_min;
+ unsigned int s_awu_max;
+
/* Ext4 fast commit sub transaction ID */
atomic_t s_fc_subtid;
@@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
return buffer_uptodate(bh);
}
+static inline bool ext4_can_atomic_write(struct super_block *sb)
+{
+ return EXT4_SB(sb)->s_awu_min > 0;
+}
+
extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
loff_t pos, unsigned len,
get_block_t *get_block);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 54bdd4884fe6..fcdee27b9aa2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
}
}
+ if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
+ struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+ unsigned int awu_min, awu_max;
+
+ if (ext4_can_atomic_write(inode->i_sb)) {
+ awu_min = sbi->s_awu_min;
+ awu_max = sbi->s_awu_max;
+ } else {
+ awu_min = awu_max = 0;
+ }
+
+ generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
+ }
+
flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
if (flags & EXT4_APPEND_FL)
stat->attributes |= STATX_ATTR_APPEND;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 16a4ce704460..d6e3201a48be 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
return 0;
}
+/*
+ * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
+ * @sb: super block
+ * TODO: Later add support for bigalloc
+ */
+static void ext4_atomic_write_init(struct super_block *sb)
+{
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct block_device *bdev = sb->s_bdev;
+
+ if (!bdev_can_atomic_write(bdev))
+ return;
+
+ if (!ext4_has_feature_extents(sb))
+ return;
+
+ sbi->s_awu_min = max(sb->s_blocksize,
+ bdev_atomic_write_unit_min_bytes(bdev));
+ sbi->s_awu_max = min(sb->s_blocksize,
+ bdev_atomic_write_unit_max_bytes(bdev));
+ if (sbi->s_awu_min && sbi->s_awu_max &&
+ sbi->s_awu_min <= sbi->s_awu_max) {
+ ext4_msg(sb, KERN_NOTICE, "Supports atomic writes awu_min: %u, awu_max: %u",
+ sbi->s_awu_min, sbi->s_awu_max);
+ } else {
+ sbi->s_awu_min = 0;
+ sbi->s_awu_max = 0;
+ }
+}
+
static void ext4_fast_commit_init(struct super_block *sb)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
spin_lock_init(&sbi->s_bdev_wb_lock);
+ ext4_atomic_write_init(sb);
ext4_fast_commit_init(sb);
sb->s_root = NULL;
--
2.46.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/4] ext4: Check for atomic writes support in write iter
2024-10-27 18:17 [PATCH v2 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-10-27 18:17 ` Ritesh Harjani (IBM)
2024-10-29 8:55 ` John Garry
2024-10-27 18:17 ` [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
3 siblings, 1 reply; 12+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-27 18:17 UTC (permalink / raw)
To: linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel, Ritesh Harjani (IBM)
Let's validate the given constraints for atomic write request.
Otherwise it will fail with -EINVAL. Currently atomic write is only
supported on DIO, so for buffered-io it will return -EOPNOTSUPP.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
fs/ext4/file.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index f14aed14b9cf..a7b9b9751a3f 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -692,6 +692,20 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (IS_DAX(inode))
return ext4_dax_write_iter(iocb, from);
#endif
+
+ if (iocb->ki_flags & IOCB_ATOMIC) {
+ size_t len = iov_iter_count(from);
+ int ret;
+
+ if (len < EXT4_SB(inode->i_sb)->s_awu_min ||
+ len > EXT4_SB(inode->i_sb)->s_awu_max)
+ return -EINVAL;
+
+ ret = generic_atomic_write_valid(iocb, from);
+ if (ret)
+ return ret;
+ }
+
if (iocb->ki_flags & IOCB_DIRECT)
return ext4_dio_write_iter(iocb, from);
else
--
2.46.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
2024-10-27 18:17 [PATCH v2 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-10-27 18:17 ` Ritesh Harjani (IBM)
2024-10-29 8:56 ` John Garry
2024-10-27 18:17 ` [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
3 siblings, 1 reply; 12+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-27 18:17 UTC (permalink / raw)
To: linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel, Ritesh Harjani (IBM)
FS needs to add the fmode capability in order to support atomic writes
during file open (refer kiocb_set_rw_flags()). Set this capability on
a regular file if ext4 can do atomic write.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
fs/ext4/file.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index a7b9b9751a3f..8116bd78910b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -898,6 +898,9 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
return ret;
}
+ if (S_ISREG(inode->i_mode) && ext4_can_atomic_write(inode->i_sb))
+ filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
+
filp->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
return dquot_file_open(inode, filp);
}
--
2.46.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
2024-10-27 18:17 [PATCH v2 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
` (2 preceding siblings ...)
2024-10-27 18:17 ` [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-10-27 18:17 ` Ritesh Harjani (IBM)
2024-10-29 8:57 ` John Garry
3 siblings, 1 reply; 12+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-10-27 18:17 UTC (permalink / raw)
To: linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
John Garry, Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel, Ritesh Harjani (IBM)
iomap can return -ENOTBLK if pagecache invalidation fails.
Let's make sure if -ENOTBLK is ever returned for atomic
writes than we fail the write request (-EIO) instead of
fallback to buffered-io.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
fs/ext4/file.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8116bd78910b..22d31b4fdff3 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -576,8 +576,18 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
iomap_ops = &ext4_iomap_overwrite_ops;
ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops,
dio_flags, NULL, 0);
- if (ret == -ENOTBLK)
+ if (ret == -ENOTBLK) {
ret = 0;
+ /*
+ * iomap can return -ENOTBLK if pagecache invalidation fails.
+ * Let's make sure if -ENOTBLK is ever returned for atomic
+ * writes than we fail the write request instead of fallback
+ * to buffered-io.
+ */
+ if (iocb->ki_flags & IOCB_ATOMIC)
+ ret = -EIO;
+ }
+
if (extend) {
/*
* We always perform extending DIO write synchronously so by
--
2.46.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] ext4: Add statx support for atomic writes
2024-10-27 18:17 ` [PATCH v2 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
@ 2024-10-29 8:54 ` John Garry
2024-10-29 9:29 ` Ritesh Harjani
0 siblings, 1 reply; 12+ messages in thread
From: John Garry @ 2024-10-29 8:54 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
> This patch adds base support for atomic writes via statx getattr.
> On bs < ps systems, we can create FS with say bs of 16k. That means
> both atomic write min and max unit can be set to 16k for supporting
> atomic writes.
>
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
> fs/ext4/ext4.h | 9 +++++++++
> fs/ext4/inode.c | 14 ++++++++++++++
> fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
> 3 files changed, 54 insertions(+)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 44b0d418143c..6ee49aaacd2b 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
> */
> struct work_struct s_sb_upd_work;
>
> + /* Atomic write unit values in bytes */
> + unsigned int s_awu_min;
> + unsigned int s_awu_max;
> +
> /* Ext4 fast commit sub transaction ID */
> atomic_t s_fc_subtid;
>
> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
> return buffer_uptodate(bh);
> }
>
> +static inline bool ext4_can_atomic_write(struct super_block *sb)
> +{
> + return EXT4_SB(sb)->s_awu_min > 0;
> +}
> +
> extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
> loff_t pos, unsigned len,
> get_block_t *get_block);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 54bdd4884fe6..fcdee27b9aa2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
> }
> }
>
> + if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + unsigned int awu_min, awu_max;
> +
> + if (ext4_can_atomic_write(inode->i_sb)) {
> + awu_min = sbi->s_awu_min;
> + awu_max = sbi->s_awu_max;
> + } else {
> + awu_min = awu_max = 0;
> + }
> +
> + generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> + }
> +
> flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
> if (flags & EXT4_APPEND_FL)
> stat->attributes |= STATX_ATTR_APPEND;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 16a4ce704460..d6e3201a48be 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
> return 0;
> }
>
> +/*
> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
> + * @sb: super block
> + * TODO: Later add support for bigalloc
> + */
> +static void ext4_atomic_write_init(struct super_block *sb)
> +{
> + struct ext4_sb_info *sbi = EXT4_SB(sb);
> + struct block_device *bdev = sb->s_bdev;
> +
> + if (!bdev_can_atomic_write(bdev))
this check is duplicated, since bdev_atomic_write_unit_{min,
max}_bytes() has this check
> + return;
> +
> + if (!ext4_has_feature_extents(sb))
> + return;
> +
> + sbi->s_awu_min = max(sb->s_blocksize,
> + bdev_atomic_write_unit_min_bytes(bdev));
> + sbi->s_awu_max = min(sb->s_blocksize,
> + bdev_atomic_write_unit_max_bytes(bdev));
> + if (sbi->s_awu_min && sbi->s_awu_max &&
> + sbi->s_awu_min <= sbi->s_awu_max) {
This looks a bit complicated. I would just follow the XFS example and
ensure bdev_atomic_write_unit_min_bytes() <= sb->s_blocksize <=
bdev_atomic_write_unit_max_bytes() [which you are doing, but in a
complicated way]
> + ext4_msg(sb, KERN_NOTICE, "Supports atomic writes awu_min: %u, awu_max: %u",
> + sbi->s_awu_min, sbi->s_awu_max);
> + } else {
> + sbi->s_awu_min = 0;
> + sbi->s_awu_max = 0;
> + }
> +}
> +
> static void ext4_fast_commit_init(struct super_block *sb)
> {
> struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>
> spin_lock_init(&sbi->s_bdev_wb_lock);
>
> + ext4_atomic_write_init(sb);
> ext4_fast_commit_init(sb);
>
> sb->s_root = NULL;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/4] ext4: Check for atomic writes support in write iter
2024-10-27 18:17 ` [PATCH v2 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
@ 2024-10-29 8:55 ` John Garry
0 siblings, 0 replies; 12+ messages in thread
From: John Garry @ 2024-10-29 8:55 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
> Let's validate the given constraints for atomic write request.
> Otherwise it will fail with -EINVAL. Currently atomic write is only
> supported on DIO, so for buffered-io it will return -EOPNOTSUPP.
>
> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE
2024-10-27 18:17 ` [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
@ 2024-10-29 8:56 ` John Garry
0 siblings, 0 replies; 12+ messages in thread
From: John Garry @ 2024-10-29 8:56 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
> FS needs to add the fmode capability in order to support atomic writes
> during file open (refer kiocb_set_rw_flags()). Set this capability on
> a regular file if ext4 can do atomic write.
>
> Signed-off-by: Ritesh Harjani (IBM)<ritesh.list@gmail.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
2024-10-27 18:17 ` [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
@ 2024-10-29 8:57 ` John Garry
2024-10-29 9:22 ` Ritesh Harjani
0 siblings, 1 reply; 12+ messages in thread
From: John Garry @ 2024-10-29 8:57 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
> iomap can return -ENOTBLK if pagecache invalidation fails.
> Let's make sure if -ENOTBLK is ever returned for atomic
> writes than we fail the write request (-EIO) instead of
> fallback to buffered-io.
>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
I am not sure if you plan on dropping this patch...
> ---
> fs/ext4/file.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 8116bd78910b..22d31b4fdff3 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -576,8 +576,18 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
> iomap_ops = &ext4_iomap_overwrite_ops;
> ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops,
> dio_flags, NULL, 0);
> - if (ret == -ENOTBLK)
> + if (ret == -ENOTBLK) {
> ret = 0;
> + /*
> + * iomap can return -ENOTBLK if pagecache invalidation fails.
> + * Let's make sure if -ENOTBLK is ever returned for atomic
> + * writes than we fail the write request instead of fallback
> + * to buffered-io.
> + */
> + if (iocb->ki_flags & IOCB_ATOMIC)
> + ret = -EIO;
> + }
> +
> if (extend) {
> /*
> * We always perform extending DIO write synchronously so by
> --
> 2.46.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write
2024-10-29 8:57 ` John Garry
@ 2024-10-29 9:22 ` Ritesh Harjani
0 siblings, 0 replies; 12+ messages in thread
From: Ritesh Harjani @ 2024-10-29 9:22 UTC (permalink / raw)
To: John Garry, linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
John Garry <john.g.garry@oracle.com> writes:
> On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
>> iomap can return -ENOTBLK if pagecache invalidation fails.
>> Let's make sure if -ENOTBLK is ever returned for atomic
>> writes than we fail the write request (-EIO) instead of
>> fallback to buffered-io.
>>
>> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
> I am not sure if you plan on dropping this patch...
>
As discussed in the other thread, EXT4 has got a fallback to buffered-io
logic at certain places including during short DIO writes unlike XFS
which never fallsback to buffered-io in case of short DIO writes.
Let me send another patch for this one.
Thanks for the review.
-ritesh
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] ext4: Add statx support for atomic writes
2024-10-29 8:54 ` John Garry
@ 2024-10-29 9:29 ` Ritesh Harjani
2024-10-29 15:27 ` Ritesh Harjani
0 siblings, 1 reply; 12+ messages in thread
From: Ritesh Harjani @ 2024-10-29 9:29 UTC (permalink / raw)
To: John Garry, linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
Hi John,
John Garry <john.g.garry@oracle.com> writes:
> On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
>> This patch adds base support for atomic writes via statx getattr.
>> On bs < ps systems, we can create FS with say bs of 16k. That means
>> both atomic write min and max unit can be set to 16k for supporting
>> atomic writes.
>>
>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> ---
>> fs/ext4/ext4.h | 9 +++++++++
>> fs/ext4/inode.c | 14 ++++++++++++++
>> fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>> 3 files changed, 54 insertions(+)
>>
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 44b0d418143c..6ee49aaacd2b 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>> */
>> struct work_struct s_sb_upd_work;
>>
>> + /* Atomic write unit values in bytes */
>> + unsigned int s_awu_min;
>> + unsigned int s_awu_max;
>> +
>> /* Ext4 fast commit sub transaction ID */
>> atomic_t s_fc_subtid;
>>
>> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>> return buffer_uptodate(bh);
>> }
>>
>> +static inline bool ext4_can_atomic_write(struct super_block *sb)
>> +{
>> + return EXT4_SB(sb)->s_awu_min > 0;
>> +}
>> +
>> extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>> loff_t pos, unsigned len,
>> get_block_t *get_block);
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 54bdd4884fe6..fcdee27b9aa2 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>> }
>> }
>>
>> + if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
>> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>> + unsigned int awu_min, awu_max;
>> +
>> + if (ext4_can_atomic_write(inode->i_sb)) {
>> + awu_min = sbi->s_awu_min;
>> + awu_max = sbi->s_awu_max;
>> + } else {
>> + awu_min = awu_max = 0;
>> + }
>> +
>> + generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
>> + }
>> +
>> flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>> if (flags & EXT4_APPEND_FL)
>> stat->attributes |= STATX_ATTR_APPEND;
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 16a4ce704460..d6e3201a48be 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>> return 0;
>> }
>>
>> +/*
>> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
>> + * @sb: super block
>> + * TODO: Later add support for bigalloc
>> + */
>> +static void ext4_atomic_write_init(struct super_block *sb)
>> +{
>> + struct ext4_sb_info *sbi = EXT4_SB(sb);
>> + struct block_device *bdev = sb->s_bdev;
>> +
>> + if (!bdev_can_atomic_write(bdev))
>
> this check is duplicated, since bdev_atomic_write_unit_{min,
> max}_bytes() has this check
>
Right, yes. I can mention a comment and remove this check perhaps.
Looks like XFS also got it duplicated then.
>> + return;
>> +
>> + if (!ext4_has_feature_extents(sb))
>> + return;
>> +
>> + sbi->s_awu_min = max(sb->s_blocksize,
>> + bdev_atomic_write_unit_min_bytes(bdev));
>> + sbi->s_awu_max = min(sb->s_blocksize,
>> + bdev_atomic_write_unit_max_bytes(bdev));
>> + if (sbi->s_awu_min && sbi->s_awu_max &&
>> + sbi->s_awu_min <= sbi->s_awu_max) {
>
> This looks a bit complicated. I would just follow the XFS example and
> ensure bdev_atomic_write_unit_min_bytes() <= sb->s_blocksize <=
> bdev_atomic_write_unit_max_bytes() [which you are doing, but in a
> complicated way]
>
In here we are checking for min and max supported units against fs
blocksize at one place during mount time itself and then caching the
supported atomic write unit. The supported atomic write unit can change
when bigalloc gets introduced.
XFS caches the blockdevice min, max units and then defers these checks
against fs blocksize during file open time using xfs_inode_can_atomicwrite().
I just preferrd the 1st aproach in case of EXT4 here because it will be simpler
this way for when bigalloc also gets introduced.
BTW none of these are ondisk changes, so whenever extsize gets
introduced, we might still have to add something like
ext4_inode_can_atomic_write() (similar to XFS) based on inode extsize
value. But for now it was not needed in EXT4.
I hope the reasoning is clear and make sense.
>> + ext4_msg(sb, KERN_NOTICE, "Supports atomic writes awu_min: %u, awu_max: %u",
>> + sbi->s_awu_min, sbi->s_awu_max);
BTW - I was wondering if we should add "experimental" in above msg.
-ritesh
>> + } else {
>> + sbi->s_awu_min = 0;
>> + sbi->s_awu_max = 0;
>> + }
>> +}
>> +
>> static void ext4_fast_commit_init(struct super_block *sb)
>> {
>> struct ext4_sb_info *sbi = EXT4_SB(sb);
>> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>>
>> spin_lock_init(&sbi->s_bdev_wb_lock);
>>
>> + ext4_atomic_write_init(sb);
>> ext4_fast_commit_init(sb);
>>
>> sb->s_root = NULL;
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/4] ext4: Add statx support for atomic writes
2024-10-29 9:29 ` Ritesh Harjani
@ 2024-10-29 15:27 ` Ritesh Harjani
0 siblings, 0 replies; 12+ messages in thread
From: Ritesh Harjani @ 2024-10-29 15:27 UTC (permalink / raw)
To: John Garry, linux-ext4
Cc: Theodore Ts'o, Jan Kara, Darrick J . Wong, Christoph Hellwig,
Ojaswin Mujoo, Dave Chinner, linux-kernel, linux-xfs,
linux-fsdevel
Ritesh Harjani (IBM) <ritesh.list@gmail.com> writes:
> Hi John,
>
> John Garry <john.g.garry@oracle.com> writes:
>
>> On 27/10/2024 18:17, Ritesh Harjani (IBM) wrote:
>>> This patch adds base support for atomic writes via statx getattr.
>>> On bs < ps systems, we can create FS with say bs of 16k. That means
>>> both atomic write min and max unit can be set to 16k for supporting
>>> atomic writes.
>>>
>>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>> ---
>>> fs/ext4/ext4.h | 9 +++++++++
>>> fs/ext4/inode.c | 14 ++++++++++++++
>>> fs/ext4/super.c | 31 +++++++++++++++++++++++++++++++
>>> 3 files changed, 54 insertions(+)
>>>
>>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>>> index 44b0d418143c..6ee49aaacd2b 100644
>>> --- a/fs/ext4/ext4.h
>>> +++ b/fs/ext4/ext4.h
>>> @@ -1729,6 +1729,10 @@ struct ext4_sb_info {
>>> */
>>> struct work_struct s_sb_upd_work;
>>>
>>> + /* Atomic write unit values in bytes */
>>> + unsigned int s_awu_min;
>>> + unsigned int s_awu_max;
>>> +
>>> /* Ext4 fast commit sub transaction ID */
>>> atomic_t s_fc_subtid;
>>>
>>> @@ -3855,6 +3859,11 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
>>> return buffer_uptodate(bh);
>>> }
>>>
>>> +static inline bool ext4_can_atomic_write(struct super_block *sb)
>>> +{
>>> + return EXT4_SB(sb)->s_awu_min > 0;
>>> +}
>>> +
>>> extern int ext4_block_write_begin(handle_t *handle, struct folio *folio,
>>> loff_t pos, unsigned len,
>>> get_block_t *get_block);
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 54bdd4884fe6..fcdee27b9aa2 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -5578,6 +5578,20 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
>>> }
>>> }
>>>
>>> + if (S_ISREG(inode->i_mode) && (request_mask & STATX_WRITE_ATOMIC)) {
>>> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>>> + unsigned int awu_min, awu_max;
>>> +
>>> + if (ext4_can_atomic_write(inode->i_sb)) {
>>> + awu_min = sbi->s_awu_min;
>>> + awu_max = sbi->s_awu_max;
>>> + } else {
>>> + awu_min = awu_max = 0;
>>> + }
>>> +
>>> + generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
>>> + }
>>> +
>>> flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
>>> if (flags & EXT4_APPEND_FL)
>>> stat->attributes |= STATX_ATTR_APPEND;
>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>>> index 16a4ce704460..d6e3201a48be 100644
>>> --- a/fs/ext4/super.c
>>> +++ b/fs/ext4/super.c
>>> @@ -4425,6 +4425,36 @@ static int ext4_handle_clustersize(struct super_block *sb)
>>> return 0;
>>> }
>>>
>>> +/*
>>> + * ext4_atomic_write_init: Initializes filesystem min & max atomic write units.
>>> + * @sb: super block
>>> + * TODO: Later add support for bigalloc
>>> + */
>>> +static void ext4_atomic_write_init(struct super_block *sb)
>>> +{
>>> + struct ext4_sb_info *sbi = EXT4_SB(sb);
>>> + struct block_device *bdev = sb->s_bdev;
>>> +
>>> + if (!bdev_can_atomic_write(bdev))
>>
>> this check is duplicated, since bdev_atomic_write_unit_{min,
>> max}_bytes() has this check
>>
>
> Right, yes. I can mention a comment and remove this check perhaps.
> Looks like XFS also got it duplicated then.
>
>
If we remove above bdev check logic, than we are relying on the min max
comparison in below code to decide whether we support atomic writes or
not. It's rather cleaner and straight forward to check if bdev supports
atomic write in the beginning itself.
So I feel it's better to keep the above check as is since it's only once
during mount.
(Note - bdev_can_atomic_write() either ways gets called twice when
bdev_atomic_write_unit_min|max_bytes() is getting called).
So I would like to keep the check as is please.
Thanks for the review!
-ritesh
>>> + return;
>>> +
>>> + if (!ext4_has_feature_extents(sb))
>>> + return;
>>> +
>>> + sbi->s_awu_min = max(sb->s_blocksize,
>>> + bdev_atomic_write_unit_min_bytes(bdev));
>>> + sbi->s_awu_max = min(sb->s_blocksize,
>>> + bdev_atomic_write_unit_max_bytes(bdev));
>>> + if (sbi->s_awu_min && sbi->s_awu_max &&
>>> + sbi->s_awu_min <= sbi->s_awu_max) {
>>
>> This looks a bit complicated. I would just follow the XFS example and
>> ensure bdev_atomic_write_unit_min_bytes() <= sb->s_blocksize <=
>> bdev_atomic_write_unit_max_bytes() [which you are doing, but in a
>> complicated way]
>>
>
> In here we are checking for min and max supported units against fs
> blocksize at one place during mount time itself and then caching the
> supported atomic write unit. The supported atomic write unit can change
> when bigalloc gets introduced.
>
> XFS caches the blockdevice min, max units and then defers these checks
> against fs blocksize during file open time using xfs_inode_can_atomicwrite().
>
> I just preferrd the 1st aproach in case of EXT4 here because it will be simpler
> this way for when bigalloc also gets introduced.
>
> BTW none of these are ondisk changes, so whenever extsize gets
> introduced, we might still have to add something like
> ext4_inode_can_atomic_write() (similar to XFS) based on inode extsize
> value. But for now it was not needed in EXT4.
>
> I hope the reasoning is clear and make sense.
>
>
>>> + ext4_msg(sb, KERN_NOTICE, "Supports atomic writes awu_min: %u, awu_max: %u",
>>> + sbi->s_awu_min, sbi->s_awu_max);
>
> BTW - I was wondering if we should add "experimental" in above msg.
>
> -ritesh
>
>>> + } else {
>>> + sbi->s_awu_min = 0;
>>> + sbi->s_awu_max = 0;
>>> + }
>>> +}
>>> +
>>> static void ext4_fast_commit_init(struct super_block *sb)
>>> {
>>> struct ext4_sb_info *sbi = EXT4_SB(sb);
>>> @@ -5336,6 +5366,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
>>>
>>> spin_lock_init(&sbi->s_bdev_wb_lock);
>>>
>>> + ext4_atomic_write_init(sb);
>>> ext4_fast_commit_init(sb);
>>>
>>> sb->s_root = NULL;
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-10-29 15:40 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-27 18:17 [PATCH v2 0/4] ext4: Add atomic writes support for DIO Ritesh Harjani (IBM)
2024-10-27 18:17 ` [PATCH v2 1/4] ext4: Add statx support for atomic writes Ritesh Harjani (IBM)
2024-10-29 8:54 ` John Garry
2024-10-29 9:29 ` Ritesh Harjani
2024-10-29 15:27 ` Ritesh Harjani
2024-10-27 18:17 ` [PATCH v2 2/4] ext4: Check for atomic writes support in write iter Ritesh Harjani (IBM)
2024-10-29 8:55 ` John Garry
2024-10-27 18:17 ` [PATCH v2 3/4] ext4: Support setting FMODE_CAN_ATOMIC_WRITE Ritesh Harjani (IBM)
2024-10-29 8:56 ` John Garry
2024-10-27 18:17 ` [PATCH v2 4/4] ext4: Do not fallback to buffered-io for DIO atomic write Ritesh Harjani (IBM)
2024-10-29 8:57 ` John Garry
2024-10-29 9:22 ` Ritesh Harjani
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).