From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: linux-ext4@vger.kernel.org
Cc: Theodore Ts'o <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
John Garry <john.g.garry@oracle.com>,
djwong@kernel.org, Ojaswin Mujoo <ojaswin@linux.ibm.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc
Date: Fri, 09 May 2025 23:12:46 +0530 [thread overview]
Message-ID: <87h61t65pl.fsf@gmail.com> (raw)
In-Reply-To: <cover.1746734745.git.ritesh.list@gmail.com>
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com> writes:
> This is v3 of multi-fsblock atomic write support using bigalloc. This has
> started looking into much better shape now. The major chunk of the design
> changes has been kept in Patch-4 & 5.
>
> This series can now be carefully reviewed, as all the error handling related
> code paths should be properly taken care of.
>
We spotted that multi-fsblock changes might need to force a journal
commit if there were mixed mappings in the underlying region e.g. say WUWUWUW...
The issue arises when, during block allocation, the unwritten ranges are
first zeroed out, followed by the unwritten-to-written extent
conversion. This conversion is part of a journaled metadata transaction
that has not yet been committed, as the transaction is still running.
If an iomap write then modifies the data on those multi-fsblocks and a
sudden power loss occurs before the transaction commits, the
unwritten-to-written conversion will not be replayed during journal
recovery. As a result, we end up with new data written over mapped
blocks, while the alternate unwritten blocks will read zeroes. This
could cause a torn write behavior for atomic writes.
So we were thinking we might need something like this. Hopefully this
should still be ok, as mixed mapping case mostly is a non-performance
critical path. Thoughts?
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2642e1ef128f..59b59d609976 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3517,7 +3517,8 @@ static int ext4_map_blocks_atomic_write_slow(handle_t *handle,
* underlying short holes/unwritten extents within the requested range.
*/
static int ext4_map_blocks_atomic_write(handle_t *handle, struct inode *inode,
- struct ext4_map_blocks *map, int m_flags)
+ struct ext4_map_blocks *map, int m_flags,
+ bool *force_commit)
{
ext4_lblk_t m_lblk = map->m_lblk;
unsigned int m_len = map->m_len;
@@ -3537,6 +3538,11 @@ static int ext4_map_blocks_atomic_write(handle_t *handle, struct inode *inode,
map->m_len = m_len;
map->m_flags = 0;
+ /*
+ * slow path means we have mixed mapping, that means we will need
+ * to force txn commit.
+ */
+ *force_commit = true;
return ext4_map_blocks_atomic_write_slow(handle, inode, map);
out:
return ret;
@@ -3548,6 +3554,7 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
handle_t *handle;
u8 blkbits = inode->i_blkbits;
int ret, dio_credits, m_flags = 0, retries = 0;
+ bool force_commit = false;
/*
* Trim the mapping request to the maximum value that we can map at
@@ -3610,7 +3617,8 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
m_flags = EXT4_GET_BLOCKS_IO_CREATE_EXT;
if (flags & IOMAP_ATOMIC)
- ret = ext4_map_blocks_atomic_write(handle, inode, map, m_flags);
+ ret = ext4_map_blocks_atomic_write(handle, inode, map, m_flags,
+ &force_commit);
else
ret = ext4_map_blocks(handle, inode, map, m_flags);
@@ -3626,6 +3634,9 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
goto retry;
+ if (ret > 0 && force_commit)
+ ext4_force_commit(inode->i_sb);
+
return ret;
}
-ritesh
next prev parent reply other threads:[~2025-05-09 18:09 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-08 20:50 [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-08 20:50 ` [PATCH v3 1/7] ext4: Document an edge case for overwrites Ritesh Harjani (IBM)
2025-05-09 5:19 ` Ojaswin Mujoo
2025-05-14 16:23 ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 2/7] ext4: Check if inode uses extents in ext4_inode_can_atomic_write() Ritesh Harjani (IBM)
2025-05-09 5:20 ` Ojaswin Mujoo
2025-05-14 16:24 ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 3/7] ext4: Make ext4_meta_trans_blocks() non-static for later use Ritesh Harjani (IBM)
2025-05-09 5:21 ` Ojaswin Mujoo
2025-05-14 16:24 ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 4/7] ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS Ritesh Harjani (IBM)
2025-05-14 16:16 ` Darrick J. Wong
2025-05-14 18:47 ` Ritesh Harjani
2025-05-08 20:50 ` [PATCH v3 5/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-14 16:19 ` Darrick J. Wong
2025-05-14 19:04 ` Ritesh Harjani
2025-05-08 20:50 ` [PATCH v3 6/7] ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc Ritesh Harjani (IBM)
2025-05-14 16:21 ` Darrick J. Wong
2025-05-08 20:50 ` [PATCH v3 7/7] ext4: Add atomic block write documentation Ritesh Harjani (IBM)
2025-05-09 7:34 ` Ojaswin Mujoo
2025-05-14 16:38 ` Darrick J. Wong
2025-05-15 2:15 ` Ritesh Harjani
2025-05-15 2:18 ` Ritesh Harjani
2025-05-09 17:42 ` Ritesh Harjani [this message]
2025-05-14 16:40 ` [PATCH v3 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Darrick J. Wong
2025-05-14 18:55 ` Ritesh Harjani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h61t65pl.fsf@gmail.com \
--to=ritesh.list@gmail.com \
--cc=djwong@kernel.org \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).