* [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
@ 2025-11-29 10:32 Zhang Yi
2025-11-29 10:32 ` [PATCH v3 01/14] ext4: subdivide EXT4_EXT_DATA_VALID1 Zhang Yi
` (14 more replies)
0 siblings, 15 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Changes since v2:
- Rebase the codes on ext4.git dev-91ef18b567da.
- Move the first cleanup patch in v2 to patch 08 to facilitate easier
backporting.
- In patch 01, correct the mismatch comments for
EXT4_EXT_DATA_ENTIRE_VALID1 and EXT4_EXT_DATA_PARTIAL_VALID1.
- Modify patch 06 and add 07, cleanup the commit message to avoid
confusion, and don't always drop extent cache before splitting
extent, instead, do this only after PARTIAL_VALID1 zeroed out or
split extent fails.
- In patch 08, mark zero_ex to initialized.
- In patch 09, correct the word 'tag' to 'lable' in the commit message.
- In patch 11, add return value check of __es_remove_extent() in
ext4_es_cache_extent().
- Collecting RVB tags.
Thanks for the comments and suggestions from Jan, Ojaswin and Baokun!
Next, it is necessary to focus on refactoring and cleaning up the
code related to ext4_split_extent(). Ojaswin is going to take on this
work since he has already been exploring it on his local branch.
Changes since v1:
- Rebase the codes based on the latest linux-next 20251120.
- Add patches 01-05, fix two stale data problems caused by
EXT4_EXT_MAY_ZEROOUT when splitting extent.
- Add patches 06-07, fix two stale extent status entries problems also
caused by splitting extent.
- Modify patches 08-10, extend __es_remove_extent() and
ext4_es_cache_extent() to allow them to overwrite existing extents of
the same status when caching on-disk extents, while also checking
extents of different stauts and raising alarms to prevent misuse.
- Add patch 13 to clear the usage of ext4_es_insert_extent(), and
remove the TODO comment in it.
v2: https://lore.kernel.org/linux-ext4/20251121060811.1685783-1-yi.zhang@huaweicloud.com/
v1: https://lore.kernel.org/linux-ext4/20251031062905.4135909-1-yi.zhang@huaweicloud.com/
Original Description
This series addresses the optimization that Jan pointed out [1]
regarding the introduction of a sequence number to
ext4_es_insert_extent(). The proposal is to replace all instances where
the cache of on-disk extents is updated by using ext4_es_cache_extent()
instead of ext4_es_insert_extent(). This change can prevent excessive
cache invalidations caused by unnecessarily increasing the extent
sequence number when reading from the on-disk extent tree.
[1] https://lore.kernel.org/linux-ext4/ympvfypw3222g2k4xzd5pba4zhkz5jihw4td67iixvrqhuu43y@wse63ntv4s6u/
Cheers,
Yi.
Zhang Yi (14):
ext4: subdivide EXT4_EXT_DATA_VALID1
ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1
ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before
submitting I/O
ext4: correct the mapping status if the extent has been zeroed
ext4: don't cache extent during splitting extent
ext4: drop extent cache after doing PARTIAL_VALID1 zeroout
ext4: drop extent cache when splitting extent fails
ext4: cleanup zeroout in ext4_split_extent_at()
ext4: cleanup useless out label in __es_remove_extent()
ext4: make __es_remove_extent() check extent status
ext4: make ext4_es_cache_extent() support overwrite existing extents
ext4: adjust the debug info in ext4_es_cache_extent()
ext4: replace ext4_es_insert_extent() when caching on-disk extents
ext4: drop the TODO comment in ext4_es_insert_extent()
fs/ext4/extents.c | 135 ++++++++++++++++++++++++---------------
fs/ext4/extents_status.c | 124 ++++++++++++++++++++++++++---------
fs/ext4/inode.c | 18 +++---
3 files changed, 187 insertions(+), 90 deletions(-)
--
2.46.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3 01/14] ext4: subdivide EXT4_EXT_DATA_VALID1
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 02/14] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1 Zhang Yi
` (13 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
When splitting an extent, if the EXT4_GET_BLOCKS_CONVERT flag is set and
it is necessary to split the target extent in the middle,
ext4_split_extent() first handles splitting the latter half of the
extent and passes the EXT4_EXT_DATA_VALID1 flag. This flag implies that
all blocks before the split point contain valid data; however, this
assumption is incorrect.
Therefore, subdivid EXT4_EXT_DATA_VALID1 into
EXT4_EXT_DATA_ENTIRE_VALID1 and EXT4_EXT_DATA_PARTIAL_VALID1, which
indicate that the first half of the extent is either entirely valid or
only partially valid, respectively. These two flags cannot be set
simultaneously.
This patch does not use EXT4_EXT_DATA_PARTIAL_VALID1, it only replaces
EXT4_EXT_DATA_VALID1 with EXT4_EXT_DATA_ENTIRE_VALID1 at the location
where it is set, no logical changes.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2cf5759ba689..8d5ca450aa5d 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -43,8 +43,13 @@
#define EXT4_EXT_MARK_UNWRIT1 0x2 /* mark first half unwritten */
#define EXT4_EXT_MARK_UNWRIT2 0x4 /* mark second half unwritten */
-#define EXT4_EXT_DATA_VALID1 0x8 /* first half contains valid data */
-#define EXT4_EXT_DATA_VALID2 0x10 /* second half contains valid data */
+/* first half contains valid data */
+#define EXT4_EXT_DATA_ENTIRE_VALID1 0x8 /* has entirely valid data */
+#define EXT4_EXT_DATA_PARTIAL_VALID1 0x10 /* has partially valid data */
+#define EXT4_EXT_DATA_VALID1 (EXT4_EXT_DATA_ENTIRE_VALID1 | \
+ EXT4_EXT_DATA_PARTIAL_VALID1)
+
+#define EXT4_EXT_DATA_VALID2 0x20 /* second half contains valid data */
static __le32 ext4_extent_block_csum(struct inode *inode,
struct ext4_extent_header *eh)
@@ -3190,8 +3195,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
unsigned int ee_len, depth;
int err = 0;
- BUG_ON((split_flag & (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2)) ==
- (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2));
+ BUG_ON((split_flag & EXT4_EXT_DATA_VALID1) == EXT4_EXT_DATA_VALID1);
+ BUG_ON((split_flag & EXT4_EXT_DATA_VALID1) &&
+ (split_flag & EXT4_EXT_DATA_VALID2));
ext_debug(inode, "logical block %llu\n", (unsigned long long)split);
@@ -3373,7 +3379,7 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
split_flag1 |= EXT4_EXT_MARK_UNWRIT1 |
EXT4_EXT_MARK_UNWRIT2;
if (split_flag & EXT4_EXT_DATA_VALID2)
- split_flag1 |= EXT4_EXT_DATA_VALID1;
+ split_flag1 |= EXT4_EXT_DATA_ENTIRE_VALID1;
path = ext4_split_extent_at(handle, inode, path,
map->m_lblk + map->m_len, split_flag1, flags1);
if (IS_ERR(path))
@@ -3728,7 +3734,7 @@ static struct ext4_ext_path *ext4_split_convert_extents(handle_t *handle,
/* Convert to unwritten */
if (flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN) {
- split_flag |= EXT4_EXT_DATA_VALID1;
+ split_flag |= EXT4_EXT_DATA_ENTIRE_VALID1;
/* Convert to initialized */
} else if (flags & EXT4_GET_BLOCKS_CONVERT) {
/*
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 02/14] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
2025-11-29 10:32 ` [PATCH v3 01/14] ext4: subdivide EXT4_EXT_DATA_VALID1 Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 03/14] ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O Zhang Yi
` (12 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
When allocating initialized blocks from a large unwritten extent, or
when splitting an unwritten extent during end I/O and converting it to
initialized, there is currently a potential issue of stale data if the
extent needs to be split in the middle.
0 A B N
[UUUUUUUUUUUU] U: unwritten extent
[--DDDDDDDD--] D: valid data
|<- ->| ----> this range needs to be initialized
ext4_split_extent() first try to split this extent at B with
EXT4_EXT_DATA_ENTIRE_VALID1 and EXT4_EXT_MAY_ZEROOUT flag set, but
ext4_split_extent_at() failed to split this extent due to temporary lack
of space. It zeroout B to N and mark the entire extent from 0 to N
as written.
0 A B N
[WWWWWWWWWWWW] W: written extent
[SSDDDDDDDDZZ] Z: zeroed, S: stale data
ext4_split_extent() then try to split this extent at A with
EXT4_EXT_DATA_VALID2 flag set. This time, it split successfully and left
a stale written extent from 0 to A.
0 A B N
[WW|WWWWWWWWWW]
[SS|DDDDDDDDZZ]
Fix this by pass EXT4_EXT_DATA_PARTIAL_VALID1 to ext4_split_extent_at()
when splitting at B, don't convert the entire extent to written and left
it as unwritten after zeroing out B to N. The remaining work is just
like the standard two-part split. ext4_split_extent() will pass the
EXT4_EXT_DATA_VALID2 flag when it calls ext4_split_extent_at() for the
second time, allowing it to properly handle the split. If the split is
successful, it will keep extent from 0 to A as unwritten.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 8d5ca450aa5d..1fee84ea20af 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3310,6 +3310,15 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
}
if (!err) {
+ /*
+ * The first half contains partially valid data, the
+ * splitting of this extent has not been completed, fix
+ * extent length and ext4_split_extent() split will the
+ * first half again.
+ */
+ if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1)
+ goto fix_extent_len;
+
/* update the extent length and mark as initialized */
ex->ee_len = cpu_to_le16(ee_len);
ext4_ext_try_to_merge(handle, inode, path, ex);
@@ -3379,7 +3388,9 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
split_flag1 |= EXT4_EXT_MARK_UNWRIT1 |
EXT4_EXT_MARK_UNWRIT2;
if (split_flag & EXT4_EXT_DATA_VALID2)
- split_flag1 |= EXT4_EXT_DATA_ENTIRE_VALID1;
+ split_flag1 |= map->m_lblk > ee_block ?
+ EXT4_EXT_DATA_PARTIAL_VALID1 :
+ EXT4_EXT_DATA_ENTIRE_VALID1;
path = ext4_split_extent_at(handle, inode, path,
map->m_lblk + map->m_len, split_flag1, flags1);
if (IS_ERR(path))
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 03/14] ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
2025-11-29 10:32 ` [PATCH v3 01/14] ext4: subdivide EXT4_EXT_DATA_VALID1 Zhang Yi
2025-11-29 10:32 ` [PATCH v3 02/14] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1 Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 04/14] ext4: correct the mapping status if the extent has been zeroed Zhang Yi
` (11 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
When allocating blocks during within-EOF DIO and writeback with
dioread_nolock enabled, EXT4_GET_BLOCKS_PRE_IO was set to split an
existing large unwritten extent. However, EXT4_GET_BLOCKS_CONVERT was
set when calling ext4_split_convert_extents(), which may potentially
result in stale data issues.
Assume we have an unwritten extent, and then DIO writes the second half.
[UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUUUUUUUUUUU] extent status tree
|<- ->| ----> dio write this range
First, ext4_iomap_alloc() call ext4_map_blocks() with
EXT4_GET_BLOCKS_PRE_IO, EXT4_GET_BLOCKS_UNWRIT_EXT and
EXT4_GET_BLOCKS_CREATE flags set. ext4_map_blocks() find this extent and
call ext4_split_convert_extents() with EXT4_GET_BLOCKS_CONVERT and the
above flags set.
Then, ext4_split_convert_extents() calls ext4_split_extent() with
EXT4_EXT_MAY_ZEROOUT, EXT4_EXT_MARK_UNWRIT2 and EXT4_EXT_DATA_VALID2
flags set, and it calls ext4_split_extent_at() to split the second half
with EXT4_EXT_DATA_VALID2, EXT4_EXT_MARK_UNWRIT1, EXT4_EXT_MAY_ZEROOUT
and EXT4_EXT_MARK_UNWRIT2 flags set. However, ext4_split_extent_at()
failed to insert extent since a temporary lack -ENOSPC. It zeroes out
the first half but convert the entire on-disk extent to written since
the EXT4_EXT_DATA_VALID2 flag set, but left the second half as unwritten
in the extent status tree.
[0000000000SSSSSS] data S: stale data, 0: zeroed
[WWWWWWWWWWWWWWWW] on-disk extent W: written extent
[WWWWWWWWWWUUUUUU] extent status tree
Finally, if the DIO failed to write data to the disk, the stale data in
the second half will be exposed once the cached extent entry is gone.
Fix this issue by not passing EXT4_GET_BLOCKS_CONVERT when splitting
an unwritten extent before submitting I/O, and make
ext4_split_convert_extents() to zero out the entire extent range
to zero for this case, and also mark the extent in the extent status
tree for consistency.
Fixes: b8a8684502a0 ("ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 1fee84ea20af..91b56de60c90 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3746,15 +3746,19 @@ static struct ext4_ext_path *ext4_split_convert_extents(handle_t *handle,
/* Convert to unwritten */
if (flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN) {
split_flag |= EXT4_EXT_DATA_ENTIRE_VALID1;
- /* Convert to initialized */
- } else if (flags & EXT4_GET_BLOCKS_CONVERT) {
+ /* Split the existing unwritten extent */
+ } else if (flags & (EXT4_GET_BLOCKS_UNWRIT_EXT |
+ EXT4_GET_BLOCKS_CONVERT)) {
/*
* It is safe to convert extent to initialized via explicit
* zeroout only if extent is fully inside i_size or new_size.
*/
split_flag |= ee_block + ee_len <= eof_block ?
EXT4_EXT_MAY_ZEROOUT : 0;
- split_flag |= (EXT4_EXT_MARK_UNWRIT2 | EXT4_EXT_DATA_VALID2);
+ split_flag |= EXT4_EXT_MARK_UNWRIT2;
+ /* Convert to initialized */
+ if (flags & EXT4_GET_BLOCKS_CONVERT)
+ split_flag |= EXT4_EXT_DATA_VALID2;
}
flags |= EXT4_GET_BLOCKS_SPLIT_NOMERGE;
return ext4_split_extent(handle, inode, path, map, split_flag, flags,
@@ -3930,7 +3934,7 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
/* get_block() before submitting IO, split the extent */
if (flags & EXT4_GET_BLOCKS_SPLIT_NOMERGE) {
path = ext4_split_convert_extents(handle, inode, map, path,
- flags | EXT4_GET_BLOCKS_CONVERT, allocated);
+ flags, allocated);
if (IS_ERR(path))
return path;
/*
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 04/14] ext4: correct the mapping status if the extent has been zeroed
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (2 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 03/14] ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 05/14] ext4: don't cache extent during splitting extent Zhang Yi
` (10 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Before submitting I/O and allocating blocks with the
EXT4_GET_BLOCKS_PRE_IO flag set, ext4_split_convert_extents() may
convert the target extent range to initialized due to ENOSPC, ENOMEM, or
EQUOTA errors. However, it still marks the mapping as incorrectly
unwritten. Although this may not seem to cause any practical problems,
it will result in an unnecessary extent conversion operation after I/O
completion. Therefore, it's better to correct the returned mapping
status.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 91b56de60c90..daecf3f0b367 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3933,6 +3933,8 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
/* get_block() before submitting IO, split the extent */
if (flags & EXT4_GET_BLOCKS_SPLIT_NOMERGE) {
+ int depth;
+
path = ext4_split_convert_extents(handle, inode, map, path,
flags, allocated);
if (IS_ERR(path))
@@ -3948,7 +3950,13 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
err = -EFSCORRUPTED;
goto errout;
}
- map->m_flags |= EXT4_MAP_UNWRITTEN;
+ /* Don't mark unwritten if the extent has been zeroed out. */
+ path = ext4_find_extent(inode, map->m_lblk, path, flags);
+ if (IS_ERR(path))
+ return path;
+ depth = ext_depth(inode);
+ if (ext4_ext_is_unwritten(path[depth].p_ext))
+ map->m_flags |= EXT4_MAP_UNWRITTEN;
goto out;
}
/* IO end_io complete, convert the filled extent to written */
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 05/14] ext4: don't cache extent during splitting extent
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (3 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 04/14] ext4: correct the mapping status if the extent has been zeroed Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout Zhang Yi
` (9 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Caching extents during the splitting process is risky, as it may result
in stale extents remaining in the status tree. Moreover, in most cases,
the corresponding extent block entries are likely already cached before
the split happens, making caching here not particularly useful.
Assume we have an unwritten extent, and then DIO writes the first half.
[UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUUUUUUUUUUU] extent status tree
|<- ->| ----> dio write this range
First, when ext4_split_extent_at() splits this extent, it truncates the
existing extent and then inserts a new one. During this process, this
extent status entry may be shrunk, and calls to ext4_find_extent() and
ext4_cache_extents() may occur, which could potentially insert the
truncated range as a hole into the extent status tree. After the split
is completed, this hole is not replaced with the correct status.
[UUUUUUU|UUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUU|HHHHHHHH] extent status tree H: hole
Then, the outer calling functions will not correct this remaining hole
extent either. Finally, if we perform a delayed buffer write on this
latter part, it will re-insert the delayed extent and cause an error in
space accounting.
In adition, if the unwritten extent cache is not shrunk during the
splitting, ext4_cache_extents() also conflicts with existing extents
when caching extents. In the future, we will add checks when caching
extents, which will trigger a warning. Therefore, Do not cache extents
that are being split.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index daecf3f0b367..be9fd2ab8667 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3199,6 +3199,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
BUG_ON((split_flag & EXT4_EXT_DATA_VALID1) &&
(split_flag & EXT4_EXT_DATA_VALID2));
+ /* Do not cache extents that are in the process of being modified. */
+ flags |= EXT4_EX_NOCACHE;
+
ext_debug(inode, "logical block %llu\n", (unsigned long long)split);
ext4_ext_show_leaf(inode, path);
@@ -3381,6 +3384,9 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
ee_len = ext4_ext_get_actual_len(ex);
unwritten = ext4_ext_is_unwritten(ex);
+ /* Do not cache extents that are in the process of being modified. */
+ flags |= EXT4_EX_NOCACHE;
+
if (map->m_lblk + map->m_len < ee_block + ee_len) {
split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
flags1 = flags | EXT4_GET_BLOCKS_SPLIT_NOMERGE;
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (4 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 05/14] ext4: don't cache extent during splitting extent Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 17:33 ` Ojaswin Mujoo
2025-11-29 10:32 ` [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails Zhang Yi
` (8 subsequent siblings)
14 siblings, 1 reply; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
When splitting an unwritten extent in the middle and converting it to
initialized in ext4_split_extent() with the EXT4_EXT_MAY_ZEROOUT and
EXT4_EXT_DATA_VALID2 flags set, it could leave a stale unwritten extent.
Assume we have an unwritten file and buffered write in the middle of it
without dioread_nolock enabled, it will allocate blocks as written
extent.
0 A B N
[UUUUUUUUUUUU] on-disk extent U: unwritten extent
[UUUUUUUUUUUU] extent status tree
[--DDDDDDDD--] D: valid data
|<- ->| ----> this range needs to be initialized
ext4_split_extent() first try to split this extent at B with
EXT4_EXT_DATA_PARTIAL_VALID1 and EXT4_EXT_MAY_ZEROOUT flag set, but
ext4_split_extent_at() failed to split this extent due to temporary lack
of space. It zeroout B to N and leave the entire extent as unwritten.
0 A B N
[UUUUUUUUUUUU] on-disk extent
[UUUUUUUUUUUU] extent status tree
[--DDDDDDDDZZ] Z: zeroed data
ext4_split_extent() then try to split this extent at A with
EXT4_EXT_DATA_VALID2 flag set. This time, it split successfully and
leave an written extent from A to N.
0 A B N
[UUWWWWWWWWWW] on-disk extent W: written extent
[UUUUUUUUUUUU] extent status tree
[--DDDDDDDDZZ]
Finally ext4_map_create_blocks() only insert extent A to B to the extent
status tree, and leave an stale unwritten extent in the status tree.
0 A B N
[UUWWWWWWWWWW] on-disk extent W: written extent
[UUWWWWWWWWUU] extent status tree
[--DDDDDDDDZZ]
Fix this issue by always cached extent status entry after zeroing out
the second part.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index be9fd2ab8667..1094e4923451 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3319,8 +3319,16 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
* extent length and ext4_split_extent() split will the
* first half again.
*/
- if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1)
+ if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1) {
+ /*
+ * Drop extent cache to prevent stale unwritten
+ * extents remaining after zeroing out.
+ */
+ ext4_es_remove_extent(inode,
+ le32_to_cpu(zero_ex.ee_block),
+ ext4_ext_get_actual_len(&zero_ex));
goto fix_extent_len;
+ }
/* update the extent length and mark as initialized */
ex->ee_len = cpu_to_le16(ee_len);
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (5 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 17:34 ` Ojaswin Mujoo
2025-11-29 10:32 ` [PATCH v3 08/14] ext4: cleanup zeroout in ext4_split_extent_at() Zhang Yi
` (7 subsequent siblings)
14 siblings, 1 reply; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
When the split extent fails, we might leave some extents still being
processed and return an error directly, which will result in stale
extent entries remaining in the extent status tree. So drop all of the
remaining potentially stale extents if the splitting fails.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
---
fs/ext4/extents.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 1094e4923451..945995d68c4d 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3267,7 +3267,7 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
err = PTR_ERR(path);
if (err != -ENOSPC && err != -EDQUOT && err != -ENOMEM)
- return path;
+ goto out_path;
/*
* Get a new path to try to zeroout or fix the extent length.
@@ -3281,7 +3281,7 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
if (IS_ERR(path)) {
EXT4_ERROR_INODE(inode, "Failed split extent on %u, err %ld",
split, PTR_ERR(path));
- return path;
+ goto out_path;
}
depth = ext_depth(inode);
ex = path[depth].p_ext;
@@ -3358,6 +3358,10 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
ext4_free_ext_path(path);
path = ERR_PTR(err);
}
+out_path:
+ if (IS_ERR(path))
+ /* Remove all remaining potentially stale extents. */
+ ext4_es_remove_extent(inode, ee_block, ee_len);
ext4_ext_show_leaf(inode, path);
return path;
}
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 08/14] ext4: cleanup zeroout in ext4_split_extent_at()
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (6 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 09/14] ext4: cleanup useless out label in __es_remove_extent() Zhang Yi
` (6 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
zero_ex is a temporary variable used only for writing zeros and
inserting extent status entry, it will not be directly inserted into the
tree. Therefore, it can be assigned values from the target extent in
various scenarios, eliminating the need to explicitly assign values to
each variable individually.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents.c | 87 ++++++++++++++++++++---------------------------
1 file changed, 36 insertions(+), 51 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 945995d68c4d..2cfce2c01208 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3287,63 +3287,48 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
ex = path[depth].p_ext;
if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
- if (split_flag & (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
- if (split_flag & EXT4_EXT_DATA_VALID1) {
- err = ext4_ext_zeroout(inode, ex2);
- zero_ex.ee_block = ex2->ee_block;
- zero_ex.ee_len = cpu_to_le16(
- ext4_ext_get_actual_len(ex2));
- ext4_ext_store_pblock(&zero_ex,
- ext4_ext_pblock(ex2));
- } else {
- err = ext4_ext_zeroout(inode, ex);
- zero_ex.ee_block = ex->ee_block;
- zero_ex.ee_len = cpu_to_le16(
- ext4_ext_get_actual_len(ex));
- ext4_ext_store_pblock(&zero_ex,
- ext4_ext_pblock(ex));
- }
- } else {
- err = ext4_ext_zeroout(inode, &orig_ex);
- zero_ex.ee_block = orig_ex.ee_block;
- zero_ex.ee_len = cpu_to_le16(
- ext4_ext_get_actual_len(&orig_ex));
- ext4_ext_store_pblock(&zero_ex,
- ext4_ext_pblock(&orig_ex));
- }
+ if (split_flag & EXT4_EXT_DATA_VALID1)
+ memcpy(&zero_ex, ex2, sizeof(zero_ex));
+ else if (split_flag & EXT4_EXT_DATA_VALID2)
+ memcpy(&zero_ex, ex, sizeof(zero_ex));
+ else
+ memcpy(&zero_ex, &orig_ex, sizeof(zero_ex));
+ ext4_ext_mark_initialized(&zero_ex);
- if (!err) {
+ err = ext4_ext_zeroout(inode, &zero_ex);
+ if (err)
+ goto fix_extent_len;
+
+ /*
+ * The first half contains partially valid data, the splitting
+ * of this extent has not been completed, fix extent length
+ * and ext4_split_extent() split will the first half again.
+ */
+ if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1) {
/*
- * The first half contains partially valid data, the
- * splitting of this extent has not been completed, fix
- * extent length and ext4_split_extent() split will the
- * first half again.
+ * Drop extent cache to prevent stale unwritten
+ * extents remaining after zeroing out.
*/
- if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1) {
- /*
- * Drop extent cache to prevent stale unwritten
- * extents remaining after zeroing out.
- */
- ext4_es_remove_extent(inode,
+ ext4_es_remove_extent(inode,
le32_to_cpu(zero_ex.ee_block),
ext4_ext_get_actual_len(&zero_ex));
- goto fix_extent_len;
- }
-
- /* update the extent length and mark as initialized */
- ex->ee_len = cpu_to_le16(ee_len);
- ext4_ext_try_to_merge(handle, inode, path, ex);
- err = ext4_ext_dirty(handle, inode, path + path->p_depth);
- if (!err)
- /* update extent status tree */
- ext4_zeroout_es(inode, &zero_ex);
- /* If we failed at this point, we don't know in which
- * state the extent tree exactly is so don't try to fix
- * length of the original extent as it may do even more
- * damage.
- */
- goto out;
+ goto fix_extent_len;
}
+
+ /* update the extent length and mark as initialized */
+ ex->ee_len = cpu_to_le16(ee_len);
+ ext4_ext_try_to_merge(handle, inode, path, ex);
+ err = ext4_ext_dirty(handle, inode, path + path->p_depth);
+ if (!err)
+ /* update extent status tree */
+ ext4_zeroout_es(inode, &zero_ex);
+ /*
+ * If we failed at this point, we don't know in which
+ * state the extent tree exactly is so don't try to fix
+ * length of the original extent as it may do even more
+ * damage.
+ */
+ goto out;
}
fix_extent_len:
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 09/14] ext4: cleanup useless out label in __es_remove_extent()
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (7 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 08/14] ext4: cleanup zeroout in ext4_split_extent_at() Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 10/14] ext4: make __es_remove_extent() check extent status Zhang Yi
` (5 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
The out label in __es_remove_extent() is just return err value, we can
return it directly if something bad happens. Therefore, remove the
useless out label and rename out_get_reserved to out.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents_status.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index e04fbf10fe4f..04d56f8f6c0c 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1434,7 +1434,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
struct extent_status orig_es;
ext4_lblk_t len1, len2;
ext4_fsblk_t block;
- int err = 0;
+ int err;
bool count_reserved = true;
struct rsvd_count rc;
@@ -1443,9 +1443,9 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
es = __es_tree_search(&tree->root, lblk);
if (!es)
- goto out;
+ return 0;
if (es->es_lblk > end)
- goto out;
+ return 0;
/* Simply invalidate cache_es. */
tree->cache_es = NULL;
@@ -1480,7 +1480,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
es->es_lblk = orig_es.es_lblk;
es->es_len = orig_es.es_len;
- goto out;
+ return err;
}
} else {
es->es_lblk = end + 1;
@@ -1494,7 +1494,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
if (count_reserved)
count_rsvd(inode, orig_es.es_lblk + len1,
orig_es.es_len - len1 - len2, &orig_es, &rc);
- goto out_get_reserved;
+ goto out;
}
if (len1 > 0) {
@@ -1536,11 +1536,10 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
}
}
-out_get_reserved:
+out:
if (count_reserved)
*reserved = get_rsvd(inode, end, es, &rc);
-out:
- return err;
+ return 0;
}
/*
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 10/14] ext4: make __es_remove_extent() check extent status
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (8 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 09/14] ext4: cleanup useless out label in __es_remove_extent() Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 11/14] ext4: make ext4_es_cache_extent() support overwrite existing extents Zhang Yi
` (4 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Currently, __es_remove_extent() unconditionally removes extent status
entries within the specified range. In order to prepare for extending
the ext4_es_cache_extent() function to cache on-disk extents, which may
overwrite some existing short-length extents with the same status, allow
__es_remove_extent() to check the specified extent type before removing
it, and return error and pass out the conflicting extent if the status
does not match.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents_status.c | 49 +++++++++++++++++++++++++++++++++-------
1 file changed, 41 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 04d56f8f6c0c..818007bb613f 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -178,7 +178,8 @@ static struct kmem_cache *ext4_pending_cachep;
static int __es_insert_extent(struct inode *inode, struct extent_status *newes,
struct extent_status *prealloc);
static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
- ext4_lblk_t end, int *reserved,
+ ext4_lblk_t end, unsigned int status,
+ int *reserved, struct extent_status *res,
struct extent_status *prealloc);
static int es_reclaim_extents(struct ext4_inode_info *ei, int *nr_to_scan);
static int __es_shrink(struct ext4_sb_info *sbi, int nr_to_scan,
@@ -242,6 +243,21 @@ static inline void ext4_es_inc_seq(struct inode *inode)
WRITE_ONCE(ei->i_es_seq, ei->i_es_seq + 1);
}
+static inline int __es_check_extent_status(struct extent_status *es,
+ unsigned int status,
+ struct extent_status *res)
+{
+ if (ext4_es_type(es) & status)
+ return 0;
+
+ if (res) {
+ res->es_lblk = es->es_lblk;
+ res->es_len = es->es_len;
+ res->es_pblk = es->es_pblk;
+ }
+ return -EINVAL;
+}
+
/*
* search through the tree for an delayed extent with a given offset. If
* it can't be found, try to find next extent.
@@ -929,7 +945,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
pr = __alloc_pending(true);
write_lock(&EXT4_I(inode)->i_es_lock);
- err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1);
+ err1 = __es_remove_extent(inode, lblk, end, 0, &resv_used, NULL, es1);
if (err1 != 0)
goto error;
/* Free preallocated extent if it didn't get used. */
@@ -1409,23 +1425,27 @@ static unsigned int get_rsvd(struct inode *inode, ext4_lblk_t end,
return rc->ndelayed;
}
-
/*
* __es_remove_extent - removes block range from extent status tree
*
* @inode - file containing range
* @lblk - first block in range
* @end - last block in range
+ * @status - the extent status to be checked
* @reserved - number of cluster reservations released
+ * @res - return the extent if the status is not match
* @prealloc - pre-allocated es to avoid memory allocation failures
*
* If @reserved is not NULL and delayed allocation is enabled, counts
* block/cluster reservations freed by removing range and if bigalloc
- * enabled cancels pending reservations as needed. Returns 0 on success,
- * error code on failure.
+ * enabled cancels pending reservations as needed. If @status is not
+ * zero, check extent status type while removing extent, return -EINVAL
+ * and pass out the extent through @res if not match. Returns 0 on
+ * success, error code on failure.
*/
static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
- ext4_lblk_t end, int *reserved,
+ ext4_lblk_t end, unsigned int status,
+ int *reserved, struct extent_status *res,
struct extent_status *prealloc)
{
struct ext4_es_tree *tree = &EXT4_I(inode)->i_es_tree;
@@ -1440,6 +1460,8 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
if (reserved == NULL || !test_opt(inode->i_sb, DELALLOC))
count_reserved = false;
+ if (status == 0)
+ status = ES_TYPE_MASK;
es = __es_tree_search(&tree->root, lblk);
if (!es)
@@ -1447,6 +1469,10 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
if (es->es_lblk > end)
return 0;
+ err = __es_check_extent_status(es, status, res);
+ if (err)
+ return err;
+
/* Simply invalidate cache_es. */
tree->cache_es = NULL;
if (count_reserved)
@@ -1509,6 +1535,9 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
}
while (es && ext4_es_end(es) <= end) {
+ err = __es_check_extent_status(es, status, res);
+ if (err)
+ return err;
if (count_reserved)
count_rsvd(inode, es->es_lblk, es->es_len, es, &rc);
node = rb_next(&es->rb_node);
@@ -1524,6 +1553,10 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
if (es && es->es_lblk < end + 1) {
ext4_lblk_t orig_len = es->es_len;
+ err = __es_check_extent_status(es, status, res);
+ if (err)
+ return err;
+
len1 = ext4_es_end(es) - end;
if (count_reserved)
count_rsvd(inode, es->es_lblk, orig_len - len1,
@@ -1581,7 +1614,7 @@ void ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
* is reclaimed.
*/
write_lock(&EXT4_I(inode)->i_es_lock);
- err = __es_remove_extent(inode, lblk, end, &reserved, es);
+ err = __es_remove_extent(inode, lblk, end, 0, &reserved, NULL, es);
if (err)
goto error;
/* Free preallocated extent if it didn't get used. */
@@ -2173,7 +2206,7 @@ void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk,
}
write_lock(&EXT4_I(inode)->i_es_lock);
- err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
+ err1 = __es_remove_extent(inode, lblk, end, 0, NULL, NULL, es1);
if (err1 != 0)
goto error;
/* Free preallocated extent if it didn't get used. */
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 11/14] ext4: make ext4_es_cache_extent() support overwrite existing extents
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (9 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 10/14] ext4: make __es_remove_extent() check extent status Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 12/14] ext4: adjust the debug info in ext4_es_cache_extent() Zhang Yi
` (3 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Currently, ext4_es_cache_extent() is used to load extents into the
extent status tree when reading on-disk extent blocks. But it inserts
information into the extent status tree if and only if there isn't
information about the specified range already. So it only used for the
initial loading and does not support overwrit extents.
However, there are many other places in ext4 where on-disk extents are
inserted into the extent status tree, such as in ext4_map_query_blocks().
Currently, they call ext4_es_insert_extent() to perform the insertion,
but they don't modify the extents, so ext4_es_cache_extent() would be a
more appropriate choice. However, when ext4_map_query_blocks() inserts
an extent, it may overwrite a short existing extent of the same type.
Therefore, to prepare for the replacements, we need to extend
ext4_es_cache_extent() to allow it to overwrite existing extents with
the same status. So it checks the found extents before removing and
inserting. (There is one exception, a hole in the on-disk extent but a
delayed extent in the extent status tree is allowed.)
In addition, since cached extents can be more lenient than the extents
they modify and do not involve modifying reserved blocks, it is not
necessary to ensure that the insertion operation succeeds as strictly as
in the ext4_es_insert_extent() function.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents_status.c | 50 ++++++++++++++++++++++++++++++++++------
1 file changed, 43 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 818007bb613f..48f04aef2f2e 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1014,17 +1014,24 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
}
/*
- * ext4_es_cache_extent() inserts information into the extent status
- * tree if and only if there isn't information about the range in
- * question already.
+ * ext4_es_cache_extent() inserts information into the extent status tree
+ * only if there is no existing information about the specified range or
+ * if the existing extents have the same status.
+ *
+ * Note that this interface is only used for caching on-disk extent
+ * information and cannot be used to convert existing extents in the extent
+ * status tree. To convert existing extents, use ext4_es_insert_extent()
+ * instead.
*/
void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
ext4_lblk_t len, ext4_fsblk_t pblk,
unsigned int status)
{
struct extent_status *es;
- struct extent_status newes;
+ struct extent_status chkes, newes;
ext4_lblk_t end = lblk + len - 1;
+ bool conflict = false;
+ int err;
if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY)
return;
@@ -1040,11 +1047,40 @@ void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
BUG_ON(end < lblk);
write_lock(&EXT4_I(inode)->i_es_lock);
-
es = __es_tree_search(&EXT4_I(inode)->i_es_tree.root, lblk);
- if (!es || es->es_lblk > end)
- __es_insert_extent(inode, &newes, NULL);
+ if (es && es->es_lblk <= end) {
+ /* Found an extent that covers the entire range. */
+ if (es->es_lblk <= lblk && es->es_lblk + es->es_len > end) {
+ if (__es_check_extent_status(es, status, &chkes))
+ conflict = true;
+ goto unlock;
+ }
+ /* Check and remove all extents in range. */
+ err = __es_remove_extent(inode, lblk, end, status, NULL,
+ &chkes, NULL);
+ if (err) {
+ if (err == -EINVAL)
+ conflict = true;
+ goto unlock;
+ }
+ }
+ __es_insert_extent(inode, &newes, NULL);
+unlock:
write_unlock(&EXT4_I(inode)->i_es_lock);
+ if (!conflict)
+ return;
+ /*
+ * A hole in the on-disk extent but a delayed extent in the extent
+ * status tree, is allowed.
+ */
+ if (status == EXTENT_STATUS_HOLE &&
+ ext4_es_type(&chkes) == EXTENT_STATUS_DELAYED)
+ return;
+
+ ext4_warning_inode(inode,
+ "ES cache extent failed: add [%d,%d,%llu,0x%x] conflict with existing [%d,%d,%llu,0x%x]\n",
+ lblk, len, pblk, status, chkes.es_lblk, chkes.es_len,
+ ext4_es_pblock(&chkes), ext4_es_status(&chkes));
}
/*
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 12/14] ext4: adjust the debug info in ext4_es_cache_extent()
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (10 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 11/14] ext4: make ext4_es_cache_extent() support overwrite existing extents Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 13/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (2 subsequent siblings)
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Print a trace point after successfully inserting an extent in the
ext4_es_cache_extent() function. Additionally, similar to other extent
cache operation functions, call ext4_print_pending_tree() to display the
extent debug information of the inode when in ES_DEBUG mode.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents_status.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 48f04aef2f2e..0529c603ee88 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1039,7 +1039,6 @@ void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
newes.es_lblk = lblk;
newes.es_len = len;
ext4_es_store_pblock_status(&newes, pblk, status);
- trace_ext4_es_cache_extent(inode, &newes);
if (!len)
return;
@@ -1065,6 +1064,8 @@ void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
}
}
__es_insert_extent(inode, &newes, NULL);
+ trace_ext4_es_cache_extent(inode, &newes);
+ ext4_es_print_tree(inode);
unlock:
write_unlock(&EXT4_I(inode)->i_es_lock);
if (!conflict)
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 13/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (11 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 12/14] ext4: adjust the debug info in ext4_es_cache_extent() Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-11-29 10:32 ` [PATCH v3 14/14] ext4: drop the TODO comment in ext4_es_insert_extent() Zhang Yi
2025-12-01 16:23 ` [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Theodore Ts'o
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
In ext4, the remaining places for inserting extents into the extent
status tree within ext4_ext_determine_insert_hole() and
ext4_map_query_blocks() directly cache on-disk extents. We can use
ext4_es_cache_extent() instead of ext4_es_insert_extent() in these
cases. This will help reduce unnecessary increases in extent sequence
numbers and cache invalidations after supporting IOMAP in the future.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents.c | 3 +--
fs/ext4/inode.c | 18 +++++++++---------
2 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2cfce2c01208..27eb2c1df012 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4192,8 +4192,7 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode,
insert_hole:
/* Put just found gap into cache to speed up subsequent requests */
ext_debug(inode, " -> %u:%u\n", hole_start, len);
- ext4_es_insert_extent(inode, hole_start, len, ~0,
- EXTENT_STATUS_HOLE, false);
+ ext4_es_cache_extent(inode, hole_start, len, ~0, EXTENT_STATUS_HOLE);
/* Update hole_len to reflect hole size after lblk */
if (hole_start != lblk)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index eeb3ec4c2a9a..bb8165582840 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -504,8 +504,8 @@ static int ext4_map_query_blocks_next_in_leaf(handle_t *handle,
retval = ext4_ext_map_blocks(handle, inode, &map2, 0);
if (retval <= 0) {
- ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
- map->m_pblk, status, false);
+ ext4_es_cache_extent(inode, map->m_lblk, map->m_len,
+ map->m_pblk, status);
return map->m_len;
}
@@ -526,13 +526,13 @@ static int ext4_map_query_blocks_next_in_leaf(handle_t *handle,
*/
if (map->m_pblk + map->m_len == map2.m_pblk &&
status == status2) {
- ext4_es_insert_extent(inode, map->m_lblk,
- map->m_len + map2.m_len, map->m_pblk,
- status, false);
+ ext4_es_cache_extent(inode, map->m_lblk,
+ map->m_len + map2.m_len, map->m_pblk,
+ status);
map->m_len += map2.m_len;
} else {
- ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
- map->m_pblk, status, false);
+ ext4_es_cache_extent(inode, map->m_lblk, map->m_len,
+ map->m_pblk, status);
}
return map->m_len;
@@ -574,8 +574,8 @@ static int ext4_map_query_blocks(handle_t *handle, struct inode *inode,
map->m_len == orig_mlen) {
status = map->m_flags & EXT4_MAP_UNWRITTEN ?
EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN;
- ext4_es_insert_extent(inode, map->m_lblk, map->m_len,
- map->m_pblk, status, false);
+ ext4_es_cache_extent(inode, map->m_lblk, map->m_len,
+ map->m_pblk, status);
} else {
retval = ext4_map_query_blocks_next_in_leaf(handle, inode, map,
orig_mlen);
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v3 14/14] ext4: drop the TODO comment in ext4_es_insert_extent()
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (12 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 13/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
@ 2025-11-29 10:32 ` Zhang Yi
2025-12-01 16:23 ` [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Theodore Ts'o
14 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-11-29 10:32 UTC (permalink / raw)
To: linux-ext4
Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
yi.zhang, yi.zhang, yizhang089, libaokun1, yangerkun
From: Zhang Yi <yi.zhang@huawei.com>
Now we have ext4_es_cache_extent() to cache on-disk extents instead of
ext4_es_insert_extent(), so drop the TODO comment.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
---
fs/ext4/extents_status.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 0529c603ee88..fc83e7e2ca9e 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -898,7 +898,8 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes,
/*
* ext4_es_insert_extent() adds information to an inode's extent
- * status tree.
+ * status tree. This interface is used for modifying extents. To cache
+ * on-disk extents, use ext4_es_cache_extent() instead.
*/
void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
ext4_lblk_t len, ext4_fsblk_t pblk,
@@ -977,10 +978,6 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
}
pending = err3;
}
- /*
- * TODO: For cache on-disk extents, there is no need to increment
- * the sequence counter, this requires future optimization.
- */
ext4_es_inc_seq(inode);
error:
write_unlock(&EXT4_I(inode)->i_es_lock);
--
2.46.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout
2025-11-29 10:32 ` [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout Zhang Yi
@ 2025-11-29 17:33 ` Ojaswin Mujoo
0 siblings, 0 replies; 20+ messages in thread
From: Ojaswin Mujoo @ 2025-11-29 17:33 UTC (permalink / raw)
To: Zhang Yi
Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
jack, yi.zhang, yizhang089, libaokun1, yangerkun
On Sat, Nov 29, 2025 at 06:32:38PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> When splitting an unwritten extent in the middle and converting it to
> initialized in ext4_split_extent() with the EXT4_EXT_MAY_ZEROOUT and
> EXT4_EXT_DATA_VALID2 flags set, it could leave a stale unwritten extent.
>
> Assume we have an unwritten file and buffered write in the middle of it
> without dioread_nolock enabled, it will allocate blocks as written
> extent.
>
> 0 A B N
> [UUUUUUUUUUUU] on-disk extent U: unwritten extent
> [UUUUUUUUUUUU] extent status tree
> [--DDDDDDDD--] D: valid data
> |<- ->| ----> this range needs to be initialized
>
> ext4_split_extent() first try to split this extent at B with
> EXT4_EXT_DATA_PARTIAL_VALID1 and EXT4_EXT_MAY_ZEROOUT flag set, but
> ext4_split_extent_at() failed to split this extent due to temporary lack
> of space. It zeroout B to N and leave the entire extent as unwritten.
>
> 0 A B N
> [UUUUUUUUUUUU] on-disk extent
> [UUUUUUUUUUUU] extent status tree
> [--DDDDDDDDZZ] Z: zeroed data
>
> ext4_split_extent() then try to split this extent at A with
> EXT4_EXT_DATA_VALID2 flag set. This time, it split successfully and
> leave an written extent from A to N.
>
> 0 A B N
> [UUWWWWWWWWWW] on-disk extent W: written extent
> [UUUUUUUUUUUU] extent status tree
> [--DDDDDDDDZZ]
>
> Finally ext4_map_create_blocks() only insert extent A to B to the extent
> status tree, and leave an stale unwritten extent in the status tree.
>
> 0 A B N
> [UUWWWWWWWWWW] on-disk extent W: written extent
> [UUWWWWWWWWUU] extent status tree
> [--DDDDDDDDZZ]
>
> Fix this issue by always cached extent status entry after zeroing out
> the second part.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Reviewed-by: Baokun Li <libaokun1@huawei.com>
> Cc: stable@kernel.org
Okay so now we only drop the part that would have become stale. Looks
good to me.
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Regards,
ojaswin
> ---
> fs/ext4/extents.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index be9fd2ab8667..1094e4923451 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3319,8 +3319,16 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
> * extent length and ext4_split_extent() split will the
> * first half again.
> */
> - if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1)
> + if (split_flag & EXT4_EXT_DATA_PARTIAL_VALID1) {
> + /*
> + * Drop extent cache to prevent stale unwritten
> + * extents remaining after zeroing out.
> + */
> + ext4_es_remove_extent(inode,
> + le32_to_cpu(zero_ex.ee_block),
> + ext4_ext_get_actual_len(&zero_ex));
> goto fix_extent_len;
> + }
>
> /* update the extent length and mark as initialized */
> ex->ee_len = cpu_to_le16(ee_len);
> --
> 2.46.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails
2025-11-29 10:32 ` [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails Zhang Yi
@ 2025-11-29 17:34 ` Ojaswin Mujoo
0 siblings, 0 replies; 20+ messages in thread
From: Ojaswin Mujoo @ 2025-11-29 17:34 UTC (permalink / raw)
To: Zhang Yi
Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
jack, yi.zhang, yizhang089, libaokun1, yangerkun
On Sat, Nov 29, 2025 at 06:32:39PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> When the split extent fails, we might leave some extents still being
> processed and return an error directly, which will result in stale
> extent entries remaining in the extent status tree. So drop all of the
> remaining potentially stale extents if the splitting fails.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Reviewed-by: Baokun Li <libaokun1@huawei.com>
> Cc: stable@kernel.org
Looks good, feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Regards,
ojaswin
> ---
> fs/ext4/extents.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 1094e4923451..945995d68c4d 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3267,7 +3267,7 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>
> err = PTR_ERR(path);
> if (err != -ENOSPC && err != -EDQUOT && err != -ENOMEM)
> - return path;
> + goto out_path;
>
> /*
> * Get a new path to try to zeroout or fix the extent length.
> @@ -3281,7 +3281,7 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
> if (IS_ERR(path)) {
> EXT4_ERROR_INODE(inode, "Failed split extent on %u, err %ld",
> split, PTR_ERR(path));
> - return path;
> + goto out_path;
> }
> depth = ext_depth(inode);
> ex = path[depth].p_ext;
> @@ -3358,6 +3358,10 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
> ext4_free_ext_path(path);
> path = ERR_PTR(err);
> }
> +out_path:
> + if (IS_ERR(path))
> + /* Remove all remaining potentially stale extents. */
> + ext4_es_remove_extent(inode, ee_block, ee_len);
> ext4_ext_show_leaf(inode, path);
> return path;
> }
> --
> 2.46.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
` (13 preceding siblings ...)
2025-11-29 10:32 ` [PATCH v3 14/14] ext4: drop the TODO comment in ext4_es_insert_extent() Zhang Yi
@ 2025-12-01 16:23 ` Theodore Ts'o
2025-12-01 16:42 ` Theodore Tso
14 siblings, 1 reply; 20+ messages in thread
From: Theodore Ts'o @ 2025-12-01 16:23 UTC (permalink / raw)
To: linux-ext4, Zhang Yi
Cc: Theodore Ts'o, linux-fsdevel, linux-kernel, adilger.kernel,
jack, ojaswin, yi.zhang, yizhang089, libaokun1, yangerkun
On Sat, 29 Nov 2025 18:32:32 +0800, Zhang Yi wrote:
> Changes since v2:
> - Rebase the codes on ext4.git dev-91ef18b567da.
> - Move the first cleanup patch in v2 to patch 08 to facilitate easier
> backporting.
> - In patch 01, correct the mismatch comments for
> EXT4_EXT_DATA_ENTIRE_VALID1 and EXT4_EXT_DATA_PARTIAL_VALID1.
> - Modify patch 06 and add 07, cleanup the commit message to avoid
> confusion, and don't always drop extent cache before splitting
> extent, instead, do this only after PARTIAL_VALID1 zeroed out or
> split extent fails.
> - In patch 08, mark zero_ex to initialized.
> - In patch 09, correct the word 'tag' to 'lable' in the commit message.
> - In patch 11, add return value check of __es_remove_extent() in
> ext4_es_cache_extent().
> - Collecting RVB tags.
>
> [...]
Applied, thanks!
[01/14] ext4: subdivide EXT4_EXT_DATA_VALID1
commit: 0f9885eab9182118fd7bfd8cdf8bab6f71f74699
[02/14] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1
commit: 1fec988b1f71c27c45d31cde6ffe3efdb10657b9
[03/14] ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O
commit: c42e9f199c419f11938b8d411123e3f6719941d4
[04/14] ext4: correct the mapping status if the extent has been zeroed
commit: 2410e55561cc405c56b9e38d69be1b8fdb6c9722
[05/14] ext4: don't cache extent during splitting extent
commit: 4b4a6ac831ff347127e46c60a516b3ec42921242
[06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout
commit: 87d5cb059b8ab1623f5bcebcc0b53e43abd36ae7
[07/14] ext4: drop extent cache when splitting extent fails
commit: 889085343ddffdf9ccb6be8402469458da6b350f
[08/14] ext4: cleanup zeroout in ext4_split_extent_at()
commit: 02f8dc1707ceb87656288e6460f3ebb94200ba2c
[09/14] ext4: cleanup useless out label in __es_remove_extent()
commit: 13cbc168d9ba14822de66fc085e85416cc2fda8e
[10/14] ext4: make __es_remove_extent() check extent status
commit: ad02a3d000a512aada99cfad13d62c3edfb793de
[11/14] ext4: make ext4_es_cache_extent() support overwrite existing extents
commit: 41a414d53bfb5c91ea5c73125181568901c74a7a
[12/14] ext4: adjust the debug info in ext4_es_cache_extent()
commit: 4e84970a460d27f35f3127327c3e131476c06b03
[13/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
commit: d494567091eddfeded77017bb9b4dc677046d93d
[14/14] ext4: drop the TODO comment in ext4_es_insert_extent()
commit: 6fb67ac896900e60f46ee4efba97b372a80370e0
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
2025-12-01 16:23 ` [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Theodore Ts'o
@ 2025-12-01 16:42 ` Theodore Tso
2025-12-02 1:15 ` Zhang Yi
0 siblings, 1 reply; 20+ messages in thread
From: Theodore Tso @ 2025-12-01 16:42 UTC (permalink / raw)
To: linux-ext4, Zhang Yi
Cc: linux-fsdevel, linux-kernel, adilger.kernel, jack, ojaswin,
yi.zhang, yizhang089, libaokun1, yangerkun
On Mon, Dec 01, 2025 at 11:23:50AM -0500, Theodore Ts'o wrote:
> Applied, thanks!
n.b. This is on the dev branch, but I plan to not include it in the initial
pull request to Linus, so it can get a bit more soak testing. I'll
send to Linus after -rc1.
- Ted
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents
2025-12-01 16:42 ` Theodore Tso
@ 2025-12-02 1:15 ` Zhang Yi
0 siblings, 0 replies; 20+ messages in thread
From: Zhang Yi @ 2025-12-02 1:15 UTC (permalink / raw)
To: Theodore Tso, linux-ext4
Cc: linux-fsdevel, linux-kernel, adilger.kernel, jack, ojaswin,
yi.zhang, yizhang089, libaokun1, yangerkun
On 12/2/2025 12:42 AM, Theodore Tso wrote:
> On Mon, Dec 01, 2025 at 11:23:50AM -0500, Theodore Ts'o wrote:
>> Applied, thanks!
>
> n.b. This is on the dev branch, but I plan to not include it in the initial
> pull request to Linus, so it can get a bit more soak testing. I'll
> send to Linus after -rc1.
>
> - Ted
Sure, thank you. It would be better to do more testing. Please let me
know if there is any regression. :-)
Thanks,
Yi.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-12-02 1:15 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-29 10:32 [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
2025-11-29 10:32 ` [PATCH v3 01/14] ext4: subdivide EXT4_EXT_DATA_VALID1 Zhang Yi
2025-11-29 10:32 ` [PATCH v3 02/14] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL_VALID1 Zhang Yi
2025-11-29 10:32 ` [PATCH v3 03/14] ext4: don't set EXT4_GET_BLOCKS_CONVERT when splitting before submitting I/O Zhang Yi
2025-11-29 10:32 ` [PATCH v3 04/14] ext4: correct the mapping status if the extent has been zeroed Zhang Yi
2025-11-29 10:32 ` [PATCH v3 05/14] ext4: don't cache extent during splitting extent Zhang Yi
2025-11-29 10:32 ` [PATCH v3 06/14] ext4: drop extent cache after doing PARTIAL_VALID1 zeroout Zhang Yi
2025-11-29 17:33 ` Ojaswin Mujoo
2025-11-29 10:32 ` [PATCH v3 07/14] ext4: drop extent cache when splitting extent fails Zhang Yi
2025-11-29 17:34 ` Ojaswin Mujoo
2025-11-29 10:32 ` [PATCH v3 08/14] ext4: cleanup zeroout in ext4_split_extent_at() Zhang Yi
2025-11-29 10:32 ` [PATCH v3 09/14] ext4: cleanup useless out label in __es_remove_extent() Zhang Yi
2025-11-29 10:32 ` [PATCH v3 10/14] ext4: make __es_remove_extent() check extent status Zhang Yi
2025-11-29 10:32 ` [PATCH v3 11/14] ext4: make ext4_es_cache_extent() support overwrite existing extents Zhang Yi
2025-11-29 10:32 ` [PATCH v3 12/14] ext4: adjust the debug info in ext4_es_cache_extent() Zhang Yi
2025-11-29 10:32 ` [PATCH v3 13/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Zhang Yi
2025-11-29 10:32 ` [PATCH v3 14/14] ext4: drop the TODO comment in ext4_es_insert_extent() Zhang Yi
2025-12-01 16:23 ` [PATCH v3 00/14] ext4: replace ext4_es_insert_extent() when caching on-disk extents Theodore Ts'o
2025-12-01 16:42 ` Theodore Tso
2025-12-02 1:15 ` Zhang Yi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).