From: Zhang Yi <yizhang089@gmail.com>
To: Ojaswin Mujoo <ojaswin@linux.ibm.com>,
linux-ext4@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>
Cc: Ritesh Harjani <ritesh.list@gmail.com>,
Zhang Yi <yi.zhang@huawei.com>, Jan Kara <jack@suse.cz>,
libaokun1@huawei.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 8/8] ext4: Allow zeroout when doing written to unwritten split
Date: Sat, 17 Jan 2026 17:01:30 +0800 [thread overview]
Message-ID: <3757a5d3-624b-4705-b1c0-e33e6adce340@gmail.com> (raw)
In-Reply-To: <16dc2c0921f482fd3dc6fa1d5bbae64eaba591eb.1768402426.git.ojaswin@linux.ibm.com>
On 1/14/2026 10:57 PM, Ojaswin Mujoo wrote:
> Currently, when we are doing an extent split and convert operation of
> written to unwritten extent (example, as done by ZERO_RANGE), we don't
> allow the zeroout fallback in case the extent tree manipulation fails.
> This is mostly because zeroout might take unsually long and the fact that
> this code path is more tolerant to failures than endio.
>
> Since we have zeroout machinery in place, we might as well use it hence
> lift this restriction. To mitigate zeroout taking too long respect the
> max zeroout limit here so that the operation finishes relatively fast.
>
> Also, add kunit tests for this case.
>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Reviewed-by: Jan Kara <jack@suse.cz>
It looks good to me.
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
> ---
> fs/ext4/extents-test.c | 71 ++++++++++++++++++++++++++++++++++++++++++
> fs/ext4/extents.c | 33 +++++++++++++++-----
> 2 files changed, 96 insertions(+), 8 deletions(-)
>
> diff --git a/fs/ext4/extents-test.c b/fs/ext4/extents-test.c
> index 86fcac66be6f..d3a26cc8a9ad 100644
> --- a/fs/ext4/extents-test.c
> +++ b/fs/ext4/extents-test.c
> @@ -578,6 +578,41 @@ static const struct kunit_ext_test_param test_split_convert_params[] = {
> { .exp_char = 'X', .off_blk = 1, .len_blk = 1 },
> { .exp_char = 0, .off_blk = 2, .len_blk = 1 } } },
>
> + /* writ to unwrit splits */
> + { .desc = "split writ extent to 2 extents and convert 1st half unwrit (zeroout)",
> + .type = TEST_SPLIT_CONVERT,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 10, .m_len = 1 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 2,
> + .exp_data_state = { { .exp_char = 0, .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 'X', .off_blk = 1, .len_blk = 2 }}},
> + { .desc = "split writ extent to 2 extents and convert 2nd half unwrit (zeroout)",
> + .type = TEST_SPLIT_CONVERT,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 11, .m_len = 2 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 2,
> + .exp_data_state = { { .exp_char = 'X', .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 0, .off_blk = 1, .len_blk = 2 } } },
> + { .desc = "split writ extent to 3 extents and convert 2nd half unwrit (zeroout)",
> + .type = TEST_SPLIT_CONVERT,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 11, .m_len = 1 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 3,
> + .exp_data_state = { { .exp_char = 'X', .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 0, .off_blk = 1, .len_blk = 1 },
> + { .exp_char = 'X', .off_blk = 2, .len_blk = 1 }}},
> };
>
> static const struct kunit_ext_test_param test_convert_initialized_params[] = {
> @@ -610,6 +645,42 @@ static const struct kunit_ext_test_param test_convert_initialized_params[] = {
> { .ex_lblk = 11, .ex_len = 1, .is_unwrit = 1 },
> { .ex_lblk = 12, .ex_len = 1, .is_unwrit = 0 } },
> .is_zeroout_test = 0 },
> +
> + /* writ to unwrit splits */
> + { .desc = "split writ extent to 2 extents and convert 1st half unwrit (zeroout)",
> + .type = TEST_CREATE_BLOCKS,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 10, .m_len = 1 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 2,
> + .exp_data_state = { { .exp_char = 0, .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 'X', .off_blk = 1, .len_blk = 2 }}},
> + { .desc = "split writ extent to 2 extents and convert 2nd half unwrit (zeroout)",
> + .type = TEST_CREATE_BLOCKS,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 11, .m_len = 2 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 2,
> + .exp_data_state = { { .exp_char = 'X', .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 0, .off_blk = 1, .len_blk = 2 } } },
> + { .desc = "split writ extent to 3 extents and convert 2nd half unwrit (zeroout)",
> + .type = TEST_CREATE_BLOCKS,
> + .is_unwrit_at_start = 0,
> + .split_flags = EXT4_GET_BLOCKS_CONVERT_UNWRITTEN,
> + .split_map = { .m_lblk = 11, .m_len = 1 },
> + .nr_exp_ext = 1,
> + .exp_ext_state = { { .ex_lblk = 10, .ex_len = 3, .is_unwrit = 0 } },
> + .is_zeroout_test = 1,
> + .nr_exp_data_segs = 3,
> + .exp_data_state = { { .exp_char = 'X', .off_blk = 0, .len_blk = 1 },
> + { .exp_char = 0, .off_blk = 1, .len_blk = 1 },
> + { .exp_char = 'X', .off_blk = 2, .len_blk = 1 }}},
> };
>
> static const struct kunit_ext_test_param test_handle_unwritten_params[] = {
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 8ade9c68ddd8..4c6e4e7a80b0 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3463,6 +3463,15 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
> */
> goto out_orig_err;
>
> + if (flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN) {
> + int max_zeroout_blks =
> + EXT4_SB(inode->i_sb)->s_extent_max_zeroout_kb >>
> + (inode->i_sb->s_blocksize_bits - 10);
> +
> + if (map->m_len > max_zeroout_blks)
> + goto out_orig_err;
> + }
> +
> path = ext4_find_extent(inode, map->m_lblk, NULL, flags);
> if (IS_ERR(path))
> goto out_orig_err;
> @@ -3818,15 +3827,10 @@ static struct ext4_ext_path *ext4_split_convert_extents(handle_t *handle,
> goto convert;
>
> /*
> - * We don't use zeroout fallback for written to unwritten conversion as
> - * it is not as critical as endio and it might take unusually long.
> - * Also, it is only safe to convert extent to initialized via explicit
> + * It is only safe to convert extent to initialized via explicit
> * zeroout only if extent is fully inside i_size or new_size.
> */
> - if (!(flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN))
> - split_flag |= ee_block + ee_len <= eof_block ?
> - EXT4_EXT_MAY_ZEROOUT :
> - 0;
> + split_flag |= ee_block + ee_len <= eof_block ? EXT4_EXT_MAY_ZEROOUT : 0;
>
> /*
> * pass SPLIT_NOMERGE explicitly so we don't end up merging extents we
> @@ -3948,7 +3952,20 @@ convert_initialized_extent(handle_t *handle, struct inode *inode,
>
> ext4_update_inode_fsync_trans(handle, inode, 1);
>
> - map->m_flags |= EXT4_MAP_UNWRITTEN;
> + /*
> + * The extent might be initialized in case of zeroout.
> + */
> + path = ext4_find_extent(inode, map->m_lblk, path, flags);
> + if (IS_ERR(path))
> + return path;
> +
> + depth = ext_depth(inode);
> + ex = path[depth].p_ext;
> +
> + if (ext4_ext_is_unwritten(ex))
> + map->m_flags |= EXT4_MAP_UNWRITTEN;
> + else
> + map->m_flags |= EXT4_MAP_MAPPED;
> if (*allocated > map->m_len)
> *allocated = map->m_len;
> map->m_len = *allocated;
prev parent reply other threads:[~2026-01-17 9:01 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-14 14:57 [PATCH v2 0/8] ext4 extent split/convert refactor and kunit tests Ojaswin Mujoo
2026-01-14 14:57 ` [PATCH v2 1/8] ext4: kunit tests for extent splitting and conversion Ojaswin Mujoo
2026-01-15 10:20 ` Jan Kara
2026-01-16 5:36 ` Ojaswin Mujoo
2026-01-17 8:47 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 2/8] ext4: kunit tests for higher level extent manipulation functions Ojaswin Mujoo
2026-01-15 10:35 ` Jan Kara
2026-01-17 8:48 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 3/8] ext4: Add extent status cache support to kunit tests Ojaswin Mujoo
2026-01-15 10:38 ` Jan Kara
2026-01-17 8:58 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 4/8] ext4: propagate flags to convert_initialized_extent() Ojaswin Mujoo
2026-01-15 10:59 ` Jan Kara
2026-01-16 5:35 ` Ojaswin Mujoo
2026-01-17 8:59 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 5/8] ext4: propagate flags to ext4_convert_unwritten_extents_endio() Ojaswin Mujoo
2026-01-17 9:00 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 6/8] ext4: Refactor zeroout path and handle all cases Ojaswin Mujoo
2026-01-15 12:01 ` Jan Kara
2026-01-16 5:35 ` Ojaswin Mujoo
2026-01-17 8:00 ` Zhang Yi
2026-01-19 11:47 ` Ojaswin Mujoo
2026-01-14 14:57 ` [PATCH v2 7/8] ext4: Refactor split and convert extents Ojaswin Mujoo
2026-01-17 9:00 ` Zhang Yi
2026-01-14 14:57 ` [PATCH v2 8/8] ext4: Allow zeroout when doing written to unwritten split Ojaswin Mujoo
2026-01-17 9:01 ` Zhang Yi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3757a5d3-624b-4705-b1c0-e33e6adce340@gmail.com \
--to=yizhang089@gmail.com \
--cc=jack@suse.cz \
--cc=libaokun1@huawei.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox