From: Zhang Yi <yi.zhang@huaweicloud.com>
To: linux-ext4@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
yi.zhang@huawei.com, yi.zhang@huaweicloud.com,
libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com
Subject: [PATCH v3 03/12] ext4: make ext4_es_lookup_extent() pass out the extent seq counter
Date: Fri, 10 Oct 2025 18:33:17 +0800 [thread overview]
Message-ID: <20251010103326.3353700-4-yi.zhang@huaweicloud.com> (raw)
In-Reply-To: <20251010103326.3353700-1-yi.zhang@huaweicloud.com>
From: Zhang Yi <yi.zhang@huawei.com>
When querying extents in the extent status tree, we should hold the
data_sem if we want to obtain the sequence number as a valid cookie
simultaneously. However, currently, ext4_map_blocks() calls
ext4_es_lookup_extent() without holding data_sem. Therefore, we should
acquire i_es_lock instead, which also ensures that the sequence cookie
and the extent remain consistent. Consequently, make
ext4_es_lookup_extent() to pass out the sequence number when necessary.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/extents.c | 2 +-
fs/ext4/extents_status.c | 6 ++++--
fs/ext4/extents_status.h | 2 +-
fs/ext4/inode.c | 8 ++++----
4 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ca5499e9412b..c7d219e6c6d8 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -2213,7 +2213,7 @@ static int ext4_fill_es_cache_info(struct inode *inode,
while (block <= end) {
next = 0;
flags = 0;
- if (!ext4_es_lookup_extent(inode, block, &next, &es))
+ if (!ext4_es_lookup_extent(inode, block, &next, &es, NULL))
break;
if (ext4_es_is_unwritten(&es))
flags |= FIEMAP_EXTENT_UNWRITTEN;
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index c3daa57ecd35..e04fbf10fe4f 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1039,8 +1039,8 @@ void ext4_es_cache_extent(struct inode *inode, ext4_lblk_t lblk,
* Return: 1 on found, 0 on not
*/
int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
- ext4_lblk_t *next_lblk,
- struct extent_status *es)
+ ext4_lblk_t *next_lblk, struct extent_status *es,
+ u64 *pseq)
{
struct ext4_es_tree *tree;
struct ext4_es_stats *stats;
@@ -1099,6 +1099,8 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
} else
*next_lblk = 0;
}
+ if (pseq)
+ *pseq = EXT4_I(inode)->i_es_seq;
} else {
percpu_counter_inc(&stats->es_stats_cache_misses);
}
diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index 8f9c008d11e8..f3396cf32b44 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -148,7 +148,7 @@ extern void ext4_es_find_extent_range(struct inode *inode,
struct extent_status *es);
extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,
ext4_lblk_t *next_lblk,
- struct extent_status *es);
+ struct extent_status *es, u64 *pseq);
extern bool ext4_es_scan_range(struct inode *inode,
int (*matching_fn)(struct extent_status *es),
ext4_lblk_t lblk, ext4_lblk_t end);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index f9e4ac87211e..10792772b450 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -649,7 +649,7 @@ static int ext4_map_create_blocks(handle_t *handle, struct inode *inode,
* extent status tree.
*/
if (flags & EXT4_GET_BLOCKS_PRE_IO &&
- ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+ ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
if (ext4_es_is_written(&es))
return retval;
}
@@ -723,7 +723,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
ext4_check_map_extents_env(inode);
/* Lookup extent status tree firstly */
- if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+ if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) {
map->m_pblk = ext4_es_pblock(&es) +
map->m_lblk - es.es_lblk;
@@ -1908,7 +1908,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
ext4_check_map_extents_env(inode);
/* Lookup extent status tree firstly */
- if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+ if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
map->m_len = min_t(unsigned int, map->m_len,
es.es_len - (map->m_lblk - es.es_lblk));
@@ -1961,7 +1961,7 @@ static int ext4_da_map_blocks(struct inode *inode, struct ext4_map_blocks *map)
* is held in write mode, before inserting a new da entry in
* the extent status tree.
*/
- if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) {
+ if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) {
map->m_len = min_t(unsigned int, map->m_len,
es.es_len - (map->m_lblk - es.es_lblk));
--
2.46.1
next prev parent reply other threads:[~2025-10-10 10:34 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-10 10:33 [PATCH v3 00/12] ext4: optimize online defragment Zhang Yi
2025-10-10 10:33 ` [PATCH v3 01/12] ext4: correct the checking of quota files before moving extents Zhang Yi
2025-10-10 10:33 ` [PATCH v3 02/12] ext4: introduce seq counter for the extent status entry Zhang Yi
2025-10-10 10:33 ` Zhang Yi [this message]
2025-10-10 10:33 ` [PATCH v3 04/12] ext4: pass out extent seq counter when mapping blocks Zhang Yi
2025-10-10 10:33 ` [PATCH v3 05/12] ext4: use EXT4_B_TO_LBLK() in mext_check_arguments() Zhang Yi
2025-10-10 10:33 ` [PATCH v3 06/12] ext4: add mext_check_validity() to do basic check Zhang Yi
2025-10-10 10:33 ` [PATCH v3 07/12] ext4: refactor mext_check_arguments() Zhang Yi
2025-10-10 10:33 ` [PATCH v3 08/12] ext4: rename mext_page_mkuptodate() to mext_folio_mkuptodate() Zhang Yi
2025-10-10 10:33 ` [PATCH v3 09/12] ext4: introduce mext_move_extent() Zhang Yi
2025-10-10 13:38 ` Jan Kara
2025-10-11 1:20 ` Zhang Yi
2025-10-10 10:33 ` [PATCH v3 10/12] ext4: switch to using the new extent movement method Zhang Yi
2025-10-10 13:41 ` Jan Kara
2025-10-10 10:33 ` [PATCH v3 11/12] ext4: add large folios support for moving extents Zhang Yi
2025-10-10 10:33 ` [PATCH v3 12/12] ext4: add two trace points " Zhang Yi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251010103326.3353700-4-yi.zhang@huaweicloud.com \
--to=yi.zhang@huaweicloud.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=libaokun1@huawei.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).