* [PATCH 0/4] btrfs: some cleanups for two ctree functions
@ 2025-12-09 3:27 Sun YangKai
2025-12-09 3:27 ` [PATCH 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config Sun YangKai
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Sun YangKai @ 2025-12-09 3:27 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
The first three patches is cleanups for btrfs_search_slot_for_read().
There're only two callers that set @return_any to 1 and both of them
is unnecessary. So @return_any and related logic could be removed.
The last patch is cleanup for btrfs_prev_leaf(). Callers expect a
valid p->slots[0] to read the item's key. So make sure btrfs_prev_leaf()
return a valid p->slots[0] and remove related checks in callers.
Sun YangKai (4):
btrfs: don't set @return_any for btrfs_search_slot_for_read in
btrfs_read_qgroup_config
btrfs: don't set return_any @return_any for btrfs_search_slot_for_read
in get_last_extent()
btrfs: cleanup btrfs_search_slot_for_read()
btrfs: ctree: cleanup btrfs_prev_leaf()
fs/btrfs/ctree.c | 141 ++++++-------------------------------
fs/btrfs/ctree.h | 3 +-
fs/btrfs/free-space-tree.c | 2 +-
fs/btrfs/qgroup.c | 10 +--
fs/btrfs/send.c | 22 +++---
5 files changed, 40 insertions(+), 138 deletions(-)
--
2.51.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config
2025-12-09 3:27 [PATCH 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
@ 2025-12-09 3:27 ` Sun YangKai
2025-12-09 3:27 ` [PATCH 2/4] btrfs: don't set return_any @return_any for btrfs_search_slot_for_read in get_last_extent() Sun YangKai
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Sun YangKai @ 2025-12-09 3:27 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
The call to `btrfs_search_slot_for_read` in `btrfs_read_qgroup_config` is
intended to find the very first item in the quota root tree to initiate
iteration.
The search key is set to all zeros: (0, 0, 0).
The current call uses `return_any=1`:
`btrfs_search_slot_for_read(..., find_higher=1, return_any=1)`
With `find_higher=1`, the function searches for an item greater than
(0, 0, 0). The `return_any=1` flag provides a fallback: if no higher item
is found, it attempts to return the next lower item instead.
Since the search key (0, 0, 0) represents the absolute floor of the btrfs
key space (u64 objectid), there can be no valid key lower than it. The
`return_any` fallback logic is therefore pointless and misleading in
this context.
Change `return_any` from `1` to `0` to correctly express the intention:
find the first item strictly higher than (0, 0, 0), and if no such item
exists, simply return 'not found' (1) without attempting an unnecessary
and impossible search for a lower key.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/qgroup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 9e2b53e90dcb..d780980e6790 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -415,7 +415,7 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info)
key.objectid = 0;
key.type = 0;
key.offset = 0;
- ret = btrfs_search_slot_for_read(quota_root, &key, path, 1, 1);
+ ret = btrfs_search_slot_for_read(quota_root, &key, path, 1, 0);
if (ret)
goto out;
--
2.51.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/4] btrfs: don't set return_any @return_any for btrfs_search_slot_for_read in get_last_extent()
2025-12-09 3:27 [PATCH 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
2025-12-09 3:27 ` [PATCH 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config Sun YangKai
@ 2025-12-09 3:27 ` Sun YangKai
2025-12-09 3:27 ` [PATCH 3/4] btrfs: cleanup btrfs_search_slot_for_read() Sun YangKai
2025-12-09 3:27 ` [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
3 siblings, 0 replies; 13+ messages in thread
From: Sun YangKai @ 2025-12-09 3:27 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
There must be a previous item at least the inode_item. So setting
@return_any is not necessary.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/send.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 2522faa97478..eae596b80ec0 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -6320,16 +6320,14 @@ static int get_last_extent(struct send_ctx *sctx, u64 offset)
key.objectid = sctx->cur_ino;
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = offset;
- ret = btrfs_search_slot_for_read(root, &key, path, 0, 1);
+ ret = btrfs_search_slot_for_read(root, &key, path, 0, 0);
if (ret < 0)
return ret;
- ret = 0;
+ ASSERT(ret == 0);
btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
- if (key.objectid != sctx->cur_ino || key.type != BTRFS_EXTENT_DATA_KEY)
- return ret;
-
- sctx->cur_inode_last_extent = btrfs_file_extent_end(path);
- return ret;
+ if (key.objectid == sctx->cur_ino && key.type == BTRFS_EXTENT_DATA_KEY)
+ sctx->cur_inode_last_extent = btrfs_file_extent_end(path);
+ return 0;
}
static int range_is_hole_in_parent(struct send_ctx *sctx,
--
2.51.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/4] btrfs: cleanup btrfs_search_slot_for_read()
2025-12-09 3:27 [PATCH 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
2025-12-09 3:27 ` [PATCH 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config Sun YangKai
2025-12-09 3:27 ` [PATCH 2/4] btrfs: don't set return_any @return_any for btrfs_search_slot_for_read in get_last_extent() Sun YangKai
@ 2025-12-09 3:27 ` Sun YangKai
2025-12-09 3:27 ` [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
3 siblings, 0 replies; 13+ messages in thread
From: Sun YangKai @ 2025-12-09 3:27 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Now @return_any is not used by any caller, remove it and related logic.
@for_read is used as boolean, so convert it from int to bool.
No functional change.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/ctree.c | 47 ++++++--------------------------------
fs/btrfs/ctree.h | 3 +--
fs/btrfs/free-space-tree.c | 2 +-
fs/btrfs/qgroup.c | 10 ++++----
fs/btrfs/send.c | 12 +++++-----
5 files changed, 20 insertions(+), 54 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index ee6e5c393873..bb886f9508e2 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2439,22 +2439,14 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
* instead the next or previous item should be returned.
* When find_higher is true, the next higher item is returned, the next lower
* otherwise.
- * When return_any and find_higher are both true, and no higher item is found,
- * return the next lower instead.
- * When return_any is true and find_higher is false, and no lower item is found,
- * return the next higher instead.
- * It returns 0 if any item is found, 1 if none is found (tree empty), and
- * < 0 on error
+ * It returns 0 if any item is found, 1 if none is found and < 0 on error
*/
int btrfs_search_slot_for_read(struct btrfs_root *root,
const struct btrfs_key *key,
- struct btrfs_path *p, int find_higher,
- int return_any)
+ struct btrfs_path *p, bool find_higher)
{
int ret;
- struct extent_buffer *leaf;
-again:
ret = btrfs_search_slot(NULL, root, key, p, 0, 0);
if (ret <= 0)
return ret;
@@ -2465,47 +2457,22 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
* to the first free slot in the previous leaf, i.e. at an invalid
* item.
*/
- leaf = p->nodes[0];
-
if (find_higher) {
- if (p->slots[0] >= btrfs_header_nritems(leaf)) {
- ret = btrfs_next_leaf(root, p);
- if (ret <= 0)
- return ret;
- if (!return_any)
- return 1;
- /*
- * no higher item found, return the next
- * lower instead
- */
- return_any = 0;
- find_higher = 0;
- btrfs_release_path(p);
- goto again;
- }
+ if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
+ return btrfs_next_leaf(root, p);
} else {
if (p->slots[0] == 0) {
ret = btrfs_prev_leaf(root, p);
if (ret < 0)
return ret;
if (!ret) {
- leaf = p->nodes[0];
- if (p->slots[0] == btrfs_header_nritems(leaf))
+ if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
p->slots[0]--;
return 0;
}
- if (!return_any)
- return 1;
- /*
- * no lower item found, return the next
- * higher instead
- */
- return_any = 0;
- find_higher = 1;
- btrfs_release_path(p);
- goto again;
+ return 1;
} else {
- --p->slots[0];
+ p->slots[0]--;
}
}
return 0;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 692370fc07b2..4b7b8ce7e211 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -595,8 +595,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
struct btrfs_path *p, u64 time_seq);
int btrfs_search_slot_for_read(struct btrfs_root *root,
const struct btrfs_key *key,
- struct btrfs_path *p, int find_higher,
- int return_any);
+ struct btrfs_path *p, bool find_higher);
void btrfs_release_path(struct btrfs_path *p);
struct btrfs_path *btrfs_alloc_path(void);
void btrfs_free_path(struct btrfs_path *p);
diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index 1ad2ad384b9e..88c46950f5d2 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -1089,7 +1089,7 @@ static int populate_free_space_tree(struct btrfs_trans_handle *trans,
key.offset = 0;
extent_root = btrfs_extent_root(trans->fs_info, key.objectid);
- ret = btrfs_search_slot_for_read(extent_root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(extent_root, &key, path, true);
if (ret < 0)
goto out_locked;
/*
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index d780980e6790..bd458ba537ba 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -415,7 +415,7 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info)
key.objectid = 0;
key.type = 0;
key.offset = 0;
- ret = btrfs_search_slot_for_read(quota_root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(quota_root, &key, path, true);
if (ret)
goto out;
@@ -530,7 +530,7 @@ int btrfs_read_qgroup_config(struct btrfs_fs_info *fs_info)
key.objectid = 0;
key.type = BTRFS_QGROUP_RELATION_KEY;
key.offset = 0;
- ret = btrfs_search_slot_for_read(quota_root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(quota_root, &key, path, true);
if (ret)
goto out;
while (1) {
@@ -1088,7 +1088,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info,
key.offset = 0;
btrfs_release_path(path);
- ret = btrfs_search_slot_for_read(tree_root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(tree_root, &key, path, true);
if (ret > 0)
goto out_add_root;
if (unlikely(ret < 0)) {
@@ -1130,7 +1130,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info,
goto out_free_path;
}
ret = btrfs_search_slot_for_read(tree_root, &found_key,
- path, 1, 0);
+ path, true);
if (unlikely(ret < 0)) {
btrfs_abort_transaction(trans, ret);
goto out_free_path;
@@ -3692,7 +3692,7 @@ static int qgroup_rescan_leaf(struct btrfs_trans_handle *trans,
fs_info->qgroup_rescan_progress.objectid);
ret = btrfs_search_slot_for_read(extent_root,
&fs_info->qgroup_rescan_progress,
- path, 1, 0);
+ path, true);
btrfs_debug(fs_info,
"current progress key " BTRFS_KEY_FMT ", search_slot ret %d",
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index eae596b80ec0..471e81a8e844 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -1234,7 +1234,7 @@ static int get_inode_path(struct btrfs_root *root,
key.type = BTRFS_INODE_REF_KEY;
key.offset = 0;
- ret = btrfs_search_slot_for_read(root, &key, p, 1, 0);
+ ret = btrfs_search_slot_for_read(root, &key, p, true);
if (ret < 0)
return ret;
if (ret)
@@ -1979,7 +1979,7 @@ static int get_first_ref(struct btrfs_root *root, u64 ino,
key.type = BTRFS_INODE_REF_KEY;
key.offset = 0;
- ret = btrfs_search_slot_for_read(root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(root, &key, path, true);
if (ret < 0)
return ret;
if (!ret)
@@ -2475,7 +2475,7 @@ static int send_subvol_begin(struct send_ctx *sctx)
key.offset = 0;
ret = btrfs_search_slot_for_read(send_root->fs_info->tree_root,
- &key, path, 1, 0);
+ &key, path, true);
if (ret < 0)
return ret;
if (ret)
@@ -6195,7 +6195,7 @@ static int is_extent_unchanged(struct send_ctx *sctx,
key.objectid = ekey->objectid;
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = ekey->offset;
- ret = btrfs_search_slot_for_read(sctx->parent_root, &key, path, 0, 0);
+ ret = btrfs_search_slot_for_read(sctx->parent_root, &key, path, false);
if (ret < 0)
return ret;
if (ret)
@@ -6320,7 +6320,7 @@ static int get_last_extent(struct send_ctx *sctx, u64 offset)
key.objectid = sctx->cur_ino;
key.type = BTRFS_EXTENT_DATA_KEY;
key.offset = offset;
- ret = btrfs_search_slot_for_read(root, &key, path, 0, 0);
+ ret = btrfs_search_slot_for_read(root, &key, path, false);
if (ret < 0)
return ret;
ASSERT(ret == 0);
@@ -7288,7 +7288,7 @@ static int full_send_tree(struct send_ctx *sctx)
sctx->last_reloc_trans = fs_info->last_reloc_trans;
up_read(&fs_info->commit_root_sem);
- ret = btrfs_search_slot_for_read(send_root, &key, path, 1, 0);
+ ret = btrfs_search_slot_for_read(send_root, &key, path, true);
if (ret < 0)
return ret;
if (ret)
--
2.51.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 3:27 [PATCH 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
` (2 preceding siblings ...)
2025-12-09 3:27 ` [PATCH 3/4] btrfs: cleanup btrfs_search_slot_for_read() Sun YangKai
@ 2025-12-09 3:27 ` Sun YangKai
2025-12-09 4:05 ` Sun Yangkai
2025-12-09 12:05 ` Filipe Manana
3 siblings, 2 replies; 13+ messages in thread
From: Sun YangKai @ 2025-12-09 3:27 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
There's a common parttern in callers of btrfs_prev_leaf:
p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
valid slot and cleanup its over complex logic.
Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
should insert the key. So just slots[0]-- is enough to get the previous
item.
And then remove the related logic and cleanup the callers.
ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]))
is enough to make sure that nritems != 0 and slots[0] points to a valid
btrfs_item.
And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
error because btrfs_pref_leaf() should always
1. either find a non-empty leaf
2. or return 1
So we can use ASSERT here.
No functional changes.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/ctree.c | 100 +++++++++--------------------------------------
1 file changed, 19 insertions(+), 81 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index bb886f9508e2..07e6433cde61 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2365,12 +2365,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
{
struct btrfs_key key;
- struct btrfs_key orig_key;
- struct btrfs_disk_key found_key;
int ret;
btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
- orig_key = key;
if (key.offset > 0) {
key.offset--;
@@ -2390,48 +2387,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
if (ret <= 0)
return ret;
- /*
- * Previous key not found. Even if we were at slot 0 of the leaf we had
- * before releasing the path and calling btrfs_search_slot(), we now may
- * be in a slot pointing to the same original key - this can happen if
- * after we released the path, one of more items were moved from a
- * sibling leaf into the front of the leaf we had due to an insertion
- * (see push_leaf_right()).
- * If we hit this case and our slot is > 0 and just decrement the slot
- * so that the caller does not process the same key again, which may or
- * may not break the caller, depending on its logic.
- */
- if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
- btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
- ret = btrfs_comp_keys(&found_key, &orig_key);
- if (ret == 0) {
- if (path->slots[0] > 0) {
- path->slots[0]--;
- return 0;
- }
- /*
- * At slot 0, same key as before, it means orig_key is
- * the lowest, leftmost, key in the tree. We're done.
- */
- return 1;
- }
- }
+ /* There's no smaller keys in the whole tree. */
+ if (path->slots[0] == 0)
+ return 1;
- btrfs_item_key(path->nodes[0], &found_key, 0);
- ret = btrfs_comp_keys(&found_key, &key);
- /*
- * We might have had an item with the previous key in the tree right
- * before we released our path. And after we released our path, that
- * item might have been pushed to the first slot (0) of the leaf we
- * were holding due to a tree balance. Alternatively, an item with the
- * previous key can exist as the only element of a leaf (big fat item).
- * Therefore account for these 2 cases, so that our callers (like
- * btrfs_previous_item) don't miss an existing item with a key matching
- * the previous key we computed above.
- */
- if (ret <= 0)
- return 0;
- return 1;
+ path->slots[0]--;
+ return 0;
}
/*
@@ -2461,19 +2422,10 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
return btrfs_next_leaf(root, p);
} else {
- if (p->slots[0] == 0) {
- ret = btrfs_prev_leaf(root, p);
- if (ret < 0)
- return ret;
- if (!ret) {
- if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
- p->slots[0]--;
- return 0;
- }
- return 1;
- } else {
- p->slots[0]--;
- }
+ if (p->slots[0] == 0)
+ return btrfs_prev_leaf(root, p);
+
+ p->slots[0]--;
}
return 0;
}
@@ -4957,26 +4909,19 @@ int btrfs_previous_item(struct btrfs_root *root,
int type)
{
struct btrfs_key found_key;
- struct extent_buffer *leaf;
- u32 nritems;
int ret;
while (1) {
if (path->slots[0] == 0) {
ret = btrfs_prev_leaf(root, path);
- if (ret != 0)
+ if (ret)
return ret;
- } else {
- path->slots[0]--;
- }
- leaf = path->nodes[0];
- nritems = btrfs_header_nritems(leaf);
- if (nritems == 0)
- return 1;
- if (path->slots[0] == nritems)
+ } else
path->slots[0]--;
- btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
+ ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
+
+ btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
if (found_key.objectid < min_objectid)
break;
if (found_key.type == type)
@@ -4998,26 +4943,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
struct btrfs_path *path, u64 min_objectid)
{
struct btrfs_key found_key;
- struct extent_buffer *leaf;
- u32 nritems;
int ret;
while (1) {
if (path->slots[0] == 0) {
ret = btrfs_prev_leaf(root, path);
- if (ret != 0)
+ if (ret)
return ret;
- } else {
- path->slots[0]--;
- }
- leaf = path->nodes[0];
- nritems = btrfs_header_nritems(leaf);
- if (nritems == 0)
- return 1;
- if (path->slots[0] == nritems)
+ } else
path->slots[0]--;
- btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
+ ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
+
+ btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
if (found_key.objectid < min_objectid)
break;
if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
--
2.51.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 3:27 ` [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
@ 2025-12-09 4:05 ` Sun Yangkai
2025-12-09 12:05 ` Filipe Manana
1 sibling, 0 replies; 13+ messages in thread
From: Sun Yangkai @ 2025-12-09 4:05 UTC (permalink / raw)
To: linux-btrfs
在 2025/12/9 11:27, Sun YangKai 写道:
> There's a common parttern in callers of btrfs_prev_leaf:
> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>
> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
> valid slot and cleanup its over complex logic.
>
> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
> should insert the key. So just slots[0]-- is enough to get the previous
> item.
>
> And then remove the related logic and cleanup the callers.
>
> ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]))
> is enough to make sure that nritems != 0 and slots[0] points to a valid
> btrfs_item.
>
> And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
> error because btrfs_pref_leaf() should always
>
> 1. either find a non-empty leaf
> 2. or return 1
>
> So we can use ASSERT here.
>
> No functional changes.
>
> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
> ---
> fs/btrfs/ctree.c | 100 +++++++++--------------------------------------
> 1 file changed, 19 insertions(+), 81 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index bb886f9508e2..07e6433cde61 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2365,12 +2365,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
> static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> {
> struct btrfs_key key;
> - struct btrfs_key orig_key;
> - struct btrfs_disk_key found_key;
> int ret;
>
> btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
> - orig_key = key;
>
> if (key.offset > 0) {
> key.offset--;
> @@ -2390,48 +2387,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> if (ret <= 0)
> return ret;
>
> - /*
> - * Previous key not found. Even if we were at slot 0 of the leaf we had
> - * before releasing the path and calling btrfs_search_slot(), we now may
> - * be in a slot pointing to the same original key - this can happen if
> - * after we released the path, one of more items were moved from a
> - * sibling leaf into the front of the leaf we had due to an insertion
> - * (see push_leaf_right()).
> - * If we hit this case and our slot is > 0 and just decrement the slot
> - * so that the caller does not process the same key again, which may or
> - * may not break the caller, depending on its logic.
> - */
> - if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
> - btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
> - ret = btrfs_comp_keys(&found_key, &orig_key);
> - if (ret == 0) {
> - if (path->slots[0] > 0) {
> - path->slots[0]--;
> - return 0;
> - }
> - /*
> - * At slot 0, same key as before, it means orig_key is
> - * the lowest, leftmost, key in the tree. We're done.
> - */
> - return 1;
> - }
> - }
> + /* There's no smaller keys in the whole tree. */
> + if (path->slots[0] == 0)
> + return 1;
>
> - btrfs_item_key(path->nodes[0], &found_key, 0);
> - ret = btrfs_comp_keys(&found_key, &key);
> - /*
> - * We might have had an item with the previous key in the tree right
> - * before we released our path. And after we released our path, that
> - * item might have been pushed to the first slot (0) of the leaf we
> - * were holding due to a tree balance. Alternatively, an item with the
> - * previous key can exist as the only element of a leaf (big fat item).
> - * Therefore account for these 2 cases, so that our callers (like
> - * btrfs_previous_item) don't miss an existing item with a key matching
> - * the previous key we computed above.
> - */
> - if (ret <= 0)
> - return 0;
> - return 1;
> + path->slots[0]--;
> + return 0;
> }
>
> /*
> @@ -2461,19 +2422,10 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
> if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
> return btrfs_next_leaf(root, p);
> } else {
> - if (p->slots[0] == 0) {
> - ret = btrfs_prev_leaf(root, p);
> - if (ret < 0)
> - return ret;
> - if (!ret) {
> - if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
> - p->slots[0]--;
> - return 0;
> - }
> - return 1;
> - } else {
> - p->slots[0]--;
> - }
> + if (p->slots[0] == 0)
> + return btrfs_prev_leaf(root, p);
I just found we don't need to call btrfs_prev_leaf() here because we've got
ret==1 and p->slots[0] == 0 from btrfs_search_slot(), which means there's no
lower key in the whole tree so just return 1 is enough.
> +
> + p->slots[0]--;
> }
> return 0;
> }
> @@ -4957,26 +4909,19 @@ int btrfs_previous_item(struct btrfs_root *root,
> int type)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == type)
> @@ -4998,26 +4943,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
> struct btrfs_path *path, u64 min_objectid)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 3:27 ` [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
2025-12-09 4:05 ` Sun Yangkai
@ 2025-12-09 12:05 ` Filipe Manana
2025-12-09 12:27 ` Sun Yangkai
2025-12-09 13:57 ` Sun Yangkai
1 sibling, 2 replies; 13+ messages in thread
From: Filipe Manana @ 2025-12-09 12:05 UTC (permalink / raw)
To: Sun YangKai; +Cc: linux-btrfs
On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>
> There's a common parttern in callers of btrfs_prev_leaf:
> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>
> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
> valid slot and cleanup its over complex logic.
>
> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
> should insert the key.
Hell no! path->slots[0] can end up pointing to the original key, which is what
should be the location for the computed previous key, and the
comments there explain how that can happen.
> So just slots[0]-- is enough to get the previous
> item.
All that logic in btrfs_prev_leaf() is necessary.
We release the path and then do a btrfs_search_slot() for the computed
previous key, but after the release and before the search, the
structure of the tree may have changed, keys moved between leaves, new
keys added, previous keys removed, and so on.
I left all such cases commented in detail in btrfs_prev_leaf() for a reason...
Removing that logic is just wrong and there's no explanation here for it.
Thanks.
>
> And then remove the related logic and cleanup the callers.
>
> ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]))
> is enough to make sure that nritems != 0 and slots[0] points to a valid
> btrfs_item.
>
> And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
> error because btrfs_pref_leaf() should always
>
> 1. either find a non-empty leaf
> 2. or return 1
>
> So we can use ASSERT here.
>
> No functional changes.
>
> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
> ---
> fs/btrfs/ctree.c | 100 +++++++++--------------------------------------
> 1 file changed, 19 insertions(+), 81 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index bb886f9508e2..07e6433cde61 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2365,12 +2365,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
> static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> {
> struct btrfs_key key;
> - struct btrfs_key orig_key;
> - struct btrfs_disk_key found_key;
> int ret;
>
> btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
> - orig_key = key;
>
> if (key.offset > 0) {
> key.offset--;
> @@ -2390,48 +2387,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> if (ret <= 0)
> return ret;
>
> - /*
> - * Previous key not found. Even if we were at slot 0 of the leaf we had
> - * before releasing the path and calling btrfs_search_slot(), we now may
> - * be in a slot pointing to the same original key - this can happen if
> - * after we released the path, one of more items were moved from a
> - * sibling leaf into the front of the leaf we had due to an insertion
> - * (see push_leaf_right()).
> - * If we hit this case and our slot is > 0 and just decrement the slot
> - * so that the caller does not process the same key again, which may or
> - * may not break the caller, depending on its logic.
> - */
> - if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
> - btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
> - ret = btrfs_comp_keys(&found_key, &orig_key);
> - if (ret == 0) {
> - if (path->slots[0] > 0) {
> - path->slots[0]--;
> - return 0;
> - }
> - /*
> - * At slot 0, same key as before, it means orig_key is
> - * the lowest, leftmost, key in the tree. We're done.
> - */
> - return 1;
> - }
> - }
> + /* There's no smaller keys in the whole tree. */
> + if (path->slots[0] == 0)
> + return 1;
>
> - btrfs_item_key(path->nodes[0], &found_key, 0);
> - ret = btrfs_comp_keys(&found_key, &key);
> - /*
> - * We might have had an item with the previous key in the tree right
> - * before we released our path. And after we released our path, that
> - * item might have been pushed to the first slot (0) of the leaf we
> - * were holding due to a tree balance. Alternatively, an item with the
> - * previous key can exist as the only element of a leaf (big fat item).
> - * Therefore account for these 2 cases, so that our callers (like
> - * btrfs_previous_item) don't miss an existing item with a key matching
> - * the previous key we computed above.
> - */
> - if (ret <= 0)
> - return 0;
> - return 1;
> + path->slots[0]--;
> + return 0;
> }
>
> /*
> @@ -2461,19 +2422,10 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
> if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
> return btrfs_next_leaf(root, p);
> } else {
> - if (p->slots[0] == 0) {
> - ret = btrfs_prev_leaf(root, p);
> - if (ret < 0)
> - return ret;
> - if (!ret) {
> - if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
> - p->slots[0]--;
> - return 0;
> - }
> - return 1;
> - } else {
> - p->slots[0]--;
> - }
> + if (p->slots[0] == 0)
> + return btrfs_prev_leaf(root, p);
> +
> + p->slots[0]--;
> }
> return 0;
> }
> @@ -4957,26 +4909,19 @@ int btrfs_previous_item(struct btrfs_root *root,
> int type)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == type)
> @@ -4998,26 +4943,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
> struct btrfs_path *path, u64 min_objectid)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
> --
> 2.51.2
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 12:05 ` Filipe Manana
@ 2025-12-09 12:27 ` Sun Yangkai
2026-02-07 8:50 ` Sun YangKai
2025-12-09 13:57 ` Sun Yangkai
1 sibling, 1 reply; 13+ messages in thread
From: Sun Yangkai @ 2025-12-09 12:27 UTC (permalink / raw)
To: Filipe Manana; +Cc: linux-btrfs
在 2025/12/9 20:05, Filipe Manana 写道:
> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>
>> There's a common parttern in callers of btrfs_prev_leaf:
>> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>>
>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>> valid slot and cleanup its over complex logic.
>>
>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>> should insert the key.
>
> Hell no! path->slots[0] can end up pointing to the original key, which is what
> should be the location for the computed previous key, and the
> comments there explain how that can happen.
>
>> So just slots[0]-- is enough to get the previous
>> item.
>
> All that logic in btrfs_prev_leaf() is necessary.
>
> We release the path and then do a btrfs_search_slot() for the computed
> previous key, but after the release and before the search, the
> structure of the tree may have changed, keys moved between leaves, new
> keys added, previous keys removed, and so on.
Thanks for your reply. Here's my thoughts about this:
My assumption is that there's not a key between the computed previous key and
the original key. So...
1) When searching for the computed previous key, we may get ret==1 and
p->slots[0] points to the original key. In this case,
if (p->slots[0] == 0) // we're the lowest key in the tree.
return 1;
p->slots[0]--; // move to the previous item.
return 0;
is exactly what we want if I understand it correctly. I don't understand why
it's a special case.
2) And if there's an item matches the computed previous key, we will get ret==0
from btrfs_search_slot() and we will early return after calling
btrfs_search_slot(). If there's no such an item, then we'll never get an item
whose key is lower than the key we give to btrfs_search_slot().
So the second comment block is also not a special case.
These two cases are not related with how the items moved, added or deleted
before we call btrfs_search_slot().
Please correct me if I got anything wrong.
Thanks.
>
> I left all such cases commented in detail in btrfs_prev_leaf() for a reason...
>
> Removing that logic is just wrong and there's no explanation here for it.
>
> Thanks.
>
>
>
>>
>> And then remove the related logic and cleanup the callers.
>>
>> ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]))
>> is enough to make sure that nritems != 0 and slots[0] points to a valid
>> btrfs_item.
>>
>> And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
>> error because btrfs_pref_leaf() should always
>>
>> 1. either find a non-empty leaf
>> 2. or return 1
>>
>> So we can use ASSERT here.
>>
>> No functional changes.
>>
>> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
>> ---
>> fs/btrfs/ctree.c | 100 +++++++++--------------------------------------
>> 1 file changed, 19 insertions(+), 81 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index bb886f9508e2..07e6433cde61 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -2365,12 +2365,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
>> static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
>> {
>> struct btrfs_key key;
>> - struct btrfs_key orig_key;
>> - struct btrfs_disk_key found_key;
>> int ret;
>>
>> btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
>> - orig_key = key;
>>
>> if (key.offset > 0) {
>> key.offset--;
>> @@ -2390,48 +2387,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
>> if (ret <= 0)
>> return ret;
>>
>> - /*
>> - * Previous key not found. Even if we were at slot 0 of the leaf we had
>> - * before releasing the path and calling btrfs_search_slot(), we now may
>> - * be in a slot pointing to the same original key - this can happen if
>> - * after we released the path, one of more items were moved from a
>> - * sibling leaf into the front of the leaf we had due to an insertion
>> - * (see push_leaf_right()).
>> - * If we hit this case and our slot is > 0 and just decrement the slot
>> - * so that the caller does not process the same key again, which may or
>> - * may not break the caller, depending on its logic.
>> - */
>> - if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
>> - btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
>> - ret = btrfs_comp_keys(&found_key, &orig_key);
>> - if (ret == 0) {
>> - if (path->slots[0] > 0) {
>> - path->slots[0]--;
>> - return 0;
>> - }
>> - /*
>> - * At slot 0, same key as before, it means orig_key is
>> - * the lowest, leftmost, key in the tree. We're done.
>> - */
>> - return 1;
>> - }
>> - }
>> + /* There's no smaller keys in the whole tree. */
>> + if (path->slots[0] == 0)
>> + return 1;
>>
>> - btrfs_item_key(path->nodes[0], &found_key, 0);
>> - ret = btrfs_comp_keys(&found_key, &key);
>> - /*
>> - * We might have had an item with the previous key in the tree right
>> - * before we released our path. And after we released our path, that
>> - * item might have been pushed to the first slot (0) of the leaf we
>> - * were holding due to a tree balance. Alternatively, an item with the
>> - * previous key can exist as the only element of a leaf (big fat item).
>> - * Therefore account for these 2 cases, so that our callers (like
>> - * btrfs_previous_item) don't miss an existing item with a key matching
>> - * the previous key we computed above.
>> - */
>> - if (ret <= 0)
>> - return 0;
>> - return 1;
>> + path->slots[0]--;
>> + return 0;
>> }
>>
>> /*
>> @@ -2461,19 +2422,10 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
>> if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
>> return btrfs_next_leaf(root, p);
>> } else {
>> - if (p->slots[0] == 0) {
>> - ret = btrfs_prev_leaf(root, p);
>> - if (ret < 0)
>> - return ret;
>> - if (!ret) {
>> - if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
>> - p->slots[0]--;
>> - return 0;
>> - }
>> - return 1;
>> - } else {
>> - p->slots[0]--;
>> - }
>> + if (p->slots[0] == 0)
>> + return btrfs_prev_leaf(root, p);
>> +
>> + p->slots[0]--;
>> }
>> return 0;
>> }
>> @@ -4957,26 +4909,19 @@ int btrfs_previous_item(struct btrfs_root *root,
>> int type)
>> {
>> struct btrfs_key found_key;
>> - struct extent_buffer *leaf;
>> - u32 nritems;
>> int ret;
>>
>> while (1) {
>> if (path->slots[0] == 0) {
>> ret = btrfs_prev_leaf(root, path);
>> - if (ret != 0)
>> + if (ret)
>> return ret;
>> - } else {
>> - path->slots[0]--;
>> - }
>> - leaf = path->nodes[0];
>> - nritems = btrfs_header_nritems(leaf);
>> - if (nritems == 0)
>> - return 1;
>> - if (path->slots[0] == nritems)
>> + } else
>> path->slots[0]--;
>>
>> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
>> +
>> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
>> if (found_key.objectid < min_objectid)
>> break;
>> if (found_key.type == type)
>> @@ -4998,26 +4943,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
>> struct btrfs_path *path, u64 min_objectid)
>> {
>> struct btrfs_key found_key;
>> - struct extent_buffer *leaf;
>> - u32 nritems;
>> int ret;
>>
>> while (1) {
>> if (path->slots[0] == 0) {
>> ret = btrfs_prev_leaf(root, path);
>> - if (ret != 0)
>> + if (ret)
>> return ret;
>> - } else {
>> - path->slots[0]--;
>> - }
>> - leaf = path->nodes[0];
>> - nritems = btrfs_header_nritems(leaf);
>> - if (nritems == 0)
>> - return 1;
>> - if (path->slots[0] == nritems)
>> + } else
>> path->slots[0]--;
>>
>> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
>> +
>> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
>> if (found_key.objectid < min_objectid)
>> break;
>> if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
>> --
>> 2.51.2
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 12:05 ` Filipe Manana
2025-12-09 12:27 ` Sun Yangkai
@ 2025-12-09 13:57 ` Sun Yangkai
2025-12-09 14:04 ` Sun Yangkai
1 sibling, 1 reply; 13+ messages in thread
From: Sun Yangkai @ 2025-12-09 13:57 UTC (permalink / raw)
To: Filipe Manana; +Cc: linux-btrfs
在 2025/12/9 20:05, Filipe Manana 写道:
> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>
>> There's a common parttern in callers of btrfs_prev_leaf:
>> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>>
>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>> valid slot and cleanup its over complex logic.
>>
>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>> should insert the key.
>
> Hell no! path->slots[0] can end up pointing to the original key, which is what
> should be the location for the computed previous key, and the
> comments there explain how that can happen.
>
>> So just slots[0]-- is enough to get the previous
>> item.
>
> All that logic in btrfs_prev_leaf() is necessary.
>
> We release the path and then do a btrfs_search_slot() for the computed
> previous key, but after the release and before the search, the
> structure of the tree may have changed, keys moved between leaves, new
> keys added, previous keys removed, and so on.
>
> I left all such cases commented in detail in btrfs_prev_leaf() for a reason...
>
> Removing that logic is just wrong and there's no explanation here for it.
>
> Thanks.
>
Oh, after go deeper into git history, I think I got your point.
I guess you mean:
1) check the key at p->slots[0] is lower than original key(or <= calculated
previous key) to make sure we're in a previous leaf;
2) before 1), we should check that if p is pointing to the original key, we
should `p->slots[0]--;` so we'll not return the original item.
I think now I totally get your point. And I still think my way is more simple
and clear because we don't bother comparing keys and only need to deal with two
different cases when we got ret==1 from btrfs_search_slot():
1) when p->slots[0] == 0, we're searching the lowest key in the whole tree. So
there's no prev leaf. Just return 1;
2) when p->slots[0] != 0, p is pointing to the original item, or other item if
the original one was removed. But this doesn't matter because after
p->slots[0]--, we'll get what we want: an item with a key lower than the
original one(of course also lower than the computed previous key). We get what
we want, return 0.
I hope I've make myself clear.
Thanks.
>
>
>>
>> And then remove the related logic and cleanup the callers.
>>
>> ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]))
>> is enough to make sure that nritems != 0 and slots[0] points to a valid
>> btrfs_item.
>>
>> And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
>> error because btrfs_pref_leaf() should always
>>
>> 1. either find a non-empty leaf
>> 2. or return 1
>>
>> So we can use ASSERT here.
>>
>> No functional changes.
>>
>> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
>> ---
>> fs/btrfs/ctree.c | 100 +++++++++--------------------------------------
>> 1 file changed, 19 insertions(+), 81 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index bb886f9508e2..07e6433cde61 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -2365,12 +2365,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
>> static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
>> {
>> struct btrfs_key key;
>> - struct btrfs_key orig_key;
>> - struct btrfs_disk_key found_key;
>> int ret;
>>
>> btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
>> - orig_key = key;
>>
>> if (key.offset > 0) {
>> key.offset--;
>> @@ -2390,48 +2387,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
>> if (ret <= 0)
>> return ret;
>>
>> - /*
>> - * Previous key not found. Even if we were at slot 0 of the leaf we had
>> - * before releasing the path and calling btrfs_search_slot(), we now may
>> - * be in a slot pointing to the same original key - this can happen if
>> - * after we released the path, one of more items were moved from a
>> - * sibling leaf into the front of the leaf we had due to an insertion
>> - * (see push_leaf_right()).
>> - * If we hit this case and our slot is > 0 and just decrement the slot
>> - * so that the caller does not process the same key again, which may or
>> - * may not break the caller, depending on its logic.
>> - */
>> - if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
>> - btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
>> - ret = btrfs_comp_keys(&found_key, &orig_key);
>> - if (ret == 0) {
>> - if (path->slots[0] > 0) {
>> - path->slots[0]--;
>> - return 0;
>> - }
>> - /*
>> - * At slot 0, same key as before, it means orig_key is
>> - * the lowest, leftmost, key in the tree. We're done.
>> - */
>> - return 1;
>> - }
>> - }
>> + /* There's no smaller keys in the whole tree. */
>> + if (path->slots[0] == 0)
>> + return 1;
>>
>> - btrfs_item_key(path->nodes[0], &found_key, 0);
>> - ret = btrfs_comp_keys(&found_key, &key);
>> - /*
>> - * We might have had an item with the previous key in the tree right
>> - * before we released our path. And after we released our path, that
>> - * item might have been pushed to the first slot (0) of the leaf we
>> - * were holding due to a tree balance. Alternatively, an item with the
>> - * previous key can exist as the only element of a leaf (big fat item).
>> - * Therefore account for these 2 cases, so that our callers (like
>> - * btrfs_previous_item) don't miss an existing item with a key matching
>> - * the previous key we computed above.
>> - */
>> - if (ret <= 0)
>> - return 0;
>> - return 1;
>> + path->slots[0]--;
>> + return 0;
>> }
>>
>> /*
>> @@ -2461,19 +2422,10 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
>> if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
>> return btrfs_next_leaf(root, p);
>> } else {
>> - if (p->slots[0] == 0) {
>> - ret = btrfs_prev_leaf(root, p);
>> - if (ret < 0)
>> - return ret;
>> - if (!ret) {
>> - if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
>> - p->slots[0]--;
>> - return 0;
>> - }
>> - return 1;
>> - } else {
>> - p->slots[0]--;
>> - }
>> + if (p->slots[0] == 0)
>> + return btrfs_prev_leaf(root, p);
>> +
>> + p->slots[0]--;
>> }
>> return 0;
>> }
>> @@ -4957,26 +4909,19 @@ int btrfs_previous_item(struct btrfs_root *root,
>> int type)
>> {
>> struct btrfs_key found_key;
>> - struct extent_buffer *leaf;
>> - u32 nritems;
>> int ret;
>>
>> while (1) {
>> if (path->slots[0] == 0) {
>> ret = btrfs_prev_leaf(root, path);
>> - if (ret != 0)
>> + if (ret)
>> return ret;
>> - } else {
>> - path->slots[0]--;
>> - }
>> - leaf = path->nodes[0];
>> - nritems = btrfs_header_nritems(leaf);
>> - if (nritems == 0)
>> - return 1;
>> - if (path->slots[0] == nritems)
>> + } else
>> path->slots[0]--;
>>
>> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
>> +
>> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
>> if (found_key.objectid < min_objectid)
>> break;
>> if (found_key.type == type)
>> @@ -4998,26 +4943,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
>> struct btrfs_path *path, u64 min_objectid)
>> {
>> struct btrfs_key found_key;
>> - struct extent_buffer *leaf;
>> - u32 nritems;
>> int ret;
>>
>> while (1) {
>> if (path->slots[0] == 0) {
>> ret = btrfs_prev_leaf(root, path);
>> - if (ret != 0)
>> + if (ret)
>> return ret;
>> - } else {
>> - path->slots[0]--;
>> - }
>> - leaf = path->nodes[0];
>> - nritems = btrfs_header_nritems(leaf);
>> - if (nritems == 0)
>> - return 1;
>> - if (path->slots[0] == nritems)
>> + } else
>> path->slots[0]--;
>>
>> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
>> +
>> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
>> if (found_key.objectid < min_objectid)
>> break;
>> if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
>> --
>> 2.51.2
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 13:57 ` Sun Yangkai
@ 2025-12-09 14:04 ` Sun Yangkai
0 siblings, 0 replies; 13+ messages in thread
From: Sun Yangkai @ 2025-12-09 14:04 UTC (permalink / raw)
To: Filipe Manana; +Cc: linux-btrfs
在 2025/12/9 21:57, Sun Yangkai 写道:
>
>
> 在 2025/12/9 20:05, Filipe Manana 写道:
>> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>>
>>> There's a common parttern in callers of btrfs_prev_leaf:
>>> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>>>
>>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>>> valid slot and cleanup its over complex logic.
>>>
>>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>>> should insert the key.
>>
>> Hell no! path->slots[0] can end up pointing to the original key, which is what
>> should be the location for the computed previous key, and the
>> comments there explain how that can happen.
>>
>>> So just slots[0]-- is enough to get the previous
>>> item.
>>
>> All that logic in btrfs_prev_leaf() is necessary.
>>
>> We release the path and then do a btrfs_search_slot() for the computed
>> previous key, but after the release and before the search, the
>> structure of the tree may have changed, keys moved between leaves, new
>> keys added, previous keys removed, and so on.
>>
>> I left all such cases commented in detail in btrfs_prev_leaf() for a reason...
>>
>> Removing that logic is just wrong and there's no explanation here for it.
>>
>> Thanks.
>>
>
> Oh, after go deeper into git history, I think I got your point.
>
> I guess you mean:
>
> 1) check the key at p->slots[0] is lower than original key(or <= calculated
Sorry, I mean the key at p->nodes[0] with slot==0 here
> previous key) to make sure we're in a previous leaf;
> 2) before 1), we should check that if p is pointing to the original key, we
> should `p->slots[0]--;` so we'll not return the original item.
>
> I think now I totally get your point. And I still think my way is more simple
> and clear because we don't bother comparing keys and only need to deal with two
> different cases when we got ret==1 from btrfs_search_slot():
>
> 1) when p->slots[0] == 0, we're searching the lowest key in the whole tree. So
> there's no prev leaf. Just return 1;
> 2) when p->slots[0] != 0, p is pointing to the original item, or other item if
> the original one was removed. But this doesn't matter because after
> p->slots[0]--, we'll get what we want: an item with a key lower than the
> original one(of course also lower than the computed previous key). We get what
> we want, return 0.
>
> I hope I've make myself clear.
>
> Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2025-12-09 12:27 ` Sun Yangkai
@ 2026-02-07 8:50 ` Sun YangKai
2026-02-07 10:18 ` Qu Wenruo
0 siblings, 1 reply; 13+ messages in thread
From: Sun YangKai @ 2026-02-07 8:50 UTC (permalink / raw)
To: Filipe Manana; +Cc: linux-btrfs
On 2025/12/9 20:27, Sun Yangkai wrote:
>
>
> 在 2025/12/9 20:05, Filipe Manana 写道:
>> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>>
>>> There's a common parttern in callers of btrfs_prev_leaf:
>>> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>>>
>>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>>> valid slot and cleanup its over complex logic.
>>>
>>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>>> should insert the key.
>>
>> Hell no! path->slots[0] can end up pointing to the original key, which is what
>> should be the location for the computed previous key, and the
>> comments there explain how that can happen.
>>
>>> So just slots[0]-- is enough to get the previous
>>> item.
>>
>> All that logic in btrfs_prev_leaf() is necessary.
>>
>> We release the path and then do a btrfs_search_slot() for the computed
>> previous key, but after the release and before the search, the
>> structure of the tree may have changed, keys moved between leaves, new
>> keys added, previous keys removed, and so on.
>
> Thanks for your reply. Here's my thoughts about this:
>
> My assumption is that there's not a key between the computed previous key and
> the original key. So...
>
> 1) When searching for the computed previous key, we may get ret==1 and
> p->slots[0] points to the original key. In this case,
>
> if (p->slots[0] == 0) // we're the lowest key in the tree.
> return 1;
>
> p->slots[0]--; // move to the previous item.
> return 0;
>
> is exactly what we want if I understand it correctly. I don't understand why
> it's a special case.
>
> 2) And if there's an item matches the computed previous key, we will get ret==0
> from btrfs_search_slot() and we will early return after calling
> btrfs_search_slot(). If there's no such an item, then we'll never get an item
> whose key is lower than the key we give to btrfs_search_slot().
> So the second comment block is also not a special case.
>
> These two cases are not related with how the items moved, added or deleted
> before we call btrfs_search_slot().
>
> Please correct me if I got anything wrong.
>
> Thanks.
After reviewing this patch several times, I cannot find anything wrong
with it. I'd glad to learn a special case which will break the new code.
My assumption is that let's say we have leaf A with key range [100, 180]
and leaf B with key range [200, ...] at search time (and we don't care
how those keys distribute before we release path), when searching for
key 199, we'll always get to leaf A and pointing to the position next to
the last item of leaf A. Please correct me if my assumption is wrong :)
Let's see the origin code:
if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
ret = btrfs_comp_keys(&found_key, &orig_key);
if (ret == 0) {
if (path->slots[0] > 0) {
path->slots[0]--;
return 0; // case 1
}
return 1; // case 2
}
}
btrfs_item_key(path->nodes[0], &found_key, 0);
ret = btrfs_comp_keys(&found_key, &key);
if (ret <= 0)
return 0; // case 3
return 1; // case 4
And the new code:
if (path->slots[0] == 0)
return 1; // case A
path->slots[0]--;
return 0; // case B
For all the origin return branches:
- Case 1: it will go to case B now, the same behavior;
- Case 2: it will go to case A now, the same behavior;
- Case 3: if ret == 0, then the found key matches the key,
btrfs_search_slot will return 0 and we'll not get to case 3.
So ret must less than 0, and the key of the item at slot 0 is
less than the search key so path->slots[0] is greater than
0 and we'll go to case B now. The behavior is different,
but we make sure pointing to the previous item in new code.
- Case 4: the key of item at slot 0 is larger than the key we give to
btrfs_search_slot(). In this case path->slot[0] must be 0 and
we'll go to case A now.
The original code's complexity (comparing keys) was an attempt to verify
if we landed exactly where we started, but this is unnecessary and I
don't think the previous search result and the release path things
matter because we're starting a brand new tree search and
btrfs_search_slot is the source of truth for the current tree structure.
As long as we treat all the things as a brand new search, looking for
the item with key=(origin_key - 1) or the previous item if it doesn't
exist, it will be very simple and clear :)
Thanks.
>>
>> I left all such cases commented in detail in btrfs_prev_leaf() for a reason...
>>
>> Removing that logic is just wrong and there's no explanation here for it.
>>
>> Thanks.
>>
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2026-02-07 8:50 ` Sun YangKai
@ 2026-02-07 10:18 ` Qu Wenruo
2026-02-07 15:07 ` Sun YangKai
0 siblings, 1 reply; 13+ messages in thread
From: Qu Wenruo @ 2026-02-07 10:18 UTC (permalink / raw)
To: Sun YangKai, Filipe Manana; +Cc: linux-btrfs
在 2026/2/7 19:20, Sun YangKai 写道:
>
>
> On 2025/12/9 20:27, Sun Yangkai wrote:
>>
>>
>> 在 2025/12/9 20:05, Filipe Manana 写道:
>>> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>>>
>>>> There's a common parttern in callers of btrfs_prev_leaf:
>>>> p->slots[0]-- if p->slots[0] points to a slot with invalid
>>>> item(nritem).
>>>>
>>>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>>>> valid slot and cleanup its over complex logic.
>>>>
>>>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>>>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>>>> should insert the key.
>>>
>>> Hell no! path->slots[0] can end up pointing to the original key,
>>> which is what
>>> should be the location for the computed previous key, and the
>>> comments there explain how that can happen.
>>>
>>>> So just slots[0]-- is enough to get the previous
>>>> item.
>>>
>>> All that logic in btrfs_prev_leaf() is necessary.
>>>
>>> We release the path and then do a btrfs_search_slot() for the computed
>>> previous key, but after the release and before the search, the
>>> structure of the tree may have changed, keys moved between leaves, new
>>> keys added, previous keys removed, and so on.
>>
>> Thanks for your reply. Here's my thoughts about this:
>>
>> My assumption is that there's not a key between the computed previous
>> key and
>> the original key. So...
>>
>> 1) When searching for the computed previous key, we may get ret==1 and
>> p->slots[0] points to the original key. In this case,
>>
>> if (p->slots[0] == 0) // we're the lowest key in the tree.
>> return 1;
>>
>> p->slots[0]--; // move to the previous item.
>> return 0;
>>
>> is exactly what we want if I understand it correctly. I don't
>> understand why
>> it's a special case.
>>
>> 2) And if there's an item matches the computed previous key, we will
>> get ret==0
>> from btrfs_search_slot() and we will early return after calling
>> btrfs_search_slot(). If there's no such an item, then we'll never get
>> an item
>> whose key is lower than the key we give to btrfs_search_slot().
>> So the second comment block is also not a special case.
>>
>> These two cases are not related with how the items moved, added or
>> deleted
>> before we call btrfs_search_slot().
>>
>> Please correct me if I got anything wrong.
>>
>> Thanks.
>
> After reviewing this patch several times, I cannot find anything wrong
> with it. I'd glad to learn a special case which will break the new code.
Check the following corner case.
Originally the leaf has the following first key
Slot 0 key X (X is a u136 value, combined objectid/type/offset)
Now we're search using key X - 1, and released the path.
COW happened, deleted everything in the tree.
Now btrfs_search_slot() returned 1, pointing the empty leaf with:
Slot 0 key 0
The old code will return 0 (went through the btrfs_item_key() and
btrfs_comp_keys() check), but the new code will return 1 (slots[0] == 0
check only).
Although I'm not sure if this makes much sense to the caller though.
And I have to admit, despite the above empty tree corner case, the new
code seems correct.
Firstly if we hit an exact match for (X - 1), then we immediately return
0, no need to bother exact match anymore.
The remaining cases are for non-exact match.
If there is a key smaller than X - 1, we will be at the next slot of
that key, thus slots[0] should not be 0.
The only case we hit slots[0] == 0 for non-exact match, is when that key
at slot 0 is already the smallest in the whole tree.
Thanks,
Qu
>
> My assumption is that let's say we have leaf A with key range [100, 180]
> and leaf B with key range [200, ...] at search time (and we don't care
> how those keys distribute before we release path), when searching for
> key 199, we'll always get to leaf A and pointing to the position next to
> the last item of leaf A. Please correct me if my assumption is wrong :)
>
> Let's see the origin code:
>
> if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
> btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
> ret = btrfs_comp_keys(&found_key, &orig_key);
> if (ret == 0) {
> if (path->slots[0] > 0) {
> path->slots[0]--;
> return 0; // case 1
> }
> return 1; // case 2
> }
> }
>
> btrfs_item_key(path->nodes[0], &found_key, 0);
> ret = btrfs_comp_keys(&found_key, &key);
> if (ret <= 0)
> return 0; // case 3
> return 1; // case 4
>
> And the new code:
>
> if (path->slots[0] == 0)
> return 1; // case A
>
> path->slots[0]--;
> return 0; // case B
>
> For all the origin return branches:
>
> - Case 1: it will go to case B now, the same behavior;
> - Case 2: it will go to case A now, the same behavior;
> - Case 3: if ret == 0, then the found key matches the key,
> btrfs_search_slot will return 0 and we'll not get to case 3.
> So ret must less than 0, and the key of the item at slot 0 is
> less than the search key so path->slots[0] is greater than
> 0 and we'll go to case B now. The behavior is different,
> but we make sure pointing to the previous item in new code.
> - Case 4: the key of item at slot 0 is larger than the key we give to
> btrfs_search_slot(). In this case path->slot[0] must be 0 and
> we'll go to case A now.
>
> The original code's complexity (comparing keys) was an attempt to verify
> if we landed exactly where we started, but this is unnecessary and I
> don't think the previous search result and the release path things
> matter because we're starting a brand new tree search and
> btrfs_search_slot is the source of truth for the current tree structure.
> As long as we treat all the things as a brand new search, looking for
> the item with key=(origin_key - 1) or the previous item if it doesn't
> exist, it will be very simple and clear :)
>
> Thanks.
>
>>>
>>> I left all such cases commented in detail in btrfs_prev_leaf() for a
>>> reason...
>>>
>>> Removing that logic is just wrong and there's no explanation here for
>>> it.
>>>
>>> Thanks.
>>>
>>>
>>>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
2026-02-07 10:18 ` Qu Wenruo
@ 2026-02-07 15:07 ` Sun YangKai
0 siblings, 0 replies; 13+ messages in thread
From: Sun YangKai @ 2026-02-07 15:07 UTC (permalink / raw)
To: Qu Wenruo, Filipe Manana; +Cc: linux-btrfs
On 2026/2/7 18:18, Qu Wenruo wrote:
>
>
> 在 2026/2/7 19:20, Sun YangKai 写道:
>>
>>
>> On 2025/12/9 20:27, Sun Yangkai wrote:
>>>
>>>
>>> 在 2025/12/9 20:05, Filipe Manana 写道:
>>>> On Tue, Dec 9, 2025 at 3:38 AM Sun YangKai <sunk67188@gmail.com> wrote:
>>>>>
>>>>> There's a common parttern in callers of btrfs_prev_leaf:
>>>>> p->slots[0]-- if p->slots[0] points to a slot with invalid
>>>>> item(nritem).
>>>>>
>>>>> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
>>>>> valid slot and cleanup its over complex logic.
>>>>>
>>>>> Reading and comparing keys in btrfs_prev_leaf() is unnecessary because
>>>>> when got a ret>0 from btrfs_search_slot(), slots[0] points to where we
>>>>> should insert the key.
>>>>
>>>> Hell no! path->slots[0] can end up pointing to the original key,
>>>> which is what
>>>> should be the location for the computed previous key, and the
>>>> comments there explain how that can happen.
>>>>
>>>>> So just slots[0]-- is enough to get the previous
>>>>> item.
>>>>
>>>> All that logic in btrfs_prev_leaf() is necessary.
>>>>
>>>> We release the path and then do a btrfs_search_slot() for the computed
>>>> previous key, but after the release and before the search, the
>>>> structure of the tree may have changed, keys moved between leaves, new
>>>> keys added, previous keys removed, and so on.
>>>
>>> Thanks for your reply. Here's my thoughts about this:
>>>
>>> My assumption is that there's not a key between the computed previous
>>> key and
>>> the original key. So...
>>>
>>> 1) When searching for the computed previous key, we may get ret==1 and
>>> p->slots[0] points to the original key. In this case,
>>>
>>> if (p->slots[0] == 0) // we're the lowest key in the tree.
>>> return 1;
>>>
>>> p->slots[0]--; // move to the previous item.
>>> return 0;
>>>
>>> is exactly what we want if I understand it correctly. I don't
>>> understand why
>>> it's a special case.
>>>
>>> 2) And if there's an item matches the computed previous key, we will
>>> get ret==0
>>> from btrfs_search_slot() and we will early return after calling
>>> btrfs_search_slot(). If there's no such an item, then we'll never get
>>> an item
>>> whose key is lower than the key we give to btrfs_search_slot().
>>> So the second comment block is also not a special case.
>>>
>>> These two cases are not related with how the items moved, added or
>>> deleted
>>> before we call btrfs_search_slot().
>>>
>>> Please correct me if I got anything wrong.
>>>
>>> Thanks.
>>
>> After reviewing this patch several times, I cannot find anything wrong
>> with it. I'd glad to learn a special case which will break the new code.
>
> Check the following corner case.
>
> Originally the leaf has the following first key
>
> Slot 0 key X (X is a u136 value, combined objectid/type/offset)
>
> Now we're search using key X - 1, and released the path.
> COW happened, deleted everything in the tree.
>
> Now btrfs_search_slot() returned 1, pointing the empty leaf with:
>
> Slot 0 key 0
Thanks a lot for providing this case which I've never thought about. It
seems the old version will try reading key at slot 0 while there's no
item. We'll get key 0 and it's safe because of we always set all the
unused space to zero in a leaf? Or we may get the old uncleared key
originally at slot 0?
> The old code will return 0 (went through the btrfs_item_key() and
> btrfs_comp_keys() check), but the new code will return 1 (slots[0] == 0
> check only).
Although I'm not sure if this case could happen in reality, but I think
return 1 is the expected behavior.
> Although I'm not sure if this makes much sense to the caller though.
>
>
> And I have to admit, despite the above empty tree corner case, the new
> code seems correct.
And much simpler :D
> Firstly if we hit an exact match for (X - 1), then we immediately return
> 0, no need to bother exact match anymore.
>
> The remaining cases are for non-exact match.
>
> If there is a key smaller than X - 1, we will be at the next slot of
> that key, thus slots[0] should not be 0.
>
> The only case we hit slots[0] == 0 for non-exact match, is when that key
> at slot 0 is already the smallest in the whole tree.
Thanks a lot for making this clear!
And I forgot that I've sent a patch v2:
https://lore.kernel.org/linux-btrfs/20251211072442.15920-2-sunk67188@gmail.com/
Maybe we can move future discussions there.
Thanks,
Sun YangKai
> Thanks,
> Qu
>
>>
>> My assumption is that let's say we have leaf A with key range [100,
>> 180] and leaf B with key range [200, ...] at search time (and we don't
>> care how those keys distribute before we release path), when searching
>> for key 199, we'll always get to leaf A and pointing to the position
>> next to the last item of leaf A. Please correct me if my assumption is
>> wrong :)
>>
>> Let's see the origin code:
>>
>> if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
>> btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
>> ret = btrfs_comp_keys(&found_key, &orig_key);
>> if (ret == 0) {
>> if (path->slots[0] > 0) {
>> path->slots[0]--;
>> return 0; // case 1
>> }
>> return 1; // case 2
>> }
>> }
>>
>> btrfs_item_key(path->nodes[0], &found_key, 0);
>> ret = btrfs_comp_keys(&found_key, &key);
>> if (ret <= 0)
>> return 0; // case 3
>> return 1; // case 4
>>
>> And the new code:
>>
>> if (path->slots[0] == 0)
>> return 1; // case A
>>
>> path->slots[0]--;
>> return 0; // case B
>>
>> For all the origin return branches:
>>
>> - Case 1: it will go to case B now, the same behavior;
>> - Case 2: it will go to case A now, the same behavior;
>> - Case 3: if ret == 0, then the found key matches the key,
>> btrfs_search_slot will return 0 and we'll not get to case 3.
>> So ret must less than 0, and the key of the item at slot 0 is
>> less than the search key so path->slots[0] is greater than
>> 0 and we'll go to case B now. The behavior is different,
>> but we make sure pointing to the previous item in new code.
>> - Case 4: the key of item at slot 0 is larger than the key we give to
>> btrfs_search_slot(). In this case path->slot[0] must be 0 and
>> we'll go to case A now.
>>
>> The original code's complexity (comparing keys) was an attempt to
>> verify if we landed exactly where we started, but this is unnecessary
>> and I don't think the previous search result and the release path
>> things matter because we're starting a brand new tree search and
>> btrfs_search_slot is the source of truth for the current tree structure.
>> As long as we treat all the things as a brand new search, looking for
>> the item with key=(origin_key - 1) or the previous item if it doesn't
>> exist, it will be very simple and clear :)
>>
>> Thanks.
>>
>>>>
>>>> I left all such cases commented in detail in btrfs_prev_leaf() for a
>>>> reason...
>>>>
>>>> Removing that logic is just wrong and there's no explanation here
>>>> for it.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-02-07 15:07 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-09 3:27 [PATCH 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
2025-12-09 3:27 ` [PATCH 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config Sun YangKai
2025-12-09 3:27 ` [PATCH 2/4] btrfs: don't set return_any @return_any for btrfs_search_slot_for_read in get_last_extent() Sun YangKai
2025-12-09 3:27 ` [PATCH 3/4] btrfs: cleanup btrfs_search_slot_for_read() Sun YangKai
2025-12-09 3:27 ` [PATCH 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
2025-12-09 4:05 ` Sun Yangkai
2025-12-09 12:05 ` Filipe Manana
2025-12-09 12:27 ` Sun Yangkai
2026-02-07 8:50 ` Sun YangKai
2026-02-07 10:18 ` Qu Wenruo
2026-02-07 15:07 ` Sun YangKai
2025-12-09 13:57 ` Sun Yangkai
2025-12-09 14:04 ` Sun Yangkai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox