From: Qu Wenruo <wqu@suse.com>
To: Sun YangKai <sunk67188@gmail.com>, linux-btrfs@vger.kernel.org
Cc: fdmanana@kernel.org
Subject: Re: [PATCH v2 4/4] btrfs: ctree: cleanup btrfs_prev_leaf()
Date: Sun, 8 Feb 2026 09:39:30 +1030 [thread overview]
Message-ID: <bf7c477d-ce4c-4d6e-9538-1baa33e39a02@suse.com> (raw)
In-Reply-To: <20251211072442.15920-6-sunk67188@gmail.com>
在 2025/12/11 17:52, Sun YangKai 写道:
> There's a common parttern in callers of btrfs_prev_leaf:
> p->slots[0]-- if p->slots[0] points to a slot with invalid item(nritem).
>
> So just make btrfs_prev_leaf() ensure that path->slots[0] points to a
> valid slot and cleanup its over complex logic.
>
> And then remove the related logic and cleanup the callers.
>
> This will make things much simpler.
>
> No functional changes.
>
> A. Details about changes in callers:
>
> ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0])) in callers
> is enough to make sure that nritems != 0 and slots[0] points to a valid
> btrfs_item.
I don't think the change is safe.
There are callers doing proper checks for empty trees, e.g.
btrfs_previous_extent_item() and btrfs_previous_item(), now you will
just ignore that for kernels without CONFIG_BTRFS_ASSERT, or crash the
kernel if CONFIG_BTRFS_ASSERT is selected.
If you're just cleaning up btrfs_prev_leaf(), then you should keep the
callers the same without changing their behaviors.
I think there may be some corner case races when tree deleting and some
other work are happening at the same time.
>
> And getting a `nritems==0` when btrfs_prev_leaf() returns 0 is a logic
> error because btrfs_pref_leaf() should always
>
> 1. either find a non-empty leaf
Nope, there is no such guarantee in the first place.
btrfs_prev_leaf() will unlock the path, thus all kinds of modification
can happen between btrfs_release_path() and btrfs_search_slot().
This assumption is just wrong.
> 2. or return 1
>
> So we can use ASSERT safely here.
>
> B. Details about cleanup of btrfs_prev_leaf().
>
> The previous implementation works like this:
>
> 0) Get a previous key by "dec by 1" of the original key. Let's call it
> search key. It's obvious that search key is less than original key
> and there's no key between them.
>
> 1) Call btrfs_search_slot() with search key.
>
> 2) If we got an error or an exact match, early return.
>
> 3) If p->slots[0] points to the original item, p->slots[0]-- to make sure
> that we will not return the same item again. This may happen because
> there might be some tree balancing happened so the original item is no
> longer at slot 0.
>
> 4) Check if the key of the item at slot 0 is (less than the original key
> / less than or equal to search key) to verify if we got a previous leaf.
>
> However, 3) and 4) are over complex. We only need to check if
> p->slots[0] == 0 because:
>
> 3a) If p->slots[0] == 0, there's no key less than or equal to search key
> in the tree, which means the original key is lowest in the tree. In
> this case, there's no previous leaf, we should return 1.
>
> 3b) If p->slots[0] != 0, using p->slots[0]-- is enough to get a valid
> previous item neither missing anything nor return the original item
> again because:
>
> i) p->slots[0] == nritems, which means all keys in the leaf are less
> than search key so the leaf is a previous leaf. We only need to
> p->slots[0]-- to get a valid previous item.
>
> ii) p->slots[0] < nritems, p->slots[0] points to an item whose key
> is greater than search key(probably the original item if it was
> not deleted), and p->slots[0] - 1 points to an item whose key is
> less than search key. So use p->slots[0]-- to get the previous
> item and we will neigher miss anything nor return the original
> item again. This handles the case 3) in original implementation.
>
> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
> ---
> fs/btrfs/ctree.c | 99 ++++++++++--------------------------------------
> 1 file changed, 19 insertions(+), 80 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 0a0157db0b0c..3026d956c7fb 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2376,12 +2376,9 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key,
> static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> {
> struct btrfs_key key;
> - struct btrfs_key orig_key;
> - struct btrfs_disk_key found_key;
> int ret;
>
> btrfs_item_key_to_cpu(path->nodes[0], &key, 0);
> - orig_key = key;
>
> if (key.offset > 0) {
> key.offset--;
> @@ -2401,48 +2398,12 @@ static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path *path)
> if (ret <= 0)
> return ret;
>
> - /*
> - * Previous key not found. Even if we were at slot 0 of the leaf we had
> - * before releasing the path and calling btrfs_search_slot(), we now may
> - * be in a slot pointing to the same original key - this can happen if
> - * after we released the path, one of more items were moved from a
> - * sibling leaf into the front of the leaf we had due to an insertion
> - * (see push_leaf_right()).
> - * If we hit this case and our slot is > 0 and just decrement the slot
> - * so that the caller does not process the same key again, which may or
> - * may not break the caller, depending on its logic.
> - */
Although the change is mostly fine, if you want to keep the existing
behavior, you should return 0 for empty tree to keep the old behavior.
Furthermore, you have to explain why it's safe not only in the commit
message, but also comments.
This not a small change, and very low-level. Just doing black magic is
not acceptable.
And you need something better than your commit message as comments.
In short, you should:
- Not change the callers
- Better comments
> - if (path->slots[0] < btrfs_header_nritems(path->nodes[0])) {
> - btrfs_item_key(path->nodes[0], &found_key, path->slots[0]);
> - ret = btrfs_comp_keys(&found_key, &orig_key);
> - if (ret == 0) {
> - if (path->slots[0] > 0) {
> - path->slots[0]--;
> - return 0;
> - }
> - /*
> - * At slot 0, same key as before, it means orig_key is
> - * the lowest, leftmost, key in the tree. We're done.
> - */
> - return 1;
> - }
> - }
> + /* There's no smaller keys in the whole tree. */
> + if (path->slots[0] == 0)
> + return 1;
>
> - btrfs_item_key(path->nodes[0], &found_key, 0);
> - ret = btrfs_comp_keys(&found_key, &key);
> - /*
> - * We might have had an item with the previous key in the tree right
> - * before we released our path. And after we released our path, that
> - * item might have been pushed to the first slot (0) of the leaf we
> - * were holding due to a tree balance. Alternatively, an item with the
> - * previous key can exist as the only element of a leaf (big fat item).
> - * Therefore account for these 2 cases, so that our callers (like
> - * btrfs_previous_item) don't miss an existing item with a key matching
> - * the previous key we computed above.
> - */
> - if (ret <= 0)
> - return 0;
> - return 1;
> + path->slots[0]--;
> + return 0;
> }
>
> /*
> @@ -2473,19 +2434,11 @@ int btrfs_search_slot_for_read(struct btrfs_root *root,
> if (p->slots[0] >= btrfs_header_nritems(p->nodes[0]))
> return btrfs_next_leaf(root, p);
> } else {
> - if (p->slots[0] == 0) {
> - ret = btrfs_prev_leaf(root, p);
> - if (ret < 0)
> - return ret;
> - if (!ret) {
> - if (p->slots[0] == btrfs_header_nritems(p->nodes[0]))
> - p->slots[0]--;
> - return 0;
> - }
> + /* We have no lower key in the tree. */
> + if (p->slots[0] == 0)
> return 1;
> - } else {
> - p->slots[0]--;
> - }
> +
> + p->slots[0]--;
> }
> return 0;
> }
> @@ -4969,26 +4922,19 @@ int btrfs_previous_item(struct btrfs_root *root,
> int type)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == type)
> @@ -5010,26 +4956,19 @@ int btrfs_previous_extent_item(struct btrfs_root *root,
> struct btrfs_path *path, u64 min_objectid)
> {
> struct btrfs_key found_key;
> - struct extent_buffer *leaf;
> - u32 nritems;
> int ret;
>
> while (1) {
> if (path->slots[0] == 0) {
> ret = btrfs_prev_leaf(root, path);
> - if (ret != 0)
> + if (ret)
> return ret;
> - } else {
> - path->slots[0]--;
> - }
> - leaf = path->nodes[0];
> - nritems = btrfs_header_nritems(leaf);
> - if (nritems == 0)
> - return 1;
> - if (path->slots[0] == nritems)
> + } else
> path->slots[0]--;
>
> - btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
> + ASSERT(path->slots[0] < btrfs_header_nritems(path->nodes[0]));
> +
> + btrfs_item_key_to_cpu(path->nodes[0], &found_key, path->slots[0]);
> if (found_key.objectid < min_objectid)
> break;
> if (found_key.type == BTRFS_EXTENT_ITEM_KEY ||
next prev parent reply other threads:[~2026-02-07 23:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-11 7:22 [PATCH v2 0/4] btrfs: some cleanups for two ctree functions Sun YangKai
2025-12-11 7:22 ` [PATCH v2 1/4] btrfs: don't set @return_any for btrfs_search_slot_for_read in btrfs_read_qgroup_config Sun YangKai
2025-12-11 7:22 ` [PATCH v2 2/4] btrfs: don't set return_any @return_any for btrfs_search_slot_for_read in get_last_extent() Sun YangKai
2025-12-11 7:22 ` [PATCH v2 3/4] btrfs: cleanup btrfs_search_slot_for_read() Sun YangKai
2025-12-11 7:22 ` [PATCH v2 4/4] btrfs: ctree: cleanup btrfs_prev_leaf() Sun YangKai
2026-02-07 23:09 ` Qu Wenruo [this message]
2026-02-08 2:42 ` Sun YangKai
2026-02-08 3:13 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf7c477d-ce4c-4d6e-9538-1baa33e39a02@suse.com \
--to=wqu@suse.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=sunk67188@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox