* [PATCH] btrfs: fix node balancing condition in balance_level()
@ 2025-08-05 3:57 Sun YangKai
2025-08-05 18:46 ` Boris Burkov
0 siblings, 1 reply; 4+ messages in thread
From: Sun YangKai @ 2025-08-05 3:57 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Commit cfbb9308463f ("Btrfs: balance btree more often") intended to
trigger node balancing when node utilization drops below 50% (capacity/2)
by modifying the condition in setup_nodes_for_search(). However, an
undetected early return condition in balance_level() prevented this
behavior from taking effect.
The early return condition:
if (btrfs_header_nritems(mid) > BTRFS_NODEPTRS_PER_BLOCK(fs_info) / 4)
return 0;
caused balance_level() to abort when nodes were still more than 25% full,
effectively maintaining the original 25% threshold. This unintended
behavior persisted for over a decade. Since setup_nodes_for_search() is
the sole caller of balance_level(), remove the obsolete early return
condition to:
1. Align with the original intent of commit cfbb9308463f ("Btrfs: balance btree more often")
2. Allow proper node balancing at the 50% utilization threshold
3. Improve btree performance by more frequent node compaction
Fixes: cfbb9308463f ("Btrfs: balance btree more often")
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/ctree.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index acd85e317564..8cc52c8b38f3 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -939,9 +939,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
}
return 0;
}
- if (btrfs_header_nritems(mid) >
- BTRFS_NODEPTRS_PER_BLOCK(fs_info) / 4)
- return 0;
if (pslot) {
left = btrfs_read_node_slot(parent, pslot - 1);
--
2.50.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: fix node balancing condition in balance_level()
2025-08-05 3:57 [PATCH] btrfs: fix node balancing condition in balance_level() Sun YangKai
@ 2025-08-05 18:46 ` Boris Burkov
2025-08-06 6:07 ` Sun YangKai
2025-08-07 15:06 ` Sun YangKai
0 siblings, 2 replies; 4+ messages in thread
From: Boris Burkov @ 2025-08-05 18:46 UTC (permalink / raw)
To: Sun YangKai; +Cc: linux-btrfs
On Tue, Aug 05, 2025 at 11:57:04AM +0800, Sun YangKai wrote:
> Commit cfbb9308463f ("Btrfs: balance btree more often") intended to
> trigger node balancing when node utilization drops below 50% (capacity/2)
> by modifying the condition in setup_nodes_for_search(). However, an
> undetected early return condition in balance_level() prevented this
> behavior from taking effect.
>
> The early return condition:
> if (btrfs_header_nritems(mid) > BTRFS_NODEPTRS_PER_BLOCK(fs_info) / 4)
> return 0;
>
> caused balance_level() to abort when nodes were still more than 25% full,
> effectively maintaining the original 25% threshold. This unintended
> behavior persisted for over a decade. Since setup_nodes_for_search() is
> the sole caller of balance_level(), remove the obsolete early return
> condition to:
>
> 1. Align with the original intent of commit cfbb9308463f ("Btrfs: balance btree more often")
> 2. Allow proper node balancing at the 50% utilization threshold
> 3. Improve btree performance by more frequent node compaction
This is interesting, good catch.
However, we don't really have any evidence one way or the other which is
better. For better or worse, this logic has been around since ~2007, so
to change it I think requires more justification than the reasoning on a
faulty patch from 2009. Do you have any evidence for your #3:
> Improve btree performance by more frequent node compaction
I would advise that you run:
- fsperf for generic benchmarking.
- a targeted workload that would get compacted more now that you removed
the surprising check.
Thanks,
Boris
>
> Fixes: cfbb9308463f ("Btrfs: balance btree more often")
> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
> ---
> fs/btrfs/ctree.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index acd85e317564..8cc52c8b38f3 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -939,9 +939,6 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
> }
> return 0;
> }
> - if (btrfs_header_nritems(mid) >
> - BTRFS_NODEPTRS_PER_BLOCK(fs_info) / 4)
> - return 0;
>
> if (pslot) {
> left = btrfs_read_node_slot(parent, pslot - 1);
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: fix node balancing condition in balance_level()
2025-08-05 18:46 ` Boris Burkov
@ 2025-08-06 6:07 ` Sun YangKai
2025-08-07 15:06 ` Sun YangKai
1 sibling, 0 replies; 4+ messages in thread
From: Sun YangKai @ 2025-08-06 6:07 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs
> This is interesting, good catch.
>
> However, we don't really have any evidence one way or the other which is
> better. For better or worse, this logic has been around since ~2007, so
> to change it I think requires more justification than the reasoning on a
>
> faulty patch from 2009. Do you have any evidence for your #3:
> > Improve btree performance by more frequent node compaction
>
> I would advise that you run:
> - fsperf for generic benchmarking.
> - a targeted workload that would get compacted more now that you removed
> the surprising check.
>
> Thanks,
> Boris
Hi Boris.
The performance might get better because we will have less nodes and even
lower trees, if we compact nodes when they are 50% full instead of 25% full.
But I've not benched yet and there's some problem in it.
I come up with a drawback. This patch will make it more frequently to
split a nearly full node into two and merge them back later again and again if
item count changes slightly and its neighbors cannot help balance some items,
which happens when:
1. left node is full or null and we are always operating the left-half part of
the mid node
2. right node is null
The leaf merge related code choose (cap/3) instead of (cap/2) to prevent
similar cases. I have no idea how often this could be triggered in real world
workload. But if it really hurts, we should try some other method to merge/
split the nodes to get better performance.
Anyway, thank you for your advice. And I'm working on fsperf.
Thanks,
Sun YangKai
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: fix node balancing condition in balance_level()
2025-08-05 18:46 ` Boris Burkov
2025-08-06 6:07 ` Sun YangKai
@ 2025-08-07 15:06 ` Sun YangKai
1 sibling, 0 replies; 4+ messages in thread
From: Sun YangKai @ 2025-08-07 15:06 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs
Hi, Boris.
After working on node balance related code and fs_perf for days, I found that
current node balancing is not good enough with or without this patch. And I
come up with a different way to balance the nodes without adding much
complexity, which will takes me some days to implement and test it. I believe
it will be much better than the current one if I didn't mess things up.
BTW, I'm not familiar with fs benchmark, so I need some advice about how can I
get a reasonable benchmark result that can reflect the metadata operation
performance since I only have a 128G SSD can be used to benchmark, and it's
speed seems not stable enough. Currently I use kprobe to trace node balance
related functions to measure if the balance code works well.
Thanks,
Sun YangKai
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-07 15:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 3:57 [PATCH] btrfs: fix node balancing condition in balance_level() Sun YangKai
2025-08-05 18:46 ` Boris Burkov
2025-08-06 6:07 ` Sun YangKai
2025-08-07 15:06 ` Sun YangKai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox