* [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification
@ 2026-03-14 12:37 ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter ZhengYuan Huang
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-14 12:37 UTC (permalink / raw)
To: dsterba, clm, idryomov
Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
ZhengYuan Huang
This series fixes two NULL dereferences in btrfs balance usage filters and
the underlying mount-time verification bug that lets the corresponding
chunk/block-group inconsistency go undetected.
The balance crashes happen when metadata corruption leaves a chunk present
in the chunk tree but without a corresponding block group in the in-memory
block group cache. In that case, the usage filters call
btrfs_lookup_block_group() and dereference the returned pointer without
checking for NULL.
The first two patches add the missing NULL checks and propagate -EUCLEAN
back to userspace instead of crashing. They are split because the usage
and usage-range filters were introduced by different commits, which should
also make backporting easier, as suggested by Qu Wenruo.
The third patch fixes the root cause on the mount-time verification side.
check_chunk_block_group_mappings() is supposed to verify that every chunk
has a matching block group, but its current iteration starts with
btrfs_find_chunk_map(fs_info, 0, 1). If no chunk contains logical address
0, the lookup returns NULL immediately and the loop exits without checking
any chunk at all. As a result, the corrupted mapping can survive mount and
only crash later when balance reaches it.
This series makes btrfs reject the inconsistency earlier at mount time,
and also hardens the balance filters so the corruption is reported as
-EUCLEAN instead of triggering a NULL dereference.
Changes since v1:
- split the two balance filter fixes into separate patches
- reworked the third patch to fix the case where
check_chunk_block_group_mappings() does not actually check all chunk
mappings
[NOTE]
Some of the changelogs may repeat parts of the bug analysis, which can
make the series somewhat verbose. I did that intentionally because I was
trying to follow the usual expectation that each patch should be able to
stand on its own and explain the specific issue it fixes. In particular,
I wanted each patch to describe its own immediate cause clearly, even
where the overall trigger path overlaps with the others. If that is not
the preferred style here, I would be happy to rework the changelogs and
resend the series in a different form.
Also, in a previous reply, Qu Wenruo suggested adding a separate
chunk/block-group consistency check. After looking into that, I found
that btrfs already has a function intended for this purpose,
check_chunk_block_group_mappings(). Patch 3 is based on the observation
that this check exists, but due to its current iteration logic it can
exit without checking any chunk mappings at all.
Since I am not very familiar with all the details of btrfs internals, if
my analysis of patch 3 is flawed, or if the fix is not the right one, I
would greatly appreciate any correction or guidance, and I will revise
and resend the patch accordingly.
ZhengYuan Huang (3):
btrfs: balance: handle missing block groups in usage filter
btrfs: balance: handle missing block groups in usage range filter
btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
fs/btrfs/block-group.c | 21 ++++++------------
fs/btrfs/volumes.c | 48 +++++++++++++++++++++++++++++++-----------
2 files changed, 42 insertions(+), 27 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter
2026-03-14 12:37 [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification ZhengYuan Huang
@ 2026-03-14 12:37 ` ZhengYuan Huang
2026-03-23 17:40 ` David Sterba
2026-03-14 12:37 ` [PATCH v2 2/3] btrfs: balance: fix null-ptr-deref in chunk_usage_range_filter ZhengYuan Huang
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-14 12:37 UTC (permalink / raw)
To: dsterba, clm, idryomov
Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
ZhengYuan Huang, stable
[BUG]
Running btrfs balance with a usage filter (-dusage=N) can trigger a
null-ptr-deref when metadata corruption causes a chunk to have no
corresponding block group in the in-memory cache:
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
...
Call Trace:
btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
vfs_ioctl fs/ioctl.c:51 [inline]
...
The bug is reproducible on next-20260312 with our dynamic metadata
fuzzing tool, which corrupts btrfs metadata at runtime.
[CAUSE]
Two separate data structures are involved:
1. The on-disk chunk tree, which records every chunk (logical address
space region) and is iterated by __btrfs_balance().
2. The in-memory block group cache (fs_info->block_group_cache_tree),
which is built at mount time by btrfs_read_block_groups() and holds
a struct btrfs_block_group for each chunk. This cache is what the
usage filter queries.
On a well-formed filesystem, these two are kept in 1:1 correspondence.
However, btrfs_read_block_groups() builds the cache from block group
items in the extent tree, not directly from the chunk tree. A corrupted
image can therefore contain a chunk item in the chunk tree whose
corresponding block group item is absent from the extent tree; that
chunk's block group is then never inserted into the in-memory cache.
When balance iterates the chunk tree and reaches such an orphaned chunk,
should_balance_chunk() calls chunk_usage_filter(), which queries the block
group cache:
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
chunk_used = cache->used; /* cache may be NULL */
btrfs_lookup_block_group() returns NULL silently when no cached entry
covers chunk_offset. chunk_usage_filter() does not check the return value,
so the immediately following dereference of cache->used triggers the crash.
[FIX]
Add a NULL check after btrfs_lookup_block_group() in chunk_usage_filter().
When the lookup fails, emit a btrfs_err() message identifying the
affected bytenr and return -EUCLEAN to indicate filesystem corruption.
Since the filter function now has an error return path, change its
return type from bool to int (negative = error, 0 = do not balance,
positive = balance). Update should_balance_chunk() accordingly (bool ->
int, with the same convention) and add error propagation for the usage
filter path. Finally, handle the new negative return in __btrfs_balance()
by jumping to the existing error path, which aborts the balance
operation and reports the error to userspace.
After the fix, the same corruption is correctly detected and reported
by the filter, and the null-ptr-deref is no longer triggered.
Fixes: 5ce5b3c0916b ("Btrfs: usage filter")
Cc: stable@vger.kernel.org
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
fs/btrfs/volumes.c | 28 +++++++++++++++++++++-------
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2bec544d8ba3..7c21ac249383 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3863,14 +3863,20 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
return ret;
}
-static bool chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
- struct btrfs_balance_args *bargs)
+static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
+ struct btrfs_balance_args *bargs)
{
struct btrfs_block_group *cache;
u64 chunk_used, user_thresh;
bool ret = true;
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
+ if (!cache) {
+ btrfs_err(fs_info,
+ "balance: chunk at bytenr %llu has no corresponding block group",
+ chunk_offset);
+ return -EUCLEAN;
+ }
chunk_used = cache->used;
if (bargs->usage_min == 0)
@@ -3986,8 +3992,8 @@ static bool chunk_soft_convert_filter(u64 chunk_type, struct btrfs_balance_args
return false;
}
-static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
- u64 chunk_offset)
+static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
+ u64 chunk_offset)
{
struct btrfs_fs_info *fs_info = leaf->fs_info;
struct btrfs_balance_control *bctl = fs_info->balance_ctl;
@@ -4014,9 +4020,13 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
}
/* usage filter */
- if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE) &&
- chunk_usage_filter(fs_info, chunk_offset, bargs)) {
- return false;
+ if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
+ int filter_ret = chunk_usage_filter(fs_info, chunk_offset, bargs);
+
+ if (filter_ret < 0)
+ return filter_ret;
+ if (filter_ret)
+ return false;
} else if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) &&
chunk_usage_range_filter(fs_info, chunk_offset, bargs)) {
return false;
@@ -4172,6 +4182,10 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
ret = should_balance_chunk(leaf, chunk, found_key.offset);
btrfs_release_path(path);
+ if (ret < 0) {
+ mutex_unlock(&fs_info->reclaim_bgs_lock);
+ goto error;
+ }
if (!ret) {
mutex_unlock(&fs_info->reclaim_bgs_lock);
goto loop;
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] btrfs: balance: fix null-ptr-deref in chunk_usage_range_filter
2026-03-14 12:37 [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter ZhengYuan Huang
@ 2026-03-14 12:37 ` ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks ZhengYuan Huang
2026-03-23 17:33 ` [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification David Sterba
3 siblings, 0 replies; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-14 12:37 UTC (permalink / raw)
To: dsterba, clm, idryomov
Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
ZhengYuan Huang, stable
[BUG]
Running btrfs balance with a usage range filter (-dusage=min..max) can
trigger a null-ptr-deref when metadata corruption causes a chunk to have
no corresponding block group in the in-memory cache:
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
RIP: 0010:chunk_usage_range_filter fs/btrfs/volumes.c:3845 [inline]
RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4031 [inline]
RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4182 [inline]
RIP: 0010:btrfs_balance+0x249e/0x4320 fs/btrfs/volumes.c:4618
...
Call Trace:
btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
vfs_ioctl fs/ioctl.c:51 [inline]
...
The bug is reproducible on next-20260312 with our dynamic metadata
fuzzing tool, which corrupts btrfs metadata at runtime.
[CAUSE]
Two separate data structures are involved:
1. The on-disk chunk tree, which records every chunk (logical address
space region) and is iterated by __btrfs_balance().
2. The in-memory block group cache (fs_info->block_group_cache_tree),
which is built at mount time by btrfs_read_block_groups() and holds
a struct btrfs_block_group for each chunk. This cache is what the
usage range filter queries.
On a well-formed filesystem, these two are kept in 1:1 correspondence.
However, btrfs_read_block_groups() builds the cache from block group
items in the extent tree, not directly from the chunk tree. A corrupted
image can therefore contain a chunk item in the chunk tree whose
corresponding block group item is absent from the extent tree; that
chunk's block group is then never inserted into the in-memory cache.
When balance iterates the chunk tree and reaches such an orphaned chunk,
should_balance_chunk() calls chunk_usage_range_filter(), which queries
the block group cache:
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
chunk_used = cache->used; /* cache may be NULL */
btrfs_lookup_block_group() returns NULL silently when no cached entry
covers chunk_offset. chunk_usage_range_filter() does not check the return
value, so the immediately following dereference of cache->used triggers
the crash.
[FIX]
Add a NULL check after btrfs_lookup_block_group() in
chunk_usage_range_filter(). When the lookup fails, emit a btrfs_err()
message identifying the affected bytenr and return -EUCLEAN to indicate
filesystem corruption.
Since chunk_usage_range_filter() now has an error return path, change its
return type from bool to int (negative = error, 0 = do not balance,
positive = balance). Update the BTRFS_BALANCE_ARGS_USAGE_RANGE branch in
should_balance_chunk() to propagate negative errors instead of treating
them as a normal filter result.
After the fix, the same corruption is correctly detected and reported
by the filter, and the null-ptr-deref is no longer triggered.
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Cc: stable@vger.kernel.org
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
fs/btrfs/volumes.c | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7c21ac249383..4958e074d420 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3832,8 +3832,8 @@ static bool chunk_profiles_filter(u64 chunk_type, struct btrfs_balance_args *bar
return true;
}
-static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
- struct btrfs_balance_args *bargs)
+static int chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
+ struct btrfs_balance_args *bargs)
{
struct btrfs_block_group *cache;
u64 chunk_used;
@@ -3842,6 +3842,12 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
bool ret = true;
cache = btrfs_lookup_block_group(fs_info, chunk_offset);
+ if (!cache) {
+ btrfs_err(fs_info,
+ "balance: chunk at bytenr %llu has no corresponding block group",
+ chunk_offset);
+ return -EUCLEAN;
+ }
chunk_used = cache->used;
if (bargs->usage_min == 0)
@@ -4027,9 +4033,13 @@ static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *
return filter_ret;
if (filter_ret)
return false;
- } else if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) &&
- chunk_usage_range_filter(fs_info, chunk_offset, bargs)) {
- return false;
+ } else if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) {
+ int filter_ret = chunk_usage_range_filter(fs_info, chunk_offset, bargs);
+
+ if (filter_ret < 0)
+ return filter_ret;
+ if (filter_ret)
+ return false;
}
/* devid filter */
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks
2026-03-14 12:37 [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 2/3] btrfs: balance: fix null-ptr-deref in chunk_usage_range_filter ZhengYuan Huang
@ 2026-03-14 12:37 ` ZhengYuan Huang
2026-03-23 17:52 ` David Sterba
2026-03-23 17:33 ` [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification David Sterba
3 siblings, 1 reply; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-14 12:37 UTC (permalink / raw)
To: dsterba, clm, idryomov
Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
ZhengYuan Huang
[BUG]
A corrupted image with a chunk present in the chunk tree but whose
corresponding block group item is missing from the extent tree can be
mounted successfully, even though check_chunk_block_group_mappings()
is supposed to catch exactly this corruption at mount time. Once
mounted, running btrfs balance with a usage filter (-dusage=N or
-dusage=min..max) triggers a null-ptr-deref:
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
The crash occurs because __btrfs_balance() iterates the on-disk chunk
tree, finds the orphaned chunk, calls chunk_usage_filter() (or
chunk_usage_range_filter()), which queries the in-memory block group
cache via btrfs_lookup_block_group(). Since no block group was ever
inserted for this chunk, the lookup returns NULL, and the subsequent
dereference of cache->used crashes.
[CAUSE]
check_chunk_block_group_mappings() uses btrfs_find_chunk_map() to
iterate the in-memory chunk map (fs_info->mapping_tree):
map = btrfs_find_chunk_map(fs_info, start, 1);
With @start = 0 and @length = 1, btrfs_find_chunk_map() looks for a
chunk map that *contains* the logical address 0. If no chunk contains
logical address 0, btrfs_find_chunk_map(fs_info, 0, 1) returns NULL
immediately and the loop breaks after the very first iteration,
having checked zero chunks. The entire verification function is therefore
a no-op, and the corrupted image passes the mount-time check undetected.
[FIX]
Replace the btrfs_find_chunk_map() based loop with a direct in-order
walk of fs_info->mapping_tree using rb_first_cached() + rb_next(),
protected by mapping_tree_lock. This guarantees that every chunk map
in the tree is visited regardless of the logical addresses involved.
Since the mapping_tree itself is accessed under read_lock, no refcount
manipulation of each map entry is needed inside the loop, so the
btrfs_free_chunk_map() calls on the map are also removed.
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
fs/btrfs/block-group.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 5322ef2ae015..25bd0d058be6 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2319,29 +2319,22 @@ static struct btrfs_block_group *btrfs_create_block_group_cache(
*/
static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
{
- u64 start = 0;
+ struct rb_node *node;
int ret = 0;
- while (1) {
+ read_lock(&fs_info->mapping_tree_lock);
+ for (node = rb_first_cached(&fs_info->mapping_tree); node;
+ node = rb_next(node)) {
struct btrfs_chunk_map *map;
struct btrfs_block_group *bg;
- /*
- * btrfs_find_chunk_map() will return the first chunk map
- * intersecting the range, so setting @length to 1 is enough to
- * get the first chunk.
- */
- map = btrfs_find_chunk_map(fs_info, start, 1);
- if (!map)
- break;
-
+ map = rb_entry(node, struct btrfs_chunk_map, rb_node);
bg = btrfs_lookup_block_group(fs_info, map->start);
if (unlikely(!bg)) {
btrfs_err(fs_info,
"chunk start=%llu len=%llu doesn't have corresponding block group",
map->start, map->chunk_len);
ret = -EUCLEAN;
- btrfs_free_chunk_map(map);
break;
}
if (unlikely(bg->start != map->start || bg->length != map->chunk_len ||
@@ -2354,14 +2347,12 @@ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
bg->start, bg->length,
bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
ret = -EUCLEAN;
- btrfs_free_chunk_map(map);
btrfs_put_block_group(bg);
break;
}
- start = map->start + map->chunk_len;
- btrfs_free_chunk_map(map);
btrfs_put_block_group(bg);
}
+ read_unlock(&fs_info->mapping_tree_lock);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification
2026-03-14 12:37 [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification ZhengYuan Huang
` (2 preceding siblings ...)
2026-03-14 12:37 ` [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks ZhengYuan Huang
@ 2026-03-23 17:33 ` David Sterba
2026-03-24 2:53 ` ZhengYuan Huang
3 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2026-03-23 17:33 UTC (permalink / raw)
To: ZhengYuan Huang
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427
On Sat, Mar 14, 2026 at 08:37:38PM +0800, ZhengYuan Huang wrote:
> This series fixes two NULL dereferences in btrfs balance usage filters and
> the underlying mount-time verification bug that lets the corresponding
> chunk/block-group inconsistency go undetected.
>
> The balance crashes happen when metadata corruption leaves a chunk present
> in the chunk tree but without a corresponding block group in the in-memory
> block group cache. In that case, the usage filters call
> btrfs_lookup_block_group() and dereference the returned pointer without
> checking for NULL.
>
> The first two patches add the missing NULL checks and propagate -EUCLEAN
> back to userspace instead of crashing. They are split because the usage
> and usage-range filters were introduced by different commits, which should
> also make backporting easier, as suggested by Qu Wenruo.
>
> The third patch fixes the root cause on the mount-time verification side.
> check_chunk_block_group_mappings() is supposed to verify that every chunk
> has a matching block group, but its current iteration starts with
> btrfs_find_chunk_map(fs_info, 0, 1). If no chunk contains logical address
> 0, the lookup returns NULL immediately and the loop exits without checking
> any chunk at all. As a result, the corrupted mapping can survive mount and
> only crash later when balance reaches it.
>
> This series makes btrfs reject the inconsistency earlier at mount time,
> and also hardens the balance filters so the corruption is reported as
> -EUCLEAN instead of triggering a NULL dereference.
As I understand it you're using some advanced fuzzing tool (patch 1
mentions runtime fuzzing), so the errors would not normally happen. With
fuzzing it depends on the capabilities, at runtime it is possible to
confuse the filesystem so much that sipmle checks can't detect it.
Here checking if block group lookups are ok makes sense in general.
There are existing checks that seem to be following the same logic like
in unpin_extent_range().
>
> Changes since v1:
> - split the two balance filter fixes into separate patches
> - reworked the third patch to fix the case where
> check_chunk_block_group_mappings() does not actually check all chunk
> mappings
>
> [NOTE]
> Some of the changelogs may repeat parts of the bug analysis, which can
> make the series somewhat verbose. I did that intentionally because I was
> trying to follow the usual expectation that each patch should be able to
> stand on its own and explain the specific issue it fixes.
This is good, thanks. For simple fixes or cleanups it's fine to
make a vague reference to the main patch or a "in the previous/followup
patches".
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter
2026-03-14 12:37 ` [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter ZhengYuan Huang
@ 2026-03-23 17:40 ` David Sterba
2026-03-24 2:56 ` ZhengYuan Huang
0 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2026-03-23 17:40 UTC (permalink / raw)
To: ZhengYuan Huang
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427, stable
On Sat, Mar 14, 2026 at 08:37:39PM +0800, ZhengYuan Huang wrote:
> [BUG]
> Running btrfs balance with a usage filter (-dusage=N) can trigger a
> null-ptr-deref when metadata corruption causes a chunk to have no
> corresponding block group in the in-memory cache:
>
> KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
> RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
> RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
> RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
> ...
> Call Trace:
> btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
> btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
> vfs_ioctl fs/ioctl.c:51 [inline]
> ...
>
> The bug is reproducible on next-20260312 with our dynamic metadata
> fuzzing tool, which corrupts btrfs metadata at runtime.
So, for example you let a filesystem create some structures, let it
continue, damage/destroy the structures and then let it access again?
If this is supposed to emulate a corruption, either on media or in the
IO path then OK.
> [CAUSE]
> Two separate data structures are involved:
>
> 1. The on-disk chunk tree, which records every chunk (logical address
> space region) and is iterated by __btrfs_balance().
> 2. The in-memory block group cache (fs_info->block_group_cache_tree),
> which is built at mount time by btrfs_read_block_groups() and holds
> a struct btrfs_block_group for each chunk. This cache is what the
> usage filter queries.
>
> On a well-formed filesystem, these two are kept in 1:1 correspondence.
> However, btrfs_read_block_groups() builds the cache from block group
> items in the extent tree, not directly from the chunk tree. A corrupted
> image can therefore contain a chunk item in the chunk tree whose
> corresponding block group item is absent from the extent tree; that
> chunk's block group is then never inserted into the in-memory cache.
>
> When balance iterates the chunk tree and reaches such an orphaned chunk,
> should_balance_chunk() calls chunk_usage_filter(), which queries the block
> group cache:
>
> cache = btrfs_lookup_block_group(fs_info, chunk_offset);
> chunk_used = cache->used; /* cache may be NULL */
>
> btrfs_lookup_block_group() returns NULL silently when no cached entry
> covers chunk_offset. chunk_usage_filter() does not check the return value,
> so the immediately following dereference of cache->used triggers the crash.
>
> [FIX]
> Add a NULL check after btrfs_lookup_block_group() in chunk_usage_filter().
> When the lookup fails, emit a btrfs_err() message identifying the
> affected bytenr and return -EUCLEAN to indicate filesystem corruption.
>
> Since the filter function now has an error return path, change its
> return type from bool to int (negative = error, 0 = do not balance,
> positive = balance). Update should_balance_chunk() accordingly (bool ->
> int, with the same convention) and add error propagation for the usage
> filter path. Finally, handle the new negative return in __btrfs_balance()
> by jumping to the existing error path, which aborts the balance
> operation and reports the error to userspace.
>
> After the fix, the same corruption is correctly detected and reported
> by the filter, and the null-ptr-deref is no longer triggered.
>
> Fixes: 5ce5b3c0916b ("Btrfs: usage filter")
> Cc: stable@vger.kernel.org
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> fs/btrfs/volumes.c | 28 +++++++++++++++++++++-------
> 1 file changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 2bec544d8ba3..7c21ac249383 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3863,14 +3863,20 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
> return ret;
> }
>
> -static bool chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> - struct btrfs_balance_args *bargs)
> +static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> + struct btrfs_balance_args *bargs)
> {
> struct btrfs_block_group *cache;
> u64 chunk_used, user_thresh;
> bool ret = true;
As this is bool it does not match the changed return type anymore
>
> cache = btrfs_lookup_block_group(fs_info, chunk_offset);
> + if (!cache) {
> + btrfs_err(fs_info,
> + "balance: chunk at bytenr %llu has no corresponding block group",
> + chunk_offset);
> + return -EUCLEAN;
> + }
> chunk_used = cache->used;
>
> if (bargs->usage_min == 0)
> @@ -3986,8 +3992,8 @@ static bool chunk_soft_convert_filter(u64 chunk_type, struct btrfs_balance_args
> return false;
> }
>
> -static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> - u64 chunk_offset)
> +static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> + u64 chunk_offset)
> {
> struct btrfs_fs_info *fs_info = leaf->fs_info;
> struct btrfs_balance_control *bctl = fs_info->balance_ctl;
> @@ -4014,9 +4020,13 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
> }
>
> /* usage filter */
> - if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE) &&
> - chunk_usage_filter(fs_info, chunk_offset, bargs)) {
> - return false;
> + if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
> + int filter_ret = chunk_usage_filter(fs_info, chunk_offset, bargs);
Same problem here. Also please use ret2 for nested return values.
> +
> + if (filter_ret < 0)
> + return filter_ret;
> + if (filter_ret)
> + return false;
> } else if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) &&
> chunk_usage_range_filter(fs_info, chunk_offset, bargs)) {
> return false;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks
2026-03-14 12:37 ` [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks ZhengYuan Huang
@ 2026-03-23 17:52 ` David Sterba
2026-03-24 2:57 ` ZhengYuan Huang
0 siblings, 1 reply; 10+ messages in thread
From: David Sterba @ 2026-03-23 17:52 UTC (permalink / raw)
To: ZhengYuan Huang
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427
On Sat, Mar 14, 2026 at 08:37:41PM +0800, ZhengYuan Huang wrote:
> [BUG]
> A corrupted image with a chunk present in the chunk tree but whose
> corresponding block group item is missing from the extent tree can be
> mounted successfully, even though check_chunk_block_group_mappings()
> is supposed to catch exactly this corruption at mount time. Once
> mounted, running btrfs balance with a usage filter (-dusage=N or
> -dusage=min..max) triggers a null-ptr-deref:
>
> KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
> RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
> RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
> RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
>
> The crash occurs because __btrfs_balance() iterates the on-disk chunk
> tree, finds the orphaned chunk, calls chunk_usage_filter() (or
> chunk_usage_range_filter()), which queries the in-memory block group
> cache via btrfs_lookup_block_group(). Since no block group was ever
> inserted for this chunk, the lookup returns NULL, and the subsequent
> dereference of cache->used crashes.
>
> [CAUSE]
> check_chunk_block_group_mappings() uses btrfs_find_chunk_map() to
> iterate the in-memory chunk map (fs_info->mapping_tree):
>
> map = btrfs_find_chunk_map(fs_info, start, 1);
>
> With @start = 0 and @length = 1, btrfs_find_chunk_map() looks for a
> chunk map that *contains* the logical address 0. If no chunk contains
> logical address 0, btrfs_find_chunk_map(fs_info, 0, 1) returns NULL
> immediately and the loop breaks after the very first iteration,
> having checked zero chunks. The entire verification function is therefore
> a no-op, and the corrupted image passes the mount-time check undetected.
>
> [FIX]
> Replace the btrfs_find_chunk_map() based loop with a direct in-order
> walk of fs_info->mapping_tree using rb_first_cached() + rb_next(),
> protected by mapping_tree_lock. This guarantees that every chunk map
> in the tree is visited regardless of the logical addresses involved.
> Since the mapping_tree itself is accessed under read_lock, no refcount
> manipulation of each map entry is needed inside the loop, so the
> btrfs_free_chunk_map() calls on the map are also removed.
>
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> fs/btrfs/block-group.c | 21 ++++++---------------
> 1 file changed, 6 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 5322ef2ae015..25bd0d058be6 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2319,29 +2319,22 @@ static struct btrfs_block_group *btrfs_create_block_group_cache(
> */
> static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
> {
> - u64 start = 0;
> + struct rb_node *node;
> int ret = 0;
>
> - while (1) {
> + read_lock(&fs_info->mapping_tree_lock);
This is called during mount indirectly from open_ctree() and this is
single threaded (partially), so the lock may not be needed. It would be
needed if there's eg. caching thread possibly accessing the same
structures, I haven't looked closely.
> + for (node = rb_first_cached(&fs_info->mapping_tree); node;
> + node = rb_next(node)) {
> struct btrfs_chunk_map *map;
> struct btrfs_block_group *bg;
>
> - /*
> - * btrfs_find_chunk_map() will return the first chunk map
> - * intersecting the range, so setting @length to 1 is enough to
> - * get the first chunk.
> - */
> - map = btrfs_find_chunk_map(fs_info, start, 1);
> - if (!map)
> - break;
> -
> + map = rb_entry(node, struct btrfs_chunk_map, rb_node);
> bg = btrfs_lookup_block_group(fs_info, map->start);
What concerns me is this lookup. Previously the references avoided
taking the big lock. The time the lock is held may add up significanly
for all block groups but as said before it might not be necessary due to
the mount context.
> if (unlikely(!bg)) {
> btrfs_err(fs_info,
> "chunk start=%llu len=%llu doesn't have corresponding block group",
> map->start, map->chunk_len);
> ret = -EUCLEAN;
> - btrfs_free_chunk_map(map);
> break;
> }
> if (unlikely(bg->start != map->start || bg->length != map->chunk_len ||
> @@ -2354,14 +2347,12 @@ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
> bg->start, bg->length,
> bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
> ret = -EUCLEAN;
> - btrfs_free_chunk_map(map);
> btrfs_put_block_group(bg);
> break;
> }
> - start = map->start + map->chunk_len;
> - btrfs_free_chunk_map(map);
> btrfs_put_block_group(bg);
> }
> + read_unlock(&fs_info->mapping_tree_lock);
> return ret;
> }
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification
2026-03-23 17:33 ` [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification David Sterba
@ 2026-03-24 2:53 ` ZhengYuan Huang
0 siblings, 0 replies; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-24 2:53 UTC (permalink / raw)
To: dsterba
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427
On Tue, Mar 24, 2026 at 1:33 AM David Sterba <dsterba@suse.cz> wrote:
> As I understand it you're using some advanced fuzzing tool (patch 1
> mentions runtime fuzzing), so the errors would not normally happen. With
> fuzzing it depends on the capabilities, at runtime it is possible to
> confuse the filesystem so much that sipmle checks can't detect it.
>
> Here checking if block group lookups are ok makes sense in general.
> There are existing checks that seem to be following the same logic like
> in unpin_extent_range().
Thanks for your review.
Yes, we are using an in-house runtime fuzzing tool.
However, after further investigation of this bug, we found that it is
not limited to fuzzing-only scenarios. The issue can be reliably
triggered by using a crafted filesystem image together with normal syscalls.
So this may not be purely a fuzzing artifact, but rather a potential
robustness issue that could be hit in practice.
> This is good, thanks. For simple fixes or cleanups it's fine to
> make a vague reference to the main patch or a "in the previous/followup
> patches".
Thanks for the guidance, I’ll continue to follow this convention for
changelogs.
Thanks,
ZhengYuan Huang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter
2026-03-23 17:40 ` David Sterba
@ 2026-03-24 2:56 ` ZhengYuan Huang
0 siblings, 0 replies; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-24 2:56 UTC (permalink / raw)
To: dsterba
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427, stable
On Tue, Mar 24, 2026 at 1:40 AM David Sterba <dsterba@suse.cz> wrote:
> So, for example you let a filesystem create some structures, let it
> continue, damage/destroy the structures and then let it access again?
>
> If this is supposed to emulate a corruption, either on media or in the
> IO path then OK.
Yes, this is one of the fuzzing strategies we use, where metadata is
intentionally corrupted at runtime to emulate possible media corruption
or I/O errors.
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 2bec544d8ba3..7c21ac249383 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -3863,14 +3863,20 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
> > return ret;
> > }
> >
> > -static bool chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> > - struct btrfs_balance_args *bargs)
> > +static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> > + struct btrfs_balance_args *bargs)
> > {
> > struct btrfs_block_group *cache;
> > u64 chunk_used, user_thresh;
> > bool ret = true;
>
> As this is bool it does not match the changed return type anymore
>
> >
> > cache = btrfs_lookup_block_group(fs_info, chunk_offset);
> > + if (!cache) {
> > + btrfs_err(fs_info,
> > + "balance: chunk at bytenr %llu has no corresponding block group",
> > + chunk_offset);
> > + return -EUCLEAN;
> > + }
> > chunk_used = cache->used;
> >
> > if (bargs->usage_min == 0)
> > @@ -3986,8 +3992,8 @@ static bool chunk_soft_convert_filter(u64 chunk_type, struct btrfs_balance_args
> > return false;
> > }
> >
> > -static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> > - u64 chunk_offset)
> > +static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> > + u64 chunk_offset)
> > {
> > struct btrfs_fs_info *fs_info = leaf->fs_info;
> > struct btrfs_balance_control *bctl = fs_info->balance_ctl;
> > @@ -4014,9 +4020,13 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
> > }
> >
> > /* usage filter */
> > - if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE) &&
> > - chunk_usage_filter(fs_info, chunk_offset, bargs)) {
> > - return false;
> > + if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
> > + int filter_ret = chunk_usage_filter(fs_info, chunk_offset, bargs);
>
> Same problem here. Also please use ret2 for nested return values.
Thanks for the note, I’ll fix the return type issue and send a v3.
Thanks,
ZhengYuan Huang
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks
2026-03-23 17:52 ` David Sterba
@ 2026-03-24 2:57 ` ZhengYuan Huang
0 siblings, 0 replies; 10+ messages in thread
From: ZhengYuan Huang @ 2026-03-24 2:57 UTC (permalink / raw)
To: dsterba
Cc: dsterba, clm, idryomov, linux-btrfs, linux-kernel, baijiaju1990,
r33s3n6, zzzccc427
On Tue, Mar 24, 2026 at 1:52 AM David Sterba <dsterba@suse.cz> wrote:
> This is called during mount indirectly from open_ctree() and this is
> single threaded (partially), so the lock may not be needed. It would be
> needed if there's eg. caching thread possibly accessing the same
> structures, I haven't looked closely.
>
> > + for (node = rb_first_cached(&fs_info->mapping_tree); node;
> > + node = rb_next(node)) {
> > struct btrfs_chunk_map *map;
> > struct btrfs_block_group *bg;
> >
> > - /*
> > - * btrfs_find_chunk_map() will return the first chunk map
> > - * intersecting the range, so setting @length to 1 is enough to
> > - * get the first chunk.
> > - */
> > - map = btrfs_find_chunk_map(fs_info, start, 1);
> > - if (!map)
> > - break;
> > -
> > + map = rb_entry(node, struct btrfs_chunk_map, rb_node);
> > bg = btrfs_lookup_block_group(fs_info, map->start);
>
> What concerns me is this lookup. Previously the references avoided
> taking the big lock. The time the lock is held may add up significanly
> for all block groups but as said before it might not be necessary due to
> the mount context.
Thanks for the suggestion, I’ll take a closer look at the locking here.
If the lock turns out to be unnecessary in this context, I’ll drop it
and include the change in v3.
Thanks,
ZhengYuan Huang
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-24 2:58 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-14 12:37 [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 1/3] btrfs: balance: fix null-ptr-deref in chunk_usage_filter ZhengYuan Huang
2026-03-23 17:40 ` David Sterba
2026-03-24 2:56 ` ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 2/3] btrfs: balance: fix null-ptr-deref in chunk_usage_range_filter ZhengYuan Huang
2026-03-14 12:37 ` [PATCH v2 3/3] btrfs: fix check_chunk_block_group_mappings() to actually iterate all chunks ZhengYuan Huang
2026-03-23 17:52 ` David Sterba
2026-03-24 2:57 ` ZhengYuan Huang
2026-03-23 17:33 ` [PATCH v2 0/3] btrfs: fix balance NULL derefs and chunk/bg mapping verification David Sterba
2026-03-24 2:53 ` ZhengYuan Huang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox