public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: balance: fix null-ptr-deref in usage filters
@ 2026-03-13 14:06 ZhengYuan Huang
  2026-03-13 21:28 ` Qu Wenruo
  0 siblings, 1 reply; 2+ messages in thread
From: ZhengYuan Huang @ 2026-03-13 14:06 UTC (permalink / raw)
  To: dsterba, clm, idryomov
  Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	ZhengYuan Huang, stable

[BUG]
Running btrfs balance with a usage filter (-dusage=N or -dusage=min..max)
on a corrupted image triggers a null-ptr-deref crash.

In chunk_usage_filter():
  KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
  RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline]
  RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline]
  RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline]
  RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604
  ...
  Call Trace:
    btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
    btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
    vfs_ioctl fs/ioctl.c:51 [inline]
    ...

In chunk_usage_range_filter():
  KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
  RIP: 0010:chunk_usage_range_filter fs/btrfs/volumes.c:3845 [inline]
  RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4031 [inline]
  RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4182 [inline]
  RIP: 0010:btrfs_balance+0x249e/0x4320 fs/btrfs/volumes.c:4618
  ...
  Call Trace:
    btrfs_ioctl_balance fs/btrfs/ioctl.c:3577 [inline]
    btrfs_ioctl+0x25cf/0x5b90 fs/btrfs/ioctl.c:5313
    vfs_ioctl fs/ioctl.c:51 [inline]
    ...

The two bugs are independently triggerable:
- chunk_usage_filter() is reached via BTRFS_BALANCE_ARGS_USAGE, set when
  the user passes a single threshold (-dusage=N).
- chunk_usage_range_filter() is reached via BTRFS_BALANCE_ARGS_USAGE_RANGE,
  set when the user passes a range (-dusage=min..max).

These two flags are mutually exclusive; either path can crash on its own
without the other being involved.

These two bugs are reproducible on next-20260312 with our dynamic
metadata fuzzing tool that corrupts btrfs metadata at runtime.

[CAUSE]
There are two separate data structures involved:

1. The on-disk chunk tree, which records every chunk (logical address
   space region) and is iterated by __btrfs_balance().
2. The in-memory block group cache (fs_info->block_group_cache_tree),
   which is built at mount time by btrfs_read_block_groups() and holds
   a struct btrfs_block_group for each chunk. This cache is what the 
   usage filters query.

On a well-formed filesystem these two are kept in 1:1 correspondence.
However, btrfs_read_block_groups() builds the cache from block group
items in the extent tree, not directly from the chunk tree.  A corrupted
image can therefore present a chunk item in the chunk tree whose
corresponding block group item is absent from the extent tree; that
chunk's block group is then never inserted into the in-memory cache.

When balance iterates the chunk tree and reaches such an orphaned chunk,
should_balance_chunk() calls chunk_usage_filter() or
chunk_usage_range_filter(), both of which query the block group cache:

  cache = btrfs_lookup_block_group(fs_info, chunk_offset);
  chunk_used = cache->used;   /* cache may be NULL */

btrfs_lookup_block_group() returns NULL silently when no cached entry
covers chunk_offset. Neither filter checks the return value, so the
immediately following dereference of cache->used triggers the crash.

[FIX]
Add a NULL check after btrfs_lookup_block_group() in both
chunk_usage_filter() and chunk_usage_range_filter(). When the lookup
fails, emit a btrfs_err() message identifying the offending bytenr and
return -EUCLEAN to indicate filesystem corruption.

Since both filter functions now have an error return path, change their
return type from bool to int (negative = error, 0 = do not balance,
positive = balance). Update should_balance_chunk() accordingly (bool ->
int, same convention) and add error propagation for both usage filter
branches. Finally, handle the new negative return in __btrfs_balance()
by jumping to the existing error path, which aborts the balance
operation and reports the error to userspace.

After the fix, the same corruption is correctly detected and reported
by the filters, and the null-ptr-deref is no longer triggered.

Fixes: 5ce5b3c0916b ("Btrfs: usage filter")
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Cc: stable@vger.kernel.org # v3.3+
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
I was not sure whether these two bugs should be fixed in a single patch
or split into two. They share the same root cause, are very close to
each other in the code, and both depend on the same change to
should_balance_chunk(), so I kept them in one patch for now. If splitting
them would be preferred, I can respin this patch accordingly.
---
 fs/btrfs/volumes.c | 48 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2bec544d8ba3..3aa44967c1dd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3832,8 +3832,8 @@ static bool chunk_profiles_filter(u64 chunk_type, struct btrfs_balance_args *bar
 	return true;
 }
 
-static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
-				     struct btrfs_balance_args *bargs)
+static int chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
+				    struct btrfs_balance_args *bargs)
 {
 	struct btrfs_block_group *cache;
 	u64 chunk_used;
@@ -3842,6 +3842,12 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
 	bool ret = true;
 
 	cache = btrfs_lookup_block_group(fs_info, chunk_offset);
+	if (!cache) {
+		btrfs_err(fs_info,
+			  "balance: chunk at bytenr %llu has no corresponding block group",
+			  chunk_offset);
+		return -EUCLEAN;
+	}
 	chunk_used = cache->used;
 
 	if (bargs->usage_min == 0)
@@ -3863,14 +3869,20 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
 	return ret;
 }
 
-static bool chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
-			       struct btrfs_balance_args *bargs)
+static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
+			      struct btrfs_balance_args *bargs)
 {
 	struct btrfs_block_group *cache;
 	u64 chunk_used, user_thresh;
 	bool ret = true;
 
 	cache = btrfs_lookup_block_group(fs_info, chunk_offset);
+	if (!cache) {
+		btrfs_err(fs_info,
+			  "balance: chunk at bytenr %llu has no corresponding block group",
+			  chunk_offset);
+		return -EUCLEAN;
+	}
 	chunk_used = cache->used;
 
 	if (bargs->usage_min == 0)
@@ -3986,8 +3998,8 @@ static bool chunk_soft_convert_filter(u64 chunk_type, struct btrfs_balance_args
 	return false;
 }
 
-static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
-				 u64 chunk_offset)
+static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
+				u64 chunk_offset)
 {
 	struct btrfs_fs_info *fs_info = leaf->fs_info;
 	struct btrfs_balance_control *bctl = fs_info->balance_ctl;
@@ -4014,12 +4026,20 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
 	}
 
 	/* usage filter */
-	if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE) &&
-	    chunk_usage_filter(fs_info, chunk_offset, bargs)) {
-		return false;
-	} else if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) &&
-	    chunk_usage_range_filter(fs_info, chunk_offset, bargs)) {
-		return false;
+	if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
+		int filter_ret = chunk_usage_filter(fs_info, chunk_offset, bargs);
+
+		if (filter_ret < 0)
+			return filter_ret;
+		if (filter_ret)
+			return 0;
+	} else if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) {
+		int filter_ret = chunk_usage_range_filter(fs_info, chunk_offset, bargs);
+
+		if (filter_ret < 0)
+			return filter_ret;
+		if (filter_ret)
+			return 0;
 	}
 
 	/* devid filter */
@@ -4172,6 +4192,10 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 		ret = should_balance_chunk(leaf, chunk, found_key.offset);
 
 		btrfs_release_path(path);
+		if (ret < 0) {
+			mutex_unlock(&fs_info->reclaim_bgs_lock);
+			goto error;
+		}
 		if (!ret) {
 			mutex_unlock(&fs_info->reclaim_bgs_lock);
 			goto loop;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] btrfs: balance: fix null-ptr-deref in usage filters
  2026-03-13 14:06 [PATCH] btrfs: balance: fix null-ptr-deref in usage filters ZhengYuan Huang
@ 2026-03-13 21:28 ` Qu Wenruo
  0 siblings, 0 replies; 2+ messages in thread
From: Qu Wenruo @ 2026-03-13 21:28 UTC (permalink / raw)
  To: ZhengYuan Huang, dsterba, clm, idryomov
  Cc: linux-btrfs, linux-kernel, baijiaju1990, r33s3n6, zzzccc427,
	stable



在 2026/3/14 00:36, ZhengYuan Huang 写道:
[...]
> 
> On a well-formed filesystem these two are kept in 1:1 correspondence.
> However, btrfs_read_block_groups() builds the cache from block group
> items in the extent tree, not directly from the chunk tree.  A corrupted
> image can therefore present a chunk item in the chunk tree whose
> corresponding block group item is absent from the extent tree; that
> chunk's block group is then never inserted into the in-memory cache.

This is unexpected in the first place.

We had a lot of extra checks regarding chunks/bgs, e.g. 
btrfs_verify_dev_extents() to verify there is no such missing mapping 
between chunks and their dev extents.

I believe you can also implement such check between chunks and bgs, in 
another patch of course.

> 
> When balance iterates the chunk tree and reaches such an orphaned chunk,
> should_balance_chunk() calls chunk_usage_filter() or
> chunk_usage_range_filter(), both of which query the block group cache:
> 
>    cache = btrfs_lookup_block_group(fs_info, chunk_offset);
>    chunk_used = cache->used;   /* cache may be NULL */
> 
> btrfs_lookup_block_group() returns NULL silently when no cached entry
> covers chunk_offset. Neither filter checks the return value, so the
> immediately following dereference of cache->used triggers the crash.
> 
> [FIX]
> Add a NULL check after btrfs_lookup_block_group() in both
> chunk_usage_filter() and chunk_usage_range_filter(). When the lookup
> fails, emit a btrfs_err() message identifying the offending bytenr and
> return -EUCLEAN to indicate filesystem corruption.
> 
> Since both filter functions now have an error return path, change their
> return type from bool to int (negative = error, 0 = do not balance,
> positive = balance). Update should_balance_chunk() accordingly (bool ->
> int, same convention) and add error propagation for both usage filter
> branches. Finally, handle the new negative return in __btrfs_balance()
> by jumping to the existing error path, which aborts the balance
> operation and reports the error to userspace.
> 
> After the fix, the same corruption is correctly detected and reported
> by the filters, and the null-ptr-deref is no longer triggered.
> 
> Fixes: 5ce5b3c0916b ("Btrfs: usage filter")
> Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
> Cc: stable@vger.kernel.org # v3.3+

You may not want to add a version that's already EOL, just plain CC to 
stable should be good enough.

> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> I was not sure whether these two bugs should be fixed in a single patch
> or split into two. They share the same root cause, are very close to
> each other in the code, and both depend on the same change to
> should_balance_chunk(), so I kept them in one patch for now. If splitting
> them would be preferred, I can respin this patch accordingly.

Considering the two filters are introduced in different patches, one fix 
for each will make backport much easier.

But still, both are very old thus backporting may not be that easy for 
older kernels.

Otherwise the code looks good to me.

Thanks,
Qu

> ---
>   fs/btrfs/volumes.c | 48 ++++++++++++++++++++++++++++++++++------------
>   1 file changed, 36 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 2bec544d8ba3..3aa44967c1dd 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3832,8 +3832,8 @@ static bool chunk_profiles_filter(u64 chunk_type, struct btrfs_balance_args *bar
>   	return true;
>   }
>   
> -static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> -				     struct btrfs_balance_args *bargs)
> +static int chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> +				    struct btrfs_balance_args *bargs)
>   {
>   	struct btrfs_block_group *cache;
>   	u64 chunk_used;
> @@ -3842,6 +3842,12 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
>   	bool ret = true;
>   
>   	cache = btrfs_lookup_block_group(fs_info, chunk_offset);
> +	if (!cache) {
> +		btrfs_err(fs_info,
> +			  "balance: chunk at bytenr %llu has no corresponding block group",
> +			  chunk_offset);
> +		return -EUCLEAN;
> +	}
>   	chunk_used = cache->used;
>   
>   	if (bargs->usage_min == 0)
> @@ -3863,14 +3869,20 @@ static bool chunk_usage_range_filter(struct btrfs_fs_info *fs_info, u64 chunk_of
>   	return ret;
>   }
>   
> -static bool chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> -			       struct btrfs_balance_args *bargs)
> +static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset,
> +			      struct btrfs_balance_args *bargs)
>   {
>   	struct btrfs_block_group *cache;
>   	u64 chunk_used, user_thresh;
>   	bool ret = true;
>   
>   	cache = btrfs_lookup_block_group(fs_info, chunk_offset);
> +	if (!cache) {
> +		btrfs_err(fs_info,
> +			  "balance: chunk at bytenr %llu has no corresponding block group",
> +			  chunk_offset);
> +		return -EUCLEAN;
> +	}
>   	chunk_used = cache->used;
>   
>   	if (bargs->usage_min == 0)
> @@ -3986,8 +3998,8 @@ static bool chunk_soft_convert_filter(u64 chunk_type, struct btrfs_balance_args
>   	return false;
>   }
>   
> -static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> -				 u64 chunk_offset)
> +static int should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk *chunk,
> +				u64 chunk_offset)
>   {
>   	struct btrfs_fs_info *fs_info = leaf->fs_info;
>   	struct btrfs_balance_control *bctl = fs_info->balance_ctl;
> @@ -4014,12 +4026,20 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
>   	}
>   
>   	/* usage filter */
> -	if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE) &&
> -	    chunk_usage_filter(fs_info, chunk_offset, bargs)) {
> -		return false;
> -	} else if ((bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) &&
> -	    chunk_usage_range_filter(fs_info, chunk_offset, bargs)) {
> -		return false;
> +	if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE) {
> +		int filter_ret = chunk_usage_filter(fs_info, chunk_offset, bargs);
> +
> +		if (filter_ret < 0)
> +			return filter_ret;
> +		if (filter_ret)
> +			return 0;
> +	} else if (bargs->flags & BTRFS_BALANCE_ARGS_USAGE_RANGE) {
> +		int filter_ret = chunk_usage_range_filter(fs_info, chunk_offset, bargs);
> +
> +		if (filter_ret < 0)
> +			return filter_ret;
> +		if (filter_ret)
> +			return 0;
>   	}
>   
>   	/* devid filter */
> @@ -4172,6 +4192,10 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
>   		ret = should_balance_chunk(leaf, chunk, found_key.offset);
>   
>   		btrfs_release_path(path);
> +		if (ret < 0) {
> +			mutex_unlock(&fs_info->reclaim_bgs_lock);
> +			goto error;
> +		}
>   		if (!ret) {
>   			mutex_unlock(&fs_info->reclaim_bgs_lock);
>   			goto loop;


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-13 21:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 14:06 [PATCH] btrfs: balance: fix null-ptr-deref in usage filters ZhengYuan Huang
2026-03-13 21:28 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox