From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Naohiro Aota <naohiro.aota@wdc.com>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
clm@fb.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.1] btrfs: zoned: refine extent allocator hint selection
Date: Mon, 6 Oct 2025 14:17:34 -0400 [thread overview]
Message-ID: <20251006181835.1919496-2-sashal@kernel.org> (raw)
In-Reply-To: <20251006181835.1919496-1-sashal@kernel.org>
From: Naohiro Aota <naohiro.aota@wdc.com>
[ Upstream commit 0d703963d297964451783e1a0688ebdf74cd6151 ]
The hint block group selection in the extent allocator is wrong in the
first place, as it can select the dedicated data relocation block group for
the normal data allocation.
Since we separated the normal data space_info and the data relocation
space_info, we can easily identify a block group is for data relocation or
not. Do not choose it for the normal data allocation.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my comprehensive analysis, here is my assessment:
## Backport Status: YES
### Summary of Analysis
**What the commit fixes:**
The commit fixes incorrect extent allocator hint selection in btrfs
zoned mode where the allocator could incorrectly select the dedicated
data relocation block group for normal data allocation. This violates
the design principle that relocation block groups should only be used
for data relocation operations (fs/btrfs/extent-tree.c:4322).
**Code change analysis:**
1. Adds `space_info` parameter to `prepare_allocation_zoned()` function
(lines 4299-4301)
2. Adds a critical check: `block_group->space_info == space_info` before
selecting a block group as the hint (line 4323)
3. Passes the `space_info` parameter through the call chain (line 4344)
The change is minimal (4 insertions, 2 deletions) and surgically
targeted.
**Why this is a bug affecting users:**
After commit f92ee31e031c7 (v6.16-rc1, May 2025) introduced sub-
space_info separation, btrfs zoned mode maintains separate space_info
structures for:
- Normal data block groups
- Data relocation block groups
Without this fix, `prepare_allocation_zoned()` only checks
`block_group_bits(block_group, ffe_ctl->flags)` which verifies the block
group is DATA type, but doesn't distinguish between normal data and
relocation data since both have the same flags. This can cause:
1. **Incorrect space accounting**: Normal allocations appear to have
free space when only relocation space is available
2. **ENOSPC errors**: As noted in commit f92ee31e031c7, users could
experience "strange ENOSPC" errors
3. **Write pointer violations**: On zoned devices, mixing relocation
extents and regular extents in the same zone can cause WRITE and ZONE
APPEND commands to be dispatched simultaneously, breaking the write
pointer (see commit 7b2d588572e75)
**Risk assessment:**
- **Very low risk**: The change is confined to btrfs zoned extent
allocator hint selection
- **Clear correctness**: The fix ensures block group selection respects
space_info boundaries
- **Code quality**: Reviewed by Johannes Thumshirn, an expert in btrfs
zoned support
**Dependencies:**
- **Hard dependency**: Requires sub-space_info infrastructure from
commit f92ee31e031c7 (merged in v6.16-rc1)
- **Cannot be backported to kernels < v6.16**
**Why it meets stable tree criteria:**
1. ✅ Fixes important bug affecting btrfs zoned mode users
2. ✅ Small, contained change with minimal regression risk
3. ✅ No architectural changes
4. ✅ Clear side effects are intentional (preventing wrong block group
selection)
5. ✅ Fixes correctness issue introduced by the v6.16 space_info
separation
**Recommendation:** This commit should be backported to v6.16+ stable
trees to ensure correct extent allocator behavior for btrfs zoned mode
users.
fs/btrfs/extent-tree.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 97d517cdf2df7..682d21a73a67a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4297,7 +4297,8 @@ static int prepare_allocation_clustered(struct btrfs_fs_info *fs_info,
}
static int prepare_allocation_zoned(struct btrfs_fs_info *fs_info,
- struct find_free_extent_ctl *ffe_ctl)
+ struct find_free_extent_ctl *ffe_ctl,
+ struct btrfs_space_info *space_info)
{
if (ffe_ctl->for_treelog) {
spin_lock(&fs_info->treelog_bg_lock);
@@ -4321,6 +4322,7 @@ static int prepare_allocation_zoned(struct btrfs_fs_info *fs_info,
u64 avail = block_group->zone_capacity - block_group->alloc_offset;
if (block_group_bits(block_group, ffe_ctl->flags) &&
+ block_group->space_info == space_info &&
avail >= ffe_ctl->num_bytes) {
ffe_ctl->hint_byte = block_group->start;
break;
@@ -4342,7 +4344,7 @@ static int prepare_allocation(struct btrfs_fs_info *fs_info,
return prepare_allocation_clustered(fs_info, ffe_ctl,
space_info, ins);
case BTRFS_EXTENT_ALLOC_ZONED:
- return prepare_allocation_zoned(fs_info, ffe_ctl);
+ return prepare_allocation_zoned(fs_info, ffe_ctl, space_info);
default:
BUG();
}
--
2.51.0
next prev parent reply other threads:[~2025-10-06 18:18 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 18:17 [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Sasha Levin
2025-10-06 18:17 ` Sasha Levin [this message]
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction on specific error places when walking log tree Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] x86/bugs: Add attack vector controls for VMSCAPE Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] sched_ext: Keep bypass on between enable failure and scx_disable_workfn() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction if we fail to update inode in log replay dir fixup Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] EDAC/mc_sysfs: Increase legacy channel support to 16 Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction in the process_one_buffer() log tree walk callback Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: zoned: return error from btrfs_zone_finish_endio() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Skip user unwind if the task is a kernel thread Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] perf: Have get_perf_callchain() return NULL if crosstask and user are set Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] EDAC: Fix wrong executable file modes for C source files Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: use level argument in log tree walk callback replay_one_buffer() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] sched_ext: Make qmap dump operation non-destructive Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] seccomp: passthrough uprobe systemcall without filtering Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] cpuset: Use new excpus for nocpu error check when enabling root partition Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: tree-checker: add inode extref checks Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] sched/fair: update_cfs_group() for throttled cfs_rqs Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.15] btrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] x86/bugs: Fix reporting of LFENCE retpoline Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.10] btrfs: always drop log root tree reference in btrfs_replay_log() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] audit: record fanotify event regardless of presence of rules Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support Sasha Levin
2025-10-06 18:18 ` [PATCH AUTOSEL 6.17-6.6] x86/bugs: Report correct retbleed mitigation status Sasha Levin
2025-10-06 21:55 ` [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Nathan Chancellor
2025-10-06 23:13 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251006181835.1919496-2-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=naohiro.aota@wdc.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).