From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: robbieko <robbieko@synology.com>, David Sterba <dsterba@suse.com>,
Sasha Levin <sashal@kernel.org>,
clm@fb.com, linux-btrfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 7.0-6.18] btrfs: replace ASSERT with proper error handling in stripe lookup fallback
Date: Tue, 5 May 2026 05:51:33 -0400 [thread overview]
Message-ID: <20260505095149.512052-17-sashal@kernel.org> (raw)
In-Reply-To: <20260505095149.512052-1-sashal@kernel.org>
From: robbieko <robbieko@synology.com>
[ Upstream commit 653361585d251fbca0e19ac58b04ba95dd01e378 ]
After falling back to the previous item in btrfs_delete_raid_extent(),
the code uses ASSERT(found_start <= start) to verify the found extent
actually precedes our target range. If the B-tree state is unexpected
(e.g. no overlapping extent exists), this triggers a kernel BUG/panic
in debug builds, or silently continues with wrong data otherwise.
Replace the ASSERT with a proper bounds check that returns -ENOENT if
the found extent does not actually overlap with the start position.
Signed-off-by: robbieko <robbieko@synology.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Phase 1: Commit Message Forensics
Step 1.1 Record: Subsystem `btrfs`; action verb `replace`; intent is to
replace an assertion in `btrfs_delete_raid_extent()` with real error
handling for stripe lookup fallback.
Step 1.2 Record: Tags in commit
`653361585d251fbca0e19ac58b04ba95dd01e378`: `Signed-off-by: robbieko
<robbieko@synology.com>`, `Reviewed-by: David Sterba
<dsterba@suse.com>`, `Signed-off-by: David Sterba <dsterba@suse.com>`.
No `Fixes:`, no `Reported-by:`, no `Tested-by:`, no `Link:`, no `Cc:
stable`.
Step 1.3 Record: The commit says that after fallback to the previous
item, `ASSERT(found_start <= start)` can BUG/panic when B-tree state is
unexpected, and non-assert builds can continue with wrong stripe data.
Root cause described: the previous item found may not actually overlap
the requested deletion range.
Step 1.4 Record: This is not hidden cleanup. It is explicitly a bug fix:
it converts an invariant-only check into a runtime bounds check
returning `-ENOENT`.
## Phase 2: Diff Analysis
Step 2.1 Record: One file changed: `fs/btrfs/raid-stripe-tree.c`, 4
insertions, 1 deletion. One function changed:
`btrfs_delete_raid_extent()`. Scope: single-file surgical fix.
Step 2.2 Record: Before, fallback loaded the previous key, computed
`found_start` and `found_end`, then asserted only `found_start <=
start`. After, it returns `-ENOENT` and exits the delete loop if
`found_start > start` or `found_end <= start`. This affects the stripe
extent deletion lookup path after `btrfs_search_slot()` chooses an item
after the deletion start and the code backs up to a candidate previous
item.
Step 2.3 Record: Bug category is logic/correctness with data-integrity
implications. Broken mechanism: the fallback candidate was assumed to
overlap the deletion start. The fix verifies both start and end bounds
before later code can truncate, split, or delete a stripe extent.
Step 2.4 Record: Fix quality is high: the check is local, easy to reason
about, and preserves existing error propagation. Regression risk is low,
but not zero: returning `-ENOENT` changes a previous silent/no-op path
into transaction abort at the caller. That is appropriate because the
caller already aborts on `btrfs_delete_raid_extent()` errors and the
condition means the stripe mapping lookup is inconsistent for the
deletion being performed.
## Phase 3: Git History Investigation
Step 3.1 Record: `git blame` shows the fallback block and
`ASSERT(found_start <= start)` were introduced by `76643119045eed`
(`btrfs: fix deletion of a range spanning parts two RAID stripe
extents`), first contained in `v6.14-rc1`. The broader partial deletion
machinery was introduced by `6aea95ee318890`, first contained in
`v6.13-rc1`.
Step 3.2 Record: No `Fixes:` tag is present, so there was no tagged
introducer to follow. Blame identifies the direct introducer.
Step 3.3 Record: Recent `master` history shows this was patch 4 in a
six-patch raid-stripe-tree deletion fix set: preceding fixes include
copying `devid`, fixing leaf-boundary lookup, and fixing
`btrfs_previous_item()` `min_objectid`; following fixes handle `-EAGAIN`
and missing return checks. The commit applies cleanly to the current
`stable/linux-7.0.y` checkout by itself, but it is best considered with
the neighboring raid-stripe-tree deletion fixes.
Step 3.4 Record: Author `robbieko` has multiple adjacent `fs/btrfs/raid-
stripe-tree.c` fixes in `master`. Reviewer/committer David Sterba is
listed as Btrfs maintainer in `MAINTAINERS`. Johannes Thumshirn, who
reviewed other patches in the series, has substantial prior raid-stripe-
tree history in the same file.
Step 3.5 Record: No new helper, structure, API, or external dependency
is introduced. Syntactically standalone: `git apply --check` of the
candidate patch succeeded on the current `stable/linux-7.0.y` checkout.
Semantically, it complements the surrounding lookup fixes.
## Phase 4: Mailing List And External Research
Step 4.1 Record: `b4 dig -c 653361585d251...` found the original
submission at `https://patch.msgid.link/20260413065249.2320122-5-
robbieko@synology.com`. `b4 dig -a -C` showed only v1. The thread was a
six-patch series titled `btrfs: fix multiple bugs in raid-stripe-tree
deletion path`.
Step 4.2 Record: `b4 dig -w` showed the original recipients were
`robbieko` and `linux-btrfs@vger.kernel.org`. In-thread
review/maintainer discussion involved David Sterba and Johannes
Thumshirn. David added the series to `for-next`.
Step 4.3 Record: No `Reported-by:` or `Link:` tag exists for an external
bug report. No syzbot or bugzilla report was present in the commit or
mbox.
Step 4.4 Record: Related patches found in the same series:
`513f8a52eed88`, `2aef5cb1dcf9b`, `1871ae78ffa5c`, `fe0cdfd7118d8`, and
`a8d58a7c02009`. The cover letter states all six fix bugs in raid-
stripe-tree deletion/partial deletion paths.
Step 4.5 Record: No stable-specific nomination was found in the mbox.
WebFetch attempts to lore/stable and lore/all were blocked by Anubis, so
external stable-list search could not be independently verified.
## Phase 5: Code Semantic Analysis
Step 5.1 Record: Modified function: `btrfs_delete_raid_extent()`.
Step 5.2 Record: Callers found by search: production caller
`do_free_extent_accounting()` in `fs/btrfs/extent-tree.c`, plus Btrfs
raid-stripe-tree selftests. `do_free_extent_accounting()` calls
`btrfs_delete_raid_extent()` for data extents and aborts the transaction
on error.
Step 5.3 Record: Key callees in `btrfs_delete_raid_extent()` include
`btrfs_find_chunk_map()`, `btrfs_need_stripe_tree_update()`,
`btrfs_search_slot()`, `btrfs_previous_item()`, `btrfs_del_item()`,
`btrfs_duplicate_item()`, and `btrfs_partially_delete_raid_extent()`.
Step 5.4 Record: Reachability verified: file extent removal paths call
`btrfs_free_extent()`, delayed refs call `__btrfs_free_extent()`, and
when refs reach zero `do_free_extent_accounting()` calls
`btrfs_delete_raid_extent()`. `btrfs_fallocate()` calls
`btrfs_punch_hole()` for `FALLOC_FL_PUNCH_HOLE`, which calls
`btrfs_replace_file_extents()`, which calls `btrfs_drop_extents()`,
which calls `btrfs_free_extent()`. So users with write access to files
on an affected Btrfs filesystem can reach this path via hole punching;
other data extent deletion paths also reach it.
Step 5.5 Record: Similar nearby issue pattern found in the same series:
multiple small fixes to `btrfs_delete_raid_extent()` and
`btrfs_partially_delete_raid_extent()` addressing missed entries, wrong
previous-item bounds, stale leaf pointer, missing `devid`, and unchecked
return values.
## Phase 6: Stable Tree Analysis
Step 6.1 Record: The directly blamed fallback code is first in
`v6.14-rc1`; merge-base checks show it is not in `v6.13`, but is in
`v6.14`, `v6.15`, `v6.16`, and `v7.0`. It is therefore relevant to
stable trees based on `v6.14+`, including the current
`stable/linux-7.0.y` checkout.
Step 6.2 Record: Backport difficulty is low. `git apply --check` of the
candidate patch onto current `stable/linux-7.0.y` succeeded. Neighboring
series patches also applied cleanly in the same check.
Step 6.3 Record: Local stable refs `stable/linux-6.16.y` and
`stable/linux-7.0.y` do not contain the candidate or the adjacent April
2026 raid-stripe-tree deletion fixes checked by subject/ancestor tests.
## Phase 7: Subsystem And Maintainer Context
Step 7.1 Record: Subsystem is Btrfs filesystem, specifically raid-
stripe-tree. Criticality: important, because it is filesystem
data/extent mapping code, but affected population is feature-specific
rather than universal.
Step 7.2 Record: Subsystem activity is high. Recent history in
`fs/btrfs/raid-stripe-tree.c` shows many fixes and follow-up selftests.
The feature is also exposed only under `CONFIG_BTRFS_EXPERIMENTAL` sysfs
feature attributes in this tree, so impact is limited to systems using
that feature.
## Phase 8: Impact And Risk Assessment
Step 8.1 Record: Affected users are Btrfs users with `RAID_STRIPE_TREE`
enabled and stripe-tree updates needed for data block group profiles
covered by `BTRFS_RST_SUPP_BLOCK_GROUP_MASK`.
Step 8.2 Record: Trigger conditions: deleting/freeing data extents on
such filesystems when stripe lookup fallback selects a non-overlapping
candidate. Verified reachable through file hole punching and delayed-ref
extent free paths. I did not verify a concrete reproducer for the exact
bad B-tree state.
Step 8.3 Record: Failure mode severity is high. With
`CONFIG_BTRFS_ASSERT`, `ASSERT()` calls `BUG()`, so this can crash the
kernel. Without that assertion code generated, the later deletion logic
can operate on a non-overlapping stripe extent, which is a filesystem
mapping corruption risk. If the previous candidate ends before the
target, the old code could also silently stop with success.
Step 8.4 Record: Benefit is high for affected users because it prevents
panic or wrong stripe-tree mutation in filesystem deletion code. Risk is
low because the patch is a four-line validation check in one function
and changes invalid lookup state into an error returned to an existing
transaction-abort path.
## Phase 9: Final Synthesis
Step 9.1 Record: Evidence for backporting: real filesystem correctness
bug, possible kernel BUG in assert builds, possible wrong stripe mapping
mutation in non-assert builds, one-file minimal patch, maintainer review
by David Sterba, merged by Linus via Btrfs fixes pull, applies cleanly
to current stable checkout, affected code exists in `v6.14+`. Evidence
against: feature-specific and experimental; no concrete external bug
report or reproducer found; part of a six-patch series, so stable
maintainers should consider taking the nearby deletion fixes too.
Unresolved: no successful WebFetch of lore/stable search due Anubis;
exact real-world frequency not verified.
Step 9.2 Record: Stable rules checklist:
1. Obviously correct and tested: mostly yes by inspection and maintainer
review; no explicit `Tested-by`.
2. Fixes a real bug: yes, invalid stripe lookup fallback handling in
deletion path.
3. Important issue: yes for affected users, because panic and filesystem
mapping corruption are serious.
4. Small and contained: yes, 4 insertions/1 deletion in one function.
5. No new features/APIs: yes.
6. Can apply to stable: yes on current `stable/linux-7.0.y`; relevant
for `v6.14+` trees.
Step 9.3 Record: No exception category applies. This is not a device ID,
quirk, DT, build, or documentation patch.
Step 9.4 Record: Decision is to backport. The limited affected
population does not outweigh the severity and low risk: this is a tiny
filesystem correctness fix for code already present in recent stable
trees.
## Verification
- [Phase 1] `git show --format=fuller --stat 653361585d251...`:
confirmed subject, message, tags, reviewer, and 4 insertions/1
deletion.
- [Phase 2] `git show 653361585d251... -- fs/btrfs/raid-stripe-tree.c`:
confirmed exact hunk replacing `ASSERT(found_start <= start)` with
`found_start > start || found_end <= start` check returning `-ENOENT`.
- [Phase 3] `git blame -L 114,146 -- fs/btrfs/raid-stripe-tree.c`:
confirmed fallback block introduced by `76643119045eed`.
- [Phase 3] `git show 76643119045eed`: confirmed it added the range-
spanning fallback logic.
- [Phase 3] `git describe --contains 76643119045eed`: confirmed first
containment at `v6.14-rc1`.
- [Phase 3] `git show 6aea95ee318890`: confirmed earlier partial
deletion implementation and its ASSERT-related context.
- [Phase 3] `git log master --oneline -30 -- fs/btrfs/raid-stripe-
tree.c`: confirmed related series commits before and after candidate.
- [Phase 3] `git apply --check` using the candidate diff: confirmed
clean application on current `stable/linux-7.0.y`.
- [Phase 4] `b4 dig -c 653361585d251...`: found original lore patch URL.
- [Phase 4] `b4 dig -a -C`: found v1 only.
- [Phase 4] `b4 dig -w`: confirmed original recipients.
- [Phase 4] Saved and read b4 mbox: confirmed six-patch series, David
Sterba maintainer discussion, David’s “Added to for-next”, Johannes’
question about `ENOENT` vs `EUCLEAN`, and David’s acceptance of
`ENOENT`.
- [Phase 5] `rg btrfs_delete_raid_extent`: confirmed production caller
in `do_free_extent_accounting()` and selftests.
- [Phase 5] Read `fs/btrfs/extent-tree.c`: confirmed errors from
`btrfs_delete_raid_extent()` abort transaction.
- [Phase 5] Read `fs/btrfs/file.c`: confirmed `btrfs_fallocate()` hole
punching can reach extent replacement/drop code and
`btrfs_free_extent()`.
- [Phase 5] Read `fs/btrfs/messages.h`: confirmed `ASSERT()` calls
`BUG()` when `CONFIG_BTRFS_ASSERT` is enabled and compiles away
otherwise.
- [Phase 6] `git merge-base --is-ancestor` checks: confirmed blamed code
is in `v6.14+` and candidate is not in local stable refs checked.
- [Phase 7] `rg MAINTAINERS`: confirmed David Sterba is a Btrfs
maintainer.
- [Phase 7] `rg RAID_STRIPE_TREE fs/btrfs/sysfs.c`: confirmed raid-
stripe-tree feature attribute is under `CONFIG_BTRFS_EXPERIMENTAL`.
- [Phase 8] Read `fs/btrfs/raid-stripe-tree.h`: confirmed
`btrfs_need_stripe_tree_update()` limits affected data profiles and
requires `RAID_STRIPE_TREE`.
- UNVERIFIED: exact real-world frequency and a concrete reproducer for
the bad fallback state.
- UNVERIFIED: lore/stable search results, because WebFetch was blocked
by Anubis.
**YES**
fs/btrfs/raid-stripe-tree.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c
index dd924048c6659..a2e9ac2d97988 100644
--- a/fs/btrfs/raid-stripe-tree.c
+++ b/fs/btrfs/raid-stripe-tree.c
@@ -151,7 +151,10 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le
btrfs_item_key_to_cpu(leaf, &key, slot);
found_start = key.objectid;
found_end = found_start + key.offset;
- ASSERT(found_start <= start);
+ if (found_start > start || found_end <= start) {
+ ret = -ENOENT;
+ break;
+ }
}
if (key.type != BTRFS_RAID_STRIPE_KEY)
--
2.53.0
next prev parent reply other threads:[~2026-05-05 9:52 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-05 9:51 [PATCH AUTOSEL 7.0-5.10] ALSA: hda: Avoid WARN_ON() for HDMI chmap slot checks Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.1] nvmet-tcp: check INIT_FAILED before nvmet_req_uninit in digest error path Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] drm/amd/pm: Update emit clock logic Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] smb: client: change allocation requirements in smb2_compound_op Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] btrfs: handle -EAGAIN from btrfs_duplicate_item and refresh stale leaf pointer Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] nvme: add missing MODULE_ALIAS for fabrics transports Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] dpll: export __dpll_pin_change_ntf() for use under dpll_lock Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] nvme-core: fix parameter name in comment Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808 (Samsung PM981/983/970 EVO Plus ) Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] ASoC: spacemit: move hw constraints from hw_params to startup Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] ALSA: usb-audio: apply quirk for Playstation PDP Riffmaster Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] nvmet-tcp: Don't clear tls_key when freeing sq Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] rculist: add list_splice_rcu() for private lists Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] ALSA: hda/realtek: enable mute LED support on ThinkBook 16p Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] mailbox: cix: Add IRQF_NO_SUSPEND to mailbox interrupt Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.12] ASoC: codecs: wcd937x: fix AUX PA sequencing and mixer controls Sasha Levin
2026-05-05 9:51 ` Sasha Levin [this message]
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-5.10] btrfs: handle unexpected free-space-tree key types Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] md/raid5: Fix UAF on IO across the reshape position Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.6] btrfs: apply first key check for readahead when possible Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.6] ASoC: aw88395: Fix kernel panic caused by invalid GPIO error pointer Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.12] nvme-tcp: teardown circular locking fixes Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] btrfs: fix wrong min_objectid in btrfs_previous_item() call Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] btrfs: check return value of btrfs_partially_delete_raid_extent() Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] btrfs: fix raid stripe search missing entries at leaf boundaries Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] btrfs: copy devid in btrfs_partially_delete_raid_extent() Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0-6.18] nvme-multipath: put module reference when delayed removal work is canceled Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] btrfs: abort transaction in do_remap_reloc_trans() on failure Sasha Levin
2026-05-05 9:51 ` [PATCH AUTOSEL 7.0] drm/amdkfd: check if vm ready in svm map and unmap to gpu Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260505095149.512052-17-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=robbieko@synology.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox