patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Filipe Manana <fdmanana@suse.com>, Boris Burkov <boris@bur.io>,
	Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
	Sasha Levin <sashal@kernel.org>,
	clm@fb.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction on specific error places when walking log tree
Date: Mon,  6 Oct 2025 14:17:36 -0400	[thread overview]
Message-ID: <20251006181835.1919496-4-sashal@kernel.org> (raw)
In-Reply-To: <20251006181835.1919496-1-sashal@kernel.org>

From: Filipe Manana <fdmanana@suse.com>

[ Upstream commit 6ebd726b104fa99d47c0d45979e6a6109844ac18 ]

We do several things while walking a log tree (for replaying and for
freeing a log tree) like reading extent buffers and cleaning them up,
but we don't immediately abort the transaction, or turn the fs into an
error state, when one of these things fails. Instead we the transaction
abort or turn the fs into error state in the caller of the entry point
function that walks a log tree - walk_log_tree() - which means we don't
get to know exactly where an error came from.

Improve on this by doing a transaction abort / turn fs into error state
after each such failure so that when it happens we have a better
understanding where the failure comes from. This deliberately leaves
the transaction abort / turn fs into error state in the callers of
walk_log_tree() as to ensure we don't get into an inconsistent state in
case we forget to do it deeper in call chain. It also deliberately does
not do it after errors from the calls to the callback defined in
struct walk_control::process_func(), as we will do it later on another
patch.

Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## YES

### Extensive Analysis and Rationale:

#### Commit Context and History:
This commit (6ebd726b104fa) is part of a comprehensive series by Filipe
Manana from July-September 2025 focused on improving error handling and
debugging capabilities in btrfs log tree operations. The series
includes:

1. **July 16, 2025**: fc5799986fbca "btrfs: error on missing block group
   when unaccounting log tree extent buffers" - explicitly marked for
   stable (CC: stable@vger.kernel.org # 6.12+) and addresses space leaks
2. **July 16, 2025**: 6ebd726b104fa (this commit) - adds transaction
   aborts at specific error points
3. **July 18, 2025**: 912c257c88cd8 "btrfs: abort transaction where
   errors happen during log tree replay" - extends error handling to
   replay_one_buffer() callback
4. **Sept 5, 2025**: 2753e49176240 "btrfs: dump detailed info and
   specific messages on log replay failures" - replaces all abort calls
   with enhanced debugging helper

#### Code Changes Analysis:

**In `clean_log_buffer()` (fs/btrfs/tree-log.c:2630):**
- Previously: `btrfs_pin_reserved_extent()` and `unaccount_log_buffer()`
  errors were returned but no transaction abort occurred
- After: Adds `btrfs_abort_transaction(trans, ret)` when pinning fails,
  and `btrfs_handle_fs_error()` when unaccounting fails
- Impact: Prevents continuing with log tree cleanup after extent
  pinning/accounting failures, which could lead to metadata space leaks

**In `walk_down_log_tree()` (fs/btrfs/tree-log.c:2674, 2690, 2705):**
Three specific error points now abort the transaction:
1. Line 2677: `btrfs_find_create_tree_block()` failure - couldn't
   allocate/find log tree block
2. Line 2690: `btrfs_read_extent_buffer()` failure at level 1 - couldn't
   read log leaf
3. Line 2705: `btrfs_read_extent_buffer()` failure at other levels -
   couldn't read log node

Each error path now calls either `btrfs_abort_transaction(trans, ret)`
(when transaction context exists) or `btrfs_handle_fs_error(fs_info,
ret, NULL)` (when freeing log without transaction).

#### Why This Should Be Backported:

1. **Dependency Chain**: This commit directly follows fc5799986fbca
   which changed `unaccount_log_buffer()` from void to returning int.
   Without this commit, those new error returns are not properly
   handled, defeating the purpose of that stable-marked fix.

2. **Error Containment**: The commit prevents silent error propagation
   that could lead to:
   - Filesystem inconsistencies during log replay
   - Metadata space accounting errors
   - Corrupted log trees that cause mount failures
   - Hard-to-debug production issues where errors occur but aren't
     immediately visible

3. **Minimal Risk**: The changes are defensive in nature:
   - Only adds error handling on already-failing code paths
   - Doesn't modify success path logic
   - Follows existing btrfs patterns for transaction abort
   - All error conditions already existed; this just makes them explicit

4. **Production Debugging Value**: The commit message explicitly states
   this helps "figuring out where failures come from" - critical for:
   - Debugging customer issues in production environments
   - Reducing time to identify root causes of log replay failures
   - Preventing cascading failures from silent errors

5. **Expert Review**: Reviewed by experienced btrfs developers (Boris
   Burkov, Qu Wenruo) and maintained by David Sterba, indicating
   thorough vetting

6. **Series Coherence**: Follow-up commits (especially 2753e49176240)
   depend on these abort points existing, creating a clean error
   handling architecture

#### Risk Assessment:

**Low regression risk because:**
- Transaction abort after failure is safer than continuing with
  corrupted state
- The abort paths are only triggered when operations already failed
- No new failure modes are introduced
- The commit doesn't change buffer management, extent handling, or tree
  traversal logic
- Similar patterns exist throughout btrfs code (12+ other
  btrfs_abort_transaction calls already in tree-log.c as shown by grep)

**Potential impacts:**
- Systems with underlying I/O issues may see transaction aborts sooner
  (this is good - fail fast)
- Log replay will be more strict about errors (this is the intended
  behavior)
- No performance impact on success paths

#### Conclusion:

This commit should be backported because it:
1. Is part of a critical error handling improvement series where one
   commit (fc5799986fbca) is already marked for stable
2. Prevents filesystem corruption by properly handling errors during log
   tree operations
3. Has minimal regression risk (only adds abort on already-failing
   paths)
4. Provides significant debugging value for production issues
5. Is already being backported (commit 8369fd86a93b1 exists in the
   analyzed tree)
6. Follows stable tree rules: important bugfix/hardening, minimal risk,
   confined to btrfs subsystem

The commit improves error handling in a critical filesystem subsystem
(log tree replay/free) where silent failures could lead to data loss or
corruption. While not fixing a specific reported bug, it's a defensive
change that prevents potential corruption scenarios and aids in
debugging real-world issues.

 fs/btrfs/tree-log.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 7a63afedd01e6..6d92326a1a0c7 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2630,15 +2630,24 @@ static int unaccount_log_buffer(struct btrfs_fs_info *fs_info, u64 start)
 static int clean_log_buffer(struct btrfs_trans_handle *trans,
 			    struct extent_buffer *eb)
 {
+	int ret;
+
 	btrfs_tree_lock(eb);
 	btrfs_clear_buffer_dirty(trans, eb);
 	wait_on_extent_buffer_writeback(eb);
 	btrfs_tree_unlock(eb);
 
-	if (trans)
-		return btrfs_pin_reserved_extent(trans, eb);
+	if (trans) {
+		ret = btrfs_pin_reserved_extent(trans, eb);
+		if (ret)
+			btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
 
-	return unaccount_log_buffer(eb->fs_info, eb->start);
+	ret = unaccount_log_buffer(eb->fs_info, eb->start);
+	if (ret)
+		btrfs_handle_fs_error(eb->fs_info, ret, NULL);
+	return ret;
 }
 
 static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
@@ -2674,8 +2683,14 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 		next = btrfs_find_create_tree_block(fs_info, bytenr,
 						    btrfs_header_owner(cur),
 						    *level - 1);
-		if (IS_ERR(next))
-			return PTR_ERR(next);
+		if (IS_ERR(next)) {
+			ret = PTR_ERR(next);
+			if (trans)
+				btrfs_abort_transaction(trans, ret);
+			else
+				btrfs_handle_fs_error(fs_info, ret, NULL);
+			return ret;
+		}
 
 		if (*level == 1) {
 			ret = wc->process_func(root, next, wc, ptr_gen,
@@ -2690,6 +2705,10 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 				ret = btrfs_read_extent_buffer(next, &check);
 				if (ret) {
 					free_extent_buffer(next);
+					if (trans)
+						btrfs_abort_transaction(trans, ret);
+					else
+						btrfs_handle_fs_error(fs_info, ret, NULL);
 					return ret;
 				}
 
@@ -2705,6 +2724,10 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans,
 		ret = btrfs_read_extent_buffer(next, &check);
 		if (ret) {
 			free_extent_buffer(next);
+			if (trans)
+				btrfs_abort_transaction(trans, ret);
+			else
+				btrfs_handle_fs_error(fs_info, ret, NULL);
 			return ret;
 		}
 
-- 
2.51.0


  parent reply	other threads:[~2025-10-06 18:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-06 18:17 [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] btrfs: zoned: refine extent allocator hint selection Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c Sasha Levin
2025-10-06 18:17 ` Sasha Levin [this message]
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] x86/bugs: Add attack vector controls for VMSCAPE Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] sched_ext: Keep bypass on between enable failure and scx_disable_workfn() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction if we fail to update inode in log replay dir fixup Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] EDAC/mc_sysfs: Increase legacy channel support to 16 Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: abort transaction in the process_one_buffer() log tree walk callback Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: zoned: return error from btrfs_zone_finish_endio() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Skip user unwind if the task is a kernel thread Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.1] perf: Have get_perf_callchain() return NULL if crosstask and user are set Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.16] EDAC: Fix wrong executable file modes for C source files Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] btrfs: use level argument in log tree walk callback replay_one_buffer() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] sched_ext: Make qmap dump operation non-destructive Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] seccomp: passthrough uprobe systemcall without filtering Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] cpuset: Use new excpus for nocpu error check when enabling root partition Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.12] btrfs: tree-checker: add inode extref checks Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] sched/fair: update_cfs_group() for throttled cfs_rqs Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.15] btrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.4] x86/bugs: Fix reporting of LFENCE retpoline Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-5.10] btrfs: always drop log root tree reference in btrfs_replay_log() Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17-6.6] audit: record fanotify event regardless of presence of rules Sasha Levin
2025-10-06 18:17 ` [PATCH AUTOSEL 6.17] EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support Sasha Levin
2025-10-06 18:18 ` [PATCH AUTOSEL 6.17-6.6] x86/bugs: Report correct retbleed mitigation status Sasha Levin
2025-10-06 21:55 ` [PATCH AUTOSEL 6.17-5.4] x86/build: Remove cc-option from stack alignment flags Nathan Chancellor
2025-10-06 23:13   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251006181835.1919496-4-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=boris@bur.io \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).