public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 6.18-5.10] btrfs: scrub: always update btrfs_scrub_progress::last_physical
       [not found] <20251206140252.645973-1-sashal@kernel.org>
@ 2025-12-06 14:02 ` Sasha Levin
  0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-12-06 14:02 UTC (permalink / raw)
  To: patches, stable; +Cc: Qu Wenruo, David Sterba, Sasha Levin, clm, linux-btrfs

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit 54df8b80cc63aa0f22c4590cad11542731ed43ff ]

[BUG]
When a scrub failed immediately without any byte scrubbed, the returned
btrfs_scrub_progress::last_physical will always be 0, even if there is a
non-zero @start passed into btrfs_scrub_dev() for resume cases.

This will reset the progress and make later scrub resume start from the
beginning.

[CAUSE]
The function btrfs_scrub_dev() accepts a @progress parameter to copy its
updated progress to the caller, there are cases where we either don't
touch progress::last_physical at all or copy 0 into last_physical:

- last_physical not updated at all
  If some error happened before scrubbing any super block or chunk, we
  will not copy the progress, leaving the @last_physical untouched.

  E.g. failed to allocate @sctx, scrubbing a missing device or even
  there is already a running scrub and so on.

  All those cases won't touch @progress at all, resulting the
  last_physical untouched and will be left as 0 for most cases.

- Error out before scrubbing any bytes
  In those case we allocated @sctx, and sctx->stat.last_physical is all
  zero (initialized by kvzalloc()).
  Unfortunately some critical errors happened during
  scrub_enumerate_chunks() or scrub_supers() before any stripe is really
  scrubbed.

  In that case although we will copy sctx->stat back to @progress, since
  no byte is really scrubbed, last_physical will be overwritten to 0.

[FIX]
Make sure the parameter @progress always has its @last_physical member
updated to @start parameter inside btrfs_scrub_dev().

At the very beginning of the function, set @progress->last_physical to
@start, so that even if we error out without doing progress copying,
last_physical is still at @start.

Then after we got @sctx allocated, set sctx->stat.last_physical to
@start, this will make sure even if we didn't get any byte scrubbed, at
the progress copying stage the @last_physical is not left as zero.

This should resolve the resume progress reset problem.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:



 fs/btrfs/scrub.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ba20d9286a340..cff67ed630196 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3039,6 +3039,10 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 	unsigned int nofs_flag;
 	bool need_commit = false;
 
+	/* Set the basic fallback @last_physical before we got a sctx. */
+	if (progress)
+		progress->last_physical = start;
+
 	if (btrfs_fs_closing(fs_info))
 		return -EAGAIN;
 
@@ -3057,6 +3061,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 	sctx = scrub_setup_ctx(fs_info, is_dev_replace);
 	if (IS_ERR(sctx))
 		return PTR_ERR(sctx);
+	sctx->stat.last_physical = start;
 
 	ret = scrub_workers_get(fs_info);
 	if (ret)
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH AUTOSEL 6.18-5.10] btrfs: scrub: always update btrfs_scrub_progress::last_physical
       [not found] <20251209001610.611575-1-sashal@kernel.org>
@ 2025-12-09  0:15 ` Sasha Levin
  0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2025-12-09  0:15 UTC (permalink / raw)
  To: patches, stable; +Cc: Qu Wenruo, David Sterba, Sasha Levin, clm, linux-btrfs

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit 54df8b80cc63aa0f22c4590cad11542731ed43ff ]

[BUG]
When a scrub failed immediately without any byte scrubbed, the returned
btrfs_scrub_progress::last_physical will always be 0, even if there is a
non-zero @start passed into btrfs_scrub_dev() for resume cases.

This will reset the progress and make later scrub resume start from the
beginning.

[CAUSE]
The function btrfs_scrub_dev() accepts a @progress parameter to copy its
updated progress to the caller, there are cases where we either don't
touch progress::last_physical at all or copy 0 into last_physical:

- last_physical not updated at all
  If some error happened before scrubbing any super block or chunk, we
  will not copy the progress, leaving the @last_physical untouched.

  E.g. failed to allocate @sctx, scrubbing a missing device or even
  there is already a running scrub and so on.

  All those cases won't touch @progress at all, resulting the
  last_physical untouched and will be left as 0 for most cases.

- Error out before scrubbing any bytes
  In those case we allocated @sctx, and sctx->stat.last_physical is all
  zero (initialized by kvzalloc()).
  Unfortunately some critical errors happened during
  scrub_enumerate_chunks() or scrub_supers() before any stripe is really
  scrubbed.

  In that case although we will copy sctx->stat back to @progress, since
  no byte is really scrubbed, last_physical will be overwritten to 0.

[FIX]
Make sure the parameter @progress always has its @last_physical member
updated to @start parameter inside btrfs_scrub_dev().

At the very beginning of the function, set @progress->last_physical to
@start, so that even if we error out without doing progress copying,
last_physical is still at @start.

Then after we got @sctx allocated, set sctx->stat.last_physical to
@start, this will make sure even if we didn't get any byte scrubbed, at
the progress copying stage the @last_physical is not left as zero.

This should resolve the resume progress reset problem.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## ANALYSIS SUMMARY

### 1. Commit Message Analysis

The commit describes a clear bug with well-structured [BUG], [CAUSE],
and [FIX] sections:
- **Bug**: When scrub fails early (before scrubbing any bytes),
  `last_physical` is returned as 0 instead of the `start` parameter,
  causing resume to restart from the beginning
- **Root cause**: Either `last_physical` isn't touched at all (early
  error paths) or it's left as 0 (kvzalloc zeroes sctx->stat)
- **Notable**: No explicit "Cc: stable@vger.kernel.org" or "Fixes:" tag,
  but has "Reviewed-by: David Sterba" (btrfs maintainer)

### 2. Code Change Analysis

The fix adds just **2 lines of code** (plus a comment):

```c
/* Set the basic fallback @last_physical before we got a sctx. */
if (progress)
    progress->last_physical = start;
```

And after `sctx` allocation:
```c
sctx->stat.last_physical = start;
```

**Technical mechanism**: The function `btrfs_scrub_dev()` takes a
`start` parameter indicating where to begin (or resume) scrubbing. The
`progress` struct is returned to userspace even on error (see
`btrfs_ioctl_scrub()` comment: "Copy scrub args to user space even if
btrfs_scrub_dev() returned an error...Later user space can...resume
scrub from where it left off"). Without this fix, if scrub fails early,
`last_physical` is 0, causing btrfs-progs to restart from the beginning.

### 3. Classification

- **Bug fix**: Yes - fixes incorrect initialization of a progress
  tracking field
- **Not an exception category**: Regular bug fix, not device
  IDs/quirks/DT

### 4. Scope and Risk Assessment

- **Scope**: 2 lines in 1 function (`btrfs_scrub_dev()`)
- **Risk**: **EXTREMELY LOW** - just initializing a value to `start`
  instead of leaving it as 0
- The fix is purely additive and defensive; it doesn't change the normal
  code path where scrubbing succeeds
- If scrubbing proceeds normally, `last_physical` gets updated with
  actual progress anyway

### 5. User Impact

- **Who**: Users with large btrfs filesystems who use scrub resume
- **Severity**: Moderate usability issue - scrub on multi-TB filesystems
  can take hours/days; restarting from 0 wastes significant time
- **Real-world**: The ioctl is explicitly designed for resume
  functionality (per the code comment), and btrfs-progs relies on this

### 6. Stability Indicators

- ✅ **Reviewed-by**: David Sterba (btrfs maintainer)
- ✅ **Signed-off-by**: David Sterba (btrfs maintainer)
- The fix has been properly reviewed through the btrfs maintainer tree

### 7. Dependency Check

- No dependencies on other commits
- The affected code structure (`btrfs_scrub_dev`,
  `btrfs_scrub_progress`, `start` parameter) has existed for many years
  and is present in all stable kernels

## Verdict

This commit is an excellent candidate for stable backport:

1. **Fixes a real bug**: Scrub resume is broken when scrub fails early
2. **Small and surgical**: Just 2 lines of initialization code
3. **Obviously correct**: Setting `last_physical = start` is clearly the
   right behavior
4. **Low risk**: Cannot break anything - just ensures proper
   initialization of a progress field
5. **Benefits stable users**: Large filesystem users who rely on scrub
   resume functionality
6. **Maintainer reviewed**: Reviewed and signed off by btrfs maintainer

The lack of explicit "Cc: stable" tag appears to be an oversight - the
bug clearly affects all kernels with scrub resume functionality, and the
fix is exactly the type of small, safe bugfix appropriate for stable
trees.

**YES**

 fs/btrfs/scrub.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ba20d9286a340..cff67ed630196 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3039,6 +3039,10 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 	unsigned int nofs_flag;
 	bool need_commit = false;
 
+	/* Set the basic fallback @last_physical before we got a sctx. */
+	if (progress)
+		progress->last_physical = start;
+
 	if (btrfs_fs_closing(fs_info))
 		return -EAGAIN;
 
@@ -3057,6 +3061,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 	sctx = scrub_setup_ctx(fs_info, is_dev_replace);
 	if (IS_ERR(sctx))
 		return PTR_ERR(sctx);
+	sctx->stat.last_physical = start;
 
 	ret = scrub_workers_get(fs_info);
 	if (ret)
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-12-09  0:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251206140252.645973-1-sashal@kernel.org>
2025-12-06 14:02 ` [PATCH AUTOSEL 6.18-5.10] btrfs: scrub: always update btrfs_scrub_progress::last_physical Sasha Levin
     [not found] <20251209001610.611575-1-sashal@kernel.org>
2025-12-09  0:15 ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox