* [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
@ 2026-06-04 6:04 ` Darrick J. Wong
2026-06-04 12:16 ` Andrey Albershteyn
2026-06-04 6:04 ` [PATCH 02/21] xfs_scrub_all: fix broken command line string array construction Darrick J. Wong
` (19 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:04 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
tighten up the security on the background systemd service") was applied
to the media scan failure reporting service. Therefore, it's also
broken on systems that have setgid mailer programs (e.g. postfix).
Fix this by applying the same change from commit 15fd6fc686d5ce here
too.
Cc: <linux-xfs@vger.kernel.org> # v6.17.0
Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub_media_fail@.service.in | 58 ++------------------------------
1 file changed, 3 insertions(+), 55 deletions(-)
diff --git a/scrub/xfs_scrub_media_fail@.service.in b/scrub/xfs_scrub_media_fail@.service.in
index 97c0e090721d65..e354dda146f1e3 100644
--- a/scrub/xfs_scrub_media_fail@.service.in
+++ b/scrub/xfs_scrub_media_fail@.service.in
@@ -19,58 +19,6 @@ SupplementaryGroups=systemd-journal
# can control resource usage.
Slice=system-xfs_scrub.slice
-# No realtime scheduling
-RestrictRealtime=true
-
-# Make the entire filesystem readonly and /home inaccessible, then bind mount
-# the filesystem we're supposed to be checking into our private /tmp dir.
-ProtectSystem=full
-ProtectHome=yes
-PrivateTmp=true
-RestrictSUIDSGID=true
-
-# Emailing reports requires network access, but not the ability to change the
-# hostname.
-ProtectHostname=true
-
-# Don't let the program mess with the kernel configuration at all
-ProtectKernelLogs=true
-ProtectKernelModules=true
-ProtectKernelTunables=true
-ProtectControlGroups=true
-ProtectProc=invisible
-RestrictNamespaces=true
-
-# Can't hide /proc because journalctl needs it to find various pieces of log
-# information
-#ProcSubset=pid
-
-# Only allow the default personality Linux
-LockPersonality=true
-
-# No writable memory pages
-MemoryDenyWriteExecute=true
-
-# Don't let our mounts leak out to the host
-PrivateMounts=true
-
-# Restrict system calls to the native arch and only enough to get things going
-SystemCallArchitectures=native
-SystemCallFilter=@system-service
-SystemCallFilter=~@privileged
-SystemCallFilter=~@resources
-SystemCallFilter=~@mount
-
-# xfs_scrub needs these privileges to run, and no others
-CapabilityBoundingSet=
-NoNewPrivileges=true
-
-# Failure reporting shouldn't create world-readable files
-UMask=0077
-
-# Clean up any IPC objects when this unit stops
-RemoveIPC=true
-
-# No access to hardware device files
-PrivateDevices=true
-ProtectClock=true
+# No further restrictions because some installations may have MTAs such as
+# postfix, which require the ability to run setgid programs and other
+# foolishness.
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 6:04 ` [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
@ 2026-06-04 12:16 ` Andrey Albershteyn
2026-06-04 16:48 ` Darrick J. Wong
0 siblings, 1 reply; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:16 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: aalbersh, linux-xfs, hch
On 2026-06-03 23:04:36, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
> tighten up the security on the background systemd service") was applied
> to the media scan failure reporting service. Therefore, it's also
> broken on systems that have setgid mailer programs (e.g. postfix).
> Fix this by applying the same change from commit 15fd6fc686d5ce here
> too.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.17.0
> Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
Fixes: 45ec29cfba02 ("xfs_scrub_all: support metadata+media scans of all filesystems")
This one, no?
Otherwise looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread* Re: [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 12:16 ` Andrey Albershteyn
@ 2026-06-04 16:48 ` Darrick J. Wong
0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 16:48 UTC (permalink / raw)
To: Andrey Albershteyn; +Cc: linux-xfs, hch
On Thu, Jun 04, 2026 at 02:16:24PM +0200, Andrey Albershteyn wrote:
> On 2026-06-03 23:04:36, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
> > tighten up the security on the background systemd service") was applied
> > to the media scan failure reporting service. Therefore, it's also
> > broken on systems that have setgid mailer programs (e.g. postfix).
> > Fix this by applying the same change from commit 15fd6fc686d5ce here
> > too.
> >
> > Cc: <linux-xfs@vger.kernel.org> # v6.17.0
> > Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
>
> Fixes: 45ec29cfba02 ("xfs_scrub_all: support metadata+media scans of all filesystems")
> This one, no?
Hrm. 45ec is indeed the commit that introduced the overly strict
security posture, but 15fd came after that, and failed to fix the other
two _fail services. I'm not particularly fussed about which commit the
Fixes trailer points to, but 15fd is a more recent commit.
<shrug>
> Otherwise looks good to me
> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Thanks!
--D
>
> --
> - Andrey
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 02/21] xfs_scrub_all: fix broken command line string array construction
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
2026-06-04 6:04 ` [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
@ 2026-06-04 6:04 ` Darrick J. Wong
2026-06-04 12:17 ` Andrey Albershteyn
2026-06-04 6:05 ` [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
` (18 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:04 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
In scrub_subprocess::__init__(), we construct cmd as an array of strings
to be passed (eventually) to exec. Unfortunately, adding a string to an
array of strings doesn't do what you'd think in Python:
>>> x = ['moo', 'cow']
>>> x += '-x'
>>> x
['moo', 'cow', '-', 'x']
Fix this problem, which Codex has sharp enough eyes to catch.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: a290edcfa0a391 ("xfs_scrub_all: encapsulate all the subprocess code in an object")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub_all.py.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scrub/xfs_scrub_all.py.in b/scrub/xfs_scrub_all.py.in
index 3f618587de8b25..9afb7993abc06e 100644
--- a/scrub/xfs_scrub_all.py.in
+++ b/scrub/xfs_scrub_all.py.in
@@ -100,7 +100,7 @@ class scrub_subprocess(scrub_control):
else:
cmd += '@scrub_args@'.split()
if scrub_media:
- cmd += '-x'
+ cmd += ['-x']
cmd += [mnt]
self.cmdline = cmd
self.proc = None
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 02/21] xfs_scrub_all: fix broken command line string array construction
2026-06-04 6:04 ` [PATCH 02/21] xfs_scrub_all: fix broken command line string array construction Darrick J. Wong
@ 2026-06-04 12:17 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:17 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:04:52, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> In scrub_subprocess::__init__(), we construct cmd as an array of strings
> to be passed (eventually) to exec. Unfortunately, adding a string to an
> array of strings doesn't do what you'd think in Python:
>
> >>> x = ['moo', 'cow']
> >>> x += '-x'
> >>> x
> ['moo', 'cow', '-', 'x']
>
> Fix this problem, which Codex has sharp enough eyes to catch.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: a290edcfa0a391 ("xfs_scrub_all: encapsulate all the subprocess code in an object")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
2026-06-04 6:04 ` [PATCH 01/21] xfs_scrub_media_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
2026-06-04 6:04 ` [PATCH 02/21] xfs_scrub_all: fix broken command line string array construction Darrick J. Wong
@ 2026-06-04 6:05 ` Darrick J. Wong
2026-06-04 12:21 ` Andrey Albershteyn
2026-06-04 6:05 ` [PATCH 04/21] xfs_scrub_fail: send content headers for xfs_scrub_all failures Darrick J. Wong
` (17 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:05 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
tighten up the security on the background systemd service") was applied
to the media scan failure reporting service. Therefore, it's also
broken on systems that have setgid mailer programs (e.g. postfix). Fix
this by applying the same change from commit 15fd6fc686d5ce here too.
While we're at it, put this service in the background scrub slice for
resource control.
Cc: <linux-xfs@vger.kernel.org> # v6.17.0
Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub_all_fail.service.in | 59 ++++-------------------------------
1 file changed, 6 insertions(+), 53 deletions(-)
diff --git a/scrub/xfs_scrub_all_fail.service.in b/scrub/xfs_scrub_all_fail.service.in
index 53479db8477175..e11ca5460457a7 100644
--- a/scrub/xfs_scrub_all_fail.service.in
+++ b/scrub/xfs_scrub_all_fail.service.in
@@ -15,57 +15,10 @@ User=mail
Group=mail
SupplementaryGroups=systemd-journal
-# No realtime scheduling
-RestrictRealtime=true
+# Create the service underneath the scrub background service slice so that we
+# can control resource usage.
+Slice=system-xfs_scrub.slice
-# Make the entire filesystem readonly and /home inaccessible.
-ProtectSystem=full
-ProtectHome=yes
-PrivateTmp=true
-RestrictSUIDSGID=true
-
-# Emailing reports requires network access, but not the ability to change the
-# hostname.
-ProtectHostname=true
-
-# Don't let the program mess with the kernel configuration at all
-ProtectKernelLogs=true
-ProtectKernelModules=true
-ProtectKernelTunables=true
-ProtectControlGroups=true
-ProtectProc=invisible
-RestrictNamespaces=true
-
-# Can't hide /proc because journalctl needs it to find various pieces of log
-# information
-#ProcSubset=pid
-
-# Only allow the default personality Linux
-LockPersonality=true
-
-# No writable memory pages
-MemoryDenyWriteExecute=true
-
-# Don't let our mounts leak out to the host
-PrivateMounts=true
-
-# Restrict system calls to the native arch and only enough to get things going
-SystemCallArchitectures=native
-SystemCallFilter=@system-service
-SystemCallFilter=~@privileged
-SystemCallFilter=~@resources
-SystemCallFilter=~@mount
-
-# xfs_scrub needs these privileges to run, and no others
-CapabilityBoundingSet=
-NoNewPrivileges=true
-
-# Failure reporting shouldn't create world-readable files
-UMask=0077
-
-# Clean up any IPC objects when this unit stops
-RemoveIPC=true
-
-# No access to hardware device files
-PrivateDevices=true
-ProtectClock=true
+# No further restrictions because some installations may have MTAs such as
+# postfix, which require the ability to run setgid programs and other
+# foolishness.
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 6:05 ` [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
@ 2026-06-04 12:21 ` Andrey Albershteyn
2026-06-04 16:49 ` Darrick J. Wong
0 siblings, 1 reply; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:21 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:05:07, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
> tighten up the security on the background systemd service") was applied
> to the media scan failure reporting service. Therefore, it's also
> broken on systems that have setgid mailer programs (e.g. postfix). Fix
> this by applying the same change from commit 15fd6fc686d5ce here too.
>
> While we're at it, put this service in the background scrub slice for
> resource control.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.17.0
> Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
Fixes: e040916f649f ("xfs_scrub_all: failure reporting for the xfs_scrub_all job")
?
Otherwise, looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread* Re: [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems
2026-06-04 12:21 ` Andrey Albershteyn
@ 2026-06-04 16:49 ` Darrick J. Wong
0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 16:49 UTC (permalink / raw)
To: Andrey Albershteyn; +Cc: linux-xfs, hch
On Thu, Jun 04, 2026 at 02:21:28PM +0200, Andrey Albershteyn wrote:
> On 2026-06-03 23:05:07, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > The same lockdown logic of commit 9042fcc08eed6a ("xfs_scrub_fail:
> > tighten up the security on the background systemd service") was applied
> > to the media scan failure reporting service. Therefore, it's also
> > broken on systems that have setgid mailer programs (e.g. postfix). Fix
> > this by applying the same change from commit 15fd6fc686d5ce here too.
> >
> > While we're at it, put this service in the background scrub slice for
> > resource control.
> >
> > Cc: <linux-xfs@vger.kernel.org> # v6.17.0
> > Fixes: 15fd6fc686d5ce ("xfs_scrub_fail: reduce security lockdowns to avoid postfix problems")
>
> Fixes: e040916f649f ("xfs_scrub_all: failure reporting for the xfs_scrub_all job")
> ?
Same general comment as before -- 15fd should have fixed up e0409, but
I overlooked that.
> Otherwise, looks good to me
> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Thanks for the review!
--D
> --
> - Andrey
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 04/21] xfs_scrub_fail: send content headers for xfs_scrub_all failures
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (2 preceding siblings ...)
2026-06-04 6:05 ` [PATCH 03/21] xfs_scrub_all_fail: reduce security lockdowns to avoid postfix problems Darrick J. Wong
@ 2026-06-04 6:05 ` Darrick J. Wong
2026-06-04 12:23 ` Andrey Albershteyn
2026-06-04 6:05 ` [PATCH 05/21] xfs_scrub: fix uninitialized variable Darrick J. Wong
` (16 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:05 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex noted that fail_mail() doesn't include the same content headers
that fail_mail_mntpoint() does. As a result, utf8 encoded content for
xfs_scrub_all failures might not be interpreted correctly by mailers,
whereas they will for xfs_scrub@ failures. Fix this by emitting C-T-E
and C-T headers so that mailers will see this as 8-bit encoded utf8.
One could argue that the choice of utf8 is actually system-dependent and
ought to added by the MTA, etc. but I suspect that 99.9% of Linux
systems out there are using utf8 and not some other legacy encoding.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: e040916f649f71 ("xfs_scrub_all: failure reporting for the xfs_scrub_all job")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub_fail.in | 2 ++
1 file changed, 2 insertions(+)
diff --git a/scrub/xfs_scrub_fail.in b/scrub/xfs_scrub_fail.in
index 089b438f03c018..7893e208027483 100755
--- a/scrub/xfs_scrub_fail.in
+++ b/scrub/xfs_scrub_fail.in
@@ -47,6 +47,8 @@ fail_mail() {
To: ${recipient}
From: <${service}@${hostname}>
Subject: ${service} failure
+Content-Transfer-Encoding: 8bit
+Content-Type: text/plain; charset=UTF-8
So sorry, the automatic ${service} on ${hostname} failed.
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 04/21] xfs_scrub_fail: send content headers for xfs_scrub_all failures
2026-06-04 6:05 ` [PATCH 04/21] xfs_scrub_fail: send content headers for xfs_scrub_all failures Darrick J. Wong
@ 2026-06-04 12:23 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:23 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:05:23, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex noted that fail_mail() doesn't include the same content headers
> that fail_mail_mntpoint() does. As a result, utf8 encoded content for
> xfs_scrub_all failures might not be interpreted correctly by mailers,
> whereas they will for xfs_scrub@ failures. Fix this by emitting C-T-E
> and C-T headers so that mailers will see this as 8-bit encoded utf8.
>
> One could argue that the choice of utf8 is actually system-dependent and
> ought to added by the MTA, etc. but I suspect that 99.9% of Linux
> systems out there are using utf8 and not some other legacy encoding.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: e040916f649f71 ("xfs_scrub_all: failure reporting for the xfs_scrub_all job")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 05/21] xfs_scrub: fix uninitialized variable
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (3 preceding siblings ...)
2026-06-04 6:05 ` [PATCH 04/21] xfs_scrub_fail: send content headers for xfs_scrub_all failures Darrick J. Wong
@ 2026-06-04 6:05 ` Darrick J. Wong
2026-06-04 12:24 ` Andrey Albershteyn
2026-06-04 6:05 ` [PATCH 06/21] xfs_scrub: fix integer overflows Darrick J. Wong
` (15 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:05 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
If the action list is empty, we'll return random stack garbage because
we never set ret. Found by Codex.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: a18f915ce76bc8 ("xfs_scrub: use repair_item to direct repair activities")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/repair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scrub/repair.c b/scrub/repair.c
index b2c4232fe8cea1..c3a8b3179600a6 100644
--- a/scrub/repair.c
+++ b/scrub/repair.c
@@ -719,7 +719,7 @@ action_list_process(
{
struct action_item *aitem;
struct action_item *n;
- int ret;
+ int ret = 0;
list_for_each_entry_safe(aitem, n, &alist->list, list) {
if (scrub_excessive_errors(ctx))
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 05/21] xfs_scrub: fix uninitialized variable
2026-06-04 6:05 ` [PATCH 05/21] xfs_scrub: fix uninitialized variable Darrick J. Wong
@ 2026-06-04 12:24 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:24 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:05:38, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> If the action list is empty, we'll return random stack garbage because
> we never set ret. Found by Codex.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: a18f915ce76bc8 ("xfs_scrub: use repair_item to direct repair activities")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 06/21] xfs_scrub: fix integer overflows
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (4 preceding siblings ...)
2026-06-04 6:05 ` [PATCH 05/21] xfs_scrub: fix uninitialized variable Darrick J. Wong
@ 2026-06-04 6:05 ` Darrick J. Wong
2026-06-04 12:25 ` Andrey Albershteyn
2026-06-04 6:06 ` [PATCH 07/21] xfs_scrub: don't count internal log space in the data device used count Darrick J. Wong
` (14 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:05 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
agblocks is a 32-bit variable, which means that on a filesystem with a
large AG block count (e.g. 2^31 blocks on a 1k fsblock fs) the
multiplication with 95 could cause an integer overflow. Let's just do
64-bit arithmetic here. Codex found this.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 34bed605490f93 ("xfs_scrub: tune fstrim minlen parameter based on free space histograms")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase8.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scrub/phase8.c b/scrub/phase8.c
index 0967832da890e3..5bfe0ffb4868a2 100644
--- a/scrub/phase8.c
+++ b/scrub/phase8.c
@@ -149,7 +149,7 @@ fstrim_compute_minlen(
* We can't calculate or query that value directly, so we guesstimate
* that it's 95% of the AG size.
*/
- ag_max_usable = ctx->mnt.fsgeom.agblocks * 95 / 100;
+ ag_max_usable = (uint64_t)ctx->mnt.fsgeom.agblocks * 95 / 100;
if (debug > 1) {
struct histogram_strings hstr = {
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 06/21] xfs_scrub: fix integer overflows
2026-06-04 6:05 ` [PATCH 06/21] xfs_scrub: fix integer overflows Darrick J. Wong
@ 2026-06-04 12:25 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:25 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:05:54, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> agblocks is a 32-bit variable, which means that on a filesystem with a
> large AG block count (e.g. 2^31 blocks on a 1k fsblock fs) the
> multiplication with 95 could cause an integer overflow. Let's just do
> 64-bit arithmetic here. Codex found this.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 34bed605490f93 ("xfs_scrub: tune fstrim minlen parameter based on free space histograms")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 07/21] xfs_scrub: don't count internal log space in the data device used count
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (5 preceding siblings ...)
2026-06-04 6:05 ` [PATCH 06/21] xfs_scrub: fix integer overflows Darrick J. Wong
@ 2026-06-04 6:06 ` Darrick J. Wong
2026-06-11 8:25 ` Andrey Albershteyn
2026-06-04 6:06 ` [PATCH 08/21] xfs_scrub: widen scrub and repair dependency mask Darrick J. Wong
` (13 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:06 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Internal logs are not considered free space, so for the data device
they've already been subtracted from xfs_fsop_counts::freedata by
XFS_IOC_FSCOUNTS. This causes overcounting in the phase 7 reporting.
Cc: <linux-xfs@vger.kernel.org> # v7.0.0
Fixes: 95a92c5cecb2e1 ("xfs_scrub: perform media scanning of the log region")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase7.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/scrub/phase7.c b/scrub/phase7.c
index d52c142cc36114..e16ca28aa28371 100644
--- a/scrub/phase7.c
+++ b/scrub/phase7.c
@@ -244,10 +244,6 @@ phase7_func(
stat_data = totalcount.dbytes;
stat_rt = totalcount.rbytes;
- /* only count internal logs for data device summary */
- if (!ctx->fsinfo.fs_log)
- used_data += cvt_off_fsb_to_b(&ctx->mnt, l_blocks);
-
/*
* Complain if the counts are off by more than 10% unless
* the inaccuracy is less than 32MB worth of blocks or 100 inodes.
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 07/21] xfs_scrub: don't count internal log space in the data device used count
2026-06-04 6:06 ` [PATCH 07/21] xfs_scrub: don't count internal log space in the data device used count Darrick J. Wong
@ 2026-06-11 8:25 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-11 8:25 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:06:09, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Internal logs are not considered free space, so for the data device
> they've already been subtracted from xfs_fsop_counts::freedata by
> XFS_IOC_FSCOUNTS. This causes overcounting in the phase 7 reporting.
>
> Cc: <linux-xfs@vger.kernel.org> # v7.0.0
> Fixes: 95a92c5cecb2e1 ("xfs_scrub: perform media scanning of the log region")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 08/21] xfs_scrub: widen scrub and repair dependency mask
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (6 preceding siblings ...)
2026-06-04 6:06 ` [PATCH 07/21] xfs_scrub: don't count internal log space in the data device used count Darrick J. Wong
@ 2026-06-04 6:06 ` Darrick J. Wong
2026-06-04 12:37 ` Andrey Albershteyn
2026-06-04 6:06 ` [PATCH 09/21] xfs_scrub: fix work estimation for rtgroups filesystems Darrick J. Wong
` (12 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:06 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex observes:
"Since `XFS_SCRUB_TYPE_NR` is now 33 (`XFS_SCRUB_TYPE_RTREFCBT` is 32),
any scheduled scrub item drives this loop to `j == 32`, where `1U << j`
shifts a 32-bit unsigned value by its width. That is undefined behavior
(and will trip UBSan or allow miscompilation of dependency scheduling);
the dependency mask and shift need to use a type wide enough for all
scrub types, e.g. `1ULL`/`uint64_t`."
The presence of a shift overflow is correct, since there are now 33
scrub type codes. However, the local loop variable isn't the only
problem here -- scrub_deps and repair_deps are masks, and they aren't
wide enough. As a result, there's a lurking logic bomb because RTREFCBT
will never get selected.
Fortunately this is benign because no scrub/repair type depends on
RTREFCBT, which is why we haven't seen any complaints. But let's
eliminate an avenue for us to forget to fix the problem. Widen the
masks and the local variable to fix the problem.
Cc: <linux-xfs@vger.kernel.org> # v6.14.0
Fixes: b71fbe4c3da263 ("xfs: scrub the realtime refcount btree")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/scrub_private.h | 2 +-
scrub/repair.c | 8 ++++----
scrub/scrub.c | 8 ++++----
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/scrub/scrub_private.h b/scrub/scrub_private.h
index c5e0a7c08be056..629ef4405225eb 100644
--- a/scrub/scrub_private.h
+++ b/scrub/scrub_private.h
@@ -118,6 +118,6 @@ scrub_item_schedule_retry(struct scrub_item *sri, unsigned int scrub_type)
bool scrub_item_call_kernel_again(struct scrub_item *sri, uint8_t work_mask,
const struct scrub_item *old);
bool scrub_item_schedule_work(struct scrub_item *sri, uint8_t state_flags,
- const unsigned int *schedule_deps);
+ const uint64_t *schedule_deps);
#endif /* XFS_SCRUB_SCRUB_PRIVATE_H_ */
diff --git a/scrub/repair.c b/scrub/repair.c
index c3a8b3179600a6..5d821655877b80 100644
--- a/scrub/repair.c
+++ b/scrub/repair.c
@@ -28,8 +28,8 @@
* the filesystem if the btree root pointers in the AG headers are wrong.
* Dependencies cannot cross scrub groups.
*/
-#define DEP(x) (1U << (x))
-static const unsigned int repair_deps[XFS_SCRUB_TYPE_NR] = {
+#define DEP(x) (1ULL << (x))
+static const uint64_t repair_deps[XFS_SCRUB_TYPE_NR] = {
[XFS_SCRUB_TYPE_BMBTD] = DEP(XFS_SCRUB_TYPE_INODE),
[XFS_SCRUB_TYPE_BMBTA] = DEP(XFS_SCRUB_TYPE_INODE),
[XFS_SCRUB_TYPE_BMBTC] = DEP(XFS_SCRUB_TYPE_INODE),
@@ -250,7 +250,7 @@ repair_item_dependencies_ok(
const struct scrub_item *sri,
unsigned int scrub_type)
{
- unsigned int dep_mask = repair_deps[scrub_type];
+ uint64_t dep_mask = repair_deps[scrub_type];
unsigned int b;
for (b = 0; dep_mask && b < XFS_SCRUB_TYPE_NR; b++, dep_mask >>= 1) {
@@ -434,7 +434,7 @@ repair_item_boost_priorities(
unsigned int scrub_type;
foreach_scrub_type(scrub_type) {
- unsigned int dep_mask = repair_deps[scrub_type];
+ uint64_t dep_mask = repair_deps[scrub_type];
unsigned int b;
if (repair_item_count_needsrepair(sri) == 0 || !dep_mask)
diff --git a/scrub/scrub.c b/scrub/scrub.c
index de687af687d32d..12ed87f07d9f6c 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -28,8 +28,8 @@
* Bitmap showing the correctness dependencies between scrub types for scrubs.
* Dependencies cannot cross scrub groups.
*/
-#define DEP(x) (1U << (x))
-static const unsigned int scrub_deps[XFS_SCRUB_TYPE_NR] = {
+#define DEP(x) (1ULL << (x))
+static const uint64_t scrub_deps[XFS_SCRUB_TYPE_NR] = {
[XFS_SCRUB_TYPE_AGF] = DEP(XFS_SCRUB_TYPE_SB),
[XFS_SCRUB_TYPE_AGFL] = DEP(XFS_SCRUB_TYPE_SB) |
DEP(XFS_SCRUB_TYPE_AGF),
@@ -472,7 +472,7 @@ bool
scrub_item_schedule_work(
struct scrub_item *sri,
uint8_t state_flags,
- const unsigned int *schedule_deps)
+ const uint64_t *schedule_deps)
{
unsigned int scrub_type;
unsigned int nr = 0;
@@ -486,7 +486,7 @@ scrub_item_schedule_work(
continue;
foreach_scrub_type(j) {
- if (schedule_deps[scrub_type] & (1U << j))
+ if (schedule_deps[scrub_type] & (1ULL << j))
sri->sri_state[j] |= SCRUB_ITEM_BARRIER;
}
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 08/21] xfs_scrub: widen scrub and repair dependency mask
2026-06-04 6:06 ` [PATCH 08/21] xfs_scrub: widen scrub and repair dependency mask Darrick J. Wong
@ 2026-06-04 12:37 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 12:37 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:06:25, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex observes:
>
> "Since `XFS_SCRUB_TYPE_NR` is now 33 (`XFS_SCRUB_TYPE_RTREFCBT` is 32),
> any scheduled scrub item drives this loop to `j == 32`, where `1U << j`
> shifts a 32-bit unsigned value by its width. That is undefined behavior
> (and will trip UBSan or allow miscompilation of dependency scheduling);
> the dependency mask and shift need to use a type wide enough for all
> scrub types, e.g. `1ULL`/`uint64_t`."
>
> The presence of a shift overflow is correct, since there are now 33
> scrub type codes. However, the local loop variable isn't the only
> problem here -- scrub_deps and repair_deps are masks, and they aren't
> wide enough. As a result, there's a lurking logic bomb because RTREFCBT
> will never get selected.
>
> Fortunately this is benign because no scrub/repair type depends on
> RTREFCBT, which is why we haven't seen any complaints. But let's
> eliminate an avenue for us to forget to fix the problem. Widen the
> masks and the local variable to fix the problem.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.14.0
> Fixes: b71fbe4c3da263 ("xfs: scrub the realtime refcount btree")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 09/21] xfs_scrub: fix work estimation for rtgroups filesystems
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (7 preceding siblings ...)
2026-06-04 6:06 ` [PATCH 08/21] xfs_scrub: widen scrub and repair dependency mask Darrick J. Wong
@ 2026-06-04 6:06 ` Darrick J. Wong
2026-06-04 13:48 ` Andrey Albershteyn
2026-06-04 6:06 ` [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data Darrick J. Wong
` (11 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:06 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex noticed that phase2_func queues up scrub work items for every
rtgroup in the system and tracks progress towards that number of work
items, but that it doesn't include the rtgroups in the estimate. This
leads to the user-papercut of the progressbar quickly advancing to 100%
and then wedging there for most of the rt metadata scan. Let's fix this
by amending the group estimator function.
Cc: <linux-xfs@vger.kernel.org> # v6.13.0
Fixes: 241d915d69d4ae ("xfs_scrub: scrub realtime allocation group metadata")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub.h | 2 +-
scrub/phase2.c | 2 +-
scrub/scrub.c | 15 +++++++++++++--
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/scrub/xfs_scrub.h b/scrub/xfs_scrub.h
index 851b4f37db48e4..ce98693a674d9d 100644
--- a/scrub/xfs_scrub.h
+++ b/scrub/xfs_scrub.h
@@ -131,7 +131,7 @@ int phase7_func(struct scrub_ctx *ctx);
int phase8_func(struct scrub_ctx *ctx);
/* Progress estimator functions */
-unsigned int scrub_estimate_ag_work(struct scrub_ctx *ctx);
+unsigned int scrub_estimate_group_work(struct scrub_ctx *ctx);
unsigned int scrub_estimate_iscan_work(struct scrub_ctx *ctx);
int phase2_estimate(struct scrub_ctx *ctx, uint64_t *items,
unsigned int *nr_threads, int *rshift);
diff --git a/scrub/phase2.c b/scrub/phase2.c
index c7828c332e7c3a..8de77429f05d60 100644
--- a/scrub/phase2.c
+++ b/scrub/phase2.c
@@ -416,7 +416,7 @@ phase2_estimate(
unsigned int *nr_threads,
int *rshift)
{
- *items = scrub_estimate_ag_work(ctx);
+ *items = scrub_estimate_group_work(ctx);
*nr_threads = scrub_nproc(ctx);
*rshift = 0;
return 0;
diff --git a/scrub/scrub.c b/scrub/scrub.c
index 12ed87f07d9f6c..c9b75eb041e460 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -534,9 +534,9 @@ scrub_item_check_file(
return 0;
}
-/* How many items do we have to check? */
+/* How many items do we have to check for per-AG and per-rtgroup work? */
unsigned int
-scrub_estimate_ag_work(
+scrub_estimate_group_work(
struct scrub_ctx *ctx)
{
const struct xfrog_scrub_descr *sc;
@@ -553,10 +553,21 @@ scrub_estimate_ag_work(
case XFROG_SCRUB_GROUP_FS:
estimate++;
break;
+ case XFROG_SCRUB_GROUP_RTGROUP:
+ /*
+ * rtbitmap and rtsummary exist on non-rt/non-rtgroup
+ * filesystems, but we schedule the other rtgroup
+ * metadata for scanning (even if it won't do
+ * anything), so we must include those in the
+ * estimation as well.
+ */
+ estimate += max(1, ctx->mnt.fsgeom.rgcount);
+ break;
default:
break;
}
}
+
return estimate;
}
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 09/21] xfs_scrub: fix work estimation for rtgroups filesystems
2026-06-04 6:06 ` [PATCH 09/21] xfs_scrub: fix work estimation for rtgroups filesystems Darrick J. Wong
@ 2026-06-04 13:48 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 13:48 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:06:40, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex noticed that phase2_func queues up scrub work items for every
> rtgroup in the system and tracks progress towards that number of work
> items, but that it doesn't include the rtgroups in the estimate. This
> leads to the user-papercut of the progressbar quickly advancing to 100%
> and then wedging there for most of the rt metadata scan. Let's fix this
> by amending the group estimator function.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.13.0
> Fixes: 241d915d69d4ae ("xfs_scrub: scrub realtime allocation group metadata")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (8 preceding siblings ...)
2026-06-04 6:06 ` [PATCH 09/21] xfs_scrub: fix work estimation for rtgroups filesystems Darrick J. Wong
@ 2026-06-04 6:06 ` Darrick J. Wong
2026-06-11 9:14 ` Andrey Albershteyn
2026-06-04 6:07 ` [PATCH 11/21] xfs_scrub: return SCRUB_RET_OPERROR if unicode collision detection fails to initialize Darrick J. Wong
` (10 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:06 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
If a media error occurs for some disk space that has a "special" owner,
we mustn't try to walk parent pointers because special owners are not
files. If it's an extent map, we shouldn't accidentally fall into
reporting that as file data loss. This was, like the rest, found by
Codex.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase6.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/scrub/phase6.c b/scrub/phase6.c
index c273420d42ca14..cd9bb26bf88628 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -436,6 +436,8 @@ report_ioerr_fsmap(
} else {
str_corrupt(ctx, buf, _("media error in %s."), type);
}
+
+ return 0;
}
if (can_use_pptrs(ctx)) {
@@ -454,6 +456,8 @@ report_ioerr_fsmap(
attr ? _("extended attribute") :
_("file data"));
str_corrupt(ctx, buf, _("media error in extent map"));
+
+ return 0;
}
/*
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data
2026-06-04 6:06 ` [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data Darrick J. Wong
@ 2026-06-11 9:14 ` Andrey Albershteyn
2026-06-11 14:05 ` Darrick J. Wong
0 siblings, 1 reply; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-11 9:14 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:06:56, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> If a media error occurs for some disk space that has a "special" owner,
> we mustn't try to walk parent pointers because special owners are not
> files. If it's an extent map, we shouldn't accidentally fall into
> reporting that as file data loss. This was, like the rest, found by
> Codex.
>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data
2026-06-11 9:14 ` Andrey Albershteyn
@ 2026-06-11 14:05 ` Darrick J. Wong
0 siblings, 0 replies; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-11 14:05 UTC (permalink / raw)
To: Andrey Albershteyn; +Cc: linux-xfs, hch
On Thu, Jun 11, 2026 at 11:14:39AM +0200, Andrey Albershteyn wrote:
> On 2026-06-03 23:06:56, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > If a media error occurs for some disk space that has a "special" owner,
> > we mustn't try to walk parent pointers because special owners are not
> > files. If it's an extent map, we shouldn't accidentally fall into
> > reporting that as file data loss. This was, like the rest, found by
> > Codex.
> >
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
>
> Looks good to me
> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
Hey, thanks for reviewing this! I have a few more fixes on the way...
--D
> --
> - Andrey
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 11/21] xfs_scrub: return SCRUB_RET_OPERROR if unicode collision detection fails to initialize
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (9 preceding siblings ...)
2026-06-04 6:06 ` [PATCH 10/21] xfs_scrub: don't report media errors in specially-owned areas as file data Darrick J. Wong
@ 2026-06-04 6:07 ` Darrick J. Wong
2026-06-04 13:55 ` Andrey Albershteyn
2026-06-04 6:07 ` [PATCH 12/21] xfs_scrub: fix nonsense advice after a scrub finds errors Darrick J. Wong
` (9 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:07 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
If we can't initialize libicu, we should report that as an operational
failure via exitcode instead of "success". Codex found this one.
Cc: <linux-xfs@vger.kernel.org> # v5.11.0
Fixes: ff8717e52b7364 ("xfs_scrub: load and unload libicu properly")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index aae24ec83c1c75..80b56f8baef611 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -749,6 +749,7 @@ main(
fprintf(stderr,
_("%s: couldn't initialize Unicode library.\n"),
progname);
+ ret = SCRUB_RET_OPERROR;
goto out_unicrash;
}
^ permalink raw reply related [flat|nested] 46+ messages in thread* [PATCH 12/21] xfs_scrub: fix nonsense advice after a scrub finds errors
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (10 preceding siblings ...)
2026-06-04 6:07 ` [PATCH 11/21] xfs_scrub: return SCRUB_RET_OPERROR if unicode collision detection fails to initialize Darrick J. Wong
@ 2026-06-04 6:07 ` Darrick J. Wong
2026-06-04 13:59 ` Andrey Albershteyn
2026-06-04 6:07 ` [PATCH 13/21] xfs_scrub: don't allow NAN as fstrim percentage Darrick J. Wong
` (8 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:07 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
If a user ran xfs_scrub -p (aka optimization-only mode) and it found
errors, we advise the user to run without -n. That's silly, because
they didn't run with -n, they ran with -p. Fix the messaging here.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 84e248f3e83f36 ("xfs_scrub: add an optimization-only mode")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 80b56f8baef611..383aa8752b2ce7 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -610,12 +610,20 @@ report_outcome(
* are not corruptions.
*/
if (ctx->scrub_setup_succeeded && actionable_errors > 0) {
- char *msg;
+ char *msg = NULL;
- if (ctx->mode != SCRUB_MODE_REPAIR)
+ switch (ctx->mode) {
+ case SCRUB_MODE_DRY_RUN:
msg = _("%s: Re-run xfs_scrub without -n.\n");
- else
+ break;
+ case SCRUB_MODE_PREEN:
+ msg = _("%s: Re-run xfs_scrub without -p.\n");
+ break;
+ case SCRUB_MODE_NONE:
+ case SCRUB_MODE_REPAIR:
msg = _("%s: Unmount and run xfs_repair.\n");
+ break;
+ }
fprintf(stderr, msg, ctx->mntpoint);
}
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 12/21] xfs_scrub: fix nonsense advice after a scrub finds errors
2026-06-04 6:07 ` [PATCH 12/21] xfs_scrub: fix nonsense advice after a scrub finds errors Darrick J. Wong
@ 2026-06-04 13:59 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 13:59 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:07:27, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> If a user ran xfs_scrub -p (aka optimization-only mode) and it found
> errors, we advise the user to run without -n. That's silly, because
> they didn't run with -n, they ran with -p. Fix the messaging here.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 84e248f3e83f36 ("xfs_scrub: add an optimization-only mode")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 13/21] xfs_scrub: don't allow NAN as fstrim percentage
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (11 preceding siblings ...)
2026-06-04 6:07 ` [PATCH 12/21] xfs_scrub: fix nonsense advice after a scrub finds errors Darrick J. Wong
@ 2026-06-04 6:07 ` Darrick J. Wong
2026-06-04 14:03 ` Andrey Albershteyn
2026-06-04 6:07 ` [PATCH 14/21] xfs_scrub: reset bulkstat pointer on retry Darrick J. Wong
` (7 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:07 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Don't let users pass us -o fstrim_pct=NaN, because strtod will happily
return what looks like success but isn't actually a number. Codex found
this fun little nit.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 34bed605490f93 ("xfs_scrub: tune fstrim minlen parameter based on free space histograms")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/xfs_scrub.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 383aa8752b2ce7..9e73dac92e0502 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -7,6 +7,7 @@
#include <pthread.h>
#include <stdlib.h>
#include <paths.h>
+#include <math.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/statvfs.h>
@@ -683,7 +684,7 @@ parse_o_opts(
errno = 0;
dval = strtod(val, &endp);
- if (*endp) {
+ if (*endp || isnan(dval)) {
fprintf(stderr,
_("-o fstrim_pct must be a floating point number\n"));
usage();
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 13/21] xfs_scrub: don't allow NAN as fstrim percentage
2026-06-04 6:07 ` [PATCH 13/21] xfs_scrub: don't allow NAN as fstrim percentage Darrick J. Wong
@ 2026-06-04 14:03 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:03 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:07:43, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Don't let users pass us -o fstrim_pct=NaN, because strtod will happily
> return what looks like success but isn't actually a number. Codex found
> this fun little nit.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 34bed605490f93 ("xfs_scrub: tune fstrim minlen parameter based on free space histograms")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 14/21] xfs_scrub: reset bulkstat pointer on retry
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (12 preceding siblings ...)
2026-06-04 6:07 ` [PATCH 13/21] xfs_scrub: don't allow NAN as fstrim percentage Darrick J. Wong
@ 2026-06-04 6:07 ` Darrick J. Wong
2026-06-04 14:09 ` Andrey Albershteyn
2026-06-04 6:08 ` [PATCH 15/21] xfs_scrub: don't return garbage value from bulkstat_the_rest Darrick J. Wong
` (6 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:07 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex points out that we don't reset the struct bulkstat array pointer
if we decide to re-issue the inumbers and bulkstat calls. Fix that, and
fix the @error variable being returned uninitialized if hdr.ocount is
zero.
Cc: <linux-xfs@vger.kernel.org> # v5.18.0
Fixes: 9f4d6358e28ba1 ("xfs_scrub: don't revisit scanned inodes when reprocessing a stale inode")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/inodes.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/scrub/inodes.c b/scrub/inodes.c
index 4ed7cd99635bda..bfd4abc44b56de 100644
--- a/scrub/inodes.c
+++ b/scrub/inodes.c
@@ -471,11 +471,11 @@ scan_ag_bulkstat(
struct xfs_inumbers_req *ireq = ichunk_to_inumbers(ichunk);
struct xfs_bulkstat_req *breq = ichunk_to_bulkstat(ichunk);
struct scan_inodes *si = ichunk->si;
- struct xfs_bulkstat *bs = &breq->bulkstat[0];
+ struct xfs_bulkstat *bs;
struct xfs_inumbers *inumbers = &ireq->inumbers[0];
uint64_t last_ino = 0;
int i;
- int error;
+ int error = 0;
int stale_count = 0;
DEFINE_DESCR(dsc_bulkstat, ctx, render_ino_from_bulkstat);
DEFINE_DESCR(dsc_inumbers, ctx, render_inumbers_from_agno);
@@ -486,7 +486,9 @@ scan_ag_bulkstat(
bulkstat_for_inumbers(ctx, inumbers, breq);
/* Iterate all the inodes. */
- for (i = 0; !si->aborted && i < breq->hdr.ocount; i++, bs++) {
+ for (i = 0, bs = &breq->bulkstat[0];
+ i < breq->hdr.ocount && !si->aborted;
+ i++, bs++) {
uint64_t scan_ino = bs->bs_ino;
/* ensure forward progress if we retried */
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 14/21] xfs_scrub: reset bulkstat pointer on retry
2026-06-04 6:07 ` [PATCH 14/21] xfs_scrub: reset bulkstat pointer on retry Darrick J. Wong
@ 2026-06-04 14:09 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:09 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:07:58, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex points out that we don't reset the struct bulkstat array pointer
> if we decide to re-issue the inumbers and bulkstat calls. Fix that, and
> fix the @error variable being returned uninitialized if hdr.ocount is
> zero.
>
> Cc: <linux-xfs@vger.kernel.org> # v5.18.0
> Fixes: 9f4d6358e28ba1 ("xfs_scrub: don't revisit scanned inodes when reprocessing a stale inode")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 15/21] xfs_scrub: don't return garbage value from bulkstat_the_rest
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (13 preceding siblings ...)
2026-06-04 6:07 ` [PATCH 14/21] xfs_scrub: reset bulkstat pointer on retry Darrick J. Wong
@ 2026-06-04 6:08 ` Darrick J. Wong
2026-06-04 14:13 ` Andrey Albershteyn
2026-06-04 6:08 ` [PATCH 16/21] xfs_scrub: don't continue with phase1 if autofsck=none Darrick J. Wong
` (5 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:08 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
bulkstat_the_rest returns an inode mask, but Codex noticed that we
carelessly return an error number if we can't allocate a bulkstat
request. Fix this by returning 0, which forces the caller to singlestep
the rest of the bulkstat array.
Cc: <linux-xfs@vger.kernel.org> # v6.14.0
Fixes: 7ae92e1cb0aeeb ("xfs_scrub: try harder to fill the bulkstat array with bulkstat()")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/inodes.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/scrub/inodes.c b/scrub/inodes.c
index bfd4abc44b56de..ab5cf393327f1a 100644
--- a/scrub/inodes.c
+++ b/scrub/inodes.c
@@ -152,8 +152,13 @@ bulkstat_the_rest(
error = -xfrog_bulkstat_alloc_req(
orig_breq->hdr.icount - orig_breq->hdr.ocount,
start_ino, &new_breq);
- if (error)
- return error;
+ if (error) {
+ /*
+ * Couldn't allocate memory for bulkstat, so we return 0 and
+ * let the caller single-step.
+ */
+ return 0;
+ }
new_breq->hdr.flags = orig_breq->hdr.flags;
do {
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 15/21] xfs_scrub: don't return garbage value from bulkstat_the_rest
2026-06-04 6:08 ` [PATCH 15/21] xfs_scrub: don't return garbage value from bulkstat_the_rest Darrick J. Wong
@ 2026-06-04 14:13 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:13 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:08:14, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> bulkstat_the_rest returns an inode mask, but Codex noticed that we
> carelessly return an error number if we can't allocate a bulkstat
> request. Fix this by returning 0, which forces the caller to singlestep
> the rest of the bulkstat array.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.14.0
> Fixes: 7ae92e1cb0aeeb ("xfs_scrub: try harder to fill the bulkstat array with bulkstat()")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 16/21] xfs_scrub: don't continue with phase1 if autofsck=none
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (14 preceding siblings ...)
2026-06-04 6:08 ` [PATCH 15/21] xfs_scrub: don't return garbage value from bulkstat_the_rest Darrick J. Wong
@ 2026-06-04 6:08 ` Darrick J. Wong
2026-06-04 14:19 ` Andrey Albershteyn
2026-06-04 6:08 ` [PATCH 17/21] xfs_scrub: don't crash trying to complain about clean health Darrick J. Wong
` (4 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:08 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
If xfs_scrub was started with -o autofsck and the filesystem's autofsck
property says not to run xfs_scrub, we should exit phase 1 early so that
the rest of the validation checks don't need to run. Codex points out
that the current code can fail pointlessly in this manner.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 9451b5ee0d0d2d ("xfs_scrub: allow sysadmin to control background scrubs")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase1.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/scrub/phase1.c b/scrub/phase1.c
index f4d50d016c2ebd..9e60a4d948f2db 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -342,6 +342,8 @@ _("Not an XFS filesystem."));
*/
if (ctx->mode == SCRUB_MODE_NONE)
mode_from_autofsck(ctx);
+ if (ctx->mode == SCRUB_MODE_NONE)
+ return 0;
/* Do we have kernel-assisted metadata scrubbing? */
if (!can_scrub_fs_metadata(ctx) || !can_scrub_inode(ctx) ||
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 16/21] xfs_scrub: don't continue with phase1 if autofsck=none
2026-06-04 6:08 ` [PATCH 16/21] xfs_scrub: don't continue with phase1 if autofsck=none Darrick J. Wong
@ 2026-06-04 14:19 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:19 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:08:29, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> If xfs_scrub was started with -o autofsck and the filesystem's autofsck
> property says not to run xfs_scrub, we should exit phase 1 early so that
> the rest of the validation checks don't need to run. Codex points out
> that the current code can fail pointlessly in this manner.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 9451b5ee0d0d2d ("xfs_scrub: allow sysadmin to control background scrubs")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 17/21] xfs_scrub: don't crash trying to complain about clean health
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (15 preceding siblings ...)
2026-06-04 6:08 ` [PATCH 16/21] xfs_scrub: don't continue with phase1 if autofsck=none Darrick J. Wong
@ 2026-06-04 6:08 ` Darrick J. Wong
2026-06-04 14:24 ` Andrey Albershteyn
2026-06-04 6:09 ` [PATCH 18/21] xfs_scrub: fix inverted return value from ptvar_get Darrick J. Wong
` (3 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:08 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex points out that the str_info call in report_to_kernel can crash
because of the null argument. This whole callsite is fubar because
the third argument is supposed to be a format string, and it's supposed
to print the mountpoint too. Fix all this.
Cc: <linux-xfs@vger.kernel.org> # v6.9.0
Fixes: 2025aa7a29e60c ("xfs_scrub: upload clean bills of health")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase1.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/scrub/phase1.c b/scrub/phase1.c
index 9e60a4d948f2db..34a2b3aec030eb 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -74,7 +74,8 @@ report_to_kernel(
*/
if (repair_item_count_needsrepair(&sri) != 0 &&
!debug_tweak_on("XFS_SCRUB_FORCE_REPAIR")) {
- str_info(ctx, _("Couldn't upload clean bill of health."), NULL);
+ str_info(ctx, ctx->mntpoint,
+_("Couldn't upload clean bill of health."));
}
return 0;
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 17/21] xfs_scrub: don't crash trying to complain about clean health
2026-06-04 6:08 ` [PATCH 17/21] xfs_scrub: don't crash trying to complain about clean health Darrick J. Wong
@ 2026-06-04 14:24 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:24 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:08:45, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex points out that the str_info call in report_to_kernel can crash
> because of the null argument. This whole callsite is fubar because
> the third argument is supposed to be a format string, and it's supposed
> to print the mountpoint too. Fix all this.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.9.0
> Fixes: 2025aa7a29e60c ("xfs_scrub: upload clean bills of health")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 18/21] xfs_scrub: fix inverted return value from ptvar_get
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (16 preceding siblings ...)
2026-06-04 6:08 ` [PATCH 17/21] xfs_scrub: don't crash trying to complain about clean health Darrick J. Wong
@ 2026-06-04 6:09 ` Darrick J. Wong
2026-06-04 14:28 ` Andrey Albershteyn
2026-06-04 6:09 ` [PATCH 19/21] xfs_scrub: don't obscure repair failures in repair_list_schedule Darrick J. Wong
` (2 subsequent siblings)
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:09 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex noticed that this callsite to ptvar_get got the sign of the error
code backwards, unlike all the other callers. Fix that.
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 4f4d99615d2377 ("xfs_scrub: improve thread scheduling repair items during phase 4")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scrub/phase3.c b/scrub/phase3.c
index 046a42c1da8beb..93900fb813e489 100644
--- a/scrub/phase3.c
+++ b/scrub/phase3.c
@@ -68,9 +68,9 @@ defer_inode_repair(
alist = ptvar_get(ictx->repair_ptlists, &ret);
if (ret) {
- str_liberror(ictx->ctx, ret,
+ str_liberror(ictx->ctx, -ret,
_("getting per-thread inode repair list"));
- return ret;
+ return -ret;
}
action_list_add(alist, aitem);
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 18/21] xfs_scrub: fix inverted return value from ptvar_get
2026-06-04 6:09 ` [PATCH 18/21] xfs_scrub: fix inverted return value from ptvar_get Darrick J. Wong
@ 2026-06-04 14:28 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 14:28 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:09:00, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex noticed that this callsite to ptvar_get got the sign of the error
> code backwards, unlike all the other callers. Fix that.
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 4f4d99615d2377 ("xfs_scrub: improve thread scheduling repair items during phase 4")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 19/21] xfs_scrub: don't obscure repair failures in repair_list_schedule
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (17 preceding siblings ...)
2026-06-04 6:09 ` [PATCH 18/21] xfs_scrub: fix inverted return value from ptvar_get Darrick J. Wong
@ 2026-06-04 6:09 ` Darrick J. Wong
2026-06-04 15:27 ` Andrey Albershteyn
2026-06-04 6:09 ` [PATCH 20/21] xfs_scrub: bitmap iteration functions must retur Darrick J. Wong
2026-06-04 6:09 ` [PATCH 21/21] xfs_scrub: read verification isn't ok if it hit runtime errors Darrick J. Wong
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:09 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex points out that if we bail out of the loop in repair_everything
due to an operational error, the workqueue cleanup blows away the error
code and we return as if nothing went wrong. Let's fix that so that
operational errors cause the program to exit with failure.
Cc: <linux-xfs@vger.kernel.org> # v5.3.0
Fixes: 71296cf87ff589 ("libfrog: split workqueue destroy functions")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase4.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/scrub/phase4.c b/scrub/phase4.c
index 88cb53aeac9076..6bd16c23939eb4 100644
--- a/scrub/phase4.c
+++ b/scrub/phase4.c
@@ -181,7 +181,7 @@ repair_everything(
{
struct workqueue wq;
int fixed_anything;
- int ret;
+ int ret, ret2;
ret = -workqueue_create(&wq, (struct xfs_mount *)ctx,
scrub_nproc_workqueue(ctx));
@@ -211,9 +211,12 @@ repair_everything(
fixed_anything++;
} while (fixed_anything > 0);
- ret = -workqueue_terminate(&wq);
- if (ret)
- str_liberror(ctx, ret, _("finishing repair work"));
+ ret2 = -workqueue_terminate(&wq);
+ if (ret2) {
+ str_liberror(ctx, ret2, _("finishing repair work"));
+ if (ret >= 0)
+ ret = ret2;
+ }
workqueue_destroy(&wq);
if (ret < 0)
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 19/21] xfs_scrub: don't obscure repair failures in repair_list_schedule
2026-06-04 6:09 ` [PATCH 19/21] xfs_scrub: don't obscure repair failures in repair_list_schedule Darrick J. Wong
@ 2026-06-04 15:27 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 15:27 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:09:16, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex points out that if we bail out of the loop in repair_everything
> due to an operational error, the workqueue cleanup blows away the error
> code and we return as if nothing went wrong. Let's fix that so that
> operational errors cause the program to exit with failure.
>
> Cc: <linux-xfs@vger.kernel.org> # v5.3.0
> Fixes: 71296cf87ff589 ("libfrog: split workqueue destroy functions")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 20/21] xfs_scrub: bitmap iteration functions must retur
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (18 preceding siblings ...)
2026-06-04 6:09 ` [PATCH 19/21] xfs_scrub: don't obscure repair failures in repair_list_schedule Darrick J. Wong
@ 2026-06-04 6:09 ` Darrick J. Wong
2026-06-04 15:30 ` Andrey Albershteyn
2026-06-04 6:09 ` [PATCH 21/21] xfs_scrub: read verification isn't ok if it hit runtime errors Darrick J. Wong
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:09 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
bitmap_iterate doesn't generate its own errors, so don't invert the sign
of the errors returned from retry_deferred_inode_range. This keeps them
all positive, which is something that Codex is good at noticing and I'm
not. :P
Cc: <linux-xfs@vger.kernel.org> # v6.10.0
Fixes: 5a30504f0c60e1 ("xfs_scrub: defer phase5 file scans if dirloop fails")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/phase5.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scrub/phase5.c b/scrub/phase5.c
index 52bbbca4b9f06a..329d208d882ef1 100644
--- a/scrub/phase5.c
+++ b/scrub/phase5.c
@@ -527,7 +527,7 @@ retry_deferred_inodes(
ncs->new_deferred = NULL;
ncs->fixed_something = false;
- error = -bitmap_iterate(ncs->cur_deferred,
+ error = bitmap_iterate(ncs->cur_deferred,
retry_deferred_inode_range, ncs);
if (error)
return error;
@@ -546,7 +546,7 @@ retry_deferred_inodes(
ncs->new_deferred = NULL;
ncs->last_call = true;
- error = -bitmap_iterate(ncs->cur_deferred,
+ error = bitmap_iterate(ncs->cur_deferred,
retry_deferred_inode_range, ncs);
if (error)
return error;
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 20/21] xfs_scrub: bitmap iteration functions must retur
2026-06-04 6:09 ` [PATCH 20/21] xfs_scrub: bitmap iteration functions must retur Darrick J. Wong
@ 2026-06-04 15:30 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 15:30 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:09:31, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> bitmap_iterate doesn't generate its own errors, so don't invert the sign
> of the errors returned from retry_deferred_inode_range. This keeps them
> all positive, which is something that Codex is good at noticing and I'm
> not. :P
>
> Cc: <linux-xfs@vger.kernel.org> # v6.10.0
> Fixes: 5a30504f0c60e1 ("xfs_scrub: defer phase5 file scans if dirloop fails")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH 21/21] xfs_scrub: read verification isn't ok if it hit runtime errors
2026-06-04 6:04 [PATCHSET] xfs_scrub: codex-inspired bug fixes for 7.1 Darrick J. Wong
` (19 preceding siblings ...)
2026-06-04 6:09 ` [PATCH 20/21] xfs_scrub: bitmap iteration functions must retur Darrick J. Wong
@ 2026-06-04 6:09 ` Darrick J. Wong
2026-06-04 15:35 ` Andrey Albershteyn
20 siblings, 1 reply; 46+ messages in thread
From: Darrick J. Wong @ 2026-06-04 6:09 UTC (permalink / raw)
To: aalbersh, djwong; +Cc: linux-xfs, linux-xfs, hch
From: Darrick J. Wong <djwong@kernel.org>
Codex complains that a read-verify pool couldn't possibly be ok if
someone set the runtime_error field to a nonzero value. That's true, so
fix the predicate.
Cc: <linux-xfs@vger.kernel.org> # v7.0.0
Fixes: 58fca77c2d2ae8 ("xfs_scrub: move failmap and other outputs into read_verify_pool")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
scrub/read_verify.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index 7dd588c1482af5..01e96f5bef40f6 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -557,7 +557,7 @@ bool
read_verify_ok(
const struct read_verify_pool *rvp)
{
- return rvp->failmap == NULL && !rvp->truncated;
+ return rvp->failmap == NULL && !rvp->truncated && !rvp->runtime_error;
}
/* Did the verification unexpectedly stop early due to short reads? */
^ permalink raw reply related [flat|nested] 46+ messages in thread* Re: [PATCH 21/21] xfs_scrub: read verification isn't ok if it hit runtime errors
2026-06-04 6:09 ` [PATCH 21/21] xfs_scrub: read verification isn't ok if it hit runtime errors Darrick J. Wong
@ 2026-06-04 15:35 ` Andrey Albershteyn
0 siblings, 0 replies; 46+ messages in thread
From: Andrey Albershteyn @ 2026-06-04 15:35 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, hch
On 2026-06-03 23:09:47, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Codex complains that a read-verify pool couldn't possibly be ok if
> someone set the runtime_error field to a nonzero value. That's true, so
> fix the predicate.
>
> Cc: <linux-xfs@vger.kernel.org> # v7.0.0
> Fixes: 58fca77c2d2ae8 ("xfs_scrub: move failmap and other outputs into read_verify_pool")
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Looks good to me
Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org>
--
- Andrey
^ permalink raw reply [flat|nested] 46+ messages in thread