public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHBOMB v8] xfsprogs: autonomous self healing of filesystems
@ 2026-03-03  0:25 Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:25 UTC (permalink / raw)
  To: Andrey Albershteyn; +Cc: cem, hch, linux-xfs

Hi all,

This patchset contains the userspace and QA changes (xfs_healer) needed
to put to use all the new kernel functionality to deliver live
information about filesystem health events (xfs_healthmon.c) to
userspace.

In userspace, we create a new daemon program that will read the event
objects and initiate repairs automatically.  This daemon is managed
entirely by systemd and will not block unmounting of the filesystem
unless repairs are ongoing.  They are auto-started by a starter
service that uses fanotify.

When the patchsets under this cover letter are merged, online fsck for
XFS will at long last be fully feature complete.  The passive scan parts
have been done since mid-2024, this final part adds proactive repair.

v8: clean up userspace for merging now that the kernel part is upstream
v7: more cleanups of the media verification ioctl, improve comments, and
    reuse the bio
v6: fix pi-breaking bugs, make verify failures trigger health reports
    and filter bio status flags better
v5: add verify-media ioctl, collapse small helper funcs with only
    one caller
v4: drop multiple client support so we can make direct calls into
    healthmon instead of chasing pointers and doing indirect calls
v3: drag out of rfc status

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* [PATCHSET v8] xfsprogs: autonomous self healing of filesystems
  2026-03-03  0:25 [PATCHBOMB v8] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
@ 2026-03-03  0:33 ` Darrick J. Wong
  2026-03-03  0:34   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
                     ` (25 more replies)
  2026-03-03  0:33 ` [PATCHSET v8 1/2] fstests: test generic file IO error reporting Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
  2 siblings, 26 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:33 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

Hi all,

This patchset builds new functionality to deliver live information about
filesystem health events to userspace.  This is done by creating an
anonymous file that can be read() for events by userspace programs.
Events are captured by hooking various parts of XFS and iomap so that
metadata health failures, file I/O errors, and major changes in
filesystem state (unmounts, shutdowns, etc.) can be observed by
programs.

When an event occurs, the hook functions queue an event object to each
event anonfd for later processing.  Programs must have CAP_SYS_ADMIN
to open the anonfd and there's a maximum event lag to prevent resource
overconsumption.  The events themselves can be read() from the anonfd
as C structs for the xfs_healer daemon.

In userspace, we create a new daemon program that will read the event
objects and initiate repairs automatically.  This daemon is managed
entirely by systemd and will not block unmounting of the filesystem
unless repairs are ongoing.  They are auto-started by a starter
service that uses fanotify.

v8: clean up userspace for merging now that the kernel part is upstream
v7: more cleanups of the media verification ioctl, improve comments, and
    reuse the bio
v6: fix pi-breaking bugs, make verify failures trigger health reports
v5: add verify-media ioctl, collapse small helper funcs with only
    one caller
v4: drop multiple client support so we can make direct calls into
    healthmon instead of chasing pointers and doing indirect calls
v3: drag out of rfc status

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=health-monitoring
---
Commits in this patchset:
 * libfrog: add a function to grab the path from an open fd and a file handle
 * libfrog: create healthmon event log library functions
 * libfrog: add support code for starting systemd services programmatically
 * libfrog: hoist a couple of service helper functions
 * man2: document the healthmon ioctl
 * man2: document the media verification ioctl
 * xfs_io: monitor filesystem health events
 * xfs_io: add a media verify command
 * xfs_healer: create daemon to listen for health events
 * xfs_healer: enable repairing filesystems
 * xfs_healer: use getparents to look up file names
 * xfs_healer: create a per-mount background monitoring service
 * xfs_healer: create a service to start the per-mount healer service
 * xfs_healer: don't start service if kernel support unavailable
 * xfs_healer: use the autofsck fsproperty to select mode
 * xfs_healer: run full scrub after lost corruption events or targeted repair failure
 * xfs_healer: use getmntent to find moved filesystems
 * xfs_healer: validate that repair fds point to the monitored fs
 * xfs_healer: add a manual page
 * xfs_scrub: use the verify media ioctl during phase 6 if possible
 * xfs_scrub: perform media scanning of the log region
 * xfs_io: add listmount command
 * xfs_io: print systemd service names
 * mkfs: enable online repair if all backrefs are enabled
 * debian: enable xfs_healer on the root filesystem by default
 * debian/control: listify the build dependencies
---
 healer/xfs_healer.h                            |   88 +++
 io/io.h                                        |    8 
 libfrog/flagmap.h                              |   23 +
 libfrog/fsproperties.h                         |    5 
 libfrog/getparents.h                           |    4 
 libfrog/healthevent.h                          |   55 ++
 libfrog/systemd.h                              |   55 ++
 scrub/disk.h                                   |   11 
 Makefile                                       |    5 
 configure.ac                                   |   12 
 debian/control                                 |   14 +
 debian/postinst                                |    8 
 debian/prerm                                   |   13 +
 debian/rules                                   |    3 
 healer/Makefile                                |   70 +++
 healer/fsrepair.c                              |  342 ++++++++++++++
 healer/system-xfs_healer.slice                 |   31 +
 healer/weakhandle.c                            |  266 +++++++++++
 healer/xfs_healer.c                            |  605 ++++++++++++++++++++++++
 healer/xfs_healer@.service.in                  |  108 ++++
 healer/xfs_healer_start.c                      |  372 +++++++++++++++
 healer/xfs_healer_start.service.in             |   85 +++
 include/builddefs.in                           |   12 
 io/Makefile                                    |   16 +
 io/healthmon.c                                 |  186 +++++++
 io/init.c                                      |    3 
 io/listmount.c                                 |  383 +++++++++++++++
 io/scrub.c                                     |   75 +++
 io/verify_media.c                              |  180 +++++++
 libfrog/Makefile                               |   10 
 libfrog/flagmap.c                              |   79 +++
 libfrog/getparents.c                           |   93 +++-
 libfrog/healthevent.c                          |  477 +++++++++++++++++++
 libfrog/systemd.c                              |  181 +++++++
 m4/package_libcdev.m4                          |   97 ++++
 man/man2/ioctl_xfs_health_fd_on_monitored_fs.2 |   75 +++
 man/man2/ioctl_xfs_health_monitor.2            |  464 ++++++++++++++++++
 man/man2/ioctl_xfs_verify_media.2              |  185 +++++++
 man/man8/Makefile                              |   40 +-
 man/man8/xfs_healer.8                          |  109 ++++
 man/man8/xfs_healer_start.8                    |   37 +
 man/man8/xfs_io.8                              |  134 +++++
 mkfs/xfs_mkfs.c                                |    9 
 scrub/Makefile                                 |   13 -
 scrub/disk.c                                   |   40 ++
 scrub/phase1.c                                 |   25 +
 scrub/phase6.c                                 |   11 
 scrub/read_verify.c                            |    2 
 scrub/xfs_scrub.c                              |   32 -
 49 files changed, 5090 insertions(+), 61 deletions(-)
 create mode 100644 healer/xfs_healer.h
 create mode 100644 libfrog/flagmap.h
 create mode 100644 libfrog/healthevent.h
 create mode 100644 libfrog/systemd.h
 create mode 100644 debian/prerm
 create mode 100644 healer/Makefile
 create mode 100644 healer/fsrepair.c
 create mode 100644 healer/system-xfs_healer.slice
 create mode 100644 healer/weakhandle.c
 create mode 100644 healer/xfs_healer.c
 create mode 100644 healer/xfs_healer@.service.in
 create mode 100644 healer/xfs_healer_start.c
 create mode 100644 healer/xfs_healer_start.service.in
 create mode 100644 io/healthmon.c
 create mode 100644 io/listmount.c
 create mode 100644 io/verify_media.c
 create mode 100644 libfrog/flagmap.c
 create mode 100644 libfrog/healthevent.c
 create mode 100644 libfrog/systemd.c
 create mode 100644 man/man2/ioctl_xfs_health_fd_on_monitored_fs.2
 create mode 100644 man/man2/ioctl_xfs_health_monitor.2
 create mode 100644 man/man2/ioctl_xfs_verify_media.2
 create mode 100644 man/man8/xfs_healer.8
 create mode 100644 man/man8/xfs_healer_start.8


^ permalink raw reply	[flat|nested] 112+ messages in thread

* [PATCHSET v8 1/2] fstests: test generic file IO error reporting
  2026-03-03  0:25 [PATCHBOMB v8] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
@ 2026-03-03  0:33 ` Darrick J. Wong
  2026-03-03  0:40   ` [PATCH 1/1] generic: test fsnotify filesystem " Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
  2 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:33 UTC (permalink / raw)
  To: zlang, djwong
  Cc: linux-fsdevel, hch, gabriel, amir73il, jack, fstests, linux-xfs

Hi all,

Refactor the iomap file I/O error handling code so that failures are
reported in a generic way to fsnotify.  Then connect the XFS health
reporting to the same fsnotify, and now XFS can notify userspace of
problems.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=filesystem-error-reporting

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=filesystem-error-reporting
---
Commits in this patchset:
 * generic: test fsnotify filesystem error reporting
---
 src/Makefile           |    2 
 src/fs-monitor.c       |  155 +++++++++++++++++++++++++++++++++
 tests/generic/1838     |  228 ++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/1838.out |   20 ++++
 4 files changed, 404 insertions(+), 1 deletion(-)
 create mode 100644 src/fs-monitor.c
 create mode 100755 tests/generic/1838
 create mode 100644 tests/generic/1838.out


^ permalink raw reply	[flat|nested] 112+ messages in thread

* [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems
  2026-03-03  0:25 [PATCHBOMB v8] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
  2026-03-03  0:33 ` [PATCHSET v8 1/2] fstests: test generic file IO error reporting Darrick J. Wong
@ 2026-03-03  0:33 ` Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 01/13] xfs: test health monitoring code Darrick J. Wong
                     ` (13 more replies)
  2 siblings, 14 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:33 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

Hi all,

This series adds functionality and regression tests for the automated
self healing daemon for xfs.

v8: clean up userspace for merging now that the kernel part is upstream
v7: more cleanups of the media verification ioctl, improve comments, and
    reuse the bio
v6: fix pi-breaking bugs, make verify failures trigger health reports
v5: add verify-media ioctl, collapse small helper funcs with only
    one caller
v4: drop multiple client support so we can make direct calls into
    healthmon instead of chasing pointers and doing indirect calls
v3: drag out of rfc status

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

With a bit of luck, this should all go splendidly.
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=health-monitoring
---
Commits in this patchset:
 * xfs: test health monitoring code
 * xfs: test for metadata corruption error reporting via healthmon
 * xfs: test io error reporting via healthmon
 * xfs: set up common code for testing xfs_healer
 * xfs: test xfs_healer's event handling
 * xfs: test xfs_healer can fix a filesystem
 * xfs: test xfs_healer can report file I/O errors
 * xfs: test xfs_healer can report file media errors
 * xfs: test xfs_healer can report filesystem shutdowns
 * xfs: test xfs_healer can initiate full filesystem repairs
 * xfs: test xfs_healer can follow mount moves
 * xfs: test xfs_healer wont repair the wrong filesystem
 * xfs: test xfs_healer background service
---
 common/config       |   14 +++
 common/rc           |   15 ++++
 common/systemd      |   32 ++++++++
 common/xfs          |  114 ++++++++++++++++++++++++++++
 doc/group-names.txt |    1 
 tests/xfs/1878      |   93 +++++++++++++++++++++++
 tests/xfs/1878.out  |   10 ++
 tests/xfs/1879      |   93 +++++++++++++++++++++++
 tests/xfs/1879.out  |    8 ++
 tests/xfs/1882      |   44 +++++++++++
 tests/xfs/1882.out  |    2 
 tests/xfs/1884      |   89 ++++++++++++++++++++++
 tests/xfs/1884.out  |    2 
 tests/xfs/1885      |   53 +++++++++++++
 tests/xfs/1885.out  |    5 +
 tests/xfs/1896      |  210 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1896.out  |   21 +++++
 tests/xfs/1897      |  172 ++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1897.out  |    7 ++
 tests/xfs/1898      |   37 +++++++++
 tests/xfs/1898.out  |    4 +
 tests/xfs/1899      |  108 ++++++++++++++++++++++++++
 tests/xfs/1899.out  |    3 +
 tests/xfs/1900      |  115 ++++++++++++++++++++++++++++
 tests/xfs/1900.out  |    2 
 tests/xfs/1901      |  137 +++++++++++++++++++++++++++++++++
 tests/xfs/1901.out  |    2 
 tests/xfs/1902      |  152 +++++++++++++++++++++++++++++++++++++
 tests/xfs/1902.out  |    2 
 tests/xfs/802       |    4 -
 30 files changed, 1549 insertions(+), 2 deletions(-)
 create mode 100755 tests/xfs/1878
 create mode 100644 tests/xfs/1878.out
 create mode 100755 tests/xfs/1879
 create mode 100644 tests/xfs/1879.out
 create mode 100755 tests/xfs/1882
 create mode 100644 tests/xfs/1882.out
 create mode 100755 tests/xfs/1884
 create mode 100644 tests/xfs/1884.out
 create mode 100755 tests/xfs/1885
 create mode 100644 tests/xfs/1885.out
 create mode 100755 tests/xfs/1896
 create mode 100644 tests/xfs/1896.out
 create mode 100755 tests/xfs/1897
 create mode 100755 tests/xfs/1897.out
 create mode 100755 tests/xfs/1898
 create mode 100755 tests/xfs/1898.out
 create mode 100755 tests/xfs/1899
 create mode 100644 tests/xfs/1899.out
 create mode 100755 tests/xfs/1900
 create mode 100755 tests/xfs/1900.out
 create mode 100755 tests/xfs/1901
 create mode 100755 tests/xfs/1901.out
 create mode 100755 tests/xfs/1902
 create mode 100755 tests/xfs/1902.out


^ permalink raw reply	[flat|nested] 112+ messages in thread

* [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
@ 2026-03-03  0:34   ` Darrick J. Wong
  2026-03-03 15:44     ` Christoph Hellwig
  2026-03-03  0:34   ` [PATCH 02/26] libfrog: create healthmon event log library functions Darrick J. Wong
                     ` (24 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:34 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

handle_walk_paths operates on a file handle, but requires that the fs
has been registered with libhandle via path_to_fshandle.  For a normal
libhandle client this is the desirable behavior because the application
*should* maintain an open fd to the filesystem mount.

However for xfs_healer this isn't going to work well because the healer
mustn't pin the mount while it's running.  It's smart enough to know how
to find and reconnect to the mountpoint, but libhandle doesn't have any
such concept.

Therefore, alter the libfrog getparents code so that xfs_healer can pass
in the mountpoint and reconnected fd without needing libhandle.  All
we're really doing here is trying to obtain a user-visible path for a
file that encountered problems for logging purposes; if it fails, we'll
fall back to logging the inode number.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 libfrog/getparents.h |    4 ++
 libfrog/getparents.c |   93 ++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 82 insertions(+), 15 deletions(-)


diff --git a/libfrog/getparents.h b/libfrog/getparents.h
index 8098d594219b4c..e1df30889c7606 100644
--- a/libfrog/getparents.h
+++ b/libfrog/getparents.h
@@ -39,4 +39,8 @@ int fd_to_path(int fd, size_t ioctl_bufsize, char *path, size_t pathlen);
 int handle_to_path(const void *hanp, size_t hlen, size_t ioctl_bufsize,
 		char *path, size_t pathlen);
 
+int handle_walk_paths_fd(const char *mntpt, int mntfd, const void *hanp,
+		size_t hanlen, size_t ioctl_bufsize, walk_path_fn fn,
+		void *arg);
+
 #endif /* __LIBFROG_GETPARENTS_H_ */
diff --git a/libfrog/getparents.c b/libfrog/getparents.c
index 9118b0ff32db0d..e8f545392634e4 100644
--- a/libfrog/getparents.c
+++ b/libfrog/getparents.c
@@ -112,9 +112,13 @@ fd_walk_parents(
 	return ret;
 }
 
-/* Walk all parent pointers of this handle.  Returns 0 or positive errno. */
-int
-handle_walk_parents(
+/*
+ * Walk all parent pointers of this handle using the given fd to query the
+ * filesystem.  Returns 0 or positive errno.
+ */
+static int
+handle_walk_parents_fd(
+	int			fd,
 	const void		*hanp,
 	size_t			hlen,
 	size_t			bufsize,
@@ -123,21 +127,11 @@ handle_walk_parents(
 {
 	struct xfs_getparents_by_handle	gph = { };
 	void			*buf;
-	char			*mntpt;
-	int			fd;
 	int			ret;
 
 	if (hlen != sizeof(struct xfs_handle))
 		return EINVAL;
 
-	/*
-	 * This function doesn't modify the handle, but we don't want to have
-	 * to bump the libhandle major version just to change that.
-	 */
-	fd = handle_to_fsfd((void *)hanp, &mntpt);
-	if (fd < 0)
-		return errno;
-
 	buf = alloc_records(&gph.gph_request, bufsize);
 	if (!buf)
 		return errno;
@@ -158,6 +152,29 @@ handle_walk_parents(
 	return ret;
 }
 
+/* Walk all parent pointers of this handle.  Returns 0 or positive errno. */
+int
+handle_walk_parents(
+	const void		*hanp,
+	size_t			hlen,
+	size_t			bufsize,
+	walk_parent_fn		fn,
+	void			*arg)
+{
+	char			*mntpt;
+	int			fd;
+
+	/*
+	 * This function doesn't modify the handle, but we don't want to have
+	 * to bump the libhandle major version just to change that.
+	 */
+	fd = handle_to_fsfd((void *)hanp, &mntpt);
+	if (fd < 0)
+		return errno;
+
+	return handle_walk_parents_fd(fd, hanp, hlen, bufsize, fn, arg);
+}
+
 struct walk_ppaths_info {
 	/* Callback */
 	walk_path_fn		fn;
@@ -169,7 +186,11 @@ struct walk_ppaths_info {
 	/* Path that we're constructing. */
 	struct path_list	*path;
 
+	/* Use this much memory per call. */
 	size_t			ioctl_bufsize;
+
+	/* Use this fd for calling the getparents ioctl. */
+	int			mntfd;
 };
 
 /*
@@ -200,8 +221,14 @@ find_parent_component(
 		return errno;
 	path_list_add_parent_component(wpi->path, pc);
 
-	ret = handle_walk_parents(&rec->p_handle, sizeof(rec->p_handle),
-			wpi->ioctl_bufsize, find_parent_component, wpi);
+	if (wpi->mntfd >= 0)
+		ret = handle_walk_parents_fd(wpi->mntfd, &rec->p_handle,
+				sizeof(rec->p_handle), wpi->ioctl_bufsize,
+				find_parent_component, wpi);
+	else
+		ret = handle_walk_parents(&rec->p_handle,
+				sizeof(rec->p_handle), wpi->ioctl_bufsize,
+				find_parent_component, wpi);
 
 	path_list_del_component(wpi->path, pc);
 	path_component_free(pc);
@@ -222,6 +249,7 @@ handle_walk_paths(
 {
 	struct walk_ppaths_info	wpi = {
 		.ioctl_bufsize	= ioctl_bufsize,
+		.mntfd		= -1,
 	};
 	int			ret;
 
@@ -246,6 +274,41 @@ handle_walk_paths(
 	return ret;
 }
 
+/*
+ * Call the given function on all known paths from the vfs root to the inode
+ * described in the handle using an already open mountpoint and fd.  Returns 0
+ * for success or positive errno.
+ */
+int
+handle_walk_paths_fd(
+	const char		*mntpt,
+	int			mntfd,
+	const void		*hanp,
+	size_t			hlen,
+	size_t			ioctl_bufsize,
+	walk_path_fn		fn,
+	void			*arg)
+{
+	struct walk_ppaths_info	wpi = {
+		.ioctl_bufsize	= ioctl_bufsize,
+		.mntfd		= mntfd,
+		.mntpt		= (char *)mntpt,
+	};
+	int			ret;
+
+	wpi.path = path_list_init();
+	if (!wpi.path)
+		return errno;
+	wpi.fn = fn;
+	wpi.arg = arg;
+
+	ret = handle_walk_parents_fd(mntfd, hanp, hlen, ioctl_bufsize,
+			find_parent_component, &wpi);
+
+	path_list_free(wpi.path);
+	return ret;
+}
+
 /*
  * Call the given function on all known paths from the vfs root to the inode
  * referred to by the file description.  Returns 0 or positive errno.


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 02/26] libfrog: create healthmon event log library functions
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
  2026-03-03  0:34   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
@ 2026-03-03  0:34   ` Darrick J. Wong
  2026-03-03 15:44     ` Christoph Hellwig
  2026-03-03  0:34   ` [PATCH 03/26] libfrog: add support code for starting systemd services programmatically Darrick J. Wong
                     ` (23 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:34 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add some helper functions to log health monitoring events so that xfs_io
and xfs_healer can share logging code.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 libfrog/flagmap.h     |   20 +++
 libfrog/healthevent.h |   43 ++++++
 libfrog/Makefile      |    4 +
 libfrog/flagmap.c     |   62 ++++++++
 libfrog/healthevent.c |  360 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 489 insertions(+)
 create mode 100644 libfrog/flagmap.h
 create mode 100644 libfrog/healthevent.h
 create mode 100644 libfrog/flagmap.c
 create mode 100644 libfrog/healthevent.c


diff --git a/libfrog/flagmap.h b/libfrog/flagmap.h
new file mode 100644
index 00000000000000..8031d75a7c02a8
--- /dev/null
+++ b/libfrog/flagmap.h
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#ifndef LIBFROG_FLAGMAP_H_
+#define LIBFROG_FLAGMAP_H_
+
+struct flag_map {
+	unsigned long long	flag;
+	const char		*string;
+};
+
+void mask_to_string(const struct flag_map *map, unsigned long long mask,
+		const char *delimiter, char *buf, size_t bufsize);
+
+const char *value_to_string(const struct flag_map *map,
+		unsigned long long value);
+
+#endif /* LIBFROG_FLAGMAP_H_ */
diff --git a/libfrog/healthevent.h b/libfrog/healthevent.h
new file mode 100644
index 00000000000000..6de41bc797100c
--- /dev/null
+++ b/libfrog/healthevent.h
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#ifndef LIBFROG_HEALTHEVENT_H_
+#define LIBFROG_HEALTHEVENT_H_
+
+struct hme_prefix {
+	/*
+	 * Format a complete file path into this buffer to prevent the logging
+	 * code from printing the mountpoint and a file handle.  Only works for
+	 * file-related events.
+	 */
+	char		path[MAXPATHLEN];
+
+	/* Set this to the mountpoint */
+	const char	*mountpoint;
+};
+
+static inline bool hme_prefix_has_path(const struct hme_prefix *pfx)
+{
+	return pfx->path[0] != 0;
+}
+
+static inline void hme_prefix_clear_path(struct hme_prefix *pfx)
+{
+	pfx->path[0] = 0;
+}
+
+static inline void
+hme_prefix_init(
+	struct hme_prefix	*pfx,
+	const char		*mountpoint)
+{
+	pfx->mountpoint = mountpoint;
+	hme_prefix_clear_path(pfx);
+}
+
+void hme_report_event(const struct hme_prefix *pfx,
+		const struct xfs_health_monitor_event *hme);
+
+#endif /* LIBFROG_HEALTHEVENT_H_ */
diff --git a/libfrog/Makefile b/libfrog/Makefile
index 927bd8d0957fab..bccd9289e5dd79 100644
--- a/libfrog/Makefile
+++ b/libfrog/Makefile
@@ -19,11 +19,13 @@ bulkstat.c \
 convert.c \
 crc32.c \
 file_exchange.c \
+flagmap.c \
 fsgeom.c \
 fsproperties.c \
 fsprops.c \
 getparents.c \
 histogram.c \
+healthevent.c \
 file_attr.c \
 list_sort.c \
 linux.c \
@@ -51,11 +53,13 @@ dahashselftest.h \
 div64.h \
 fakelibattr.h \
 file_exchange.h \
+flagmap.h \
 fsgeom.h \
 fsproperties.h \
 fsprops.h \
 getparents.h \
 handle_priv.h \
+healthevent.h \
 histogram.h \
 file_attr.h \
 logging.h \
diff --git a/libfrog/flagmap.c b/libfrog/flagmap.c
new file mode 100644
index 00000000000000..631c4bbc8f1dc0
--- /dev/null
+++ b/libfrog/flagmap.c
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+
+#include "platform_defs.h"
+#include "libfrog/flagmap.h"
+
+/*
+ * Given a mapping of bits to strings and a bitmask, format the bitmask as a
+ * list of strings and hexadecimal number representing bits not mapped to any
+ * string.  The output will be truncated if buf is not large enough.
+ */
+void
+mask_to_string(
+	const struct flag_map	*map,
+	unsigned long long	mask,
+	const char		*delimiter,
+	char			*buf,
+	size_t			bufsize)
+{
+	const char		*tag = "";
+	unsigned long long	seen = 0;
+	int			w;
+
+	for (; map->string; map++) {
+		seen |= map->flag;
+
+		if (mask & map->flag) {
+			w = snprintf(buf, bufsize, "%s%s", tag, _(map->string));
+			if (w > bufsize)
+				return;
+
+			buf += w;
+			bufsize -= w;
+
+			tag = delimiter;
+		}
+	}
+
+	if (mask & ~seen)
+		snprintf(buf, bufsize, "%s0x%llx", tag, mask & ~seen);
+}
+
+/*
+ * Given a mapping of values to strings and a value, return the matching string
+ * or confusion.
+ */
+const char *
+value_to_string(
+	const struct flag_map	*map,
+	unsigned long long	value)
+{
+	for (; map->string; map++) {
+		if (value == map->flag)
+			return _(map->string);
+	}
+
+	return _("unknown value");
+}
diff --git a/libfrog/healthevent.c b/libfrog/healthevent.c
new file mode 100644
index 00000000000000..8520cb3218fb03
--- /dev/null
+++ b/libfrog/healthevent.c
@@ -0,0 +1,360 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+
+#include "platform_defs.h"
+#include "libfrog/healthevent.h"
+#include "libfrog/flagmap.h"
+
+/*
+ * The healthmon log string format is as follows:
+ *
+ * WHICH OBJECT: STATUS
+ *
+ * /mnt: 32 events lost
+ * /mnt agno 0x5 bnobt, rmapbt: sick
+ * /mnt rgno 0x5 bitmap: sick
+ * /mnt ino 13 gen 0x3 bmbtd: sick
+ * /mnt/a bmbtd: sick
+ * /mnt ino 13 gen 0x3 pos 4096 len 4096: directio_write failed
+ * /mnt/a pos 4096 len 4096: directio_read failed
+ * /mnt datadev daddr 0x13 bbcount 0x5: media error
+ * /mnt: filesystem shut down due to shenanigans, badness
+ */
+
+static const struct flag_map device_domains[] = {
+	{ XFS_HEALTH_MONITOR_DOMAIN_DATADEV,	N_("datadev") },
+	{ XFS_HEALTH_MONITOR_DOMAIN_RTDEV,	N_("rtdev") },
+	{ XFS_HEALTH_MONITOR_DOMAIN_LOGDEV,	N_("logdev") },
+	{0, NULL},
+};
+
+static inline const char *
+device_domain_string(
+	uint32_t		domain)
+{
+	return value_to_string(device_domains, domain);
+}
+
+static const struct flag_map fileio_types[] = {
+	{ XFS_HEALTH_MONITOR_TYPE_BUFREAD,	N_("buffered_read") },
+	{ XFS_HEALTH_MONITOR_TYPE_BUFWRITE,	N_("buffered_write") },
+	{ XFS_HEALTH_MONITOR_TYPE_DIOREAD,	N_("directio_read") },
+	{ XFS_HEALTH_MONITOR_TYPE_DIOWRITE,	N_("directio_write") },
+	{ XFS_HEALTH_MONITOR_TYPE_DATALOST,	N_("media") },
+	{0, NULL},
+};
+
+static inline const char *
+fileio_type_string(
+	uint32_t		type)
+{
+	return value_to_string(fileio_types, type);
+}
+
+static const struct flag_map health_types[] = {
+	{ XFS_HEALTH_MONITOR_TYPE_SICK,		N_("sick") },
+	{ XFS_HEALTH_MONITOR_TYPE_CORRUPT,	N_("corrupt") },
+	{ XFS_HEALTH_MONITOR_TYPE_HEALTHY,	N_("healthy") },
+	{0, NULL},
+};
+
+static inline const char *
+health_type_string(
+	uint32_t		type)
+{
+	return value_to_string(health_types, type);
+}
+
+/* Report that the kernel lost events. */
+static void
+report_lost(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	printf("%s: %llu %s\n", pfx->mountpoint,
+			(unsigned long long)hme->e.lost.count,
+			_("events lost"));
+	fflush(stdout);
+}
+
+/* Report that the monitor is running. */
+static void
+report_running(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	printf("%s: %s\n", pfx->mountpoint, _("monitoring started"));
+	fflush(stdout);
+}
+
+/* Report that the filesystem was unmounted. */
+static void
+report_unmounted(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	printf("%s: %s\n", pfx->mountpoint, _("filesystem unmounted"));
+	fflush(stdout);
+}
+
+static const struct flag_map shutdown_reasons[] = {
+	{ XFS_HEALTH_SHUTDOWN_META_IO_ERROR,	N_("metadata I/O error") },
+	{ XFS_HEALTH_SHUTDOWN_LOG_IO_ERROR,	N_("log I/O error") },
+	{ XFS_HEALTH_SHUTDOWN_FORCE_UMOUNT,	N_("forced unmount") },
+	{ XFS_HEALTH_SHUTDOWN_CORRUPT_INCORE,	N_("in-memory state corruption") },
+	{ XFS_HEALTH_SHUTDOWN_CORRUPT_ONDISK,	N_("ondisk metadata corruption") },
+	{ XFS_HEALTH_SHUTDOWN_DEVICE_REMOVED,	N_("device removed") },
+	{0, NULL},
+};
+
+/* Report an abortive shutdown of the filesystem. */
+static void
+report_shutdown(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	char					buf[512];
+
+	mask_to_string(shutdown_reasons, hme->e.shutdown.reasons, ", ", buf,
+			sizeof(buf));
+
+	printf("%s: %s %s\n", pfx->mountpoint,
+			_("filesystem shut down due to"), buf);
+	fflush(stdout);
+}
+
+static const struct flag_map inode_structs[] = {
+	{ XFS_BS_SICK_INODE,	N_("core") },
+	{ XFS_BS_SICK_BMBTD,	N_("datafork") },
+	{ XFS_BS_SICK_BMBTA,	N_("attrfork") },
+	{ XFS_BS_SICK_BMBTC,	N_("cowfork") },
+	{ XFS_BS_SICK_DIR,	N_("directory") },
+	{ XFS_BS_SICK_XATTR,	N_("xattr") },
+	{ XFS_BS_SICK_SYMLINK,	N_("symlink") },
+	{ XFS_BS_SICK_PARENT,	N_("parent") },
+	{ XFS_BS_SICK_DIRTREE,	N_("dirtree") },
+	{0, NULL},
+};
+
+/* Report inode metadata corruption */
+static void
+report_inode(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	char					buf[512];
+
+	mask_to_string(inode_structs, hme->e.inode.mask, ", ", buf,
+			sizeof(buf));
+
+	if (hme_prefix_has_path(pfx))
+		printf("%s %s: %s\n",
+				pfx->path,
+				buf,
+				health_type_string(hme->type));
+	else
+		printf("%s %s %llu %s 0x%x %s: %s\n",
+				pfx->mountpoint,
+				_("ino"),
+				(unsigned long long)hme->e.inode.ino,
+				_("gen"),
+				hme->e.inode.gen,
+				buf,
+				health_type_string(hme->type));
+	fflush(stdout);
+}
+
+static const struct flag_map ag_structs[] = {
+	{ XFS_AG_GEOM_SICK_SB,		N_("super") },
+	{ XFS_AG_GEOM_SICK_AGF,		N_("agf") },
+	{ XFS_AG_GEOM_SICK_AGFL,	N_("agfl") },
+	{ XFS_AG_GEOM_SICK_AGI,		N_("agi") },
+	{ XFS_AG_GEOM_SICK_BNOBT,	N_("bnobt") },
+	{ XFS_AG_GEOM_SICK_CNTBT,	N_("cntbt") },
+	{ XFS_AG_GEOM_SICK_INOBT,	N_("inobt") },
+	{ XFS_AG_GEOM_SICK_FINOBT,	N_("finobt") },
+	{ XFS_AG_GEOM_SICK_RMAPBT,	N_("rmapbt") },
+	{ XFS_AG_GEOM_SICK_REFCNTBT,	N_("refcountbt") },
+	{ XFS_AG_GEOM_SICK_INODES,	N_("inodes") },
+	{0, NULL},
+};
+
+/* Report AG metadata corruption */
+static void
+report_ag(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	char					buf[512];
+
+	mask_to_string(ag_structs, hme->e.group.mask, ", ", buf,
+			sizeof(buf));
+
+	printf("%s %s 0x%x %s: %s\n",
+			pfx->mountpoint,
+			_("agno"),
+			hme->e.group.gno,
+			buf,
+			health_type_string(hme->type));
+	fflush(stdout);
+}
+
+static const struct flag_map rtgroup_structs[] = {
+	{ XFS_RTGROUP_GEOM_SICK_SUPER,		N_("super") },
+	{ XFS_RTGROUP_GEOM_SICK_BITMAP,		N_("bitmap") },
+	{ XFS_RTGROUP_GEOM_SICK_SUMMARY,	N_("summary") },
+	{ XFS_RTGROUP_GEOM_SICK_RMAPBT,		N_("rmapbt") },
+	{ XFS_RTGROUP_GEOM_SICK_REFCNTBT,	N_("refcountbt") },
+	{0, NULL},
+};
+
+/* Report rtgroup metadata corruption */
+static void
+report_rtgroup(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	char					buf[512];
+
+	mask_to_string(rtgroup_structs, hme->e.group.mask, ", ", buf,
+			sizeof(buf));
+
+	printf("%s %s 0x%x %s: %s\n",
+			pfx->mountpoint,
+			_("rgno"),
+			hme->e.group.gno,
+			buf, health_type_string(hme->type));
+	fflush(stdout);
+}
+
+static const struct flag_map fs_structs[] = {
+	{ XFS_FSOP_GEOM_SICK_COUNTERS,		N_("fscounters") },
+	{ XFS_FSOP_GEOM_SICK_UQUOTA,		N_("usrquota") },
+	{ XFS_FSOP_GEOM_SICK_GQUOTA,		N_("grpquota") },
+	{ XFS_FSOP_GEOM_SICK_PQUOTA,		N_("prjquota") },
+	{ XFS_FSOP_GEOM_SICK_RT_BITMAP,		N_("bitmap") },
+	{ XFS_FSOP_GEOM_SICK_RT_SUMMARY,	N_("summary") },
+	{ XFS_FSOP_GEOM_SICK_QUOTACHECK,	N_("quotacheck") },
+	{ XFS_FSOP_GEOM_SICK_NLINKS,		N_("nlinks") },
+	{ XFS_FSOP_GEOM_SICK_METADIR,		N_("metadir") },
+	{ XFS_FSOP_GEOM_SICK_METAPATH,		N_("metapath") },
+	{0, NULL},
+};
+
+/* Report fs-wide metadata corruption */
+static void
+report_fs(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	char					buf[512];
+
+	mask_to_string(fs_structs, hme->e.fs.mask, ", ", buf, sizeof(buf));
+
+	printf("%s %s: %s\n",
+			pfx->mountpoint,
+			buf,
+			health_type_string(hme->type));
+	fflush(stdout);
+}
+
+/* Report device media corruption */
+static void
+report_device_error(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	printf("%s %s %s 0x%llx %s 0x%llx: %s\n", pfx->mountpoint,
+			device_domain_string(hme->domain),
+			_("daddr"),
+			(unsigned long long)hme->e.media.daddr,
+			_("bbcount"),
+			(unsigned long long)hme->e.media.bbcount,
+			_("media error"));
+	fflush(stdout);
+}
+
+/* Report file range errors */
+static void
+report_file_range(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	if (hme_prefix_has_path(pfx))
+		printf("%s ", pfx->path);
+	else
+		printf("%s %s %llu %s 0x%x ",
+				pfx->mountpoint,
+				_("ino"),
+				(unsigned long long)hme->e.filerange.ino,
+				_("gen"),
+				hme->e.filerange.gen);
+	if (hme->type != XFS_HEALTH_MONITOR_TYPE_DATALOST &&
+	    hme->e.filerange.error)
+		printf("%s %llu %s %llu: %s: %s\n",
+				_("pos"),
+				(unsigned long long)hme->e.filerange.pos,
+				_("len"),
+				(unsigned long long)hme->e.filerange.len,
+				fileio_type_string(hme->type),
+				strerror(hme->e.filerange.error));
+	else
+		printf("%s %llu %s %llu: %s %s\n",
+				_("pos"),
+				(unsigned long long)hme->e.filerange.pos,
+				_("len"),
+				(unsigned long long)hme->e.filerange.len,
+				fileio_type_string(hme->type),
+				_("failed"));
+	fflush(stdout);
+}
+
+/* Log a health monitoring event to stdout. */
+void
+hme_report_event(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	switch (hme->domain) {
+	case XFS_HEALTH_MONITOR_DOMAIN_MOUNT:
+		switch (hme->type) {
+		case XFS_HEALTH_MONITOR_TYPE_LOST:
+			report_lost(pfx, hme);
+			return;
+		case XFS_HEALTH_MONITOR_TYPE_RUNNING:
+			report_running(pfx, hme);
+			return;
+		case XFS_HEALTH_MONITOR_TYPE_UNMOUNT:
+			report_unmounted(pfx, hme);
+			return;
+		case XFS_HEALTH_MONITOR_TYPE_SHUTDOWN:
+			report_shutdown(pfx, hme);
+			return;
+		}
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_INODE:
+		report_inode(pfx, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_AG:
+		report_ag(pfx, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_RTGROUP:
+		report_rtgroup(pfx, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_FS:
+		report_fs(pfx, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_DATADEV:
+	case XFS_HEALTH_MONITOR_DOMAIN_RTDEV:
+	case XFS_HEALTH_MONITOR_DOMAIN_LOGDEV:
+		report_device_error(pfx, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_FILERANGE:
+		report_file_range(pfx, hme);
+		break;
+	}
+}


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 03/26] libfrog: add support code for starting systemd services programmatically
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
  2026-03-03  0:34   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
  2026-03-03  0:34   ` [PATCH 02/26] libfrog: create healthmon event log library functions Darrick J. Wong
@ 2026-03-03  0:34   ` Darrick J. Wong
  2026-03-03 15:45     ` Christoph Hellwig
  2026-03-03  0:34   ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong
                     ` (22 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:34 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add some simple routines for computing the name of systemd service
instances and starting systemd services.  These will be used by the
xfs_healer_start service to start per-filesystem xfs_healer service
instances.

Note that we run systemd helper programs as subprocesses for a couple of
reasons.  First, the path-escaping functionality is not a part of any
library-accessible API, which means it can only be accessed via
systemd-escape(1).  Second, although the service startup functionality
can be reached via dbus, doing so would introduce a new library
dependency.  Systemd is also undergoing a dbus -> varlink RPC transition
so we avoid that mess by calling the cli systemctl(1) program.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 libfrog/systemd.h     |   20 +++++
 configure.ac          |    1 
 include/builddefs.in  |    1 
 libfrog/Makefile      |    6 ++
 libfrog/systemd.c     |  181 +++++++++++++++++++++++++++++++++++++++++++++++++
 m4/package_libcdev.m4 |   19 +++++
 6 files changed, 228 insertions(+)
 create mode 100644 libfrog/systemd.h
 create mode 100644 libfrog/systemd.c


diff --git a/libfrog/systemd.h b/libfrog/systemd.h
new file mode 100644
index 00000000000000..4f414bc3c1e9c3
--- /dev/null
+++ b/libfrog/systemd.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (c) 2026 Oracle.  All rights reserved.
+ * All Rights Reserved.
+ */
+#ifndef __LIBFROG_SYSTEMD_H__
+#define __LIBFROG_SYSTEMD_H__
+
+int systemd_path_instance_unit_name(const char *unit_template,
+		const char *path, char *unitname, size_t unitnamelen);
+
+enum systemd_unit_manage {
+	UM_STOP,
+	UM_START,
+	UM_RESTART,
+};
+
+int systemd_manage_unit(enum systemd_unit_manage how, const char *unitname);
+
+#endif /* __LIBFROG_SYSTEMD_H__ */
diff --git a/configure.ac b/configure.ac
index a8b8f7d5066fb6..8d2bbb9ef88bb9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -182,6 +182,7 @@ AC_CONFIG_UDEV_RULE_DIR
 AC_HAVE_BLKID_TOPO
 AC_HAVE_TRIVIAL_AUTO_VAR_INIT
 AC_STRERROR_R_RETURNS_STRING
+AC_HAVE_CLOSE_RANGE
 
 if test "$enable_ubsan" = "yes" || test "$enable_ubsan" = "probe"; then
         AC_PACKAGE_CHECK_UBSAN
diff --git a/include/builddefs.in b/include/builddefs.in
index b38a099b7d525a..4a2cb757c0bdb3 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -118,6 +118,7 @@ HAVE_UDEV = @have_udev@
 UDEV_RULE_DIR = @udev_rule_dir@
 HAVE_LIBURCU_ATOMIC64 = @have_liburcu_atomic64@
 STRERROR_R_RETURNS_STRING = @strerror_r_returns_string@
+HAVE_CLOSE_RANGE = @have_close_range@
 
 GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
 #	   -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
diff --git a/libfrog/Makefile b/libfrog/Makefile
index bccd9289e5dd79..89a0332ae85372 100644
--- a/libfrog/Makefile
+++ b/libfrog/Makefile
@@ -36,6 +36,7 @@ ptvar.c \
 radix-tree.c \
 randbytes.c \
 scrub.c \
+systemd.c \
 util.c \
 workqueue.c \
 zones.c
@@ -70,6 +71,7 @@ radix-tree.h \
 randbytes.h \
 scrub.h \
 statx.h \
+systemd.h \
 workqueue.h \
 zones.h
 
@@ -90,6 +92,10 @@ ifeq ($(HAVE_GETRANDOM_NONBLOCK),yes)
 LCFLAGS += -DHAVE_GETRANDOM_NONBLOCK
 endif
 
+ifeq ($(HAVE_CLOSE_RANGE),yes)
+CFLAGS += -DHAVE_CLOSE_RANGE
+endif
+
 default: ltdepend $(LTLIBRARY) $(GETTEXT_PY)
 
 crc32table.h: gen_crc32table.c crc32defs.h
diff --git a/libfrog/systemd.c b/libfrog/systemd.c
new file mode 100644
index 00000000000000..0e04ee04fc2682
--- /dev/null
+++ b/libfrog/systemd.c
@@ -0,0 +1,181 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/wait.h>
+
+#include "libfrog/systemd.h"
+
+/* Close all fds except for the three standard ones. */
+static void
+close_fds(void)
+{
+	int max_fd = sysconf(_SC_OPEN_MAX);
+	int fd;
+#ifdef HAVE_CLOSE_RANGE
+	int ret;
+#endif
+
+	if (max_fd < 1)
+		max_fd = 1024;
+
+#ifdef HAVE_CLOSE_RANGE
+	ret = close_range(STDERR_FILENO + 1, max_fd, 0);
+	if (!ret)
+		return;
+#endif
+
+	for (fd = STDERR_FILENO + 1; fd < max_fd; fd++)
+		close(fd);
+}
+
+/*
+ * Compute the systemd instance unit name for a given path.
+ *
+ * The escaping logic is implemented directly in systemctl so there's no
+ * library or dbus service that we can call.
+ */
+int
+systemd_path_instance_unit_name(
+	const char		*unit_template,
+	const char		*path,
+	char			*unitname,
+	size_t			unitnamelen)
+{
+	size_t			i;
+	ssize_t			bytes;
+	pid_t			child_pid;
+	int			pipe_fds[2];
+	int			child_status;
+	int			ret;
+
+	ret = pipe(pipe_fds);
+	if (ret)
+		return -1;
+
+	child_pid = fork();
+	if (child_pid < 0)
+		return -1;
+
+	if (!child_pid) {
+		/* child process */
+		char		*argv[] = {
+			"systemd-escape",
+			"--template",
+			(char *)unit_template,
+			"--path",
+			(char *)path,
+			NULL,
+		};
+
+		ret = dup2(pipe_fds[1], STDOUT_FILENO);
+		if (ret < 0) {
+			perror(path);
+			goto fail;
+		}
+
+		close_fds();
+
+		ret = execvp("systemd-escape", argv);
+		if (ret)
+			perror(path);
+
+fail:
+		exit(EXIT_FAILURE);
+	}
+
+	/*
+	 * Close our connection to stdin so that the read won't hang if the
+	 * child exits without writing anything to stdout.
+	 */
+	close(pipe_fds[1]);
+	bytes = read(pipe_fds[0], unitname, unitnamelen - 1);
+	close(pipe_fds[0]);
+
+	waitpid(child_pid, &child_status, 0);
+	if (!WIFEXITED(child_status) || WEXITSTATUS(child_status) != 0) {
+		errno = 0;
+		return -1;
+	}
+
+	/* Terminate string at first newline or end of buffer. */
+	for (i = 0; i < bytes; i++) {
+		if (unitname[i] == '\n') {
+			unitname[i] = 0;
+			break;
+		}
+	}
+	if (i == bytes)
+		unitname[unitnamelen - 1] = 0;
+
+	return 0;
+}
+
+static const char *systemd_unit_manage_string(enum systemd_unit_manage how)
+{
+	switch (how) {
+	case UM_STOP:
+		return "stop";
+	case UM_START:
+		return "start";
+	case UM_RESTART:
+		return "restart";
+	}
+
+	/* shut up gcc */
+	return NULL;
+}
+
+/*
+ * Start/stop/restart a systemd unit and let it run in the background.
+ *
+ * systemctl start wraps a lot of logic around starting a unit, so it's less
+ * work for xfsprogs to invoke systemctl instead of calling through dbus.
+ */
+int
+systemd_manage_unit(
+	enum systemd_unit_manage	how,
+	const char			*unitname)
+{
+	pid_t				child_pid;
+	int				child_status;
+	int				ret;
+
+	child_pid = fork();
+	if (child_pid < 0)
+		return -1;
+
+	if (!child_pid) {
+		/* child starts the process */
+		char		*argv[] = {
+			"systemctl",
+			(char *)systemd_unit_manage_string(how),
+			"--no-block",
+			(char *)unitname,
+			NULL,
+		};
+
+		close_fds();
+
+		ret = execvp("systemctl", argv);
+		if (ret)
+			perror("systemctl");
+
+		exit(EXIT_FAILURE);
+	}
+
+	/* parent waits for process */
+	waitpid(child_pid, &child_status, 0);
+
+	/* systemctl (stop/start/restart) --no-block should return quickly */
+	if (WIFEXITED(child_status) && WEXITSTATUS(child_status) == 0)
+		return 0;
+
+	errno = ENOMEM;
+	return -1;
+}
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index c5538c30d2518a..b3d87229d3367a 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -347,3 +347,22 @@ puts(strerror_r(0, buf, sizeof(buf)));
     CFLAGS="$OLD_CFLAGS"
     AC_SUBST(strerror_r_returns_string)
   ])
+
+#
+# Check if close_range exists
+#
+AC_DEFUN([AC_HAVE_CLOSE_RANGE],
+  [AC_MSG_CHECKING([for close_range])
+    AC_LINK_IFELSE(
+    [AC_LANG_PROGRAM([[
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <linux/close_range.h>
+  ]], [[
+close_range(0, 0, 0);
+  ]])
+    ], have_close_range=yes
+       AC_MSG_RESULT(yes),
+       AC_MSG_RESULT(no))
+    AC_SUBST(have_close_range)
+  ])


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 04/26] libfrog: hoist a couple of service helper functions
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (2 preceding siblings ...)
  2026-03-03  0:34   ` [PATCH 03/26] libfrog: add support code for starting systemd services programmatically Darrick J. Wong
@ 2026-03-03  0:34   ` Darrick J. Wong
  2026-03-03 15:45     ` Christoph Hellwig
  2026-03-03  0:35   ` [PATCH 05/26] man2: document the healthmon ioctl Darrick J. Wong
                     ` (21 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:34 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Hoist a couple of service/daemon-related helper functions to libfrog so
that we can share the code between xfs_scrub and xfs_healer.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 libfrog/systemd.h |   28 ++++++++++++++++++++++++++++
 scrub/xfs_scrub.c |   32 +++++++++-----------------------
 2 files changed, 37 insertions(+), 23 deletions(-)


diff --git a/libfrog/systemd.h b/libfrog/systemd.h
index 4f414bc3c1e9c3..c96df4afa39aa6 100644
--- a/libfrog/systemd.h
+++ b/libfrog/systemd.h
@@ -17,4 +17,32 @@ enum systemd_unit_manage {
 
 int systemd_manage_unit(enum systemd_unit_manage how, const char *unitname);
 
+static inline bool systemd_is_service(void)
+{
+	return getenv("SERVICE_MODE") != NULL;
+}
+
+/* Special processing for a service/daemon program that is exiting. */
+static inline int
+systemd_service_exit(int ret)
+{
+	/*
+	 * We have to sleep 2 seconds here because journald uses the pid to
+	 * connect our log messages to the systemd service.  This is critical
+	 * for capturing all the log messages if the service fails, because
+	 * failure analysis tools use the service name to gather log messages
+	 * for reporting.
+	 */
+	sleep(2);
+
+	/*
+	 * If we're being run as a service, the return code must fit the LSB
+	 * init script action error guidelines, which is to say that we
+	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
+	 * section 22.2) and hope the admin will scan the log for what actually
+	 * happened.
+	 */
+	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+}
+
 #endif /* __LIBFROG_SYSTEMD_H__ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 3dba972a7e8d2a..79937aa8cce4c4 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -19,6 +19,7 @@
 #include "unicrash.h"
 #include "progress.h"
 #include "libfrog/histogram.h"
+#include "libfrog/systemd.h"
 
 /*
  * XFS Online Metadata Scrub (and Repair)
@@ -866,8 +867,7 @@ main(
 	if (stdout_isatty && !progress_fp)
 		progress_fp = fdopen(1, "w+");
 
-	if (getenv("SERVICE_MODE"))
-		is_service = true;
+	is_service = systemd_is_service();
 
 	/* Initialize overall phase stats. */
 	error = phase_start(&all_pi, 0, NULL);
@@ -960,29 +960,15 @@ main(
 	hist_free(&ctx.datadev_hist);
 	hist_free(&ctx.rtdev_hist);
 
-	/*
-	 * If we're being run as a service, the return code must fit the LSB
-	 * init script action error guidelines, which is to say that we
-	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
-	 * section 22.2) and hope the admin will scan the log for what
-	 * actually happened.
-	 *
-	 * We have to sleep 2 seconds here because journald uses the pid to
-	 * connect our log messages to the systemd service.  This is critical
-	 * for capturing all the log messages if the scrub fails, because the
-	 * fail service uses the service name to gather log messages for the
-	 * error report.
-	 *
-	 * Note: We don't count a lack of kernel support as a service failure
-	 * because we haven't determined that there's anything wrong with the
-	 * filesystem.
-	 */
 	if (is_service) {
-		sleep(2);
+		/*
+		 * Note: We don't count a lack of kernel support as a service
+		 * failure because we haven't determined that there's anything
+		 * wrong with the filesystem.
+		 */
 		if (!ctx.scrub_setup_succeeded)
-			return 0;
-		if (ret != SCRUB_RET_SUCCESS)
-			return 1;
+			ret = 0;
+		return systemd_service_exit(ret);
 	}
 
 	return ret;


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 05/26] man2: document the healthmon ioctl
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (3 preceding siblings ...)
  2026-03-03  0:34   ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong
@ 2026-03-03  0:35   ` Darrick J. Wong
  2026-03-03 15:46     ` Christoph Hellwig
  2026-03-03  0:35   ` [PATCH 06/26] man2: document the media verification ioctl Darrick J. Wong
                     ` (20 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:35 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Document the XFS_IOC_HEALTH_MONITOR and
XFS_IOC_HEALTH_FD_ON_MONITORED_FS ioctls.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 man/man2/ioctl_xfs_health_fd_on_monitored_fs.2 |   75 ++++
 man/man2/ioctl_xfs_health_monitor.2            |  464 ++++++++++++++++++++++++
 2 files changed, 539 insertions(+)
 create mode 100644 man/man2/ioctl_xfs_health_fd_on_monitored_fs.2
 create mode 100644 man/man2/ioctl_xfs_health_monitor.2


diff --git a/man/man2/ioctl_xfs_health_fd_on_monitored_fs.2 b/man/man2/ioctl_xfs_health_fd_on_monitored_fs.2
new file mode 100644
index 00000000000000..bbc5ce9bbabf53
--- /dev/null
+++ b/man/man2/ioctl_xfs_health_fd_on_monitored_fs.2
@@ -0,0 +1,75 @@
+.\" Copyright (c) 2025-2026, Oracle.  All rights reserved.
+.\"
+.\" %%%LICENSE_START(GPLv2+_DOC_FULL)
+.\" SPDX-License-Identifier: GPL-2.0+
+.\" %%%LICENSE_END
+.TH IOCTL-XFS-HEALTH-FD-ON-MONITORED-FS 2 2026-01-04 "XFS"
+.SH NAME
+ioctl_xfs_health_fd_on_monitored_fs \- check if the given fd belongs to the same fs being monitored
+.SH SYNOPSIS
+.br
+.B #include <xfs/xfs_fs.h>
+.PP
+.BI "int ioctl(int " healthmon_fd ", XFS_IOC_HEALTH_FD_ON_MONITORED_FS, struct xfs_health_file_on_monitored_fs *" arg );
+.SH DESCRIPTION
+This XFS healthmon fd ioctl asks the kernel driver if the file descriptor
+passed in via
+.I arg
+points to a file on the same filesystem that is being monitored by
+.IR healthmon_fd .
+The file descriptor is conveyed in a structure of the following form:
+.PP
+.in +4n
+.nf
+struct xfs_health_file_on_monitored_fs {
+	__s32 fd;
+	__u32 flags;
+};
+.fi
+.in
+.PP
+The field
+.I flags
+must be zero.
+.PP
+The field
+.I fd
+is a descriptor of an open file.
+.PP
+The argument
+.I healthmon_fd
+must be a file opened via the
+.B XFS_IOC_HEALTH_MONITOR
+ioctl.
+.SH RETURN VALUE
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+If the file descriptor points to a file on the same filesystem that is being
+monitored, 0 is returned.
+.PP
+.SH ERRORS
+Error codes can be one of, but are not limited to, the following:
+.TP
+.B ESTALE
+The open file is not on the same filesystem that is being monitored.
+.TP
+.B EINVAL
+One or more of the arguments specified is invalid.
+.TP
+.B EBADF
+.I arg.fd
+does not refer to an open file.
+.TP
+.B EFAULT
+The
+.I arg
+structure could not be copied into the kernel.
+.TP
+.B ENOTTY
+.I healthmon_fd
+is not a XFS health monitoring file.
+.SH CONFORMING TO
+This API is specific to XFS filesystem on the Linux kernel.
+.SH SEE ALSO
+.BR ioctl_xfs_health_monitor (2)
diff --git a/man/man2/ioctl_xfs_health_monitor.2 b/man/man2/ioctl_xfs_health_monitor.2
new file mode 100644
index 00000000000000..269c434515d960
--- /dev/null
+++ b/man/man2/ioctl_xfs_health_monitor.2
@@ -0,0 +1,464 @@
+.\" Copyright (c) 2025-2026, Oracle.  All rights reserved.
+.\"
+.\" %%%LICENSE_START(GPLv2+_DOC_FULL)
+.\" SPDX-License-Identifier: GPL-2.0+
+.\" %%%LICENSE_END
+.TH IOCTL-XFS-HEALTH-MONITOR 2 2026-01-04 "XFS"
+.SH NAME
+ioctl_xfs_health_monitor \- read filesystem health events from the kernel
+.SH SYNOPSIS
+.br
+.B #include <xfs/xfs_fs.h>
+.PP
+.BI "int ioctl(int " dest_fd ", XFS_IOC_HEALTH_MONITOR, struct xfs_health_monitor *" arg );
+.SH DESCRIPTION
+This XFS ioctl asks the kernel driver to create a pseudo-file from which
+information about adverse filesystem health events can be read.
+This new file will be installed into the file descriptor table of the calling
+process as a read-only file, and will have the close-on-exec flag set.
+.PP
+The specific behaviors of this health monitor file are requested via a
+structure of the following form:
+.PP
+.in +4n
+.nf
+struct xfs_health_monitor {
+	__u64 flags;
+	__u8  format;
+	__u8  pad[23];
+};
+.fi
+.in
+.PP
+The field
+.I pad
+must be zero.
+.PP
+The field
+.I format
+controls the format of the event data that can be read:
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_FMT_V0
+Event data will be presented in discrete objects of type struct
+xfs_health_monitor_event.
+See below for more information.
+.RE
+
+.PD 1
+.PP
+The field
+.I flags
+control the behavior of the monitor.
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_VERBOSE
+Return all health events, including affirmations of healthy metadata.
+.RE
+.SH RETURN VALUE
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+Otherwise, the return value is a new file descriptor.
+.PP
+.SH ERRORS
+Error codes can be one of, but are not limited to, the following:
+.TP
+.B EEXIST
+Health monitoring is already active for this filesystem.
+.TP
+.B EPERM
+The caller does not have permission to open a health monitor.
+Calling programs must have administrative capability, run in the initial user
+namespace, and the
+.I fd
+passed to ioctl must be the root directory of an XFS filesystem.
+.TP
+.B EINVAL
+One or more of the arguments specified is invalid.
+.TP
+.B EFAULT
+The argument could not be copied into the kernel.
+.TP
+.B ENOMEM
+There was not sufficient memory to construct the health monitor.
+.SH EVENT FORMAT
+Calling programs retrieve XFS health events by calling
+.BR read (2)
+on the returned file descriptor.
+The read buffer must be large enough to hold at least one event object.
+Partial objects will not be returned; instead, a short read will occur.
+
+Events will be returned in the following format:
+
+.PP
+.in +4n
+.nf
+struct xfs_health_monitor_event {
+	__u32	domain;
+	__u32	type;
+	__u64	time_ns;
+
+	union {
+		struct xfs_health_monitor_lost lost;
+		struct xfs_health_monitor_fs fs;
+		struct xfs_health_monitor_group group;
+		struct xfs_health_monitor_inode inode;
+		struct xfs_health_monitor_shutdown shutdown;
+		struct xfs_health_monitor_media media;
+		struct xfs_health_monitor_filerange filerange;
+	} e;
+
+	__u64	pad[2];
+};
+.fi
+.in
+.PP
+The field
+.I time_ns
+records the timestamp at which the health event was generated, in units of
+nanoseconds since the Unix epoch.
+.PP
+The field
+.I pad
+will be zero.
+.PP
+The field
+.I domain
+indicates the scope of the filesystem affected by the event:
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_MOUNT
+The entire filesystem is affected.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_FS
+Metadata concerning the entire filesystem is affected.
+Details are available through the
+.I fs
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_AG
+Metadata concerning a specific allocation group is affected.
+Details are available through the
+.I group
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_RTGROUP
+Metadata concerning a specific realtime allocation group is affected.
+Details are available through the
+.I group
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_INODE
+File metadata is affected.
+Details are available through the
+.I inode
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_DATADEV
+The main data volume is affected.
+Details are available through the
+.I media
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_RTDEV
+The realtime volume is affected.
+Details are available through the
+.I media
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_LOGDEV
+The external log is affected.
+Details are available through the
+.I media
+field.
+.TP
+.B XFS_HEALTH_MONITOR_DOMAIN_FILERANGE
+File data is affected.
+Details are available through the
+.I filerange
+field.
+.RE
+
+.PP
+The field
+.I type
+indicates what was affected by a health event:
+.RS 0.4i
+.PP
+The following types apply to events from the
+.B MOUNT
+domain.
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_RUNNING
+This filesystem health monitor is now running.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_LOST
+Health events were lost.
+Details are available through the
+.I lost
+field.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_UNMOUNT
+The filesystem is being unmounted.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_SHUTDOWN
+The filesystem has shut down due to problems.
+Details are available through the
+.I shutdown
+field.
+.RE
+.PP
+The following three types apply to events from the
+.BR FS ,
+.BR AG ,
+.BR RTGROUP ,
+and
+.B INODE
+domains.
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_SICK
+Filesystem metadata has been scanned by online fsck and found to be corrupt.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_CORRUPT
+A metadata corruption problem was encountered during a filesystem operation
+outside of fsck.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_HEALTHY
+Filesystem metadata has either been scanned by online fsck and found to be
+in good condition, or it has been repaired to good condition.
+.RE
+.PP
+The following type applies to events from the
+.BR DATADEV ,
+.BR RTDEV ,
+and
+.B LOGDEV
+domains.
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_MEDIA_ERROR
+A media error has been observed on one of the storage devices that can be
+attached to an XFS filesystem.
+.RE
+.PP
+The following types apply to events from the
+.B FILERANGE
+domain.
+.RS 0.4i
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_BUFREAD
+An attempt to read (or readahead) from a file failed with an I/O error.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_BUFWRITE
+An attempt to write dirty data to storage failed with an I/O error.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_DIOREAD
+A direct read of file data from storage failed with an I/O error.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_DIOWRITE
+A direct write of file data to storage failed with an I/O error.
+.TP
+.B XFS_HEALTH_MONITOR_TYPE_DATALOST
+A latent media error was discovered on the storage backing part of this file.
+.RE
+.RE
+
+.PP
+The union
+.I e
+contains further details about the health event:
+
+.RS 0.4i
+.PP
+The kernel will use no more than 32KiB of memory per monitoring file to queue
+health events.
+If this limit is exceeded, an event will be generated to describe how many
+events were lost:
+
+.in +4n
+.nf
+struct xfs_health_monitor_lost {
+	__u64	count;
+};
+.fi
+.in
+.PP
+The
+.I count
+field records the number of events lost.
+
+.PP
+If whole-filesystem metadata experiences a health event, the exact type of
+that metadata is recorded as follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_fs {
+	__u32	mask;
+};
+.fi
+.in
+.PP
+The
+.I mask
+field will contain
+.I XFS_FSOP_GEOM_SICK_*
+flags that are documented in the
+.BR ioctl_xfs_fsgeometry (2)
+manual page.
+
+.PP
+If an allocation group (realtime or data) experiences a health event,
+the exact type and location of the metadata is recorded as follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_group {
+	__u32	mask;
+	__u32	gno;
+};
+.fi
+.in
+.PP
+The
+.I mask
+field will contain
+.I XFS_AG_SICK_*
+flags that are documented in the
+.BR ioctl_xfs_ag_geometry (2)
+manual page, or the
+.I XFS_RTGROUP_SICK_*
+flags that are documented by the
+.BR ioctl_xfs_rtgroup_geometry (2)
+manual page.
+.PP
+The
+.I gno
+field will contain the group number.
+
+.PP
+If a file experiences a health event, the exact type and handle to the file
+is recorded as follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_inode {
+	__u32	mask;
+	__u32	gen;
+	__u64	ino;
+};
+.fi
+.in
+.PP
+The
+.I mask
+field will contain
+.I XFS_BS_SICK_*
+flags that are documented by the
+.BR ioctl_xfs_bulkstat (2)
+manual page.
+.PP
+The
+.I ino
+and
+.I gen
+fields describe a handle to the affected file.
+
+.PP
+If the filesystem shuts down abnormally, the exact reasons are recorded as
+follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_shutdown {
+	__u32	reasons;
+};
+.fi
+.in
+.PP
+The
+.I reasons
+field is a combination of the following values:
+.RS 0.4i
+.TP
+.B XFS_HEALTH_SHUTDOWN_META_IO_ERROR
+Metadata I/O errors were encountered.
+.TP
+.B XFS_HEALTH_SHUTDOWN_LOG_IO_ERROR
+Log I/O errors were encountered.
+.TP
+.B XFS_HEALTH_SHUTDOWN_FORCE_UMOUNT
+The filesystem was forcibly shut down by an administrator.
+.TP
+.B XFS_HEALTH_SHUTDOWN_CORRUPT_INCORE
+In-memory metadata are corrupt.
+.TP
+.B XFS_HEALTH_SHUTDOWN_CORRUPT_ONDISK
+On-disk metadata are corrupt.
+.TP
+.B XFS_HEALTH_SHUTDOWN_DEVICE_REMOVED
+Storage devices were removed.
+.RE
+
+.PP
+If a media error is discovered on the storage device, the exact location is
+recorded as follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_media {
+	__u64	daddr;
+	__u64	bbcount;
+};
+.fi
+.in
+.PP
+The
+.I daddr
+and
+.I bbcount
+fields describe the range of the storage that were lost.
+Both are provided in units of 512-byte blocks.
+
+.PP
+If a problem is discovered with regular file data, the handle of the file
+and the exact range of the file are recorded as follows:
+
+.in +4n
+.nf
+struct xfs_health_monitor_filerange {
+	__u64	pos;
+	__u64	len;
+	__u64	ino;
+	__u32	gen;
+	__u32	error;
+};
+.fi
+.in
+.PP
+The
+.I ino
+and
+.I gen
+fields describe a handle to the affected file.
+The
+.I pos
+and
+.I len
+fields describe the range of the file data that are affected.
+Both are provided in units of bytes.
+.PP
+The
+.I error
+field describes the error that occurred.
+See the
+.BR errno (3)
+manual page for more information.
+.RE
+.SH CONFORMING TO
+This API is specific to XFS filesystem on the Linux kernel.
+.SH SEE ALSO
+.BR ioctl_xfs_health_samefs (2)


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 06/26] man2: document the media verification ioctl
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (4 preceding siblings ...)
  2026-03-03  0:35   ` [PATCH 05/26] man2: document the healthmon ioctl Darrick J. Wong
@ 2026-03-03  0:35   ` Darrick J. Wong
  2026-03-03 15:46     ` Christoph Hellwig
  2026-03-03  0:35   ` [PATCH 07/26] xfs_io: monitor filesystem health events Darrick J. Wong
                     ` (19 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:35 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Document XFS_IOC_VERIFY_MEDIA, which is a new ioctl for xfs_scrub to
perform media scans on the disks underneath the filesystem.  This will
enable media errors to be reported to xfs_healer and fsnotify.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 man/man2/ioctl_xfs_verify_media.2 |  185 +++++++++++++++++++++++++++++++++++++
 1 file changed, 185 insertions(+)
 create mode 100644 man/man2/ioctl_xfs_verify_media.2


diff --git a/man/man2/ioctl_xfs_verify_media.2 b/man/man2/ioctl_xfs_verify_media.2
new file mode 100644
index 00000000000000..bd0d4579f5a364
--- /dev/null
+++ b/man/man2/ioctl_xfs_verify_media.2
@@ -0,0 +1,185 @@
+.\" Copyright (c) 2025-2026, Oracle.  All rights reserved.
+.\"
+.\" %%%LICENSE_START(GPLv2+_DOC_FULL)
+.\" SPDX-License-Identifier: GPL-2.0+
+.\" %%%LICENSE_END
+.TH IOCTL-XFS-VERIFY-MEDIA 2 2026-01-09 "XFS"
+.SH NAME
+ioctl_xfs_verify_media \- verify the media of the devices backing XFS
+.SH SYNOPSIS
+.br
+.B #include <xfs/xfs_fs.h>
+.PP
+.BI "int ioctl(int " fd ", XFS_IOC_VERIFY_MEDIA, struct xfs_verify_media *" arg );
+.SH DESCRIPTION
+Verify the media of a storage device backing an XFS filesystem.
+If errors are found, report the error to the kernel so that it can generate
+health events for the health monitoring system and fsnotify.
+The verification request is conveyed in a structure of the following form:
+.PP
+.in +4n
+.nf
+struct xfs_verify_error {
+	__u32	me_dev;
+	__u32	me_flags;
+	__u64	me_start_daddr;
+	__u64	me_end_daddr;
+	__u32	me_ioerror;
+	__u32	me_pad;
+};
+.fi
+.in
+.PP
+The field
+.I me_pad
+must be zero.
+.PP
+The field
+.I me_ioerror
+will be set if the ioctl returns success.
+.PP
+The fields
+.I me_start_daddr
+and
+.I me_end_daddr
+are the range of the storage device to verify.
+Both values must be in units of 512-byte blocks.
+The
+.I me_start_daddr
+field is inclusive, and the
+.I me_end_daddr
+field is exclusive.
+If
+.I me_end_daddr
+is larger than the size of the device, the kernel will set it to the size of
+the device.
+
+If the system call returns success and any part of the storage device range was
+successfully verified, the
+.I me_start_daddr
+field will be updated to reflect the successful verification.
+If after this update the
+.I me_start_daddr
+is equal to
+.IR me_end_daddr ,
+then the entire range was verified successfully.
+
+If not, then a media error was encountered and the caller should generate a
+series of secondary calls to this ioctl with smaller ranges to discover the
+exact location and type of media error.
+The type of media error will be written to the
+.I me_ioerror
+field.
+
+.PP
+The field
+.I me_dev
+must be one of the following values:
+.RS 0.4i
+.TP
+.B XFS_DEV_DATA
+Verify the data device.
+.TP
+.B XFS_DEV_LOG
+Verify the external log device.
+.TP
+.B XFS_DEV_RT
+Verify the realtime device.
+.RE
+.PP
+The field
+.I me_flags
+is a bitmask of one of the following values:
+.RS 0.4i
+.TP
+.B XFS_VERIFY_MEDIA_REPORT
+Report all media errors to fsnotify.
+.RE
+
+The
+.IR me_max_io_size
+field, if nonzero, will be used as advice for the maximum size of the IO to
+send to the device.
+
+The
+.I me_rest_us
+field will cause the kernel to pause for this many microseconds between IO
+requests.
+
+.SH RETURN VALUE
+On runtime error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+If 0 is returned, then
+.I start_daddr
+or
+.I ioerror
+will be updated.
+.PP
+.SH ERRORS
+Error codes can be one of, but are not limited to, the following:
+.TP
+.B EPERM
+The calling process does not have sufficient privilege.
+.TP
+.B EINVAL
+One or more of the arguments specified is invalid.
+.TP
+.B EFAULT
+The
+.I arg
+structure could not be copied into the kernel.
+.TP
+.B ENODEV
+The device is not present.
+.TP
+.B ENOMEM
+There was not enough memory to perform the verification.
+
+.SH I/O ERRORS
+The
+.I ioerror
+field could be set to one of the following:
+.TP
+.B 0
+The verification I/O succeeded.
+.TP
+.B EOPNOTSUPP
+.TP
+.B ETIMEDOUT
+The kernel timed out the verification I/O command.
+.TP
+.B ENOLINK
+The transportation link to the storage device was down temporarily.
+.TP
+.B EREMOTEIO
+The storage target controller suffered a critical error.
+.TP
+.B ENODATA
+The storage target media suffered a critical error.
+.TP
+.B EILSEQ
+Storage protection metadata did not validate successfully.
+.TP
+.B ENOMEM
+There was not enough memory to allocate an I/O request.
+.TP
+.B ENODEV
+The storage device is offline.
+.TP
+.B ETIME
+The storage device timed out the I/O command.
+.TP
+.B EINVAL
+The I/O request was rejected by the device for being invalid.
+.TP
+.B EIO
+An I/O error occurred but no specific details are available.
+.RE
+.PP
+This list is not exhaustive and may grow in the future.
+
+.SH CONFORMING TO
+This API is specific to XFS filesystem on the Linux kernel.
+.SH SEE ALSO
+.BR ioctl_xfs_health_monitor (2)


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 07/26] xfs_io: monitor filesystem health events
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (5 preceding siblings ...)
  2026-03-03  0:35   ` [PATCH 06/26] man2: document the media verification ioctl Darrick J. Wong
@ 2026-03-03  0:35   ` Darrick J. Wong
  2026-03-03 15:46     ` Christoph Hellwig
  2026-03-03  0:35   ` [PATCH 08/26] xfs_io: add a media verify command Darrick J. Wong
                     ` (18 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:35 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a subcommand to monitor for health events generated by the kernel.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 io/io.h           |    1 
 io/Makefile       |    1 
 io/healthmon.c    |  186 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 io/init.c         |    1 
 man/man8/xfs_io.8 |   25 +++++++
 5 files changed, 214 insertions(+)
 create mode 100644 io/healthmon.c


diff --git a/io/io.h b/io/io.h
index 35fb8339eeb5aa..2f5262bce6acbb 100644
--- a/io/io.h
+++ b/io/io.h
@@ -162,3 +162,4 @@ extern void		bulkstat_init(void);
 void			exchangerange_init(void);
 void			fsprops_init(void);
 void			aginfo_init(void);
+void			healthmon_init(void);
diff --git a/io/Makefile b/io/Makefile
index 444e2d6a557d5d..8e3783353a52b5 100644
--- a/io/Makefile
+++ b/io/Makefile
@@ -25,6 +25,7 @@ CFILES = \
 	fsuuid.c \
 	fsync.c \
 	getrusage.c \
+	healthmon.c \
 	imap.c \
 	init.c \
 	inject.c \
diff --git a/io/healthmon.c b/io/healthmon.c
new file mode 100644
index 00000000000000..5bf54ff6c717e6
--- /dev/null
+++ b/io/healthmon.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "libxfs.h"
+#include "libfrog/fsgeom.h"
+#include "libfrog/paths.h"
+#include "libfrog/healthevent.h"
+#include "command.h"
+#include "init.h"
+#include "io.h"
+
+static void
+healthmon_help(void)
+{
+	printf(_(
+"Monitor filesystem health events"
+"\n"
+"-c             Replace the open file with the monitor file.\n"
+"-d delay_ms    Sleep this many milliseconds between reads.\n"
+"-p             Only probe for the existence of the ioctl.\n"
+"-v             Request all events.\n"
+"\n"));
+}
+
+static inline int
+monitor_sleep(
+	int			delay_ms)
+{
+	struct timespec		ts;
+
+	if (!delay_ms)
+		return 0;
+
+	ts.tv_sec = delay_ms / 1000;
+	ts.tv_nsec = (delay_ms % 1000) * 1000000;
+
+	return nanosleep(&ts, NULL);
+}
+
+static int
+monitor(
+	size_t			bufsize,
+	bool			consume,
+	int			delay_ms,
+	bool			verbose,
+	bool			only_probe)
+{
+	struct xfs_health_monitor	hmo = {
+		.format		= XFS_HEALTH_MONITOR_FMT_V0,
+	};
+	struct hme_prefix	pfx;
+	void			*buf;
+	ssize_t			bytes_read;
+	int			mon_fd;
+	int			ret = 1;
+
+	hme_prefix_init(&pfx, file->name);
+
+	if (verbose)
+		hmo.flags |= XFS_HEALTH_MONITOR_ALL;
+
+	mon_fd = ioctl(file->fd, XFS_IOC_HEALTH_MONITOR, &hmo);
+	if (mon_fd < 0) {
+		perror("XFS_IOC_HEALTH_MONITOR");
+		return 1;
+	}
+
+	if (only_probe) {
+		ret = 0;
+		goto out_mon;
+	}
+
+	buf = malloc(bufsize);
+	if (!buf) {
+		perror("malloc");
+		goto out_mon;
+	}
+
+	if (consume) {
+		close(file->fd);
+		file->fd = mon_fd;
+	}
+
+	monitor_sleep(delay_ms);
+	while ((bytes_read = read(mon_fd, buf, bufsize)) > 0) {
+		struct xfs_health_monitor_event *hme = buf;
+
+		while (bytes_read >= sizeof(*hme)) {
+			hme_report_event(&pfx, hme);
+			hme++;
+			bytes_read -= sizeof(*hme);
+		}
+		if (bytes_read > 0) {
+			printf("healthmon: %zu bytes remain?\n", bytes_read);
+			fflush(stdout);
+		}
+
+		monitor_sleep(delay_ms);
+	}
+	if (bytes_read < 0) {
+		perror("healthmon");
+		goto out_buf;
+	}
+
+	ret = 0;
+
+out_buf:
+	free(buf);
+out_mon:
+	close(mon_fd);
+	return ret;
+}
+
+static int
+healthmon_f(
+	int			argc,
+	char			**argv)
+{
+	size_t			bufsize = 4096;
+	bool			consume = false;
+	bool			verbose = false;
+	bool			only_probe = false;
+	int			delay_ms = 0;
+	int			c;
+
+	while ((c = getopt(argc, argv, "b:cd:pv")) != EOF) {
+		switch (c) {
+		case 'b':
+			errno = 0;
+			c = atoi(optarg);
+			if (c < 0 || errno) {
+				printf("%s: bufsize must be positive\n",
+						optarg);
+				exitcode = 1;
+				return 0;
+			}
+			bufsize = c;
+			break;
+		case 'c':
+			consume = true;
+			break;
+		case 'd':
+			errno = 0;
+			delay_ms = atoi(optarg);
+			if (delay_ms < 0 || errno) {
+				printf("%s: delay must be positive msecs\n",
+						optarg);
+				exitcode = 1;
+				return 0;
+			}
+			break;
+		case 'p':
+			only_probe = true;
+			break;
+		case 'v':
+			verbose = true;
+			break;
+		default:
+			exitcode = 1;
+			healthmon_help();
+			return 0;
+		}
+	}
+
+	return monitor(bufsize, consume, delay_ms, verbose, only_probe);
+}
+
+static struct cmdinfo healthmon_cmd = {
+	.name		= "healthmon",
+	.cfunc		= healthmon_f,
+	.argmin		= 0,
+	.argmax		= -1,
+	.flags		= CMD_FLAG_ONESHOT | CMD_NOMAP_OK,
+	.args		= "[-c] [-d delay_ms] [-v]",
+	.help		= healthmon_help,
+};
+
+void
+healthmon_init(void)
+{
+	healthmon_cmd.oneline = _("monitor filesystem health events");
+
+	add_command(&healthmon_cmd);
+}
diff --git a/io/init.c b/io/init.c
index 49e9e7cb88214b..cb5573f45ccfbc 100644
--- a/io/init.c
+++ b/io/init.c
@@ -92,6 +92,7 @@ init_commands(void)
 	crc32cselftest_init();
 	exchangerange_init();
 	fsprops_init();
+	healthmon_init();
 }
 
 /*
diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8
index 0a673322fde3a1..f7f2956a54a7aa 100644
--- a/man/man8/xfs_io.8
+++ b/man/man8/xfs_io.8
@@ -1356,6 +1356,31 @@ .SH FILESYSTEM COMMANDS
 .B thaw
 Undo the effects of a filesystem freeze operation.
 Only available in expert mode and requires privileges.
+.TP
+.BI "healthmon [ \-c " bufsize " ] [ \-c ] [ \-d " delay_ms " ] [ \-p ] [ \-v ]"
+Watch for filesystem health events and write them to the console.
+.RE
+.RS 1.0i
+.PD 0
+.TP
+.BI "\-b " bufsize
+Use a buffer of this size to read events from the kernel.
+.TP
+.BI \-c
+Close the open file and replace it with the monitor file.
+.TP
+.BI "\-d " delay_ms
+Sleep for this long between read attempts.
+.TP
+.B \-p
+Probe for the existence of the functionality by opening the monitoring fd and
+closing it immediately.
+.TP
+.BI \-v
+Request all health events, even if nothing changed.
+.PD
+.RE
+
 .TP
 .BI "inject [ " tag " ]"
 Inject errors into a filesystem to observe filesystem behavior at


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 08/26] xfs_io: add a media verify command
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (6 preceding siblings ...)
  2026-03-03  0:35   ` [PATCH 07/26] xfs_io: monitor filesystem health events Darrick J. Wong
@ 2026-03-03  0:35   ` Darrick J. Wong
  2026-03-03 15:46     ` Christoph Hellwig
  2026-03-03  0:36   ` [PATCH 09/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
                     ` (17 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:35 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add a subcommand to invoke the media verification ioctl to make sure
that we can actually check the storage underneath an xfs filesystem.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 io/io.h           |    1 
 io/Makefile       |    3 +
 io/init.c         |    1 
 io/verify_media.c |  180 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 man/man8/xfs_io.8 |   42 ++++++++++++
 5 files changed, 226 insertions(+), 1 deletion(-)
 create mode 100644 io/verify_media.c


diff --git a/io/io.h b/io/io.h
index 2f5262bce6acbb..0f12b3cfed5e76 100644
--- a/io/io.h
+++ b/io/io.h
@@ -163,3 +163,4 @@ void			exchangerange_init(void);
 void			fsprops_init(void);
 void			aginfo_init(void);
 void			healthmon_init(void);
+void			verifymedia_init(void);
diff --git a/io/Makefile b/io/Makefile
index 8e3783353a52b5..79d5e172b8f31f 100644
--- a/io/Makefile
+++ b/io/Makefile
@@ -51,7 +51,8 @@ CFILES = \
 	sync.c \
 	sync_file_range.c \
 	truncate.c \
-	utimes.c
+	utimes.c \
+	verify_media.c
 
 LLDLIBS = $(LIBXCMD) $(LIBHANDLE) $(LIBFROG) $(LIBPTHREAD) $(LIBUUID)
 LTDEPENDENCIES = $(LIBXCMD) $(LIBHANDLE) $(LIBFROG)
diff --git a/io/init.c b/io/init.c
index cb5573f45ccfbc..f2a551ef559200 100644
--- a/io/init.c
+++ b/io/init.c
@@ -93,6 +93,7 @@ init_commands(void)
 	exchangerange_init();
 	fsprops_init();
 	healthmon_init();
+	verifymedia_init();
 }
 
 /*
diff --git a/io/verify_media.c b/io/verify_media.c
new file mode 100644
index 00000000000000..e67567f675abfd
--- /dev/null
+++ b/io/verify_media.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (c) 2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "command.h"
+#include "input.h"
+#include "init.h"
+#include "io.h"
+
+static void
+verifymedia_help(void)
+{
+	printf(_(
+"\n"
+" Verify the media of the devices backing the filesystem.\n"
+"\n"
+" -d -- Verify the data device (default).\n"
+" -l -- Verify the log device.\n"
+" -r -- Verify the realtime device.\n"
+" -R -- Report media errors to fsnotify.\n"
+" -s -- Sleep this many usecs between IOs.\n"
+"\n"
+" start is the byte offset of the start of the range to verify.  If the start\n"
+" is specified, the end may (optionally) be specified as well."
+"\n"
+" end is the byte offset of the end of the range to verify.\n"
+"\n"
+" If neither start nor end are specified, the media verification will\n"
+" check the entire device."
+"\n"));
+}
+
+static int
+verifymedia_f(
+	int			argc,
+	char			**argv)
+{
+	xfs_daddr_t		orig_start_daddr = 0;
+	struct xfs_verify_media me = {
+		.me_start_daddr	= orig_start_daddr,
+		.me_end_daddr	= ~0ULL,
+		.me_dev		= XFS_DEV_DATA,
+	};
+	struct timeval		t1, t2;
+	long long		l;
+	size_t			fsblocksize, fssectsize;
+	const char		*verifydev = _("datadev");
+	int			c, ret;
+
+	init_cvtnum(&fsblocksize, &fssectsize);
+
+	while ((c = getopt(argc, argv, "b:dlrRs:")) != EOF) {
+		switch (c) {
+		case 'd':
+			me.me_dev = XFS_DEV_DATA;
+			verifydev = _("datadev");
+			break;
+		case 'l':
+			me.me_dev = XFS_DEV_LOG;
+			verifydev = _("logdev");
+			break;
+		case 'r':
+			me.me_dev = XFS_DEV_RT;
+			verifydev = _("rtdev");
+			break;
+		case 'b':
+			l = cvtnum(fsblocksize, fssectsize, optarg);
+			if (l < 0 || l > UINT_MAX) {
+				printf("non-numeric maxio argument -- %s\n",
+						optarg);
+				exitcode = 1;
+				return 0;
+			}
+			me.me_max_io_size = l;
+			break;
+		case 'R':
+			me.me_flags |= XFS_VERIFY_MEDIA_REPORT;
+			break;
+		case 's':
+			l = atoi(optarg);
+			if (l < 0) {
+				printf("non-numeric rest_us argument -- %s\n",
+						optarg);
+				exitcode = 1;
+				return 0;
+			}
+			me.me_rest_us = l;
+			break;
+		default:
+			verifymedia_help();
+			exitcode = 1;
+			return 0;
+		}
+	}
+
+	/* Range start (optional) */
+	if (optind < argc) {
+		l = cvtnum(fsblocksize, fssectsize, argv[optind]);
+		if (l < 0) {
+			printf("non-numeric start argument -- %s\n",
+					argv[optind]);
+			exitcode = 1;
+			return 0;
+		}
+
+		orig_start_daddr = l / 512;
+		me.me_start_daddr = orig_start_daddr;
+		optind++;
+	}
+
+	/* Range end (optional if range start was specified) */
+	if (optind < argc) {
+		l = cvtnum(fsblocksize, fssectsize, argv[optind]);
+		if (l < 0) {
+			printf("non-numeric end argument -- %s\n",
+					argv[optind]);
+			exitcode = 1;
+			return 0;
+		}
+
+		me.me_end_daddr = ((l + 511) / 512);
+		optind++;
+	}
+
+	if (optind < argc) {
+		printf("too many arguments -- %s\n", argv[optind]);
+		exitcode = 1;
+		return 0;
+	}
+
+	gettimeofday(&t1, NULL);
+	ret = ioctl(file->fd, XFS_IOC_VERIFY_MEDIA, &me);
+	gettimeofday(&t2, NULL);
+	t2 = tsub(t2, t1);
+	if (ret < 0) {
+		fprintf(stderr,
+ "%s: ioctl(XFS_IOC_VERIFY_MEDIA) [\"%s\"]: %s\n",
+				progname, file->name, strerror(errno));
+		exitcode = 1;
+		return 0;
+	}
+
+	if (me.me_ioerror) {
+		fprintf(stderr,
+ "%s: verify error at offset %llu length %llu: %s\n",
+				verifydev,
+				BBTOB(me.me_start_daddr),
+				BBTOB(me.me_end_daddr - me.me_start_daddr),
+				strerror(me.me_ioerror));
+	} else {
+		unsigned long long	total;
+
+		if (me.me_end_daddr > orig_start_daddr)
+			total = BBTOB(me.me_end_daddr - orig_start_daddr);
+		else
+			total = 0;
+		report_io_times("verified", &t2, BBTOB(orig_start_daddr),
+				BBTOB(me.me_start_daddr - orig_start_daddr),
+				total, 1, false);
+	}
+
+	return 0;
+}
+
+static struct cmdinfo verifymedia_cmd = {
+	.name		= "verifymedia",
+	.cfunc		= verifymedia_f,
+	.argmin		= 0,
+	.argmax		= -1,
+	.flags		= CMD_FLAG_ONESHOT | CMD_NOMAP_OK,
+	.args		= "[-lr] [start [end]]",
+	.help		= verifymedia_help,
+};
+
+void
+verifymedia_init(void)
+{
+	add_command(&verifymedia_cmd);
+}
diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8
index f7f2956a54a7aa..2090cd4c0b2641 100644
--- a/man/man8/xfs_io.8
+++ b/man/man8/xfs_io.8
@@ -1389,6 +1389,48 @@ .SH FILESYSTEM COMMANDS
 argument, displays the list of error tags available.
 Only available in expert mode and requires privileges.
 
+.TP
+.BI "verifymedia [ \-bdlrsR ] [ " start " [ " end " ]]"
+Check for media errors on the storage devices backing XFS.
+The
+.I start
+and
+.I end
+parameters are the range of physical storage to verify, in bytes.
+The
+.I start
+parameter is inclusive.
+The
+.I end
+parameter is exclusive.
+If neither
+.IR start " nor " end
+are specified, the entire device will be verified.
+.RE
+.RS 1.0i
+.PD 0
+.TP
+.B \-b
+Don't issue any IOs larger than this size.
+.TP
+.B \-d
+Verify the data device.
+This is the default.
+.TP
+.B \-l
+Verify the log device instead of the data device.
+.TP
+.B \-r
+Verify the realtime device instead of the data device.
+.TP
+.B \-R
+Report media errors to fsnotify.
+.TP
+.B \-s
+Sleep this many microseconds between IO requests.
+.PD
+.RE
+
 .TP
 .BI "rginfo [ \-r " rgno " ]"
 Show information about or update the state of realtime allocation groups.


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 09/26] xfs_healer: create daemon to listen for health events
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (7 preceding siblings ...)
  2026-03-03  0:35   ` [PATCH 08/26] xfs_io: add a media verify command Darrick J. Wong
@ 2026-03-03  0:36   ` Darrick J. Wong
  2026-03-03 15:47     ` Christoph Hellwig
  2026-03-03  0:36   ` [PATCH 10/26] xfs_healer: enable repairing filesystems Darrick J. Wong
                     ` (16 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:36 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a daemon program that can listen for and log health events.
Eventually this will be used to self-heal filesystems in real time.

Because events can take a while to process, the main thread reads event
objects from the healthmon fd and dispatches them to a background
workqueue as quickly as it can.  This split of responsibilities is
necessary because the kernel event queue will drop events if the queue
fills up, and each event can take some time to process (logging,
repairs, etc.) so we don't want to lose events.

To be clear, xfs_healer and xfs_scrub are complementary tools:

Scrub walks the whole filesystem, finds stuff that needs fixing or
rebuilding, and rebuilds it.  This is sort of analogous to a patrol
scrub.

Healer listens for metadata corruption messages from the kernel and
issues a targeted repair of that structure.  This is kind of like an
ondemand scrub.

My end goal is that xfs_healer (the service) is active all the time and
can respond instantly to a corruption report, whereas xfs_scrub (the
service) gets run periodically as a cron job.

xfs_healer can decide that it's overwhelmed with problems and start
xfs_scrub to deal with the mess.  Ideally you don't crash the filesystem
and then have to use xfs_repair to smash your way back to a mountable
filesystem.

By default we run xfs_healer as a background service, which means that
we only start two threads -- one to read the events, and another to
process them.  In other words, we try not to use all available hardware
resources for repairs.  The foreground mode switch starts up a large
number of threads to try to increase parallelism, which may or may not
be useful for repairs depending on how much metadata the kernel needs to
scan.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h  |   47 ++++++
 Makefile             |    5 +
 configure.ac         |    6 +
 healer/Makefile      |   35 +++++
 healer/xfs_healer.c  |  377 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/builddefs.in |    1 
 6 files changed, 471 insertions(+)
 create mode 100644 healer/xfs_healer.h
 create mode 100644 healer/Makefile
 create mode 100644 healer/xfs_healer.c


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
new file mode 100644
index 00000000000000..bcddde5db0cc47
--- /dev/null
+++ b/healer/xfs_healer.h
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#ifndef XFS_HEALER_XFS_HEALER_H_
+#define XFS_HEALER_XFS_HEALER_H_
+
+extern char *progname;
+
+/*
+ * When running in environments with restrictive security policies, healer
+ * might not be allowed to access the global mount tree.  However, processes
+ * are usually still allowed to see their own mount tree, so use this path for
+ * all mount table queries.
+ */
+#define _PATH_PROC_MOUNTS	"/proc/self/mounts"
+
+struct healer_ctx {
+	/* CLI options, must be int */
+	int			debug;
+	int			log;
+	int			everything;
+	int			foreground;
+
+	/* fd and fs geometry for mount */
+	struct xfs_fd		mnt;
+
+	/* Shared reference to the user's mountpoint for logging */
+	const char		*mntpoint;
+
+	/* Shared reference to the getmntent fsname for reconnecting */
+	const char		*fsname;
+
+	/* file stream of monitor and buffer */
+	FILE			*mon_fp;
+	char			*mon_buf;
+
+	/* coordinates logging printfs */
+	pthread_mutex_t		conlock;
+
+	/* event queue */
+	struct workqueue	event_queue;
+	bool			queue_active;
+};
+
+#endif /* XFS_HEALER_XFS_HEALER_H_ */
diff --git a/Makefile b/Makefile
index c73aa391bc5f43..1f499c30f3457e 100644
--- a/Makefile
+++ b/Makefile
@@ -69,6 +69,10 @@ ifeq ("$(ENABLE_SCRUB)","yes")
 TOOL_SUBDIRS += scrub
 endif
 
+ifeq ("$(ENABLE_HEALER)","yes")
+TOOL_SUBDIRS += healer
+endif
+
 ifneq ("$(XGETTEXT)","")
 TOOL_SUBDIRS += po
 endif
@@ -100,6 +104,7 @@ mkfs: libxcmd
 spaceman: libxcmd libhandle
 scrub: libhandle libxcmd
 rtcp: libfrog
+healer: libhandle
 
 ifeq ($(HAVE_BUILDDEFS), yes)
 include $(BUILDRULES)
diff --git a/configure.ac b/configure.ac
index 8d2bbb9ef88bb9..78bb87b159b10b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -110,6 +110,12 @@ AC_ARG_ENABLE(libicu,
 [  --enable-libicu=[yes/no]  Enable Unicode name scanning in xfs_scrub (libicu) [default=probe]],,
 	enable_libicu=probe)
 
+# Enable xfs_healer build
+AC_ARG_ENABLE(healer,
+[  --enable-healer=[yes/no]  Enable build of xfs_healer utility [[default=yes]]],,
+	enable_healer=yes)
+AC_SUBST(enable_healer)
+
 #
 # If the user specified a libdir ending in lib64 do not append another
 # 64 to the library names.
diff --git a/healer/Makefile b/healer/Makefile
new file mode 100644
index 00000000000000..e82c820883669a
--- /dev/null
+++ b/healer/Makefile
@@ -0,0 +1,35 @@
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2024-2026 Oracle.  All Rights Reserved.
+#
+
+TOPDIR = ..
+builddefs=$(TOPDIR)/include/builddefs
+include $(builddefs)
+
+INSTALL_HEALER = install-healer
+
+LTCOMMAND = xfs_healer
+
+CFILES = \
+xfs_healer.c
+
+HFILES = \
+xfs_healer.h
+
+LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBURCU) $(LIBPTHREAD)
+LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
+LLDFLAGS = -static
+
+default: depend $(LTCOMMAND)
+
+include $(BUILDRULES)
+
+install: $(INSTALL_HEALER)
+
+install-healer: default
+	$(INSTALL) -m 755 -d $(PKG_LIBEXEC_DIR)
+	$(INSTALL) -m 755 $(LTCOMMAND) $(PKG_LIBEXEC_DIR)
+
+install-dev:
+
+-include .dep
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
new file mode 100644
index 00000000000000..c69df9ed04699e
--- /dev/null
+++ b/healer/xfs_healer.c
@@ -0,0 +1,377 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+#include <pthread.h>
+#include <stdlib.h>
+
+#include "platform_defs.h"
+#include "libfrog/fsgeom.h"
+#include "libfrog/paths.h"
+#include "libfrog/healthevent.h"
+#include "libfrog/workqueue.h"
+#include "xfs_healer.h"
+
+/* Program name; needed for libfrog error reports. */
+char				*progname = "xfs_healer";
+
+/* Return a health monitoring fd. */
+static int
+open_health_monitor(
+	struct healer_ctx		*ctx,
+	int				mnt_fd)
+{
+	struct xfs_health_monitor	hmo = {
+		.format			= XFS_HEALTH_MONITOR_FMT_V0,
+	};
+
+	if (ctx->everything)
+		hmo.flags |= XFS_HEALTH_MONITOR_VERBOSE;
+
+	return ioctl(mnt_fd, XFS_IOC_HEALTH_MONITOR, &hmo);
+}
+
+/* Decide if this event can only be reported upon, and not acted upon. */
+static bool
+event_not_actionable(
+	const struct xfs_health_monitor_event	*hme)
+{
+	switch (hme->type) {
+	case XFS_HEALTH_MONITOR_TYPE_LOST:
+	case XFS_HEALTH_MONITOR_TYPE_RUNNING:
+	case XFS_HEALTH_MONITOR_TYPE_UNMOUNT:
+	case XFS_HEALTH_MONITOR_TYPE_SHUTDOWN:
+		return true;
+	}
+
+	return false;
+}
+
+/* Should this event be logged? */
+static bool
+event_loggable(
+	const struct healer_ctx			*ctx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	return ctx->log || event_not_actionable(hme);
+}
+
+/* Handle an event asynchronously. */
+static void
+handle_event(
+	struct workqueue		*wq,
+	uint32_t			index,
+	void				*arg)
+{
+	struct hme_prefix		pfx;
+	struct xfs_health_monitor_event	*hme = arg;
+	struct healer_ctx		*ctx = wq->wq_ctx;
+	const bool loggable = event_loggable(ctx, hme);
+
+	hme_prefix_init(&pfx, ctx->mntpoint);
+
+	/*
+	 * Non-actionable events should always be logged, because they are 100%
+	 * informational.
+	 */
+	if (loggable) {
+		pthread_mutex_lock(&ctx->conlock);
+		hme_report_event(&pfx, hme);
+		pthread_mutex_unlock(&ctx->conlock);
+	}
+
+	free(hme);
+}
+
+static unsigned int
+healer_nproc(
+	const struct healer_ctx	*ctx)
+{
+	/*
+	 * By default, use one event handler thread.  In foreground mode,
+	 * create one thread per cpu.
+	 */
+	return ctx->foreground ? platform_nproc() : 1;
+}
+
+/* Set ourselves up to monitor the given mountpoint for health events. */
+static int
+setup_monitor(
+	struct healer_ctx	*ctx)
+{
+	const long		BUF_SIZE = sysconf(_SC_PAGE_SIZE) * 2;
+	int			mon_fd;
+	int			ret;
+
+	ret = xfd_open(&ctx->mnt, ctx->mntpoint, O_RDONLY);
+	if (ret) {
+		perror(ctx->mntpoint);
+		return -1;
+	}
+
+	/*
+	 * Open the health monitor, then close the mountpoint to avoid pinning
+	 * it.  We can reconnect later if need be.
+	 */
+	mon_fd = open_health_monitor(ctx, ctx->mnt.fd);
+	close(ctx->mnt.fd);
+	ctx->mnt.fd = -1;
+	if (mon_fd < 0) {
+		switch (errno) {
+		case ENOTTY:
+		case EOPNOTSUPP:
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+ _("XFS health monitoring not supported."));
+			break;
+		case EEXIST:
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+ _("XFS health monitoring already running."));
+			break;
+		default:
+			perror(ctx->mntpoint);
+			break;
+		}
+		return -1;
+	}
+
+	/*
+	 * mon_fp consumes mon_fd.  We intentionally leave mon_fp attached to
+	 * the context so that we keep the monitoring fd open until we've torn
+	 * down all the background threads.
+	 */
+	ctx->mon_fp = fdopen(mon_fd, "r");
+	if (!ctx->mon_fp) {
+		close(mon_fd);
+		perror(ctx->mntpoint);
+		return -1;
+	}
+
+	/* Increase the buffer size so that we can reduce kernel calls */
+	ctx->mon_buf = malloc(BUF_SIZE);
+	if (ctx->mon_buf)
+		setvbuf(ctx->mon_fp, ctx->mon_buf, _IOFBF, BUF_SIZE);
+
+	/*
+	 * Queue up to 1MB of events before we stop trying to read events from
+	 * the kernel as quickly as we can.  Note that the kernel won't accrue
+	 * more than 32K of internal events before it starts dropping them.
+	 */
+	ret = workqueue_create_bound(&ctx->event_queue, ctx, healer_nproc(ctx),
+			1048576 / sizeof(struct xfs_health_monitor_event));
+	if (ret) {
+		errno = ret;
+		fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
+				_("worker threadpool setup"), strerror(errno));
+		return -1;
+	}
+	ctx->queue_active = true;
+
+	return 0;
+}
+
+/* Monitor the given mountpoint for health events. */
+static void
+monitor(
+	struct healer_ctx	*ctx)
+{
+	bool			mounted = true;
+	size_t			nr;
+
+	do {
+		struct xfs_health_monitor_event	*hme;
+		int		ret;
+
+		hme = malloc(sizeof(*hme));
+		if (!hme) {
+			pthread_mutex_lock(&ctx->conlock);
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+					_("could not allocate event object"));
+			pthread_mutex_unlock(&ctx->conlock);
+			break;
+		}
+
+		nr = fread(hme, sizeof(*hme), 1, ctx->mon_fp);
+		if (nr == 0) {
+			free(hme);
+			break;
+		}
+
+		if (hme->type == XFS_HEALTH_MONITOR_TYPE_UNMOUNT)
+			mounted = false;
+
+		/* handle_event owns hme if the workqueue_add succeeds */
+		ret = workqueue_add(&ctx->event_queue, handle_event, 0, hme);
+		if (ret) {
+			pthread_mutex_lock(&ctx->conlock);
+			fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
+					_("could not queue event object"),
+					strerror(ret));
+			pthread_mutex_unlock(&ctx->conlock);
+			free(hme);
+			break;
+		}
+	} while (nr > 0 && mounted);
+}
+
+/* Tear down all the resources that we created for monitoring */
+static void
+teardown_monitor(
+	struct healer_ctx	*ctx)
+{
+	if (ctx->queue_active) {
+		workqueue_terminate(&ctx->event_queue);
+		workqueue_destroy(&ctx->event_queue);
+	}
+	if (ctx->mon_fp) {
+		fclose(ctx->mon_fp);
+		ctx->mon_fp = NULL;
+	}
+	free(ctx->mon_buf);
+	ctx->mon_buf = NULL;
+}
+
+/*
+ * Find the filesystem source name for the mount that we're monitoring.  We
+ * don't use the fs_table_ helpers because we might be running in a restricted
+ * environment where we cannot access device files at all.
+ */
+static char *
+find_fsname(
+	const char	*mntpoint)
+{
+	struct mntent	*mnt;
+	FILE		*mtp;
+	char		*ret = NULL;
+	char		rpath[PATH_MAX], rmnt_dir[PATH_MAX];
+
+	if (!realpath(mntpoint, rpath))
+		return NULL;
+
+	mtp = setmntent(_PATH_PROC_MOUNTS, "r");
+	if (mtp == NULL)
+		return NULL;
+
+	while ((mnt = getmntent(mtp)) != NULL) {
+		if (strcmp(mnt->mnt_type, "xfs"))
+			continue;
+		if (!realpath(mnt->mnt_dir, rmnt_dir))
+			continue;
+
+		if (!strcmp(rpath, rmnt_dir)) {
+			ret = strdup(mnt->mnt_fsname);
+			break;
+		}
+	}
+
+	endmntent(mtp);
+	return ret;
+}
+
+static void __attribute__((noreturn))
+usage(void)
+{
+	fprintf(stderr, "%s %s %s\n", _("Usage:"), progname,
+			_("[OPTIONS] mountpoint"));
+	fprintf(stderr, "\n");
+	fprintf(stderr, _("Options:\n"));
+	fprintf(stderr, _("  --debug       Enable debugging messages.\n"));
+	fprintf(stderr, _("  --everything  Capture all events.\n"));
+	fprintf(stderr, _("  --foreground  Process events as soon as possible.\n"));
+	fprintf(stderr, _("  --quiet       Do not log health events to stdout.\n"));
+	fprintf(stderr, _("  -V            Print version.\n"));
+
+	exit(EXIT_FAILURE);
+}
+
+enum long_opt_nr {
+	LOPT_DEBUG,
+	LOPT_EVERYTHING,
+	LOPT_FOREGROUND,
+	LOPT_HELP,
+	LOPT_QUIET,
+
+	LOPT_MAX,
+};
+
+int
+main(
+	int			argc,
+	char			**argv)
+{
+	struct healer_ctx	ctx = {
+		.conlock	= (pthread_mutex_t)PTHREAD_MUTEX_INITIALIZER,
+		.log		= 1,
+	};
+	int			option_index;
+	int			vflag = 0;
+	int			c;
+	int			ret;
+
+	progname = basename(argv[0]);
+	setlocale(LC_ALL, "");
+	bindtextdomain(PACKAGE, LOCALEDIR);
+	textdomain(PACKAGE);
+
+	struct option long_options[] = {
+		[LOPT_DEBUG]	   = {"debug", no_argument, &ctx.debug, 1 },
+		[LOPT_EVERYTHING]  = {"everything", no_argument, &ctx.everything, 1 },
+		[LOPT_FOREGROUND]  = {"foreground", no_argument, &ctx.foreground, 1 },
+		[LOPT_HELP]	   = {"help", no_argument, NULL, 0 },
+		[LOPT_QUIET]	   = {"quiet", no_argument, &ctx.log, 0 },
+
+		[LOPT_MAX]	   = {NULL, 0, NULL, 0 },
+	};
+
+	while ((c = getopt_long(argc, argv, "V", long_options, &option_index))
+			!= EOF) {
+		switch (c) {
+		case 0:
+			switch (option_index) {
+			case LOPT_HELP:
+				usage();
+				break;
+			default:
+				break;
+			}
+			break;
+		case 'V':
+			vflag++;
+			break;
+		default:
+			usage();
+			break;
+		}
+	}
+
+	if (vflag) {
+		fprintf(stdout, "%s %s %s\n", progname, _("version"), VERSION);
+		fflush(stdout);
+		return EXIT_SUCCESS;
+	}
+
+	if (optind != argc - 1)
+		usage();
+
+	ctx.mntpoint = argv[optind];
+	ctx.fsname = find_fsname(ctx.mntpoint);
+	if (!ctx.fsname) {
+		fprintf(stderr, "%s: %s\n", ctx.mntpoint,
+				_("Not a XFS mount point."));
+		ret = -1;
+		goto out;
+	}
+
+	ret = setup_monitor(&ctx);
+	if (ret)
+		goto out_events;
+
+	monitor(&ctx);
+
+out_events:
+	teardown_monitor(&ctx);
+	free((char *)ctx.fsname);
+out:
+	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+}
diff --git a/include/builddefs.in b/include/builddefs.in
index 4a2cb757c0bdb3..99373ec86215cf 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -91,6 +91,7 @@ ENABLE_SHARED	= @enable_shared@
 ENABLE_GETTEXT	= @enable_gettext@
 ENABLE_EDITLINE	= @enable_editline@
 ENABLE_SCRUB	= @enable_scrub@
+ENABLE_HEALER	= @enable_healer@
 
 HAVE_ZIPPED_MANPAGES = @have_zipped_manpages@
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 10/26] xfs_healer: enable repairing filesystems
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (8 preceding siblings ...)
  2026-03-03  0:36   ` [PATCH 09/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
@ 2026-03-03  0:36   ` Darrick J. Wong
  2026-03-03 15:47     ` Christoph Hellwig
  2026-03-03  0:36   ` [PATCH 11/26] xfs_healer: use getparents to look up file names Darrick J. Wong
                     ` (15 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:36 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make it so that our health monitoring daemon can initiate repairs in
response to reports of corrupt filesystem metadata.  Repairs are
initiated from the background workers as explained in the previous
patch.

Note that just like xfs_scrub, xfs_healer's ability to repair metadata
relies heavily on back references such as reverse mappings and directory
parent pointers to add redundancy to the filesystem.  Check for these
two features and whine a bit if they are missing, just like scrub.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h   |   28 ++++++
 libfrog/flagmap.h     |    3 +
 libfrog/healthevent.h |   12 ++
 healer/Makefile       |    2 
 healer/fsrepair.c     |  249 +++++++++++++++++++++++++++++++++++++++++++++++++
 healer/weakhandle.c   |  115 +++++++++++++++++++++++
 healer/xfs_healer.c   |   56 +++++++++++
 libfrog/flagmap.c     |   17 +++
 libfrog/healthevent.c |  117 +++++++++++++++++++++++
 9 files changed, 599 insertions(+)
 create mode 100644 healer/fsrepair.c
 create mode 100644 healer/weakhandle.c


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index bcddde5db0cc47..a4de1ad32a408f 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -8,6 +8,9 @@
 
 extern char *progname;
 
+struct weakhandle;
+struct hme_prefix;
+
 /*
  * When running in environments with restrictive security policies, healer
  * might not be allowed to access the global mount tree.  However, processes
@@ -22,6 +25,7 @@ struct healer_ctx {
 	int			log;
 	int			everything;
 	int			foreground;
+	int			want_repair;
 
 	/* fd and fs geometry for mount */
 	struct xfs_fd		mnt;
@@ -32,6 +36,9 @@ struct healer_ctx {
 	/* Shared reference to the getmntent fsname for reconnecting */
 	const char		*fsname;
 
+	/* weak file handle so we can reattach to filesystem */
+	struct weakhandle	*wh;
+
 	/* file stream of monitor and buffer */
 	FILE			*mon_fp;
 	char			*mon_buf;
@@ -44,4 +51,25 @@ struct healer_ctx {
 	bool			queue_active;
 };
 
+static inline bool healer_has_rmapbt(const struct healer_ctx *ctx)
+{
+	return ctx->mnt.fsgeom.flags & XFS_FSOP_GEOM_FLAGS_RMAPBT;
+}
+
+static inline bool healer_has_parent(const struct healer_ctx *ctx)
+{
+	return ctx->mnt.fsgeom.flags & XFS_FSOP_GEOM_FLAGS_PARENT;
+}
+
+/* repair.c */
+int repair_metadata(struct healer_ctx *ctx, const struct hme_prefix *pfx,
+		const struct xfs_health_monitor_event *hme);
+bool healer_can_repair(struct healer_ctx *ctx);
+
+/* weakhandle.c */
+int weakhandle_alloc(int fd, const char *mountpoint, const char *fsname,
+		struct weakhandle **whp);
+int weakhandle_reopen(struct weakhandle *wh, int *fd);
+void weakhandle_free(struct weakhandle **whp);
+
 #endif /* XFS_HEALER_XFS_HEALER_H_ */
diff --git a/libfrog/flagmap.h b/libfrog/flagmap.h
index 8031d75a7c02a8..05110c3544dc97 100644
--- a/libfrog/flagmap.h
+++ b/libfrog/flagmap.h
@@ -14,6 +14,9 @@ struct flag_map {
 void mask_to_string(const struct flag_map *map, unsigned long long mask,
 		const char *delimiter, char *buf, size_t bufsize);
 
+const char *lowest_set_mask_string(const struct flag_map *map,
+		unsigned long long mask);
+
 const char *value_to_string(const struct flag_map *map,
 		unsigned long long value);
 
diff --git a/libfrog/healthevent.h b/libfrog/healthevent.h
index 6de41bc797100c..4f3c8ba639ec4c 100644
--- a/libfrog/healthevent.h
+++ b/libfrog/healthevent.h
@@ -40,4 +40,16 @@ hme_prefix_init(
 void hme_report_event(const struct hme_prefix *pfx,
 		const struct xfs_health_monitor_event *hme);
 
+enum repair_outcome {
+	REPAIR_SUCCESS,
+	REPAIR_FAILED,
+	REPAIR_PROBABLY_OK,
+	REPAIR_UNNECESSARY,
+};
+
+void report_health_repair(const struct hme_prefix *pfx,
+		const struct xfs_health_monitor_event *hme,
+		uint32_t event_mask,
+		enum repair_outcome outcome);
+
 #endif /* LIBFROG_HEALTHEVENT_H_ */
diff --git a/healer/Makefile b/healer/Makefile
index e82c820883669a..981192b81af626 100644
--- a/healer/Makefile
+++ b/healer/Makefile
@@ -11,6 +11,8 @@ INSTALL_HEALER = install-healer
 LTCOMMAND = xfs_healer
 
 CFILES = \
+fsrepair.c \
+weakhandle.c \
 xfs_healer.c
 
 HFILES = \
diff --git a/healer/fsrepair.c b/healer/fsrepair.c
new file mode 100644
index 00000000000000..907afca3dba8a7
--- /dev/null
+++ b/healer/fsrepair.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+
+#include "platform_defs.h"
+#include "libfrog/fsgeom.h"
+#include "libfrog/workqueue.h"
+#include "libfrog/healthevent.h"
+#include "xfs_healer.h"
+
+/* Translate scrub output flags to outcome. */
+static enum repair_outcome from_repair_oflags(uint32_t oflags)
+{
+	if (oflags & (XFS_SCRUB_OFLAG_CORRUPT | XFS_SCRUB_OFLAG_INCOMPLETE))
+		return REPAIR_FAILED;
+
+	if (oflags & XFS_SCRUB_OFLAG_XFAIL)
+		return REPAIR_PROBABLY_OK;
+
+	if (oflags & XFS_SCRUB_OFLAG_NO_REPAIR_NEEDED)
+		return REPAIR_UNNECESSARY;
+
+	return REPAIR_SUCCESS;
+}
+
+struct u32_scrub {
+	uint32_t	event_mask;
+	uint32_t	scrub_type;
+};
+
+#define foreach_scrub_type(cur, mask, coll) \
+	for ((cur) = (coll); (cur)->scrub_type != 0; (cur)++) \
+		if ((mask) & (cur)->event_mask)
+
+/* Call the kernel to repair some inode metadata. */
+static inline enum repair_outcome
+xfs_repair_metadata(
+	int			fd,
+	uint32_t		scrub_type,
+	uint32_t		group,
+	uint64_t		ino,
+	uint32_t		gen)
+{
+	struct xfs_scrub_metadata sm = {
+		.sm_type = scrub_type,
+		.sm_flags = XFS_SCRUB_IFLAG_REPAIR,
+		.sm_ino = ino,
+		.sm_gen = gen,
+		.sm_agno = group,
+	};
+	int			ret;
+
+	ret = ioctl(fd, XFS_IOC_SCRUB_METADATA, &sm);
+	if (ret)
+		return REPAIR_FAILED;
+
+	return from_repair_oflags(sm.sm_flags);
+}
+
+/* React to a fs-domain corruption event by repairing it. */
+static void
+try_repair_wholefs(
+	struct healer_ctx			*ctx,
+	const struct hme_prefix			*pfx,
+	int					mnt_fd,
+	const struct xfs_health_monitor_event	*hme)
+{
+#define X(code, type) { XFS_FSOP_GEOM_SICK_ ## code, XFS_SCRUB_TYPE_ ## type }
+	static const struct u32_scrub		FS_STRUCTURES[] = {
+		X(COUNTERS,	FSCOUNTERS),
+		X(UQUOTA,	UQUOTA),
+		X(GQUOTA,	GQUOTA),
+		X(PQUOTA,	PQUOTA),
+		X(RT_BITMAP,	RTBITMAP),
+		X(RT_SUMMARY,	RTSUM),
+		X(QUOTACHECK,	QUOTACHECK),
+		X(NLINKS,	NLINKS),
+		{0,		0},
+	};
+#undef X
+	const struct u32_scrub	*f;
+
+	foreach_scrub_type(f, hme->e.fs.mask, FS_STRUCTURES) {
+		enum repair_outcome	outcome =
+			xfs_repair_metadata(mnt_fd, f->scrub_type, 0, 0, 0);
+
+		pthread_mutex_lock(&ctx->conlock);
+		report_health_repair(pfx, hme, f->event_mask, outcome);
+		pthread_mutex_unlock(&ctx->conlock);
+	}
+}
+
+/* React to an ag corruption event by repairing it. */
+static void
+try_repair_ag(
+	struct healer_ctx			*ctx,
+	const struct hme_prefix			*pfx,
+	int					mnt_fd,
+	const struct xfs_health_monitor_event	*hme)
+{
+#define X(code, type) { XFS_AG_GEOM_SICK_ ## code, XFS_SCRUB_TYPE_ ## type }
+	static const struct u32_scrub		AG_STRUCTURES[] = {
+		X(SB,		SB),
+		X(AGF,		AGF),
+		X(AGFL,		AGFL),
+		X(AGI,		AGI),
+		X(BNOBT,	BNOBT),
+		X(CNTBT,	CNTBT),
+		X(INOBT,	INOBT),
+		X(FINOBT,	FINOBT),
+		X(RMAPBT,	RMAPBT),
+		X(REFCNTBT,	REFCNTBT),
+		{0,		0},
+	};
+#undef X
+	const struct u32_scrub *f;
+
+	foreach_scrub_type(f, hme->e.group.mask, AG_STRUCTURES) {
+		enum repair_outcome	outcome =
+			xfs_repair_metadata(mnt_fd, f->scrub_type,
+					hme->e.group.gno, 0, 0);
+
+		pthread_mutex_lock(&ctx->conlock);
+		report_health_repair(pfx, hme, f->event_mask, outcome);
+		pthread_mutex_unlock(&ctx->conlock);
+	}
+}
+
+/* React to a rtgroup corruption event by repairing it. */
+static void
+try_repair_rtgroup(
+	struct healer_ctx			*ctx,
+	const struct hme_prefix			*pfx,
+	int					mnt_fd,
+	const struct xfs_health_monitor_event	*hme)
+{
+#define X(code, type) { XFS_RTGROUP_GEOM_SICK_ ## code, XFS_SCRUB_TYPE_ ## type }
+	static const struct u32_scrub		RTG_STRUCTURES[] = {
+		X(SUPER,	RGSUPER),
+		X(BITMAP,	RTBITMAP),
+		X(SUMMARY,	RTSUM),
+		X(RMAPBT,	RTRMAPBT),
+		X(REFCNTBT,	RTREFCBT),
+		{0,		0},
+	};
+#undef X
+	const struct u32_scrub *f;
+
+	foreach_scrub_type(f, hme->e.group.mask, RTG_STRUCTURES) {
+		enum repair_outcome	outcome =
+			xfs_repair_metadata(mnt_fd, f->scrub_type,
+					hme->e.group.gno, 0, 0);
+
+		pthread_mutex_lock(&ctx->conlock);
+		report_health_repair(pfx, hme, f->event_mask, outcome);
+		pthread_mutex_unlock(&ctx->conlock);
+	}
+}
+
+/* React to a inode-domain corruption event by repairing it. */
+static void
+try_repair_inode(
+	struct healer_ctx			*ctx,
+	const struct hme_prefix			*pfx,
+	int					mnt_fd,
+	const struct xfs_health_monitor_event	*hme)
+{
+#define X(code, type) { XFS_BS_SICK_ ## code, XFS_SCRUB_TYPE_ ## type }
+	static const struct u32_scrub		INODE_STRUCTURES[] = {
+		X(INODE,	INODE),
+		X(BMBTD,	BMBTD),
+		X(BMBTA,	BMBTA),
+		X(BMBTC,	BMBTC),
+		X(DIR,		DIR),
+		X(XATTR,	XATTR),
+		X(SYMLINK,	SYMLINK),
+		X(PARENT,	PARENT),
+		X(DIRTREE,	DIRTREE),
+		{0,		0},
+	};
+#undef X
+	const struct u32_scrub *f;
+
+	foreach_scrub_type(f, hme->e.inode.mask, INODE_STRUCTURES) {
+		enum repair_outcome	outcome =
+			xfs_repair_metadata(mnt_fd, f->scrub_type,
+					0, hme->e.inode.ino, hme->e.inode.gen);
+
+		pthread_mutex_lock(&ctx->conlock);
+		report_health_repair(pfx, hme, f->event_mask, outcome);
+		pthread_mutex_unlock(&ctx->conlock);
+	}
+}
+
+/* Repair a metadata corruption. */
+int
+repair_metadata(
+	struct healer_ctx			*ctx,
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	int					repair_fd;
+	int					ret;
+
+	ret = weakhandle_reopen(ctx->wh, &repair_fd);
+	if (ret) {
+		fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
+				_("cannot open filesystem to repair"),
+				strerror(errno));
+		return ret;
+	}
+
+	switch (hme->domain) {
+	case XFS_HEALTH_MONITOR_DOMAIN_FS:
+		try_repair_wholefs(ctx, pfx, repair_fd, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_AG:
+		try_repair_ag(ctx, pfx, repair_fd, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_RTGROUP:
+		try_repair_rtgroup(ctx, pfx, repair_fd, hme);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_INODE:
+		try_repair_inode(ctx, pfx, repair_fd, hme);
+		break;
+	}
+
+	close(repair_fd);
+	return 0;
+}
+
+/* Ask the kernel if it supports repairs. */
+bool
+healer_can_repair(
+	struct healer_ctx	*ctx)
+{
+	struct xfs_scrub_metadata sm = {
+		.sm_type = XFS_SCRUB_TYPE_PROBE,
+		.sm_flags = XFS_SCRUB_IFLAG_REPAIR,
+	};
+	int			ret;
+
+	/* assume any errno means not supported */
+	ret = ioctl(ctx->mnt.fd, XFS_IOC_SCRUB_METADATA, &sm);
+	return ret ? false : true;
+}
diff --git a/healer/weakhandle.c b/healer/weakhandle.c
new file mode 100644
index 00000000000000..53df43b03e16cc
--- /dev/null
+++ b/healer/weakhandle.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2025-2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+#include <pthread.h>
+#include <stdlib.h>
+
+#include "platform_defs.h"
+#include "handle.h"
+#include "libfrog/fsgeom.h"
+#include "libfrog/workqueue.h"
+#include "xfs_healer.h"
+
+struct weakhandle {
+	/* Shared reference to the user's mountpoint for logging */
+	const char		*mntpoint;
+
+	/* Shared reference to the getmntent fsname for reconnecting */
+	const char		*fsname;
+
+	/* handle to root dir */
+	void			*hanp;
+	size_t			hlen;
+};
+
+/* Capture a handle for a given filesystem, but don't attach to the fd. */
+int
+weakhandle_alloc(
+	int			fd,
+	const char		*mountpoint,
+	const char		*fsname,
+	struct weakhandle	**whp)
+{
+	struct weakhandle	*wh;
+	int			ret;
+
+	*whp = NULL;
+
+	if (fd < 0 || !mountpoint) {
+		errno = EINVAL;
+		return -1;
+	}
+
+	wh = calloc(1, sizeof(struct weakhandle));
+	if (!wh)
+		return -1;
+
+	wh->mntpoint = mountpoint;
+	wh->fsname = fsname;
+
+	ret = fd_to_handle(fd, &wh->hanp, &wh->hlen);
+	if (ret)
+		goto out_wh;
+
+	*whp = wh;
+	return 0;
+
+out_wh:
+	free(wh);
+	return -1;
+}
+
+/* Reopen a file handle obtained via weak reference. */
+int
+weakhandle_reopen(
+	struct weakhandle	*wh,
+	int			*fd)
+{
+	void			*hanp;
+	size_t			hlen;
+	int			mnt_fd;
+	int			ret;
+
+	*fd = -1;
+
+	mnt_fd = open(wh->mntpoint, O_RDONLY);
+	if (mnt_fd < 0)
+		return -1;
+
+	ret = fd_to_handle(mnt_fd, &hanp, &hlen);
+	if (ret)
+		goto out_mntfd;
+
+	if (hlen != wh->hlen || memcmp(hanp, wh->hanp, hlen)) {
+		errno = ESTALE;
+		goto out_handle;
+	}
+
+	free_handle(hanp, hlen);
+	*fd = mnt_fd;
+	return 0;
+
+out_handle:
+	free_handle(hanp, hlen);
+out_mntfd:
+	close(mnt_fd);
+	return -1;
+}
+
+/* Tear down a weak handle */
+void
+weakhandle_free(
+	struct weakhandle	**whp)
+{
+	struct weakhandle	*wh = *whp;
+
+	if (wh) {
+		free_handle(wh->hanp, wh->hlen);
+		free(wh);
+	}
+
+	*whp = NULL;
+}
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index c69df9ed04699e..0a99ae3ed50135 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -58,6 +58,18 @@ event_loggable(
 	return ctx->log || event_not_actionable(hme);
 }
 
+/* Are we going to try a repair? */
+static inline bool
+event_repairable(
+	const struct healer_ctx			*ctx,
+	const struct xfs_health_monitor_event	*hme)
+{
+	if (event_not_actionable(hme))
+		return false;
+
+	return ctx->want_repair && hme->type == XFS_HEALTH_MONITOR_TYPE_SICK;
+}
+
 /* Handle an event asynchronously. */
 static void
 handle_event(
@@ -69,6 +81,7 @@ handle_event(
 	struct xfs_health_monitor_event	*hme = arg;
 	struct healer_ctx		*ctx = wq->wq_ctx;
 	const bool loggable = event_loggable(ctx, hme);
+	const bool will_repair = event_repairable(ctx, hme);
 
 	hme_prefix_init(&pfx, ctx->mntpoint);
 
@@ -82,6 +95,10 @@ handle_event(
 		pthread_mutex_unlock(&ctx->conlock);
 	}
 
+	/* Initiate a repair if appropriate. */
+	if (will_repair)
+		repair_metadata(ctx, &pfx, hme);
+
 	free(hme);
 }
 
@@ -111,6 +128,41 @@ setup_monitor(
 		return -1;
 	}
 
+	if (ctx->want_repair) {
+		/* Check that the kernel supports repairs at all. */
+		if (!healer_can_repair(ctx)) {
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+ _("XFS online repair is not supported, exiting"));
+			close(ctx->mnt.fd);
+			return -1;
+		}
+
+		/* Check for backref metadata that makes repair effective. */
+		if (!healer_has_rmapbt(ctx))
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+ _("XFS online repair is less effective without rmap btrees."));
+
+		if (!healer_has_parent(ctx))
+			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+ _("XFS online repair is less effective without parent pointers."));
+
+	}
+
+	/*
+	 * Open weak-referenced file handle to mountpoint so that we can
+	 * reconnect to the mountpoint to start repairs.
+	 */
+	if (ctx->want_repair) {
+		ret = weakhandle_alloc(ctx->mnt.fd, ctx->mntpoint,
+				ctx->fsname, &ctx->wh);
+		if (ret) {
+			fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
+					_("creating weak fshandle"),
+					strerror(errno));
+			return -1;
+		}
+	}
+
 	/*
 	 * Open the health monitor, then close the mountpoint to avoid pinning
 	 * it.  We can reconnect later if need be.
@@ -229,6 +281,7 @@ teardown_monitor(
 		ctx->mon_fp = NULL;
 	}
 	free(ctx->mon_buf);
+	weakhandle_free(&ctx->wh);
 	ctx->mon_buf = NULL;
 }
 
@@ -280,6 +333,7 @@ usage(void)
 	fprintf(stderr, _("  --everything  Capture all events.\n"));
 	fprintf(stderr, _("  --foreground  Process events as soon as possible.\n"));
 	fprintf(stderr, _("  --quiet       Do not log health events to stdout.\n"));
+	fprintf(stderr, _("  --repair      Always repair corrupt metadata.\n"));
 	fprintf(stderr, _("  -V            Print version.\n"));
 
 	exit(EXIT_FAILURE);
@@ -291,6 +345,7 @@ enum long_opt_nr {
 	LOPT_FOREGROUND,
 	LOPT_HELP,
 	LOPT_QUIET,
+	LOPT_REPAIR,
 
 	LOPT_MAX,
 };
@@ -320,6 +375,7 @@ main(
 		[LOPT_FOREGROUND]  = {"foreground", no_argument, &ctx.foreground, 1 },
 		[LOPT_HELP]	   = {"help", no_argument, NULL, 0 },
 		[LOPT_QUIET]	   = {"quiet", no_argument, &ctx.log, 0 },
+		[LOPT_REPAIR]	   = {"repair", no_argument, &ctx.want_repair, 1 },
 
 		[LOPT_MAX]	   = {NULL, 0, NULL, 0 },
 	};
diff --git a/libfrog/flagmap.c b/libfrog/flagmap.c
index 631c4bbc8f1dc0..ce413297780a2a 100644
--- a/libfrog/flagmap.c
+++ b/libfrog/flagmap.c
@@ -44,6 +44,23 @@ mask_to_string(
 		snprintf(buf, bufsize, "%s0x%llx", tag, mask & ~seen);
 }
 
+/*
+ * Given a mapping of bits to strings and a bitmask, return the string
+ * corresponding to the lowest set bit in the mask.
+ */
+const char *
+lowest_set_mask_string(
+	const struct flag_map	*map,
+	unsigned long long	mask)
+{
+	for (; map->string; map++) {
+		if (mask & map->flag)
+			return _(map->string);
+	}
+
+	return _("unknown flag");
+}
+
 /*
  * Given a mapping of values to strings and a value, return the matching string
  * or confusion.
diff --git a/libfrog/healthevent.c b/libfrog/healthevent.c
index 8520cb3218fb03..193738332dbd71 100644
--- a/libfrog/healthevent.c
+++ b/libfrog/healthevent.c
@@ -358,3 +358,120 @@ hme_report_event(
 		break;
 	}
 }
+
+static const char *
+repair_outcome_string(
+	enum repair_outcome	o)
+{
+	switch (o) {
+	case REPAIR_FAILED:
+		return _("Repair unsuccessful; offline repair required.");
+	case REPAIR_PROBABLY_OK:
+		return _("Seems correct but cross-referencing failed; offline repair recommended.");
+	case REPAIR_UNNECESSARY:
+		return _("No modification needed.");
+	case REPAIR_SUCCESS:
+		return _("Repairs successful.");
+	}
+
+	return NULL;
+}
+
+/* Report inode metadata repair */
+static void
+report_inode_repair(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme,
+	uint32_t				domain_mask,
+	enum repair_outcome			outcome)
+{
+	if (hme_prefix_has_path(pfx))
+		printf("%s %s: %s\n",
+				pfx->path,
+				lowest_set_mask_string(inode_structs,
+						       domain_mask),
+				repair_outcome_string(outcome));
+	else
+		printf("%s %s %llu %s 0x%x %s: %s\n",
+				pfx->mountpoint,
+				_("ino"),
+				(unsigned long long)hme->e.inode.ino,
+				_("gen"),
+				hme->e.inode.gen,
+				lowest_set_mask_string(inode_structs,
+						       domain_mask),
+				repair_outcome_string(outcome));
+	fflush(stdout);
+}
+
+/* Report AG metadata repair */
+static void
+report_ag_repair(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme,
+	uint32_t				domain_mask,
+	enum repair_outcome			outcome)
+{
+	printf("%s %s 0x%x %s: %s\n", pfx->mountpoint,
+			_("agno"),
+			hme->e.group.gno,
+			lowest_set_mask_string(ag_structs, domain_mask),
+			repair_outcome_string(outcome));
+	fflush(stdout);
+}
+
+/* Report rtgroup metadata repair */
+static void
+report_rtgroup_repair(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme,
+	uint32_t				domain_mask,
+	enum repair_outcome			outcome)
+{
+	printf("%s %s 0x%x %s: %s\n", pfx->mountpoint,
+			_("rgno"),
+			hme->e.group.gno,
+			lowest_set_mask_string(rtgroup_structs, domain_mask),
+			repair_outcome_string(outcome));
+	fflush(stdout);
+}
+
+/* Report fs-wide metadata repair */
+static void
+report_fs_repair(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme,
+	uint32_t				domain_mask,
+	enum repair_outcome			outcome)
+{
+	printf("%s %s: %s\n", pfx->mountpoint,
+			lowest_set_mask_string(fs_structs, domain_mask),
+			repair_outcome_string(outcome));
+	fflush(stdout);
+}
+
+/* Log a repair event to stdout. */
+void
+report_health_repair(
+	const struct hme_prefix			*pfx,
+	const struct xfs_health_monitor_event	*hme,
+	uint32_t				domain_mask,
+	enum repair_outcome			outcome)
+{
+	switch (hme->domain) {
+	case XFS_HEALTH_MONITOR_DOMAIN_INODE:
+		report_inode_repair(pfx, hme, domain_mask, outcome);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_AG:
+		report_ag_repair(pfx, hme, domain_mask, outcome);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_RTGROUP:
+		report_rtgroup_repair(pfx, hme, domain_mask, outcome);
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_FS:
+		report_fs_repair(pfx, hme, domain_mask, outcome);
+		break;
+	default:
+		break;
+	}
+}


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 11/26] xfs_healer: use getparents to look up file names
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (9 preceding siblings ...)
  2026-03-03  0:36   ` [PATCH 10/26] xfs_healer: enable repairing filesystems Darrick J. Wong
@ 2026-03-03  0:36   ` Darrick J. Wong
  2026-03-03 15:48     ` Christoph Hellwig
  2026-03-03  0:36   ` [PATCH 12/26] xfs_healer: create a per-mount background monitoring service Darrick J. Wong
                     ` (14 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:36 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If the kernel tells about something that happened to a file, use the
GETPARENTS ioctl to try to look up the path to that file for more
ergonomic reporting.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h |    6 ++++
 healer/fsrepair.c   |   16 ++++++++-
 healer/weakhandle.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++++++
 healer/xfs_healer.c |   45 ++++++++++++++++++++++++++-
 4 files changed, 149 insertions(+), 4 deletions(-)


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index a4de1ad32a408f..6d12921245934c 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -61,6 +61,10 @@ static inline bool healer_has_parent(const struct healer_ctx *ctx)
 	return ctx->mnt.fsgeom.flags & XFS_FSOP_GEOM_FLAGS_PARENT;
 }
 
+void lookup_path(struct healer_ctx *ctx,
+		const struct xfs_health_monitor_event *hme,
+		struct hme_prefix *pfx);
+
 /* repair.c */
 int repair_metadata(struct healer_ctx *ctx, const struct hme_prefix *pfx,
 		const struct xfs_health_monitor_event *hme);
@@ -71,5 +75,7 @@ int weakhandle_alloc(int fd, const char *mountpoint, const char *fsname,
 		struct weakhandle **whp);
 int weakhandle_reopen(struct weakhandle *wh, int *fd);
 void weakhandle_free(struct weakhandle **whp);
+int weakhandle_getpath_for(struct weakhandle *wh, uint64_t ino, uint32_t gen,
+		char *path, size_t pathlen);
 
 #endif /* XFS_HEALER_XFS_HEALER_H_ */
diff --git a/healer/fsrepair.c b/healer/fsrepair.c
index 907afca3dba8a7..4534104f8a6ac1 100644
--- a/healer/fsrepair.c
+++ b/healer/fsrepair.c
@@ -164,7 +164,7 @@ try_repair_rtgroup(
 static void
 try_repair_inode(
 	struct healer_ctx			*ctx,
-	const struct hme_prefix			*pfx,
+	const struct hme_prefix			*orig_pfx,
 	int					mnt_fd,
 	const struct xfs_health_monitor_event	*hme)
 {
@@ -182,13 +182,25 @@ try_repair_inode(
 		{0,		0},
 	};
 #undef X
-	const struct u32_scrub *f;
+	struct hme_prefix	new_pfx;
+	const struct hme_prefix	*pfx = orig_pfx;
+	const struct u32_scrub	*f;
 
 	foreach_scrub_type(f, hme->e.inode.mask, INODE_STRUCTURES) {
 		enum repair_outcome	outcome =
 			xfs_repair_metadata(mnt_fd, f->scrub_type,
 					0, hme->e.inode.ino, hme->e.inode.gen);
 
+		/*
+		 * Try again to find the file path, maybe we fixed the dir
+		 * tree.
+		 */
+		if (!hme_prefix_has_path(pfx)) {
+			lookup_path(ctx, hme, &new_pfx);
+			if (hme_prefix_has_path(&new_pfx))
+				pfx = &new_pfx;
+		}
+
 		pthread_mutex_lock(&ctx->conlock);
 		report_health_repair(pfx, hme, f->event_mask, outcome);
 		pthread_mutex_unlock(&ctx->conlock);
diff --git a/healer/weakhandle.c b/healer/weakhandle.c
index 53df43b03e16cc..8950e0eb1e5a43 100644
--- a/healer/weakhandle.c
+++ b/healer/weakhandle.c
@@ -11,6 +11,8 @@
 #include "handle.h"
 #include "libfrog/fsgeom.h"
 #include "libfrog/workqueue.h"
+#include "libfrog/getparents.h"
+#include "libfrog/paths.h"
 #include "xfs_healer.h"
 
 struct weakhandle {
@@ -113,3 +115,87 @@ weakhandle_free(
 
 	*whp = NULL;
 }
+
+struct bufvec {
+	char	*buf;
+	size_t	len;
+};
+
+static int
+render_path(
+	const char		*mntpt,
+	const struct path_list	*path,
+	void			*arg)
+{
+	struct bufvec		*args = arg;
+	int			mntpt_len = strlen(mntpt);
+	ssize_t			ret;
+
+	/* Trim trailing slashes from the mountpoint */
+	while (mntpt_len > 0 && mntpt[mntpt_len - 1] == '/')
+		mntpt_len--;
+
+	ret = snprintf(args->buf, args->len, "%.*s", mntpt_len, mntpt);
+	if (ret < 0 || ret >= args->len)
+		return 0;
+
+	ret = path_list_to_string(path, args->buf + ret, args->len - ret);
+	if (ret < 0)
+		return 0;
+
+	/* magic code that means we found one */
+	return ECANCELED;
+}
+
+/* Render any path to this weakhandle into the specified buffer. */
+int
+weakhandle_getpath_for(
+	struct weakhandle	*wh,
+	uint64_t		ino,
+	uint32_t		gen,
+	char			*path,
+	size_t			pathlen)
+{
+	struct xfs_handle	fakehandle;
+	struct bufvec		bv = {
+		.buf		= path,
+		.len		= pathlen,
+	};
+	int			mnt_fd;
+	int			ret;
+
+	if (wh->hlen != sizeof(fakehandle)) {
+		errno = EINVAL;
+		return -1;
+	}
+	memcpy(&fakehandle, wh->hanp, sizeof(fakehandle));
+	fakehandle.ha_fid.fid_ino = ino;
+	fakehandle.ha_fid.fid_gen = gen;
+
+	ret = weakhandle_reopen(wh, &mnt_fd);
+	if (ret)
+		return ret;
+
+	/*
+	 * In the common case, files only have one parent; and what's the
+	 * chance that we'll need to walk past the second parent to find *one*
+	 * path that goes to the rootdir?  With a max filename length of 255
+	 * bytes, we pick 600 for the buffer size.
+	 */
+	ret = handle_walk_paths_fd(wh->mntpoint, mnt_fd, &fakehandle,
+			sizeof(fakehandle), 600, render_path, &bv);
+	switch (ret) {
+	case ECANCELED:
+		/* found a path */
+		ret = 0;
+		break;
+	default:
+		/* didn't find one */
+		errno = ENOENT;
+		ret = -1;
+		break;
+	}
+
+	close(mnt_fd);
+	return ret;
+}
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index 0a99ae3ed50135..c9892168c706cb 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -33,6 +33,39 @@ open_health_monitor(
 	return ioctl(mnt_fd, XFS_IOC_HEALTH_MONITOR, &hmo);
 }
 
+/* Report either the file handle or its path, if we can. */
+void
+lookup_path(
+	struct healer_ctx			*ctx,
+	const struct xfs_health_monitor_event	*hme,
+	struct hme_prefix			*pfx)
+{
+	uint64_t				ino = 0;
+	uint32_t				gen = 0;
+	int					ret;
+
+	if (!healer_has_parent(ctx))
+		return;
+
+	switch (hme->domain) {
+	case XFS_HEALTH_MONITOR_DOMAIN_INODE:
+		ino = hme->e.inode.ino;
+		gen = hme->e.inode.gen;
+		break;
+	case XFS_HEALTH_MONITOR_DOMAIN_FILERANGE:
+		ino = hme->e.filerange.ino;
+		gen = hme->e.filerange.gen;
+		break;
+	default:
+		return;
+	}
+
+	ret = weakhandle_getpath_for(ctx->wh, ino, gen, pfx->path,
+			sizeof(pfx->path));
+	if (ret)
+		hme_prefix_clear_path(pfx);
+}
+
 /* Decide if this event can only be reported upon, and not acted upon. */
 static bool
 event_not_actionable(
@@ -85,6 +118,13 @@ handle_event(
 
 	hme_prefix_init(&pfx, ctx->mntpoint);
 
+	/*
+	 * Try to look up the file name for the file we're about to log or
+	 * about to repair (which always logs).
+	 */
+	if (loggable || will_repair)
+		lookup_path(ctx, hme, &pfx);
+
 	/*
 	 * Non-actionable events should always be logged, because they are 100%
 	 * informational.
@@ -150,9 +190,10 @@ setup_monitor(
 
 	/*
 	 * Open weak-referenced file handle to mountpoint so that we can
-	 * reconnect to the mountpoint to start repairs.
+	 * reconnect to the mountpoint to start repairs or to look up file
+	 * paths for logging.
 	 */
-	if (ctx->want_repair) {
+	if (ctx->want_repair || healer_has_parent(ctx)) {
 		ret = weakhandle_alloc(ctx->mnt.fd, ctx->mntpoint,
 				ctx->fsname, &ctx->wh);
 		if (ret) {


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 12/26] xfs_healer: create a per-mount background monitoring service
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (10 preceding siblings ...)
  2026-03-03  0:36   ` [PATCH 11/26] xfs_healer: use getparents to look up file names Darrick J. Wong
@ 2026-03-03  0:36   ` Darrick J. Wong
  2026-03-03 15:48     ` Christoph Hellwig
  2026-03-03  0:37   ` [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service Darrick J. Wong
                     ` (13 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:36 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a systemd service definition for our self-healing filesystem
daemon so that we can run it for every mounted filesystem.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/Makefile                |   19 +++++++
 healer/system-xfs_healer.slice |   31 ++++++++++++
 healer/xfs_healer.c            |    3 +
 healer/xfs_healer@.service.in  |  107 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 158 insertions(+), 2 deletions(-)
 create mode 100644 healer/system-xfs_healer.slice
 create mode 100644 healer/xfs_healer@.service.in


diff --git a/healer/Makefile b/healer/Makefile
index 981192b81af626..86d6f50781f9b6 100644
--- a/healer/Makefile
+++ b/healer/Makefile
@@ -22,7 +22,20 @@ LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBURCU) $(LIBPTHREAD)
 LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
 LLDFLAGS = -static
 
-default: depend $(LTCOMMAND)
+ifeq ($(HAVE_SYSTEMD),yes)
+INSTALL_HEALER += install-systemd
+SYSTEMD_SERVICES=\
+	system-xfs_healer.slice \
+	xfs_healer@.service
+OPTIONAL_TARGETS += $(SYSTEMD_SERVICES)
+endif
+
+default: depend $(LTCOMMAND) $(SYSTEMD_SERVICES)
+
+%.service: %.service.in $(builddefs)
+	@echo "    [SED]    $@"
+	$(Q)$(SED) -e "s|@pkg_libexec_dir@|$(PKG_LIBEXEC_DIR)|g" \
+		   < $< > $@
 
 include $(BUILDRULES)
 
@@ -32,6 +45,10 @@ install-healer: default
 	$(INSTALL) -m 755 -d $(PKG_LIBEXEC_DIR)
 	$(INSTALL) -m 755 $(LTCOMMAND) $(PKG_LIBEXEC_DIR)
 
+install-systemd: default
+	$(INSTALL) -m 755 -d $(SYSTEMD_SYSTEM_UNIT_DIR)
+	$(INSTALL) -m 644 $(SYSTEMD_SERVICES) $(SYSTEMD_SYSTEM_UNIT_DIR)
+
 install-dev:
 
 -include .dep
diff --git a/healer/system-xfs_healer.slice b/healer/system-xfs_healer.slice
new file mode 100644
index 00000000000000..b8f5bca03963ff
--- /dev/null
+++ b/healer/system-xfs_healer.slice
@@ -0,0 +1,31 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+# Author: Darrick J. Wong <djwong@kernel.org>
+
+[Unit]
+Description=xfs_healer background service slice
+Before=slices.target
+
+[Slice]
+
+# If the CPU usage cgroup controller is available, don't use more than 2 cores
+# for all background processes.  One thread to read events, another to run
+# repairs.
+CPUQuota=200%
+CPUAccounting=true
+
+[Install]
+# As of systemd 249, the systemd cgroupv2 configuration code will drop resource
+# controllers from the root and system.slice cgroups at startup if it doesn't
+# find any direct dependencies that require a given controller.  Newly
+# activated units with resource control directives are created under the system
+# slice but do not cause a reconfiguration of the slice's resource controllers.
+# Hence we cannot put CPUQuota= into the xfs_healer service units directly.
+#
+# For the CPUQuota directive to have any effect, we must therefore create an
+# explicit definition file for the slice that systemd creates to contain the
+# xfs_healer instance units (e.g. xfs_healer@.service) and we must configure
+# this slice as a dependency of the system slice to establish the direct
+# dependency relation.
+WantedBy=system.slice
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index c9892168c706cb..2f431f18c69318 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -12,6 +12,7 @@
 #include "libfrog/paths.h"
 #include "libfrog/healthevent.h"
 #include "libfrog/workqueue.h"
+#include "libfrog/systemd.h"
 #include "xfs_healer.h"
 
 /* Program name; needed for libfrog error reports. */
@@ -470,5 +471,5 @@ main(
 	teardown_monitor(&ctx);
 	free((char *)ctx.fsname);
 out:
-	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+	return systemd_service_exit(ret);
 }
diff --git a/healer/xfs_healer@.service.in b/healer/xfs_healer@.service.in
new file mode 100644
index 00000000000000..385257872b0cbb
--- /dev/null
+++ b/healer/xfs_healer@.service.in
@@ -0,0 +1,107 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+# Author: Darrick J. Wong <djwong@kernel.org>
+
+[Unit]
+Description=Self Healing of XFS Metadata for %f
+
+# Explicitly require the capabilities that this program needs
+ConditionCapability=CAP_SYS_ADMIN
+ConditionCapability=CAP_DAC_OVERRIDE
+
+# Must be a mountpoint
+ConditionPathIsMountPoint=%f
+RequiresMountsFor=%f
+
+[Service]
+Type=exec
+Environment=SERVICE_MODE=1
+ExecStart=@pkg_libexec_dir@/xfs_healer %f
+SyslogIdentifier=%N
+
+# Create the service underneath the healer background service slice so that we
+# can control resource usage.
+Slice=system-xfs_healer.slice
+
+# No realtime CPU scheduling
+RestrictRealtime=true
+
+# xfs_healer avoids pinning mounted filesystems by recording the file handle
+# for the provided mountpoint (%f) before opening the health monitor, after
+# which it closes the fd for the mountpoint.  If repairs are needed, it will
+# reopen the mountpoint, resample the file handle, and proceed only if the
+# handles match.  If the filesystem is unmounted, the daemon exits.  If the
+# mountpoint moves, repairs will not be attempted against the wrong filesystem.
+#
+# Due to this resampling behavior, xfs_healer must see the same filesystem
+# mount tree inside the service container as outside, with the same ro/rw
+# state.  BindPaths doesn't work on the paths that are made readonly by
+# ProtectSystem and ProtectHome, so it is not possible to set either option.
+# DynamicUser sets ProtectSystem, so that also cannot be used.  We cannot use
+# BindPaths to bind the desired mountpoint somewhere under /tmp like xfs_scrub
+# does because that pins the mount.
+#
+# Regrettably, this leaves xfs_healer less hardened than xfs_scrub.
+# Surprisingly, this doesn't affect xfs_healer's score dramatically.
+DynamicUser=false
+ProtectSystem=false
+ProtectHome=no
+PrivateTmp=true
+PrivateDevices=true
+
+# Don't let healer complain about paths in /etc/projects that have been hidden
+# by our sandboxing.  healer doesn't care about project ids anyway.
+InaccessiblePaths=-/etc/projects
+
+# No network access
+PrivateNetwork=true
+ProtectHostname=true
+RestrictAddressFamilies=none
+IPAddressDeny=any
+
+# Don't let the program mess with the kernel configuration at all
+ProtectKernelLogs=true
+ProtectKernelModules=true
+ProtectKernelTunables=true
+ProtectControlGroups=true
+ProtectProc=invisible
+RestrictNamespaces=true
+
+# Hide everything in /proc, even /proc/mounts
+ProcSubset=pid
+
+# Only allow the default personality Linux
+LockPersonality=true
+
+# No writable memory pages
+MemoryDenyWriteExecute=true
+
+# Don't let our mounts leak out to the host
+PrivateMounts=true
+
+# Restrict system calls to the native arch and only enough to get things going
+SystemCallArchitectures=native
+SystemCallFilter=@system-service
+SystemCallFilter=~@privileged
+SystemCallFilter=~@resources
+SystemCallFilter=~@mount
+
+# xfs_healer needs these privileges to open the rootdir and monitor
+CapabilityBoundingSet=CAP_SYS_ADMIN CAP_DAC_OVERRIDE
+AmbientCapabilities=CAP_SYS_ADMIN CAP_DAC_OVERRIDE
+NoNewPrivileges=true
+
+# xfs_healer doesn't create files
+UMask=7777
+
+# No access to hardware /dev files except for block devices
+ProtectClock=true
+DevicePolicy=closed
+
+[Install]
+WantedBy=multi-user.target
+# If someone tries to enable the template itself, translate that into enabling
+# this service on the root directory at systemd startup time.  In the
+# initramfs, the udev rules in xfs_healer.rules run before systemd starts.
+DefaultInstance=-


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (11 preceding siblings ...)
  2026-03-03  0:36   ` [PATCH 12/26] xfs_healer: create a per-mount background monitoring service Darrick J. Wong
@ 2026-03-03  0:37   ` Darrick J. Wong
  2026-03-03 15:49     ` Christoph Hellwig
  2026-03-03  0:37   ` [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
                     ` (12 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:37 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a daemon to wait for xfs mount events via fsnotify and start up
the per-mount healer service.  It's important that we're running in the
same mount namespace as the mount, so we're a fanotify client to avoid
having to filter the mount namespaces ourselves.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 libfrog/systemd.h                  |   23 +-
 configure.ac                       |    5 
 healer/Makefile                    |   27 ++-
 healer/xfs_healer_start.c          |  372 ++++++++++++++++++++++++++++++++++++
 healer/xfs_healer_start.service.in |   85 ++++++++
 include/builddefs.in               |    7 +
 m4/package_libcdev.m4              |   78 ++++++++
 7 files changed, 583 insertions(+), 14 deletions(-)
 create mode 100644 healer/xfs_healer_start.c
 create mode 100644 healer/xfs_healer_start.service.in


diff --git a/libfrog/systemd.h b/libfrog/systemd.h
index c96df4afa39aa6..8a0970282d1080 100644
--- a/libfrog/systemd.h
+++ b/libfrog/systemd.h
@@ -22,6 +22,20 @@ static inline bool systemd_is_service(void)
 	return getenv("SERVICE_MODE") != NULL;
 }
 
+/* Special processing for a service/daemon program that is exiting. */
+static inline int
+systemd_service_exit_now(int ret)
+{
+	/*
+	 * If we're being run as a service, the return code must fit the LSB
+	 * init script action error guidelines, which is to say that we
+	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
+	 * section 22.2) and hope the admin will scan the log for what actually
+	 * happened.
+	 */
+	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+}
+
 /* Special processing for a service/daemon program that is exiting. */
 static inline int
 systemd_service_exit(int ret)
@@ -35,14 +49,7 @@ systemd_service_exit(int ret)
 	 */
 	sleep(2);
 
-	/*
-	 * If we're being run as a service, the return code must fit the LSB
-	 * init script action error guidelines, which is to say that we
-	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
-	 * section 22.2) and hope the admin will scan the log for what actually
-	 * happened.
-	 */
-	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+	return systemd_service_exit_now(ret);
 }
 
 #endif /* __LIBFROG_SYSTEMD_H__ */
diff --git a/configure.ac b/configure.ac
index 78bb87b159b10b..fb1e1973d2f559 100644
--- a/configure.ac
+++ b/configure.ac
@@ -189,6 +189,11 @@ AC_HAVE_BLKID_TOPO
 AC_HAVE_TRIVIAL_AUTO_VAR_INIT
 AC_STRERROR_R_RETURNS_STRING
 AC_HAVE_CLOSE_RANGE
+AC_HAVE_LISTMOUNT
+if test "$have_listmount" = "yes"; then
+	AC_HAVE_LISTMOUNT_NS_FD
+fi
+AC_HAVE_FANOTIFY_MOUNTINFO
 
 if test "$enable_ubsan" = "yes" || test "$enable_ubsan" = "probe"; then
         AC_PACKAGE_CHECK_UBSAN
diff --git a/healer/Makefile b/healer/Makefile
index 86d6f50781f9b6..53cc787c6fcd0c 100644
--- a/healer/Makefile
+++ b/healer/Makefile
@@ -9,6 +9,7 @@ include $(builddefs)
 INSTALL_HEALER = install-healer
 
 LTCOMMAND = xfs_healer
+BUILD_TARGETS = $(LTCOMMAND)
 
 CFILES = \
 fsrepair.c \
@@ -24,13 +25,27 @@ LLDFLAGS = -static
 
 ifeq ($(HAVE_SYSTEMD),yes)
 INSTALL_HEALER += install-systemd
-SYSTEMD_SERVICES=\
+XFS_HEALER_SVCNAME=xfs_healer@.service
+SYSTEMD_SERVICES = \
 	system-xfs_healer.slice \
-	xfs_healer@.service
-OPTIONAL_TARGETS += $(SYSTEMD_SERVICES)
-endif
+	$(XFS_HEALER_SVCNAME)
+endif # HAVE_SYSTEMD
 
-default: depend $(LTCOMMAND) $(SYSTEMD_SERVICES)
+ifeq ($(HAVE_HEALER_START_DEPS),yes)
+CFLAGS += -DXFS_HEALER_SVCNAME=\"$(XFS_HEALER_SVCNAME)\"
+ ifeq ($(HAVE_LISTMOUNT_NS_FD),yes)
+  CFLAGS += -DHAVE_LISTMOUNT_NS_FD
+ endif # listmount mnt_ns_fd
+
+BUILD_TARGETS += xfs_healer_start
+SYSTEMD_SERVICES += xfs_healer_start.service
+endif # xfs_healer_start deps
+
+default: depend $(BUILD_TARGETS) $(SYSTEMD_SERVICES)
+
+xfs_healer_start: $(SUBDIRS) xfs_healer_start.o $(LTDEPENDENCIES)
+	@echo "    [LD]     $@"
+	$(Q)$(LTLINK) -o $@ $(LDFLAGS) xfs_healer_start.o $(LDLIBS)
 
 %.service: %.service.in $(builddefs)
 	@echo "    [SED]    $@"
@@ -43,7 +58,7 @@ install: $(INSTALL_HEALER)
 
 install-healer: default
 	$(INSTALL) -m 755 -d $(PKG_LIBEXEC_DIR)
-	$(INSTALL) -m 755 $(LTCOMMAND) $(PKG_LIBEXEC_DIR)
+	$(INSTALL) -m 755 $(BUILD_TARGETS) $(PKG_LIBEXEC_DIR)
 
 install-systemd: default
 	$(INSTALL) -m 755 -d $(SYSTEMD_SYSTEM_UNIT_DIR)
diff --git a/healer/xfs_healer_start.c b/healer/xfs_healer_start.c
new file mode 100644
index 00000000000000..a4242adc17e6e8
--- /dev/null
+++ b/healer/xfs_healer_start.c
@@ -0,0 +1,372 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+
+#include <errno.h>
+#include <err.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <sys/fanotify.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <linux/mount.h>
+#include <sys/syscall.h>
+#include <string.h>
+#include <limits.h>
+
+#include "platform_defs.h"
+#include "libfrog/systemd.h"
+
+#define DEFAULT_MOUNTNS_FILE		"/proc/self/ns/mnt"
+
+static int debug = 0;
+static const char *progname = "xfs_healer_start";
+
+/* Start the xfs_healer service for a given mountpoint. */
+static void
+start_healer(
+	const char	*mntpoint)
+{
+	char		unitname[PATH_MAX];
+	int		ret;
+
+	ret = systemd_path_instance_unit_name(XFS_HEALER_SVCNAME, mntpoint,
+			unitname, PATH_MAX);
+	if (ret) {
+		fprintf(stderr, "%s: %s\n", mntpoint,
+				_("Could not determine xfs_healer unit name."));
+		return;
+	}
+
+	/*
+	 * Restart so that we aren't foiled by an existing unit that's slowly
+	 * working its way off a cycled mount.
+	 */
+	ret = systemd_manage_unit(UM_RESTART, unitname);
+	if (ret) {
+		fprintf(stderr, "%s: %s: %s\n", mntpoint,
+				_("Could not start xfs_healer service unit"),
+				unitname);
+		return;
+	}
+
+	printf("%s: %s\n", mntpoint, _("xfs_healer service started."));
+	fflush(stdout);
+}
+
+#define REQUIRED_STATMOUNT_FIELDS (STATMOUNT_FS_TYPE | \
+				   STATMOUNT_MNT_POINT | \
+				   STATMOUNT_MNT_ROOT)
+
+/* Process a newly discovered mountpoint. */
+static void
+examine_mount(
+	int			mnt_ns_fd,
+	uint64_t		mnt_id)
+{
+	struct mnt_id_req	req = {
+		.size		= sizeof(req),
+		.mnt_id		= mnt_id,
+#ifdef HAVE_LISTMOUNT_NS_FD
+		.mnt_ns_fd	= mnt_ns_fd,
+#else
+		.spare		= mnt_ns_fd,
+#endif
+		.param		= REQUIRED_STATMOUNT_FIELDS,
+	};
+	size_t			smbuf_size = sizeof(struct statmount) + 4096;
+	struct statmount	*smbuf = alloca(smbuf_size);
+	int			ret;
+
+	ret = syscall(SYS_statmount, &req, smbuf, smbuf_size, 0);
+	if (ret) {
+		perror("statmount");
+		return;
+	}
+
+	if (debug) {
+		printf("mount: id 0x%llx fstype %s mountpoint %s mntroot %s\n",
+				(unsigned long long)mnt_id,
+				(smbuf->mask & STATMOUNT_FS_TYPE) ?
+					smbuf->str + smbuf->fs_type : "null",
+				(smbuf->mask & STATMOUNT_MNT_POINT) ?
+					smbuf->str + smbuf->mnt_point : "null",
+				(smbuf->mask & STATMOUNT_MNT_ROOT) ?
+					smbuf->str + smbuf->mnt_root : "null");
+		fflush(stdout);
+	}
+
+	/* Look for mount points for the root dir of an XFS filesystem. */
+	if ((smbuf->mask & REQUIRED_STATMOUNT_FIELDS) !=
+			   REQUIRED_STATMOUNT_FIELDS)
+		return;
+
+	if (!strcmp(smbuf->str + smbuf->fs_type, "xfs") &&
+	    !strcmp(smbuf->str + smbuf->mnt_root, "/"))
+		start_healer(smbuf->str + smbuf->mnt_point);
+}
+
+/* Translate fanotify mount events into something we can process. */
+static void
+handle_mount_event(
+	const struct fanotify_event_metadata	*event,
+	int					mnt_ns_fd)
+{
+	const struct fanotify_event_info_header	*info;
+	const struct fanotify_event_info_mnt	*mnt;
+	int					off;
+
+	if (event->fd != FAN_NOFD) {
+		if (debug)
+			fprintf(stderr, "Expected FAN_NOFD, got fd=%d\n",
+					event->fd);
+		return;
+	}
+
+	switch (event->mask) {
+	case FAN_MNT_ATTACH:
+		if (debug) {
+			printf("FAN_MNT_ATTACH (len=%d)\n", event->event_len);
+			fflush(stdout);
+		}
+		break;
+	default:
+		/* should never get here */
+		return;
+	}
+
+	for (off = sizeof(*event) ; off < event->event_len;
+	     off += info->len) {
+		info = (struct fanotify_event_info_header *)
+			((char *) event + off);
+
+		switch (info->info_type) {
+		case FAN_EVENT_INFO_TYPE_MNT:
+			mnt = (struct fanotify_event_info_mnt *) info;
+
+			if (debug) {
+				printf( "Mount record: len=%d mnt_id=0x%llx\n",
+						mnt->hdr.len, mnt->mnt_id);
+				fflush(stdout);
+			}
+
+			examine_mount(mnt_ns_fd, mnt->mnt_id);
+			break;
+
+		default:
+			if (debug)
+				fprintf(stderr,
+ "Unexpected fanotify event info_type=%d len=%d\n",
+						info->info_type, info->len);
+			break;
+		}
+	}
+}
+
+/* Extract mount attachment notifications from fanotify. */
+static void
+handle_notifications(
+	char				*buffer,
+	ssize_t				len,
+	int				mnt_ns_fd)
+{
+	struct fanotify_event_metadata	*event =
+		(struct fanotify_event_metadata *) buffer;
+
+	for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) {
+
+		switch (event->mask) {
+		case FAN_MNT_ATTACH:
+			handle_mount_event(event, mnt_ns_fd);
+			break;
+		default:
+			if (debug)
+				fprintf(stderr,
+ "Unexpected fanotify mark: 0x%llx\n",
+					(unsigned long long)event->mask);
+			break;
+		}
+	}
+}
+
+/* Start healer services for existing XFS mounts. */
+static int
+start_existing_mounts(
+	int			mnt_ns_fd)
+{
+	struct mnt_id_req	req = {
+		.size		= sizeof(struct mnt_id_req),
+#ifdef HAVE_LISTMOUNT_NS_FD
+		.mnt_ns_fd	= mnt_ns_fd,
+#else
+		.spare		= mnt_ns_fd,
+#endif
+		.mnt_id		= LSMT_ROOT,
+	};
+	uint64_t		mnt_ids[32];
+	int			i;
+	int			ret;
+
+	while ((ret = syscall(SYS_listmount, &req, &mnt_ids, 32, 0)) > 0) {
+		for (i = 0; i < ret; i++)
+			examine_mount(mnt_ns_fd, mnt_ids[i]);
+
+		req.param = mnt_ids[ret - 1];
+	}
+
+	if (ret < 0) {
+		if (errno == ENOSYS)
+			fprintf(stderr, "%s\n",
+ _("This program requires the listmount system call."));
+		else
+			perror("listmount");
+		return -1;
+	}
+
+	return 0;
+}
+
+static void __attribute__((noreturn))
+usage(void)
+{
+	fprintf(stderr, "%s %s %s\n", _("Usage:"), progname, _("[OPTIONS]"));
+	fprintf(stderr, "\n");
+	fprintf(stderr, _("Options:\n"));
+	fprintf(stderr, _("  --debug      Enable debugging messages.\n"));
+	fprintf(stderr, _("  --mountns    Path to the mount namespace file.\n"));
+	fprintf(stderr, _("  --supported  Make sure we can actually run.\n"));
+	fprintf(stderr, _("  -V           Print version.\n"));
+
+	exit(EXIT_FAILURE);
+}
+
+enum long_opt_nr {
+	LOPT_DEBUG,
+	LOPT_HELP,
+	LOPT_MOUNTNS,
+	LOPT_SUPPORTED,
+
+	LOPT_MAX,
+};
+
+int
+main(
+	int		argc,
+	char		*argv[])
+{
+	char		buffer[BUFSIZ];
+	const char	*mntns = NULL;
+	int		mnt_ns_fd;
+	int		fan_fd;
+	int		c;
+	int		option_index;
+	int		support_check = 0;
+	int		ret = 0;
+
+	struct option long_options[] = {
+		[LOPT_SUPPORTED] = {"supported", no_argument, &support_check, 1 },
+		[LOPT_DEBUG]	 = {"debug", no_argument, &debug, 1 },
+		[LOPT_HELP]	 = {"help", no_argument, NULL, 0 },
+		[LOPT_MOUNTNS]	 = {"mountns", required_argument, NULL, 0 },
+		[LOPT_MAX]	 = {NULL, 0, NULL, 0 },
+	};
+
+	while ((c = getopt_long(argc, argv, "V", long_options, &option_index))
+			!= EOF) {
+		switch (c) {
+		case 0:
+			switch (option_index) {
+			case LOPT_MOUNTNS:
+				mntns = optarg;
+				break;
+			case LOPT_HELP:
+				usage();
+				break;
+			default:
+				break;
+			}
+			break;
+		case 'V':
+			fprintf(stdout, "%s %s %s\n", progname, _("version"),
+					VERSION);
+			fflush(stdout);
+			return EXIT_SUCCESS;
+		default:
+			usage();
+			break;
+		}
+	}
+
+	/*
+	 * Try to open the mount namespace file for the current process.
+	 * fanotify requires this mount namespace file to send mount attachment
+	 * events, so this is required for correct functionality.
+	 */
+	mnt_ns_fd = open(mntns ? mntns : DEFAULT_MOUNTNS_FILE, O_RDONLY);
+	if (mnt_ns_fd < 0) {
+		if (errno == ENOENT && !mntns) {
+			perror(DEFAULT_MOUNTNS_FILE);
+			fprintf(stderr, "%s\n",
+ _("This program requires mount namespace support."));
+		} else {
+			perror(mntns ? mntns : DEFAULT_MOUNTNS_FILE);
+		}
+		ret = 1;
+		goto out;
+	}
+
+	fan_fd = fanotify_init(FAN_REPORT_MNT, O_RDONLY);
+	if (fan_fd < 0) {
+		perror("fanotify_init");
+		if (errno == EINVAL)
+			fprintf(stderr, "%s\n",
+ _("This program requires fanotify mount event support."));
+		ret = 1;
+		goto out;
+	}
+
+	ret = fanotify_mark(fan_fd, FAN_MARK_ADD | FAN_MARK_MNTNS,
+			FAN_MNT_ATTACH, mnt_ns_fd, NULL);
+	if (ret) {
+		perror("fanotify_mark");
+		goto out;
+	}
+
+	if (support_check) {
+		/*
+		 * We're being run as an ExecCondition process and we've
+		 * decided to start the main service.  There is no need to wait
+		 * for journald because the ExecStart version of ourselves will
+		 * take care of the waiting for us.
+		 */
+		return systemd_service_exit_now(0);
+	}
+
+	if (debug) {
+		printf("fanotify active\n");
+		fflush(stdout);
+	}
+
+	ret = start_existing_mounts(mnt_ns_fd);
+	if (ret)
+		goto out;
+
+	while (1) {
+		ssize_t bytes_read = read(fan_fd, buffer, BUFSIZ);
+
+		if (bytes_read < 0) {
+			perror("fanotify");
+			ret = 1;
+			break;
+		}
+
+		handle_notifications(buffer, bytes_read, mnt_ns_fd);
+	}
+
+out:
+	return systemd_service_exit(ret);
+}
diff --git a/healer/xfs_healer_start.service.in b/healer/xfs_healer_start.service.in
new file mode 100644
index 00000000000000..6fd34eafa48c33
--- /dev/null
+++ b/healer/xfs_healer_start.service.in
@@ -0,0 +1,85 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright (c) 2026 Oracle.  All Rights Reserved.
+# Author: Darrick J. Wong <djwong@kernel.org>
+
+[Unit]
+Description=Start Self Healing of XFS Metadata
+
+[Service]
+Type=exec
+Environment=SERVICE_MODE=1
+ExecCondition=@pkg_libexec_dir@/xfs_healer_start --supported
+ExecStart=@pkg_libexec_dir@/xfs_healer_start
+
+# This service starts more services, so we want it to try to restart any time
+# the program exits or crashes.
+Restart=on-failure
+
+# Create the service underneath the healer background service slice so that we
+# can control resource usage.
+Slice=system-xfs_healer.slice
+
+# No realtime CPU scheduling
+RestrictRealtime=true
+
+# Must run with full privileges in a shared mount namespace so that we can
+# see new mounts and tell systemd to start the per-mount healer service.
+DynamicUser=false
+ProtectSystem=false
+ProtectHome=no
+PrivateTmp=true
+PrivateDevices=true
+
+# Don't let healer complain about paths in /etc/projects that have been hidden
+# by our sandboxing.  healer doesn't care about project ids anyway.
+InaccessiblePaths=-/etc/projects
+
+# No network access except to the systemd control socket
+PrivateNetwork=true
+ProtectHostname=true
+RestrictAddressFamilies=AF_UNIX
+IPAddressDeny=any
+
+# Don't let the program mess with the kernel configuration at all
+ProtectKernelLogs=true
+ProtectKernelModules=true
+ProtectKernelTunables=true
+ProtectControlGroups=true
+ProtectProc=invisible
+RestrictNamespaces=true
+
+# Hide everything in /proc, even /proc/mounts
+ProcSubset=pid
+
+# Only allow the default personality Linux
+LockPersonality=true
+
+# No writable memory pages
+MemoryDenyWriteExecute=true
+
+# Don't let our mounts leak out to the host
+PrivateMounts=true
+
+# Restrict system calls to the native arch and fanotify
+SystemCallArchitectures=native
+SystemCallFilter=@system-service
+SystemCallFilter=~@privileged
+SystemCallFilter=~@resources
+SystemCallFilter=~@mount
+SystemCallFilter=fanotify_init fanotify_mark
+
+# xfs_healer_start needs these privileges to open the rootdir and monitor
+CapabilityBoundingSet=CAP_SYS_ADMIN CAP_DAC_OVERRIDE
+AmbientCapabilities=CAP_SYS_ADMIN CAP_DAC_OVERRIDE
+NoNewPrivileges=true
+
+# xfs_healer_start doesn't create files
+UMask=7777
+
+# No access to hardware /dev files except for block devices
+ProtectClock=true
+DevicePolicy=closed
+
+[Install]
+WantedBy=multi-user.target
diff --git a/include/builddefs.in b/include/builddefs.in
index 99373ec86215cf..51d24dd854bc17 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -120,6 +120,9 @@ UDEV_RULE_DIR = @udev_rule_dir@
 HAVE_LIBURCU_ATOMIC64 = @have_liburcu_atomic64@
 STRERROR_R_RETURNS_STRING = @strerror_r_returns_string@
 HAVE_CLOSE_RANGE = @have_close_range@
+HAVE_LISTMOUNT = @have_listmount@
+HAVE_LISTMOUNT_NS_FD = @have_listmount_ns_fd@
+HAVE_FANOTIFY_MOUNTINFO = @have_fanotify_mountinfo@
 
 GCCFLAGS = -funsigned-char -fno-strict-aliasing -Wall
 #	   -Wbitwise -Wno-transparent-union -Wno-old-initializer -Wno-decl
@@ -152,6 +155,10 @@ ifeq ($(HAVE_LIBURCU_ATOMIC64),yes)
 PCFLAGS += -DHAVE_LIBURCU_ATOMIC64
 endif
 
+ifeq ($(ENABLE_HEALER)$(HAVE_SYSTEMD)$(HAVE_LISTMOUNT)$(HAVE_FANOTIFY_MOUNTINFO),yesyesyesyes)
+HAVE_HEALER_START_DEPS = yes
+endif
+
 SANITIZER_CFLAGS += @addrsan_cflags@ @threadsan_cflags@ @ubsan_cflags@ @autovar_init_cflags@
 SANITIZER_LDFLAGS += @addrsan_ldflags@ @threadsan_ldflags@ @ubsan_ldflags@
 
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index b3d87229d3367a..a1ece2ad71dab7 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -366,3 +366,81 @@ close_range(0, 0, 0);
        AC_MSG_RESULT(no))
     AC_SUBST(have_close_range)
   ])
+
+#
+# Check if listmount and statmount exist.  Note that statmount came first (6.8)
+# and listmount came later (6.9).
+#
+AC_DEFUN([AC_HAVE_LISTMOUNT],
+  [AC_MSG_CHECKING([for listmount])
+    AC_LINK_IFELSE(
+    [AC_LANG_PROGRAM([[
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <linux/mount.h>
+#include <sys/syscall.h>
+#include <alloca.h>
+  ]], [[
+	struct mnt_id_req	req = {
+		.size		= sizeof(req),
+	};
+	struct statmount	smbuf;
+
+	syscall(SYS_statmount, &req, &smbuf, 0, 0);
+	syscall(SYS_listmount, &req, NULL, 0, 0);
+  ]])
+    ], have_listmount=yes
+       AC_MSG_RESULT(yes),
+       AC_MSG_RESULT(no))
+    AC_SUBST(have_listmount)
+  ])
+
+#
+# Check if mnt_id_req::mnt_ns_fd exists.  This replaced mnt_id_req::spare in
+# 6.18.
+#
+AC_DEFUN([AC_HAVE_LISTMOUNT_NS_FD],
+  [AC_MSG_CHECKING([for struct mnt_id_req::mnt_ns_fd])
+    AC_LINK_IFELSE(
+    [AC_LANG_PROGRAM([[
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <linux/mount.h>
+#include <sys/syscall.h>
+#include <alloca.h>
+  ]], [[
+	struct mnt_id_req	req = {
+		.mnt_ns_fd	= 555,
+	};
+
+	syscall(SYS_listmount, &req, NULL, 0, 0);
+  ]])
+    ], have_listmount_ns_fd=yes
+       AC_MSG_RESULT(yes),
+       AC_MSG_RESULT(no))
+    AC_SUBST(have_listmount_ns_fd)
+  ])
+
+#
+# Check if fanotify will give us mount notifications
+#
+AC_DEFUN([AC_HAVE_FANOTIFY_MOUNTINFO],
+  [AC_MSG_CHECKING([for fanotify mount events])
+    AC_LINK_IFELSE(
+    [AC_LANG_PROGRAM([[
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <fcntl.h>
+#include <sys/fanotify.h>
+  ]], [[
+	struct fanotify_event_info_mnt info;
+
+	int fan_fd = fanotify_init(FAN_REPORT_MNT, 0);
+	fanotify_mark(fan_fd, FAN_MARK_ADD | FAN_MARK_MNTNS, FAN_MNT_ATTACH,
+			-1, NULL);
+  ]])
+    ], have_fanotify_mountinfo=yes
+       AC_MSG_RESULT(yes),
+       AC_MSG_RESULT(no))
+    AC_SUBST(have_fanotify_mountinfo)
+  ])


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (12 preceding siblings ...)
  2026-03-03  0:37   ` [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service Darrick J. Wong
@ 2026-03-03  0:37   ` Darrick J. Wong
  2026-03-03 15:49     ` Christoph Hellwig
  2026-03-03  0:37   ` [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
                     ` (11 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:37 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Use ExecCondition= in the system service to check if kernel support for
the health monitor is available.  If not, we don't want to run the
service, have it fail, and generate a bunch of silly log messages.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h           |    1 +
 healer/xfs_healer.c           |   55 +++++++++++++++++++++++++++++++----------
 healer/xfs_healer@.service.in |    1 +
 3 files changed, 43 insertions(+), 14 deletions(-)


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index 6d12921245934c..93cca394e9fdd1 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -26,6 +26,7 @@ struct healer_ctx {
 	int			everything;
 	int			foreground;
 	int			want_repair;
+	int			support_check;
 
 	/* fd and fs geometry for mount */
 	struct xfs_fd		mnt;
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index 2f431f18c69318..69e5368f6ee794 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -154,8 +154,14 @@ healer_nproc(
 	return ctx->foreground ? platform_nproc() : 1;
 }
 
+enum mon_state {
+	MON_START,
+	MON_EXIT,
+	MON_ERROR,
+};
+
 /* Set ourselves up to monitor the given mountpoint for health events. */
-static int
+static enum mon_state
 setup_monitor(
 	struct healer_ctx	*ctx)
 {
@@ -166,7 +172,7 @@ setup_monitor(
 	ret = xfd_open(&ctx->mnt, ctx->mntpoint, O_RDONLY);
 	if (ret) {
 		perror(ctx->mntpoint);
-		return -1;
+		return MON_ERROR;
 	}
 
 	if (ctx->want_repair) {
@@ -175,7 +181,7 @@ setup_monitor(
 			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
  _("XFS online repair is not supported, exiting"));
 			close(ctx->mnt.fd);
-			return -1;
+			return MON_ERROR;
 		}
 
 		/* Check for backref metadata that makes repair effective. */
@@ -201,7 +207,7 @@ setup_monitor(
 			fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
 					_("creating weak fshandle"),
 					strerror(errno));
-			return -1;
+			return MON_ERROR;
 		}
 	}
 
@@ -227,7 +233,17 @@ setup_monitor(
 			perror(ctx->mntpoint);
 			break;
 		}
-		return -1;
+		return MON_ERROR;
+	}
+
+	/*
+	 * At this point, we know that the kernel is capable of repairing the
+	 * filesystem and telling us that it needs repairs.  If the user only
+	 * wanted us to check for the capability, we're done.
+	 */
+	if (ctx->support_check) {
+		close(mon_fd);
+		return MON_EXIT;
 	}
 
 	/*
@@ -239,7 +255,7 @@ setup_monitor(
 	if (!ctx->mon_fp) {
 		close(mon_fd);
 		perror(ctx->mntpoint);
-		return -1;
+		return MON_ERROR;
 	}
 
 	/* Increase the buffer size so that we can reduce kernel calls */
@@ -258,11 +274,11 @@ setup_monitor(
 		errno = ret;
 		fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
 				_("worker threadpool setup"), strerror(errno));
-		return -1;
+		return MON_ERROR;
 	}
 	ctx->queue_active = true;
 
-	return 0;
+	return MON_START;
 }
 
 /* Monitor the given mountpoint for health events. */
@@ -376,6 +392,7 @@ usage(void)
 	fprintf(stderr, _("  --foreground  Process events as soon as possible.\n"));
 	fprintf(stderr, _("  --quiet       Do not log health events to stdout.\n"));
 	fprintf(stderr, _("  --repair      Always repair corrupt metadata.\n"));
+	fprintf(stderr, _("  --supported   Check that health monitoring is supported.\n"));
 	fprintf(stderr, _("  -V            Print version.\n"));
 
 	exit(EXIT_FAILURE);
@@ -388,6 +405,7 @@ enum long_opt_nr {
 	LOPT_HELP,
 	LOPT_QUIET,
 	LOPT_REPAIR,
+	LOPT_SUPPORTED,
 
 	LOPT_MAX,
 };
@@ -418,6 +436,7 @@ main(
 		[LOPT_HELP]	   = {"help", no_argument, NULL, 0 },
 		[LOPT_QUIET]	   = {"quiet", no_argument, &ctx.log, 0 },
 		[LOPT_REPAIR]	   = {"repair", no_argument, &ctx.want_repair, 1 },
+		[LOPT_SUPPORTED]   = {"supported", no_argument, &ctx.support_check, 1 },
 
 		[LOPT_MAX]	   = {NULL, 0, NULL, 0 },
 	};
@@ -461,15 +480,23 @@ main(
 		goto out;
 	}
 
-	ret = setup_monitor(&ctx);
-	if (ret)
-		goto out_events;
+	switch (setup_monitor(&ctx)) {
+	case MON_ERROR:
+		ret = -1;
+		break;
+	case MON_EXIT:
+		ret = 0;
+		break;
+	case MON_START:
+		ret = 0;
+		monitor(&ctx);
+		break;
+	}
 
-	monitor(&ctx);
-
-out_events:
 	teardown_monitor(&ctx);
 	free((char *)ctx.fsname);
 out:
+	if (ctx.support_check)
+		return systemd_service_exit_now(ret);
 	return systemd_service_exit(ret);
 }
diff --git a/healer/xfs_healer@.service.in b/healer/xfs_healer@.service.in
index 385257872b0cbb..53f89cf9c4333d 100644
--- a/healer/xfs_healer@.service.in
+++ b/healer/xfs_healer@.service.in
@@ -17,6 +17,7 @@ RequiresMountsFor=%f
 [Service]
 Type=exec
 Environment=SERVICE_MODE=1
+ExecCondition=@pkg_libexec_dir@/xfs_healer --supported %f
 ExecStart=@pkg_libexec_dir@/xfs_healer %f
 SyslogIdentifier=%N
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (13 preceding siblings ...)
  2026-03-03  0:37   ` [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
@ 2026-03-03  0:37   ` Darrick J. Wong
  2026-03-03 15:50     ` Christoph Hellwig
  2026-03-03  0:38   ` [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
                     ` (10 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:37 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make the xfs_healer background service query the autofsck filesystem
property to figure out which operating mode it should use.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h    |    1 
 libfrog/fsproperties.h |    5 ++
 healer/xfs_healer.c    |  102 +++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 105 insertions(+), 3 deletions(-)


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index 93cca394e9fdd1..3a8f9a67beb7b6 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -27,6 +27,7 @@ struct healer_ctx {
 	int			foreground;
 	int			want_repair;
 	int			support_check;
+	int			autofsck;
 
 	/* fd and fs geometry for mount */
 	struct xfs_fd		mnt;
diff --git a/libfrog/fsproperties.h b/libfrog/fsproperties.h
index 11d6530bc9a6d6..1cf90d058765b2 100644
--- a/libfrog/fsproperties.h
+++ b/libfrog/fsproperties.h
@@ -52,6 +52,11 @@ bool fsprop_validate(const char *name, const char *value);
 
 #define FSPROP_AUTOFSCK_NAME		"autofsck"
 
+/* filesystem property name for fgetxattr */
+#define VFS_FSPROP_AUTOFSCK_NAME	(FSPROP_NAMESPACE \
+					 FSPROP_NAME_PREFIX \
+					 FSPROP_AUTOFSCK_NAME)
+
 enum fsprop_autofsck {
 	FSPROP_AUTOFSCK_UNSET = 0,	/* do not set property */
 	FSPROP_AUTOFSCK_NONE,		/* no background scrubs */
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index 69e5368f6ee794..975d28789d5e14 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -6,6 +6,7 @@
 #include "xfs.h"
 #include <pthread.h>
 #include <stdlib.h>
+#include <sys/xattr.h>
 
 #include "platform_defs.h"
 #include "libfrog/fsgeom.h"
@@ -13,6 +14,7 @@
 #include "libfrog/healthevent.h"
 #include "libfrog/workqueue.h"
 #include "libfrog/systemd.h"
+#include "libfrog/fsproperties.h"
 #include "xfs_healer.h"
 
 /* Program name; needed for libfrog error reports. */
@@ -154,6 +156,63 @@ healer_nproc(
 	return ctx->foreground ? platform_nproc() : 1;
 }
 
+enum want_repair {
+	WR_REPAIR,
+	WR_LOG_ONLY,
+	WR_EXIT,
+};
+
+/* Determine want_repair from the autofsck filesystem property. */
+static enum want_repair
+want_repair_from_autofsck(
+	struct healer_ctx	*ctx)
+{
+	char			valuebuf[FSPROP_MAX_VALUELEN + 1] = { 0 };
+	enum fsprop_autofsck	shval;
+	ssize_t			ret;
+
+	/*
+	 * Any OS error (including ENODATA) or string parsing error is treated
+	 * the same as an unrecognized value.
+	 */
+	ret = fgetxattr(ctx->mnt.fd, VFS_FSPROP_AUTOFSCK_NAME, valuebuf,
+			FSPROP_MAX_VALUELEN);
+	if (ret < 0)
+		goto no_advice;
+
+	shval = fsprop_autofsck_read(valuebuf);
+	switch (shval) {
+	case FSPROP_AUTOFSCK_NONE:
+		/* don't run at all */
+		ret = WR_EXIT;
+		break;
+	case FSPROP_AUTOFSCK_CHECK:
+	case FSPROP_AUTOFSCK_OPTIMIZE:
+		/* log events, do not repair */
+		ret = WR_LOG_ONLY;
+		break;
+	case FSPROP_AUTOFSCK_REPAIR:
+		/* repair stuff */
+		ret = WR_REPAIR;
+		break;
+	case FSPROP_AUTOFSCK_UNSET:
+		goto no_advice;
+	}
+
+	return ret;
+
+no_advice:
+	/*
+	 * For an unrecognized value, log but do not fix runtime corruption if
+	 * backref metadata are enabled.  If no backref metadata are available,
+	 * the fs is too old so don't run at all.
+	 */
+	if (healer_has_rmapbt(ctx) || healer_has_parent(ctx))
+		return WR_LOG_ONLY;
+
+	return WR_EXIT;
+}
+
 enum mon_state {
 	MON_START,
 	MON_EXIT,
@@ -175,15 +234,46 @@ setup_monitor(
 		return MON_ERROR;
 	}
 
-	if (ctx->want_repair) {
-		/* Check that the kernel supports repairs at all. */
-		if (!healer_can_repair(ctx)) {
+	if (ctx->autofsck) {
+		switch (want_repair_from_autofsck(ctx)) {
+		case WR_EXIT:
+			printf("%s: %s\n", ctx->mntpoint,
+ _("Disabling daemon per autofsck directive."));
+			fflush(stdout);
+			close(ctx->mnt.fd);
+			return MON_EXIT;
+		case WR_REPAIR:
+			ctx->want_repair = 1;
+			printf("%s: %s\n", ctx->mntpoint,
+ _("Automatically repairing per autofsck directive."));
+			fflush(stdout);
+			break;
+		case WR_LOG_ONLY:
+			ctx->want_repair = 0;
+			ctx->log = 1;
+			printf("%s: %s\n", ctx->mntpoint,
+ _("Only logging errors per autofsck directive."));
+			fflush(stdout);
+			break;
+		}
+	}
+
+	/* Check that the kernel supports repairs at all. */
+	if (ctx->want_repair && !healer_can_repair(ctx)) {
+		if (!ctx->autofsck) {
 			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
  _("XFS online repair is not supported, exiting"));
 			close(ctx->mnt.fd);
 			return MON_ERROR;
 		}
 
+		printf("%s: %s\n", ctx->mntpoint,
+ _("XFS online repair is not supported, will report only"));
+		fflush(stdout);
+		ctx->want_repair = 0;
+	}
+
+	if (ctx->want_repair) {
 		/* Check for backref metadata that makes repair effective. */
 		if (!healer_has_rmapbt(ctx))
 			fprintf(stderr, "%s: %s\n", ctx->mntpoint,
@@ -390,6 +480,7 @@ usage(void)
 	fprintf(stderr, _("  --debug       Enable debugging messages.\n"));
 	fprintf(stderr, _("  --everything  Capture all events.\n"));
 	fprintf(stderr, _("  --foreground  Process events as soon as possible.\n"));
+	fprintf(stderr, _("  --no-autofsck Do not use the \"autofsck\" fs property to decide to repair.\n"));
 	fprintf(stderr, _("  --quiet       Do not log health events to stdout.\n"));
 	fprintf(stderr, _("  --repair      Always repair corrupt metadata.\n"));
 	fprintf(stderr, _("  --supported   Check that health monitoring is supported.\n"));
@@ -403,6 +494,7 @@ enum long_opt_nr {
 	LOPT_EVERYTHING,
 	LOPT_FOREGROUND,
 	LOPT_HELP,
+	LOPT_NO_AUTOFSCK,
 	LOPT_QUIET,
 	LOPT_REPAIR,
 	LOPT_SUPPORTED,
@@ -418,6 +510,7 @@ main(
 	struct healer_ctx	ctx = {
 		.conlock	= (pthread_mutex_t)PTHREAD_MUTEX_INITIALIZER,
 		.log		= 1,
+		.autofsck	= 1,
 	};
 	int			option_index;
 	int			vflag = 0;
@@ -434,6 +527,7 @@ main(
 		[LOPT_EVERYTHING]  = {"everything", no_argument, &ctx.everything, 1 },
 		[LOPT_FOREGROUND]  = {"foreground", no_argument, &ctx.foreground, 1 },
 		[LOPT_HELP]	   = {"help", no_argument, NULL, 0 },
+		[LOPT_NO_AUTOFSCK] = {"no-autofsck", no_argument, &ctx.autofsck, 0 },
 		[LOPT_QUIET]	   = {"quiet", no_argument, &ctx.log, 0 },
 		[LOPT_REPAIR]	   = {"repair", no_argument, &ctx.want_repair, 1 },
 		[LOPT_SUPPORTED]   = {"supported", no_argument, &ctx.support_check, 1 },
@@ -470,6 +564,8 @@ main(
 
 	if (optind != argc - 1)
 		usage();
+	if (ctx.want_repair)
+		ctx.autofsck = 0;
 
 	ctx.mntpoint = argv[optind];
 	ctx.fsname = find_fsname(ctx.mntpoint);


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (14 preceding siblings ...)
  2026-03-03  0:37   ` [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
@ 2026-03-03  0:38   ` Darrick J. Wong
  2026-03-03 15:50     ` Christoph Hellwig
  2026-03-03  0:38   ` [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
                     ` (9 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:38 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If we fail to perform a spot repair of metadata or the kernel tells us
that it lost corruption events due to queue limits, initiate a full run
of the online fsck service to try to fix the error.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h  |    3 ++
 healer/Makefile      |    2 +
 healer/fsrepair.c    |   81 +++++++++++++++++++++++++++++++++++++++++++++-----
 healer/weakhandle.c  |   13 ++++++++
 healer/xfs_healer.c  |    7 ++++
 include/builddefs.in |    1 +
 scrub/Makefile       |    7 ++--
 7 files changed, 102 insertions(+), 12 deletions(-)


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index 3a8f9a67beb7b6..5e9fd7fec904ab 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -71,6 +71,7 @@ void lookup_path(struct healer_ctx *ctx,
 int repair_metadata(struct healer_ctx *ctx, const struct hme_prefix *pfx,
 		const struct xfs_health_monitor_event *hme);
 bool healer_can_repair(struct healer_ctx *ctx);
+void run_full_repair(struct healer_ctx *ctx);
 
 /* weakhandle.c */
 int weakhandle_alloc(int fd, const char *mountpoint, const char *fsname,
@@ -79,5 +80,7 @@ int weakhandle_reopen(struct weakhandle *wh, int *fd);
 void weakhandle_free(struct weakhandle **whp);
 int weakhandle_getpath_for(struct weakhandle *wh, uint64_t ino, uint32_t gen,
 		char *path, size_t pathlen);
+int weakhandle_instance_unit_name(struct weakhandle *wh, const char *template,
+		char *unitname, size_t unitnamelen);
 
 #endif /* XFS_HEALER_XFS_HEALER_H_ */
diff --git a/healer/Makefile b/healer/Makefile
index 53cc787c6fcd0c..f7ee911fe11f92 100644
--- a/healer/Makefile
+++ b/healer/Makefile
@@ -19,6 +19,8 @@ xfs_healer.c
 HFILES = \
 xfs_healer.h
 
+CFLAGS+=-DXFS_SCRUB_SVCNAME=\"$(XFS_SCRUB_SVCNAME)\"
+
 LLDLIBS += $(LIBHANDLE) $(LIBFROG) $(LIBURCU) $(LIBPTHREAD)
 LTDEPENDENCIES += $(LIBHANDLE) $(LIBFROG)
 LLDFLAGS = -static
diff --git a/healer/fsrepair.c b/healer/fsrepair.c
index 4534104f8a6ac1..9f8c128e395ebc 100644
--- a/healer/fsrepair.c
+++ b/healer/fsrepair.c
@@ -9,8 +9,14 @@
 #include "libfrog/fsgeom.h"
 #include "libfrog/workqueue.h"
 #include "libfrog/healthevent.h"
+#include "libfrog/systemd.h"
 #include "xfs_healer.h"
 
+enum what_next {
+	NEED_FULL_REPAIR,
+	REPAIR_DONE,
+};
+
 /* Translate scrub output flags to outcome. */
 static enum repair_outcome from_repair_oflags(uint32_t oflags)
 {
@@ -61,7 +67,7 @@ xfs_repair_metadata(
 }
 
 /* React to a fs-domain corruption event by repairing it. */
-static void
+static enum what_next
 try_repair_wholefs(
 	struct healer_ctx			*ctx,
 	const struct hme_prefix			*pfx,
@@ -90,11 +96,16 @@ try_repair_wholefs(
 		pthread_mutex_lock(&ctx->conlock);
 		report_health_repair(pfx, hme, f->event_mask, outcome);
 		pthread_mutex_unlock(&ctx->conlock);
+
+		if (outcome == REPAIR_FAILED)
+			return NEED_FULL_REPAIR;
 	}
+
+	return REPAIR_DONE;
 }
 
 /* React to an ag corruption event by repairing it. */
-static void
+static enum what_next
 try_repair_ag(
 	struct healer_ctx			*ctx,
 	const struct hme_prefix			*pfx,
@@ -126,11 +137,16 @@ try_repair_ag(
 		pthread_mutex_lock(&ctx->conlock);
 		report_health_repair(pfx, hme, f->event_mask, outcome);
 		pthread_mutex_unlock(&ctx->conlock);
+
+		if (outcome == REPAIR_FAILED)
+			return NEED_FULL_REPAIR;
 	}
+
+	return REPAIR_DONE;
 }
 
 /* React to a rtgroup corruption event by repairing it. */
-static void
+static enum what_next
 try_repair_rtgroup(
 	struct healer_ctx			*ctx,
 	const struct hme_prefix			*pfx,
@@ -157,11 +173,16 @@ try_repair_rtgroup(
 		pthread_mutex_lock(&ctx->conlock);
 		report_health_repair(pfx, hme, f->event_mask, outcome);
 		pthread_mutex_unlock(&ctx->conlock);
+
+		if (outcome == REPAIR_FAILED)
+			return NEED_FULL_REPAIR;
 	}
+
+	return REPAIR_DONE;
 }
 
 /* React to a inode-domain corruption event by repairing it. */
-static void
+static enum what_next
 try_repair_inode(
 	struct healer_ctx			*ctx,
 	const struct hme_prefix			*orig_pfx,
@@ -204,7 +225,12 @@ try_repair_inode(
 		pthread_mutex_lock(&ctx->conlock);
 		report_health_repair(pfx, hme, f->event_mask, outcome);
 		pthread_mutex_unlock(&ctx->conlock);
+
+		if (outcome == REPAIR_FAILED)
+			return NEED_FULL_REPAIR;
 	}
+
+	return REPAIR_DONE;
 }
 
 /* Repair a metadata corruption. */
@@ -214,6 +240,7 @@ repair_metadata(
 	const struct hme_prefix			*pfx,
 	const struct xfs_health_monitor_event	*hme)
 {
+	enum what_next				what_next;
 	int					repair_fd;
 	int					ret;
 
@@ -227,19 +254,25 @@ repair_metadata(
 
 	switch (hme->domain) {
 	case XFS_HEALTH_MONITOR_DOMAIN_FS:
-		try_repair_wholefs(ctx, pfx, repair_fd, hme);
+		what_next = try_repair_wholefs(ctx, pfx, repair_fd, hme);
 		break;
 	case XFS_HEALTH_MONITOR_DOMAIN_AG:
-		try_repair_ag(ctx, pfx, repair_fd, hme);
+		what_next = try_repair_ag(ctx, pfx, repair_fd, hme);
 		break;
 	case XFS_HEALTH_MONITOR_DOMAIN_RTGROUP:
-		try_repair_rtgroup(ctx, pfx, repair_fd, hme);
+		what_next = try_repair_rtgroup(ctx, pfx, repair_fd, hme);
 		break;
 	case XFS_HEALTH_MONITOR_DOMAIN_INODE:
-		try_repair_inode(ctx, pfx, repair_fd, hme);
+		what_next = try_repair_inode(ctx, pfx, repair_fd, hme);
 		break;
+	default:
+		what_next = REPAIR_DONE;
 	}
 
+	/* Transform into a full repair if we failed to fix this item. */
+	if (what_next == NEED_FULL_REPAIR)
+		run_full_repair(ctx);
+
 	close(repair_fd);
 	return 0;
 }
@@ -259,3 +292,35 @@ healer_can_repair(
 	ret = ioctl(ctx->mnt.fd, XFS_IOC_SCRUB_METADATA, &sm);
 	return ret ? false : true;
 }
+
+/* Run a full repair of the filesystem using the background fsck service. */
+void
+run_full_repair(
+	struct healer_ctx	*ctx)
+{
+	char			unitname[PATH_MAX];
+	int			ret;
+
+	ret = weakhandle_instance_unit_name(ctx->wh, XFS_SCRUB_SVCNAME,
+			unitname, PATH_MAX);
+	if (ret) {
+		fprintf(stderr, "%s: %s\n", ctx->mntpoint,
+				_("Could not determine xfs_scrub unit name."));
+		return;
+	}
+
+	/*
+	 * Scrub could already be repairing something, so try to start the unit
+	 * and be content if it's already running.
+	 */
+	ret = systemd_manage_unit(UM_START, unitname);
+	if (ret) {
+		fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
+				_("Could not start xfs_scrub service unit"),
+				unitname);
+		return;
+	}
+
+	printf("%s: %s\n", ctx->mntpoint, _("Full repairs in progress."));
+	fflush(stdout);
+}
diff --git a/healer/weakhandle.c b/healer/weakhandle.c
index 8950e0eb1e5a43..849aa2882700d4 100644
--- a/healer/weakhandle.c
+++ b/healer/weakhandle.c
@@ -13,6 +13,7 @@
 #include "libfrog/workqueue.h"
 #include "libfrog/getparents.h"
 #include "libfrog/paths.h"
+#include "libfrog/systemd.h"
 #include "xfs_healer.h"
 
 struct weakhandle {
@@ -199,3 +200,15 @@ weakhandle_getpath_for(
 	close(mnt_fd);
 	return ret;
 }
+
+/* Compute the systemd instance unit name for this mountpoint. */
+int
+weakhandle_instance_unit_name(
+	struct weakhandle	*wh,
+	const char		*template,
+	char			*unitname,
+	size_t			unitnamelen)
+{
+	return systemd_path_instance_unit_name(template, wh->mntpoint,
+			unitname, unitnamelen);
+}
diff --git a/healer/xfs_healer.c b/healer/xfs_healer.c
index 975d28789d5e14..2901ed0bbe219e 100644
--- a/healer/xfs_healer.c
+++ b/healer/xfs_healer.c
@@ -138,6 +138,13 @@ handle_event(
 		pthread_mutex_unlock(&ctx->conlock);
 	}
 
+	/*
+	 * If we didn't ask for all the metadata reports (including the healthy
+	 * ones) and the kernel tells us it lost something, run the full scan.
+	 */
+	if (hme->type == XFS_HEALTH_MONITOR_TYPE_LOST && !ctx->everything)
+		run_full_repair(ctx);
+
 	/* Initiate a repair if appropriate. */
 	if (will_repair)
 		repair_metadata(ctx, &pfx, hme);
diff --git a/include/builddefs.in b/include/builddefs.in
index 51d24dd854bc17..b5ace90f53a46e 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -62,6 +62,7 @@ MKFS_CFG_DIR	= @datadir@/@pkg_name@/mkfs
 PKG_STATE_DIR	= @localstatedir@/lib/@pkg_name@
 
 XFS_SCRUB_ALL_AUTO_MEDIA_SCAN_STAMP=$(PKG_STATE_DIR)/xfs_scrub_all_media.stamp
+XFS_SCRUB_SVCNAME=xfs_scrub@.service
 
 CC		= @cc@
 BUILD_CC	= @BUILD_CC@
diff --git a/scrub/Makefile b/scrub/Makefile
index ff79a265762332..aee49bfce100e2 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -8,7 +8,6 @@ include $(builddefs)
 
 SCRUB_PREREQS=$(HAVE_GETFSMAP)
 
-scrub_svcname=xfs_scrub@.service
 scrub_media_svcname=xfs_scrub_media@.service
 
 ifeq ($(SCRUB_PREREQS),yes)
@@ -21,7 +20,7 @@ XFS_SCRUB_SERVICE_ARGS = -b -o autofsck
 ifeq ($(HAVE_SYSTEMD),yes)
 INSTALL_SCRUB += install-systemd
 SYSTEMD_SERVICES=\
-	$(scrub_svcname) \
+	$(XFS_SCRUB_SVCNAME) \
 	xfs_scrub_fail@.service \
 	$(scrub_media_svcname) \
 	xfs_scrub_media_fail@.service \
@@ -123,7 +122,7 @@ xfs_scrub_all.timer: xfs_scrub_all.timer.in $(builddefs)
 $(XFS_SCRUB_ALL_PROG): $(XFS_SCRUB_ALL_PROG).in $(builddefs) $(TOPDIR)/libfrog/gettext.py
 	@echo "    [SED]    $@"
 	$(Q)$(SED) -e "s|@sbindir@|$(PKG_SBIN_DIR)|g" \
-		   -e "s|@scrub_svcname@|$(scrub_svcname)|g" \
+		   -e "s|@scrub_svcname@|$(XFS_SCRUB_SVCNAME)|g" \
 		   -e "s|@scrub_media_svcname@|$(scrub_media_svcname)|g" \
 		   -e "s|@pkg_version@|$(PKG_VERSION)|g" \
 		   -e "s|@stampfile@|$(XFS_SCRUB_ALL_AUTO_MEDIA_SCAN_STAMP)|g" \
@@ -137,7 +136,7 @@ $(XFS_SCRUB_ALL_PROG): $(XFS_SCRUB_ALL_PROG).in $(builddefs) $(TOPDIR)/libfrog/g
 xfs_scrub_fail: xfs_scrub_fail.in $(builddefs)
 	@echo "    [SED]    $@"
 	$(Q)$(SED) -e "s|@sbindir@|$(PKG_SBIN_DIR)|g" \
-		   -e "s|@scrub_svcname@|$(scrub_svcname)|g" \
+		   -e "s|@scrub_svcname@|$(XFS_SCRUB_SVCNAME)|g" \
 		   -e "s|@pkg_version@|$(PKG_VERSION)|g"  < $< > $@
 	$(Q)chmod a+x $@
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (15 preceding siblings ...)
  2026-03-03  0:38   ` [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
@ 2026-03-03  0:38   ` Darrick J. Wong
  2026-03-03 15:51     ` Christoph Hellwig
  2026-03-03  0:38   ` [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
                     ` (8 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:38 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

It's possible that a mounted filesystem can move mountpoints between the
time of the initial mount (at which point xfs_healer starts) and when
it actually wants to start a repair.  When this happens,
weakhandle::mountpoint becomes obsolete and opening it will either fail
with ENOENT or the handle revalidation will return ESTALE.

However, we do still have a means to find the mounted filesystem -- the
fsname parameter (aka the path to the data device at mount time).  This
is record in /proc/mounts, which means that we can iterate getmntent to
see if we can find the mount elsewhere.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/weakhandle.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)


diff --git a/healer/weakhandle.c b/healer/weakhandle.c
index 849aa2882700d4..8ca4ef847188ba 100644
--- a/healer/weakhandle.c
+++ b/healer/weakhandle.c
@@ -65,10 +65,14 @@ weakhandle_alloc(
 	return -1;
 }
 
-/* Reopen a file handle obtained via weak reference. */
-int
-weakhandle_reopen(
+/*
+ * Reopen a file handle obtained via weak reference, using the given path to a
+ * mount point.
+ */
+static int
+weakhandle_reopen_from(
 	struct weakhandle	*wh,
+	const char		*path,
 	int			*fd)
 {
 	void			*hanp;
@@ -78,7 +82,7 @@ weakhandle_reopen(
 
 	*fd = -1;
 
-	mnt_fd = open(wh->mntpoint, O_RDONLY);
+	mnt_fd = open(path, O_RDONLY);
 	if (mnt_fd < 0)
 		return -1;
 
@@ -102,6 +106,44 @@ weakhandle_reopen(
 	return -1;
 }
 
+/* Reopen a file handle obtained via weak reference. */
+int
+weakhandle_reopen(
+	struct weakhandle	*wh,
+	int			*fd)
+{
+	FILE			*mtab;
+	struct mntent		*mount;
+	int			ret;
+
+	ret = weakhandle_reopen_from(wh, wh->mntpoint, fd);
+	if (!ret)
+		return 0;
+
+	mtab = setmntent(_PATH_PROC_MOUNTS, "r");
+	if (!mtab)
+		return -1;
+
+	while ((mount = getmntent(mtab)) != NULL) {
+		if (strcmp(mount->mnt_type, "xfs"))
+			continue;
+		if (strcmp(mount->mnt_fsname, wh->fsname))
+			continue;
+
+		ret = weakhandle_reopen_from(wh, mount->mnt_dir, fd);
+		if (!ret)
+			break;
+	}
+
+	if (*fd < 0) {
+		errno = ESTALE;
+		ret = -1;
+	}
+
+	endmntent(mtab);
+	return ret;
+}
+
 /* Tear down a weak handle */
 void
 weakhandle_free(


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (16 preceding siblings ...)
  2026-03-03  0:38   ` [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
@ 2026-03-03  0:38   ` Darrick J. Wong
  2026-03-03 15:52     ` Christoph Hellwig
  2026-03-03  0:38   ` [PATCH 19/26] xfs_healer: add a manual page Darrick J. Wong
                     ` (7 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:38 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

When xfs_healer reopens a mountpoint to perform a repair, it should
validate that the opened fd points to a file on the same filesystem as
the one being monitored.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/xfs_healer.h |    4 +++-
 healer/fsrepair.c   |   18 +++++++++++++++++-
 healer/weakhandle.c |   20 +++++++++++++++-----
 3 files changed, 35 insertions(+), 7 deletions(-)


diff --git a/healer/xfs_healer.h b/healer/xfs_healer.h
index 5e9fd7fec904ab..d613c4f65fe9eb 100644
--- a/healer/xfs_healer.h
+++ b/healer/xfs_healer.h
@@ -76,7 +76,9 @@ void run_full_repair(struct healer_ctx *ctx);
 /* weakhandle.c */
 int weakhandle_alloc(int fd, const char *mountpoint, const char *fsname,
 		struct weakhandle **whp);
-int weakhandle_reopen(struct weakhandle *wh, int *fd);
+typedef bool (*weakhandle_fd_t)(int mnt_fd, void *data);
+int weakhandle_reopen(struct weakhandle *wh, int *fd,
+		weakhandle_fd_t is_acceptable, void *data);
 void weakhandle_free(struct weakhandle **whp);
 int weakhandle_getpath_for(struct weakhandle *wh, uint64_t ino, uint32_t gen,
 		char *path, size_t pathlen);
diff --git a/healer/fsrepair.c b/healer/fsrepair.c
index 9f8c128e395ebc..002e5e78fcf22e 100644
--- a/healer/fsrepair.c
+++ b/healer/fsrepair.c
@@ -233,6 +233,22 @@ try_repair_inode(
 	return REPAIR_DONE;
 }
 
+/* Make sure the reopened file is on the same fs as the monitor. */
+static bool
+is_same_fs(
+	int				mnt_fd,
+	void				*data)
+{
+	struct xfs_health_file_on_monitored_fs hms = {
+		.fd = mnt_fd,
+	};
+	FILE				*mon_fp = data;
+	int				ret;
+
+	ret = ioctl(fileno(mon_fp), XFS_IOC_HEALTH_FD_ON_MONITORED_FS, &hms);
+	return ret == 0;
+}
+
 /* Repair a metadata corruption. */
 int
 repair_metadata(
@@ -244,7 +260,7 @@ repair_metadata(
 	int					repair_fd;
 	int					ret;
 
-	ret = weakhandle_reopen(ctx->wh, &repair_fd);
+	ret = weakhandle_reopen(ctx->wh, &repair_fd, is_same_fs, ctx->mon_fp);
 	if (ret) {
 		fprintf(stderr, "%s: %s: %s\n", ctx->mntpoint,
 				_("cannot open filesystem to repair"),
diff --git a/healer/weakhandle.c b/healer/weakhandle.c
index 8ca4ef847188ba..4b0e2e991702ca 100644
--- a/healer/weakhandle.c
+++ b/healer/weakhandle.c
@@ -73,7 +73,9 @@ static int
 weakhandle_reopen_from(
 	struct weakhandle	*wh,
 	const char		*path,
-	int			*fd)
+	int			*fd,
+	weakhandle_fd_t		is_acceptable,
+	void			*data)
 {
 	void			*hanp;
 	size_t			hlen;
@@ -95,6 +97,11 @@ weakhandle_reopen_from(
 		goto out_handle;
 	}
 
+	if (is_acceptable && !is_acceptable(mnt_fd, data)) {
+		errno = ESTALE;
+		goto out_handle;
+	}
+
 	free_handle(hanp, hlen);
 	*fd = mnt_fd;
 	return 0;
@@ -110,13 +117,15 @@ weakhandle_reopen_from(
 int
 weakhandle_reopen(
 	struct weakhandle	*wh,
-	int			*fd)
+	int			*fd,
+	weakhandle_fd_t		is_acceptable,
+	void			*data)
 {
 	FILE			*mtab;
 	struct mntent		*mount;
 	int			ret;
 
-	ret = weakhandle_reopen_from(wh, wh->mntpoint, fd);
+	ret = weakhandle_reopen_from(wh, wh->mntpoint, fd, is_acceptable, data);
 	if (!ret)
 		return 0;
 
@@ -130,7 +139,8 @@ weakhandle_reopen(
 		if (strcmp(mount->mnt_fsname, wh->fsname))
 			continue;
 
-		ret = weakhandle_reopen_from(wh, mount->mnt_dir, fd);
+		ret = weakhandle_reopen_from(wh, mount->mnt_dir, fd,
+				is_acceptable, data);
 		if (!ret)
 			break;
 	}
@@ -215,7 +225,7 @@ weakhandle_getpath_for(
 	fakehandle.ha_fid.fid_ino = ino;
 	fakehandle.ha_fid.fid_gen = gen;
 
-	ret = weakhandle_reopen(wh, &mnt_fd);
+	ret = weakhandle_reopen(wh, &mnt_fd, NULL, NULL);
 	if (ret)
 		return ret;
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 19/26] xfs_healer: add a manual page
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (17 preceding siblings ...)
  2026-03-03  0:38   ` [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
@ 2026-03-03  0:38   ` Darrick J. Wong
  2026-03-03 15:52     ` Christoph Hellwig
  2026-03-03  0:39   ` [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible Darrick J. Wong
                     ` (6 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:38 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add a new section 8 manpage for this service daemon so others can read
about what this program is supposed to do.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 man/man8/Makefile           |   40 +++++++++++++---
 man/man8/xfs_healer.8       |  109 +++++++++++++++++++++++++++++++++++++++++++
 man/man8/xfs_healer_start.8 |   37 +++++++++++++++
 3 files changed, 180 insertions(+), 6 deletions(-)
 create mode 100644 man/man8/xfs_healer.8
 create mode 100644 man/man8/xfs_healer_start.8


diff --git a/man/man8/Makefile b/man/man8/Makefile
index 5be76ab727a1fe..05710f85ae89ad 100644
--- a/man/man8/Makefile
+++ b/man/man8/Makefile
@@ -7,13 +7,41 @@ include $(TOPDIR)/include/builddefs
 
 MAN_SECTION	= 8
 
-ifneq ("$(ENABLE_SCRUB)","yes")
-  MAN_PAGES = $(filter-out xfs_scrub%,$(shell echo *.$(MAN_SECTION)))
-else
-  MAN_PAGES = $(shell echo *.$(MAN_SECTION))
-  MAN_PAGES += xfs_scrub_all.8
+MAN_PAGES = \
+	fsck.xfs.8 \
+	mkfs.xfs.8 \
+	xfs_admin.8 \
+	xfs_bmap.8 \
+	xfs_copy.8 \
+	xfs_db.8 \
+	xfs_estimate.8 \
+	xfs_freeze.8 \
+	xfs_fsr.8 \
+	xfs_growfs.8 \
+	xfs_info.8 \
+	xfs_io.8 \
+	xfs_logprint.8 \
+	xfs_mdrestore.8 \
+	xfs_metadump.8 \
+	xfs_mkfile.8 \
+	xfs_ncheck.8 \
+	xfs_property.8 \
+	xfs_protofile.8 \
+	xfs_quota.8 \
+	xfs_repair.8 \
+	xfs_rtcp.8 \
+	xfs_spaceman.8
+
+ifeq ($(ENABLE_HEALER),yes)
+  MAN_PAGES += xfs_healer.8
 endif
-MAN_PAGES	+= mkfs.xfs.8
+ifeq ($(HAVE_HEALER_START_DEPS),yes)
+  MAN_PAGES += xfs_healer_start.8
+endif
+ifeq ($(ENABLE_SCRUB),yes)
+  MAN_PAGES += xfs_scrub.8 xfs_scrub_all.8
+endif
+
 MAN_DEST	= $(PKG_MAN_DIR)/man$(MAN_SECTION)
 LSRCFILES	= $(MAN_PAGES)
 DIRT		= mkfs.xfs.8 xfs_scrub_all.8
diff --git a/man/man8/xfs_healer.8 b/man/man8/xfs_healer.8
new file mode 100644
index 00000000000000..eea799f7811a4d
--- /dev/null
+++ b/man/man8/xfs_healer.8
@@ -0,0 +1,109 @@
+.TH xfs_healer 8
+.SH NAME
+xfs_healer \- automatically heal damage to XFS filesystem metadata
+.SH SYNOPSIS
+.B xfs_healer
+[
+.B OPTIONS
+]
+.I mount-point
+.br
+.B xfs_healer \-V
+.SH DESCRIPTION
+.B xfs_healer
+is a daemon that tries to automatically repair damaged XFS filesystem metadata.
+.PP
+.B WARNING!
+This program is
+.BR EXPERIMENTAL ","
+which means that its behavior and interface
+could change at any time!
+.PP
+.B xfs_healer
+asks the kernel to report all observations of corrupt metadata, media errors,
+filesystem shutdowns, and file I/O errors.
+The program can respond to runtime metadata corruption errors by initiating
+targeted repairs of the suspect metadata or a full online fsck of the
+filesystem.
+
+Normally this program runs as a systemd service.
+The service is activated via the
+.I xfs_healer_start
+service if systemd is supported.
+
+The kernel may not support repairing or optimizing the filesystem.
+If this is the case, the filesystem must be unmounted and
+.BR xfs_repair (8)
+run on the filesystem to fix the problems.
+.SH OPTIONS
+.TP
+.BI \-\-everything
+Ask the kernel to send us good metadata health events, not only events related
+to metadata corruption, media errors, shutdowns, and I/O errors.
+.TP
+.B \-\-foreground
+Start enough event handling threads to allow consumption of all online CPUs.
+If not specified, start exactly one event handling thread.
+.TP
+.B \-\-no-autofsck
+Do not use the
+.I autofsck
+filesystem property to decide whether or not to repair corrupt metadata.
+If the
+.B \-\-repair
+option is given, then all corruptions will be repaired.
+If the
+.B \-\-repair
+option is not given, then the program will never try to repair the filesystem.
+.TP
+.B \-\-quiet
+Do not print every event to standard output.
+.TP
+.B \-\-repair
+Always try to repair each piece of corrupt metadata when the kernel tells us
+about it.
+If an individual repair fails or the kernel tells us that health events were
+lost, the
+.I xfs_scrub
+service for this mount point will be launched.
+The default is not to try to repair anything.
+If this option is specified but the kernel does not support repairs, the
+program will exit.
+.TP
+.B \-\-supported
+Check if the filesystem supports sending health events.
+Exits with 0 if it does, and non-zero if not.
+.TP
+.BI \-V
+Prints the version number and exit.
+
+.SH AUTOFSCK
+By default, this program will read the
+.I autofsck
+filesystem property to decide if it should try to repair corruptions.
+If the property is set to the value
+.B repair
+then corruptions will be repaired.
+If the property is not set but the filesystem supports all back-reference
+metadata (reverse mappings and parent pointers), then corruptions will be
+repaired.
+
+See the
+.BR xfs_scrub (8)
+manual page for more details on this filesystem property.
+
+.SH CAVEATS
+.B xfs_healer
+is an immature utility!
+Do not run this program unless you have backups of your data!
+This program takes advantage of in-kernel scrubbing to verify a given
+data structure with locks held and can keep the filesystem busy for a
+long time.
+The kernel must be new enough to support the SCRUB_METADATA ioctl.
+.PP
+If errors are found and cannot be repaired, the filesystem must be
+unmounted and repaired.
+.SH SEE ALSO
+.BR xfs_repair (8)
+and
+.BR xfs_scrub (8).
diff --git a/man/man8/xfs_healer_start.8 b/man/man8/xfs_healer_start.8
new file mode 100644
index 00000000000000..9e424432a513fe
--- /dev/null
+++ b/man/man8/xfs_healer_start.8
@@ -0,0 +1,37 @@
+.TH xfs_healer_start 8
+.SH NAME
+xfs_healer_start \- starts xfs_healer instances
+.SH SYNOPSIS
+.B xfs_healer_start
+[
+.B OPTIONS
+]
+.br
+.B xfs_healer \-V
+.SH DESCRIPTION
+.B xfs_healer_start
+starts the xfs_healer service whenever the kernel mounts an XFS filesystem in
+the current mount namespace.
+.PP
+.B WARNING!
+This program is
+.BR EXPERIMENTAL ","
+which means that its behavior and interface
+could change at any time!
+
+Normally this program runs as a systemd service.
+
+.SH OPTIONS
+.TP
+.B \-\-supported
+Check if the kernel supports listening for mount events.
+Exits with 0 if it does, and non-zero if not.
+.TP
+.BI "\-\-mountns " path
+Monitor the given mount namespace.
+Defaults to the mount namespace associated with the process itself.
+.TP
+.BI \-V
+Prints the version number and exit.
+.SH SEE ALSO
+.BR xfs_healer (8).


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (18 preceding siblings ...)
  2026-03-03  0:38   ` [PATCH 19/26] xfs_healer: add a manual page Darrick J. Wong
@ 2026-03-03  0:39   ` Darrick J. Wong
  2026-03-03 15:53     ` Christoph Hellwig
  2026-03-03  0:39   ` [PATCH 21/26] xfs_scrub: perform media scanning of the log region Darrick J. Wong
                     ` (5 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:39 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If the kernel suppots the XFS_IOC_VERIFY_MEDIA ioctl, use that to
perform the phase 6 media scan instead of pwrite or the SCSI VERIFY
command.  This enables better integration with xfs_healer and fsnotify;
and reduces the amount of work that userspace has to do.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 scrub/disk.h        |   11 ++++++++++-
 scrub/disk.c        |   40 +++++++++++++++++++++++++++++++++++++++-
 scrub/phase1.c      |   25 +++++++++++++++++++++++++
 scrub/read_verify.c |    2 +-
 4 files changed, 75 insertions(+), 3 deletions(-)


diff --git a/scrub/disk.h b/scrub/disk.h
index 73c73ab57fb5c7..2ae27b73839ad3 100644
--- a/scrub/disk.h
+++ b/scrub/disk.h
@@ -10,18 +10,27 @@
 struct disk {
 	struct stat	d_sb;
 	int		d_fd;
+	int		d_verify_fd;
 	unsigned int	d_lbalog;
 	unsigned int	d_lbasize;	/* bytes */
 	unsigned int	d_flags;
 	unsigned int	d_blksize;	/* bytes */
 	uint64_t	d_size;		/* bytes */
 	uint64_t	d_start;	/* bytes */
+	unsigned int	d_verify_disk;
 };
 
 unsigned int disk_heads(struct disk *disk);
 struct disk *disk_open(const char *pathname);
 int disk_close(struct disk *disk);
 ssize_t disk_read_verify(struct disk *disk, void *buf, uint64_t startblock,
-		uint64_t blockcount);
+		uint64_t blockcount, bool single_step);
+
+static inline void
+disk_config_xfs_verify(struct disk *disk, int mnt_fd, unsigned int verify_disk)
+{
+	disk->d_verify_fd = mnt_fd;
+	disk->d_verify_disk = verify_disk;
+}
 
 #endif /* XFS_SCRUB_DISK_H_ */
diff --git a/scrub/disk.c b/scrub/disk.c
index 2cf84d91887587..4e78bd1cebbdc8 100644
--- a/scrub/disk.c
+++ b/scrub/disk.c
@@ -190,6 +190,7 @@ disk_open(
 	disk = calloc(1, sizeof(struct disk));
 	if (!disk)
 		return NULL;
+	disk->d_verify_fd = -1;
 
 	disk->d_fd = open(pathname, O_RDONLY | O_DIRECT | O_NOATIME);
 	if (disk->d_fd < 0)
@@ -266,6 +267,18 @@ disk_close(
 #define LBASIZE(d)		(1ULL << (d)->d_lbalog)
 #define BTOLBA(d, bytes)	(((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog)
 
+#ifndef BTOBB
+# define BTOBB(bytes)		((uint64_t)((bytes) + 511) >> 9)
+#endif
+
+#ifndef BTOBBT
+# define BTOBBT(bytes)		((uint64_t)(bytes) >> 9)
+#endif
+
+#ifndef BBTOB
+# define BBTOB(bytes)		((uint64_t)(bytes) << 9)
+#endif
+
 /* Simulate disk errors. */
 static int
 disk_simulate_read_error(
@@ -329,7 +342,8 @@ disk_read_verify(
 	struct disk		*disk,
 	void			*buf,
 	uint64_t		start,
-	uint64_t		length)
+	uint64_t		length,
+	bool			single_step)
 {
 	if (debug) {
 		int		ret;
@@ -345,6 +359,30 @@ disk_read_verify(
 			return length;
 	}
 
+	if (disk->d_verify_fd >= 0) {
+		const uint64_t	orig_start_daddr = BTOBBT(start);
+		struct xfs_verify_media me = {
+			.me_start_daddr	= orig_start_daddr,
+			.me_end_daddr	= BTOBB(start + length),
+			.me_dev		= disk->d_verify_disk,
+			.me_rest_us	= bg_mode > 2 ? bg_mode - 1 : 0,
+		};
+		int		ret;
+
+		if (single_step)
+			me.me_flags |= XFS_VERIFY_MEDIA_REPORT;
+
+		ret = ioctl(disk->d_verify_fd, XFS_IOC_VERIFY_MEDIA, &me);
+		if (ret < 0)
+			return ret;
+		if (me.me_ioerror) {
+			errno = me.me_ioerror;
+			return -1;
+		}
+
+		return BBTOB(me.me_start_daddr - orig_start_daddr);
+	}
+
 	/* Convert to logical block size. */
 	if (disk->d_flags & DISK_FLAG_SCSI_VERIFY)
 		return disk_scsi_verify(disk, BTOLBAT(disk, start),
diff --git a/scrub/phase1.c b/scrub/phase1.c
index 10e9aa1892b701..093e7a01b9542f 100644
--- a/scrub/phase1.c
+++ b/scrub/phase1.c
@@ -213,6 +213,29 @@ mode_from_autofsck(
 	goto summarize;
 }
 
+/* Does the XFS driver support media scanning its own disks? */
+static void
+configure_xfs_verify(
+	struct scrub_ctx	*ctx)
+{
+	struct xfs_verify_media	me = {
+		.me_start_daddr	= 1,
+		.me_end_daddr	= 0,
+		.me_dev		= XFS_DEV_DATA,
+	};
+	int			ret;
+
+	ret = ioctl(ctx->mnt.fd, XFS_IOC_VERIFY_MEDIA, &me);
+	if (ret < 0)
+		return;
+
+	disk_config_xfs_verify(ctx->datadev, ctx->mnt.fd, XFS_DEV_DATA);
+	if (ctx->logdev)
+		disk_config_xfs_verify(ctx->logdev, ctx->mnt.fd, XFS_DEV_LOG);
+	if (ctx->rtdev)
+		disk_config_xfs_verify(ctx->rtdev, ctx->mnt.fd, XFS_DEV_RT);
+}
+
 /*
  * Bind to the mountpoint, read the XFS geometry, bind to the block devices.
  * Anything we've already built will be cleaned up by scrub_cleanup.
@@ -379,6 +402,8 @@ _("Unable to find realtime device path."));
 		}
 	}
 
+	configure_xfs_verify(ctx);
+
 	/*
 	 * Everything's set up, which means any failures recorded after
 	 * this point are most probably corruption errors (as opposed to
diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index 1219efe2590182..9e1f3ec0ed1186 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -201,7 +201,7 @@ read_verify(
 		dbg_printf("diskverify %d %"PRIu64" %zu\n", rvp->disk->d_fd,
 				rv->io_start, len);
 		sz = disk_read_verify(rvp->disk, rvp->readbuf, rv->io_start,
-				len);
+				len, io_max_size <= rvp->miniosz);
 		if (sz == len && io_max_size < rvp->miniosz) {
 			/*
 			 * If the verify request was 100% successful and less


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 21/26] xfs_scrub: perform media scanning of the log region
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (19 preceding siblings ...)
  2026-03-03  0:39   ` [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible Darrick J. Wong
@ 2026-03-03  0:39   ` Darrick J. Wong
  2026-03-03 15:54     ` Christoph Hellwig
  2026-03-03  0:39   ` [PATCH 22/26] xfs_io: add listmount command Darrick J. Wong
                     ` (4 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:39 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Scan the log area for media errors because a defect in a region could
prevent the user from being able to perform log recovery.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 scrub/phase6.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)


diff --git a/scrub/phase6.c b/scrub/phase6.c
index abf6f9713f1a4d..59e05e8aa2f54d 100644
--- a/scrub/phase6.c
+++ b/scrub/phase6.c
@@ -616,9 +616,14 @@ check_rmap(
 			map->fmr_flags);
 
 	/* "Unknown" extents should be verified; they could be data. */
-	if ((map->fmr_flags & FMR_OF_SPECIAL_OWNER) &&
-			map->fmr_owner == XFS_FMR_OWN_UNKNOWN)
-		map->fmr_flags &= ~FMR_OF_SPECIAL_OWNER;
+	if ((map->fmr_flags & FMR_OF_SPECIAL_OWNER)) {
+		switch (map->fmr_owner) {
+		case XFS_FMR_OWN_UNKNOWN:
+		case XFS_FMR_OWN_LOG:
+			map->fmr_flags &= ~FMR_OF_SPECIAL_OWNER;
+			break;
+		}
+	}
 
 	/*
 	 * We only care about read-verifying data extents that have been


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 22/26] xfs_io: add listmount command
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (20 preceding siblings ...)
  2026-03-03  0:39   ` [PATCH 21/26] xfs_scrub: perform media scanning of the log region Darrick J. Wong
@ 2026-03-03  0:39   ` Darrick J. Wong
  2026-03-03 15:56     ` Christoph Hellwig
  2026-03-03  0:39   ` [PATCH 23/26] xfs_io: print systemd service names Darrick J. Wong
                     ` (3 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:39 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add a command to list all mounts, now that we use this in
xfs_healer_start.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 io/io.h           |    6 +
 io/Makefile       |    8 +
 io/init.c         |    1 
 io/listmount.c    |  383 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 man/man8/xfs_io.8 |   43 ++++++
 5 files changed, 441 insertions(+)
 create mode 100644 io/listmount.c


diff --git a/io/io.h b/io/io.h
index 0f12b3cfed5e76..5f1f278d14a033 100644
--- a/io/io.h
+++ b/io/io.h
@@ -164,3 +164,9 @@ void			fsprops_init(void);
 void			aginfo_init(void);
 void			healthmon_init(void);
 void			verifymedia_init(void);
+
+#ifdef HAVE_LISTMOUNT
+void			listmount_init(void);
+#else
+# define		listmount_init()	do { } while (0)
+#endif
diff --git a/io/Makefile b/io/Makefile
index 79d5e172b8f31f..4c3359c4d4f7f4 100644
--- a/io/Makefile
+++ b/io/Makefile
@@ -90,6 +90,14 @@ ifeq ($(HAVE_GETFSMAP),yes)
 CFILES += fsmap.c
 endif
 
+ifeq ($(HAVE_LISTMOUNT),yes)
+CFILES += listmount.c
+LCFLAGS += -DHAVE_LISTMOUNT
+ ifeq ($(HAVE_LISTMOUNT_NS_FD),yes)
+  CFLAGS += -DHAVE_LISTMOUNT_NS_FD
+ endif # listmount mnt_ns_fd
+endif
+
 default: depend $(LTCOMMAND)
 
 include $(BUILDRULES)
diff --git a/io/init.c b/io/init.c
index f2a551ef559200..ba60cb2199639b 100644
--- a/io/init.c
+++ b/io/init.c
@@ -94,6 +94,7 @@ init_commands(void)
 	fsprops_init();
 	healthmon_init();
 	verifymedia_init();
+	listmount_init();
 }
 
 /*
diff --git a/io/listmount.c b/io/listmount.c
new file mode 100644
index 00000000000000..f600ce562e63ea
--- /dev/null
+++ b/io/listmount.c
@@ -0,0 +1,383 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2026 Oracle.  All Rights Reserved.
+ * Author: Darrick J. Wong <djwong@kernel.org>
+ */
+#include "xfs.h"
+
+#include "libfrog/flagmap.h"
+#include "command.h"
+#include "input.h"
+#include "init.h"
+#include "io.h"
+
+/* copied from linux/mount.h in linux 6.18 */
+struct statmount_fixed {
+	__u32 size;		/* Total size, including strings */
+	__u32 mnt_opts;		/* [str] Options (comma separated, escaped) */
+	__u64 mask;		/* What results were written */
+	__u32 sb_dev_major;	/* Device ID */
+	__u32 sb_dev_minor;
+	__u64 sb_magic;		/* ..._SUPER_MAGIC */
+	__u32 sb_flags;		/* SB_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
+	__u32 fs_type;		/* [str] Filesystem type */
+	__u64 mnt_id;		/* Unique ID of mount */
+	__u64 mnt_parent_id;	/* Unique ID of parent (for root == mnt_id) */
+	__u32 mnt_id_old;	/* Reused IDs used in proc/.../mountinfo */
+	__u32 mnt_parent_id_old;
+	__u64 mnt_attr;		/* MOUNT_ATTR_... */
+	__u64 mnt_propagation;	/* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
+	__u64 mnt_peer_group;	/* ID of shared peer group */
+	__u64 mnt_master;	/* Mount receives propagation from this ID */
+	__u64 propagate_from;	/* Propagation from in current namespace */
+	__u32 mnt_root;		/* [str] Root of mount relative to root of fs */
+	__u32 mnt_point;	/* [str] Mountpoint relative to current root */
+	__u64 mnt_ns_id;	/* ID of the mount namespace */
+	__u32 fs_subtype;	/* [str] Subtype of fs_type (if any) */
+	__u32 sb_source;	/* [str] Source string of the mount */
+	__u32 opt_num;		/* Number of fs options */
+	__u32 opt_array;	/* [str] Array of nul terminated fs options */
+	__u32 opt_sec_num;	/* Number of security options */
+	__u32 opt_sec_array;	/* [str] Array of nul terminated security options */
+	__u64 supported_mask;	/* Mask flags that this kernel supports */
+	__u32 mnt_uidmap_num;	/* Number of uid mappings */
+	__u32 mnt_uidmap;	/* [str] Array of uid mappings (as seen from callers namespace) */
+	__u32 mnt_gidmap_num;	/* Number of gid mappings */
+	__u32 mnt_gidmap;	/* [str] Array of gid mappings (as seen from callers namespace) */
+	__u64 __spare2[43];
+	char str[];		/* Variable size part containing strings */
+};
+
+#ifndef STATMOUNT_MNT_NS_ID
+#define STATMOUNT_MNT_NS_ID		0x00000040U	/* Want/got mnt_ns_id */
+#endif
+
+#ifndef STATMOUNT_MNT_OPTS
+#define STATMOUNT_MNT_OPTS		0x00000080U	/* Want/got mnt_opts */
+#endif
+
+#ifndef STATMOUNT_FS_SUBTYPE
+#define STATMOUNT_FS_SUBTYPE		0x00000100U	/* Want/got fs_subtype */
+#endif
+
+#ifndef STATMOUNT_SB_SOURCE
+#define STATMOUNT_SB_SOURCE		0x00000200U	/* Want/got sb_source */
+#endif
+
+#ifndef STATMOUNT_OPT_ARRAY
+#define STATMOUNT_OPT_ARRAY		0x00000400U	/* Want/got opt_... */
+#endif
+
+#ifndef STATMOUNT_OPT_SEC_ARRAY
+#define STATMOUNT_OPT_SEC_ARRAY		0x00000800U	/* Want/got opt_sec... */
+#endif
+
+#ifndef STATMOUNT_SUPPORTED_MASK
+#define STATMOUNT_SUPPORTED_MASK	0x00001000U	/* Want/got supported mask flags */
+#endif
+
+static const struct flag_map statmount_funcs[] = {
+	{ STATMOUNT_SB_BASIC,		N_("sb_basic") },
+	{ STATMOUNT_MNT_BASIC,		N_("mnt_basic") },
+	{ STATMOUNT_PROPAGATE_FROM,	N_("propagate_from") },
+	{ STATMOUNT_MNT_ROOT,		N_("mnt_root") },
+	{ STATMOUNT_MNT_POINT,		N_("mnt_point") },
+	{ STATMOUNT_FS_TYPE,		N_("fs_type") },
+	{ STATMOUNT_MNT_NS_ID,		N_("mnt_ns_id") },
+	{ STATMOUNT_MNT_OPTS,		N_("mnt_opts") },
+	{ STATMOUNT_FS_SUBTYPE,		N_("fs_subtype") },
+	{ STATMOUNT_SB_SOURCE,		N_("sb_source") },
+	{ STATMOUNT_OPT_ARRAY,		N_("opt_array") },
+	{ STATMOUNT_OPT_SEC_ARRAY,	N_("opt_sec_array") },
+	{ STATMOUNT_SUPPORTED_MASK,	N_("supported_mask") },
+	{0, NULL},
+};
+
+static const struct flag_map mount_attrs[] = {
+	{ MOUNT_ATTR_RDONLY,		N_("rdonly") },
+	{ MOUNT_ATTR_NOSUID,		N_("nosuid") },
+	{ MOUNT_ATTR_NODEV,		N_("nodev") },
+	{ MOUNT_ATTR_NOEXEC,		N_("noexec") },
+	{ MOUNT_ATTR__ATIME,		N_("atime") },
+	{ MOUNT_ATTR_RELATIME,		N_("relatime") },
+	{ MOUNT_ATTR_NOATIME,		N_("noatime") },
+	{ MOUNT_ATTR_STRICTATIME,	N_("strictatime") },
+	{ MOUNT_ATTR_NODIRATIME,	N_("nodiratime") },
+	{ MOUNT_ATTR_IDMAP,		N_("idmap") },
+	{ MOUNT_ATTR_NOSYMFOLLOW,	N_("nosymfollow") },
+	{0, NULL},
+};
+
+static const struct flag_map mount_prop_flags[] = {
+	{ MS_SHARED,			N_("shared") },
+	{ MS_SLAVE,			N_("nopeer") },
+	{ MS_PRIVATE,			N_("private") },
+	{ MS_UNBINDABLE,		N_("unbindable") },
+	{0, NULL},
+};
+
+static void
+listmount_help(void)
+{
+	printf(_(
+"\n"
+" List all mounted filesystems.\n"
+"\n"
+" -f   -- statmount mask flags to set.  Defaults to all possible flags.\n"
+" -i   -- mount id to use.  Defaults to the root of the mount namespace.\n"
+" -n   -- path to a procfs mount namespace file.\n"
+" -t   -- only display mount info for this fs type.\n"
+));
+}
+
+static int
+listmount(
+	const struct mnt_id_req	*req,
+	uint64_t		*mnt_ids,
+	size_t			nr_mnt_ids)
+{
+	return syscall(SYS_listmount, req, mnt_ids, nr_mnt_ids, 0);
+}
+
+static int
+statmount(
+	const struct mnt_id_req	*req,
+	struct statmount_fixed	*smbuf,
+	size_t			smbuf_size)
+{
+	return syscall(SYS_statmount, req, smbuf, smbuf_size, 0);
+}
+
+static void
+dump_mountinfo(
+	int			mnt_ns_fd,
+	uint64_t		statmount_flags,
+	bool			rawflag,
+	uint64_t		row_id,
+	const char		*fstype,
+	uint64_t		mnt_id)
+{
+	struct mnt_id_req	req = {
+		.size		= sizeof(req),
+		.mnt_id		= mnt_id,
+#ifdef HAVE_LISTMOUNT_NS_FD
+		.mnt_ns_fd	= mnt_ns_fd,
+#else
+		.spare		= mnt_ns_fd,
+#endif
+		.param		= statmount_flags,
+	};
+	char			buf[4096];
+	size_t			smbuf_size = getpagesize();
+	struct statmount_fixed	*smbuf = malloc(smbuf_size);
+	int			ret;
+
+	if (!smbuf) {
+		perror("malloc");
+		return;
+	}
+
+	if (fstype)
+		req.param |= STATMOUNT_FS_TYPE | STATMOUNT_FS_SUBTYPE;
+
+	ret = statmount(&req, smbuf, smbuf_size);
+	if (ret) {
+		perror("statmount");
+		goto out_smbuf;
+	}
+
+	if (fstype) {
+		char	real_fstype[256];
+
+		if (!(smbuf->mask & STATMOUNT_FS_TYPE))
+			return;
+
+		if (smbuf->mask & STATMOUNT_FS_SUBTYPE)
+			snprintf(real_fstype, sizeof(fstype), "%s.%s",
+					smbuf->str + smbuf->fs_type,
+					smbuf->str + smbuf->fs_subtype);
+		else
+			snprintf(real_fstype, sizeof(fstype), "%s",
+					smbuf->str + smbuf->fs_type);
+		if (strcmp(fstype, real_fstype))
+			return;
+	}
+
+	printf("mnt_id[%llu]: 0x%llx\n", (unsigned long long)row_id,
+			(unsigned long long)mnt_id);
+
+	if (rawflag) {
+		printf("\tmask: 0x%llx\n", (unsigned long long)smbuf->mask);
+	} else {
+		mask_to_string(statmount_funcs, smbuf->mask, ",", buf,
+				sizeof(buf));
+		printf("\tmask: {%s}\n", buf);
+	}
+
+	if (smbuf->mask & STATMOUNT_SB_BASIC) {
+		printf("\tsb_dev_major: %u\n", smbuf->sb_dev_major);
+		printf("\tsb_dev_minor: %u\n", smbuf->sb_dev_minor);
+		printf("\tsb_magic: 0x%llx\n",
+				(unsigned long long)smbuf->sb_magic);
+		printf("\tsb_flags: 0x%x\n", smbuf->sb_flags);
+	}
+
+	if (smbuf->mask & STATMOUNT_MNT_BASIC) {
+		printf("\tmnt_id: 0x%llx\n",
+				(unsigned long long)smbuf->mnt_id);
+		printf("\tmnt_parent_id: 0x%llx\n",
+				(unsigned long long)smbuf->mnt_parent_id);
+		printf("\tmnt_id_old: %u\n", smbuf->mnt_id_old);
+		printf("\tmnt_parent_id_old: %u\n", smbuf->mnt_parent_id_old);
+		if (rawflag) {
+			printf("\tmnt_attr: 0x%llx\n",
+					(unsigned long long)smbuf->mnt_attr);
+			printf("\tmnt_propagation: 0x%llx\n",
+					(unsigned long long)smbuf->mnt_propagation);
+		} else {
+			mask_to_string(mount_attrs, smbuf->mnt_attr, ",", buf,
+					sizeof(buf));
+			printf("\tmnt_attr: {%s}\n", buf);
+			mask_to_string(mount_prop_flags, smbuf->mnt_propagation,
+					",", buf, sizeof(buf));
+			printf("\tmnt_propagation: {%s}\n", buf);
+		}
+		printf("\tmnt_peer_group: 0x%llx\n",
+				(unsigned long long)smbuf->mnt_peer_group);
+		printf("\tmnt_master: 0x%llx\n",
+				(unsigned long long)smbuf->mnt_master);
+	}
+
+	if (smbuf->mask & STATMOUNT_PROPAGATE_FROM)
+		printf("\tpropagate_from: 0x%llx\n",
+				(unsigned long long)smbuf->propagate_from);
+
+	if (smbuf->mask & STATMOUNT_MNT_ROOT)
+		printf("\tmnt_root: %s\n", smbuf->str + smbuf->mnt_root);
+	if (smbuf->mask & STATMOUNT_MNT_POINT)
+		printf("\tmnt_point: %s\n", smbuf->str + smbuf->mnt_point);
+	if (smbuf->mask & STATMOUNT_FS_TYPE)
+		printf("\tfs_type: %s\n", smbuf->str + smbuf->fs_type);
+	if (smbuf->mask & STATMOUNT_FS_SUBTYPE)
+		printf("\tfs_subtype: %s\n", smbuf->str + smbuf->fs_subtype);
+
+	if (smbuf->mask & STATMOUNT_MNT_NS_ID)
+		printf("\tmnt_ns_id: 0x%llx\n",
+				(unsigned long long)smbuf->mnt_ns_id);
+
+	if (smbuf->mask & STATMOUNT_MNT_OPTS)
+		printf("\tmnt_opts: %s\n", smbuf->str + smbuf->mnt_opts);
+	if (smbuf->mask & STATMOUNT_SB_SOURCE)
+		printf("\tsb_source: %s\n", smbuf->str + smbuf->sb_source);
+
+	if (smbuf->mask & STATMOUNT_SUPPORTED_MASK) {
+		if (rawflag) {
+			printf("\tsupported_mask: 0x%llx\n",
+					(unsigned long long)smbuf->supported_mask);
+		} else {
+			mask_to_string(statmount_funcs, smbuf->supported_mask,
+					",", buf, sizeof(buf));
+			printf("\tsupported_mask: {%s}\n", buf);
+		}
+	}
+
+out_smbuf:
+	free(smbuf);
+}
+
+#define NR_MNT_IDS		7
+
+static int
+listmount_f(
+	int			argc,
+	char			**argv)
+{
+	struct mnt_id_req	req = {
+		.size		= sizeof(struct mnt_id_req),
+		.mnt_id		= LSMT_ROOT,
+	};
+	uint64_t		mnt_ids[NR_MNT_IDS];
+	uint64_t		statmount_flags = -1ULL;
+	const char		*fstype = NULL;
+	unsigned long long	rows = 0;
+	/*
+	 * Believe it or not, listmount and statmount treat a zero fd as a
+	 * null fd even though Linus roared about that with the BPF people.
+	 * Here, zero means "use the current process' mount ns".
+	 */
+	int			mnt_ns_fd = 0;
+	int			rawflag = 0;
+	int			c;
+	int			ret;
+
+	while ((c = getopt(argc, argv, "f:i:n:rt:")) > 0) {
+		switch (c) {
+		case 'f':
+			errno = 0;
+			statmount_flags = strtoull(optarg, NULL, 0);
+			if (errno) {
+				perror(optarg);
+				return 1;
+			}
+			break;
+		case 'i':
+			errno = 0;
+			req.mnt_id = strtoull(optarg, NULL, 0);
+			if (errno) {
+				perror(optarg);
+				return 1;
+			}
+			break;
+		case 'n':
+			mnt_ns_fd = open(optarg, O_RDONLY);
+			if (mnt_ns_fd < 0) {
+				perror(optarg);
+				return 1;
+			}
+#ifdef HAVE_LISTMOUNT_NS_FD
+			req.mnt_ns_fd = mnt_ns_fd;
+#else
+			req.spare = mnt_ns_fd;
+#endif
+			break;
+		case 'r':
+			rawflag++;
+			break;
+		case 't':
+			fstype = optarg;
+			break;
+		default:
+			listmount_help();
+			return 1;
+		}
+	}
+
+	while ((ret = listmount(&req, mnt_ids, NR_MNT_IDS)) > 0) {
+		for (c = 0; c < ret; c++)
+			dump_mountinfo(mnt_ns_fd, statmount_flags, rawflag,
+					rows++, fstype, mnt_ids[c]);
+
+		req.param = mnt_ids[ret - 1];
+	}
+
+	if (ret < 0)
+		perror("listmount");
+
+	return 0;
+}
+
+static const struct cmdinfo listmount_cmd = {
+	.name		= "listmount",
+	.cfunc		= listmount_f,
+	.argmin		= -1,
+	.argmax		= -1,
+	.flags		= CMD_NOFILE_OK | CMD_FOREIGN_OK | CMD_NOMAP_OK,
+	.oneline	= N_("list mounted filesystems"),
+	.help		= listmount_help,
+};
+
+void
+listmount_init(void)
+{
+	add_command(&listmount_cmd);
+}
diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8
index 2090cd4c0b2641..2b0dbfbe848bce 100644
--- a/man/man8/xfs_io.8
+++ b/man/man8/xfs_io.8
@@ -1766,6 +1766,49 @@ .SH FILESYSTEM COMMANDS
 .TP
 .BI "removefsprops " name " [ " names "... ]"
 Remove the given filesystem properties.
+.TP
+.BI "listmount [ \-f " mask " ] [ \-i " mnt_id " ] [ \-n " path " ] [ \-r ] [ \-t" fstype " ]"
+Print information about the mounted filesystems in a particular mount
+namespace.
+The information returned by this call corresponds to the information returned
+by the
+.BR statmount (2)
+system call.
+
+.RE
+.RS 1.0i
+.PD 0
+.TP
+.BI "\-f " mask
+Pass this numeric argument as the mask argument to
+.BR statmount (8).
+Defaults to all bits set, to retrieve all possible information.
+
+.TP
+.BI "\-i " mnt_id
+Only return information for mounts below this mount in the mount tree.
+Defaults to the root directory.
+
+.TP
+.BI "\-n " path
+Return information for the mount namespace given by this procfs path.
+For a given process, the path will most likely look like
+.BI /proc/ $pid /ns/mnt
+though any path can be provided.
+Defaults to the mount namespace of the
+.B xfs_io
+process itself.
+
+.TP
+.B \-r
+Print raw bitmasks instead of converting them to strings.
+
+.TP
+.BI "\-t " fstype
+Only return information for filesystems of this type.
+If not specified, no filtering is performed.
+.RE
+.PD
 
 .SH OTHER COMMANDS
 .TP


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 23/26] xfs_io: print systemd service names
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (21 preceding siblings ...)
  2026-03-03  0:39   ` [PATCH 22/26] xfs_io: add listmount command Darrick J. Wong
@ 2026-03-03  0:39   ` Darrick J. Wong
  2026-03-03 15:57     ` Christoph Hellwig
  2026-03-03  0:40   ` [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled Darrick J. Wong
                     ` (2 subsequent siblings)
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:39 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add an xfs_io subcommand so that we can emit systemd service names for
XFS services targetting filesystems paths instead of opencoding the
computation in things like fstests.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 healer/Makefile      |    1 -
 include/builddefs.in |    2 +
 io/Makefile          |    4 +++
 io/scrub.c           |   75 ++++++++++++++++++++++++++++++++++++++++++++++++++
 man/man8/xfs_io.8    |   24 ++++++++++++++++
 scrub/Makefile       |    6 +---
 6 files changed, 107 insertions(+), 5 deletions(-)


diff --git a/healer/Makefile b/healer/Makefile
index f7ee911fe11f92..4116e338cd1dee 100644
--- a/healer/Makefile
+++ b/healer/Makefile
@@ -27,7 +27,6 @@ LLDFLAGS = -static
 
 ifeq ($(HAVE_SYSTEMD),yes)
 INSTALL_HEALER += install-systemd
-XFS_HEALER_SVCNAME=xfs_healer@.service
 SYSTEMD_SERVICES = \
 	system-xfs_healer.slice \
 	$(XFS_HEALER_SVCNAME)
diff --git a/include/builddefs.in b/include/builddefs.in
index b5ace90f53a46e..439b0dbf3ea813 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -63,6 +63,8 @@ PKG_STATE_DIR	= @localstatedir@/lib/@pkg_name@
 
 XFS_SCRUB_ALL_AUTO_MEDIA_SCAN_STAMP=$(PKG_STATE_DIR)/xfs_scrub_all_media.stamp
 XFS_SCRUB_SVCNAME=xfs_scrub@.service
+XFS_SCRUB_MEDIA_SVCNAME=xfs_scrub_media@.service
+XFS_HEALER_SVCNAME=xfs_healer@.service
 
 CC		= @cc@
 BUILD_CC	= @BUILD_CC@
diff --git a/io/Makefile b/io/Makefile
index 4c3359c4d4f7f4..7ac45236cc8c9c 100644
--- a/io/Makefile
+++ b/io/Makefile
@@ -98,6 +98,10 @@ LCFLAGS += -DHAVE_LISTMOUNT
  endif # listmount mnt_ns_fd
 endif
 
+CFLAGS+=-DXFS_SCRUB_SVCNAME=\"$(XFS_SCRUB_SVCNAME)\"
+CFLAGS+=-DXFS_SCRUB_MEDIA_SVCNAME=\"$(XFS_SCRUB_MEDIA_SVCNAME)\"
+CFLAGS+=-DXFS_HEALER_SVCNAME=\"$(XFS_HEALER_SVCNAME)\"
+
 default: depend $(LTCOMMAND)
 
 include $(BUILDRULES)
diff --git a/io/scrub.c b/io/scrub.c
index a137f402b94d48..f343ac05484b6c 100644
--- a/io/scrub.c
+++ b/io/scrub.c
@@ -13,12 +13,14 @@
 #include "libfrog/fsgeom.h"
 #include "libfrog/scrub.h"
 #include "libfrog/logging.h"
+#include "libfrog/systemd.h"
 #include "io.h"
 #include "list.h"
 
 static struct cmdinfo scrub_cmd;
 static struct cmdinfo repair_cmd;
 static const struct cmdinfo scrubv_cmd;
+static const struct cmdinfo svcname_cmd;
 
 static void
 scrub_help(void)
@@ -356,6 +358,7 @@ scrub_init(void)
 
 	add_command(&scrub_cmd);
 	add_command(&scrubv_cmd);
+	add_command(&svcname_cmd);
 }
 
 static void
@@ -730,3 +733,75 @@ static const struct cmdinfo scrubv_cmd = {
 	.oneline	= N_("vectored metadata scrub"),
 	.help		= scrubv_help,
 };
+
+static void
+svcname_help(void)
+{
+	printf(_(
+"\n"
+" Print the systemd service instance name for the given paths.\n"
+"\n"
+" -h         Print the instance name for a xfs_healer instance.\n"
+" -m         Print the instance name for a xfs_scrub_media instance.\n"
+" -s         Print the instance name for a xfs_scrub instance.\n"
+" -t templ   Use templ as a template for the service name.\n"
+"\n"
+" Example:\n"
+" 'svcname -s /mnt' - print the xfs_scrub service name for /mnt.\n"));
+}
+
+static int
+svcname_f(
+	int		argc,
+	char		**argv)
+{
+	const char	*template = XFS_SCRUB_SVCNAME;
+	int		c;
+	int		error;
+
+	while ((c = getopt(argc, argv, "shmt:")) != EOF) {
+		switch (c) {
+		case 's':
+			template = XFS_SCRUB_SVCNAME;
+			break;
+		case 'm':
+			template = XFS_SCRUB_MEDIA_SVCNAME;
+			break;
+		case 'h':
+			template = XFS_HEALER_SVCNAME;
+			break;
+		case 't':
+			template = optarg;
+			break;
+		default:
+			svcname_help();
+			return 0;
+		}
+	}
+
+	for (c = optind; c < argc; c++) {
+		char	unitname[PATH_MAX];
+
+		error = systemd_path_instance_unit_name(template, argv[c],
+				unitname, sizeof(unitname));
+		if (error) {
+			if (errno)
+				perror(argv[c]);
+		} else {
+			printf("%s\n", unitname);
+		}
+	}
+
+	return 0;
+}
+
+static const struct cmdinfo svcname_cmd = {
+	.name		= "svcname",
+	.cfunc		= svcname_f,
+	.argmin		= -1,
+	.argmax		= -1,
+	.flags		= CMD_NOFILE_OK | CMD_NOMAP_OK,
+	.args		= N_("[-h | -m | -s | -t template] path [paths...]"),
+	.oneline	= N_("print systemd service names"),
+	.help		= svcname_help,
+};
diff --git a/man/man8/xfs_io.8 b/man/man8/xfs_io.8
index 2b0dbfbe848bce..04bd1e6427f1c6 100644
--- a/man/man8/xfs_io.8
+++ b/man/man8/xfs_io.8
@@ -1824,6 +1824,30 @@ .SH OTHER COMMANDS
 See the
 .B print
 command.
+.TP
+.BI "svcname [ \-h | \-m | \-s | \-t " template " ] " path
+Print the systemd service name for a given filesystem path.
+.RE
+.RS 1.0i
+.PD 0
+.TP
+.B \-h
+Print the systemd service name for xfs_healer.
+
+.TP
+.B \-m
+Print the systemd service name for xfs_scrub_media.
+
+.TP
+.B \-s
+Print the systemd service name for xfs_scrub.
+
+.TP
+.BI "\-t " template
+Print the systemd service name for the given template.
+.RE
+.PD
+
 .TP
 .B quit
 Exit
diff --git a/scrub/Makefile b/scrub/Makefile
index aee49bfce100e2..6ace458118fc92 100644
--- a/scrub/Makefile
+++ b/scrub/Makefile
@@ -8,8 +8,6 @@ include $(builddefs)
 
 SCRUB_PREREQS=$(HAVE_GETFSMAP)
 
-scrub_media_svcname=xfs_scrub_media@.service
-
 ifeq ($(SCRUB_PREREQS),yes)
 LTCOMMAND = xfs_scrub
 INSTALL_SCRUB = install-scrub
@@ -22,7 +20,7 @@ INSTALL_SCRUB += install-systemd
 SYSTEMD_SERVICES=\
 	$(XFS_SCRUB_SVCNAME) \
 	xfs_scrub_fail@.service \
-	$(scrub_media_svcname) \
+	$(XFS_SCRUB_MEDIA_SVCNAME) \
 	xfs_scrub_media_fail@.service \
 	xfs_scrub_all.service \
 	xfs_scrub_all_fail.service \
@@ -123,7 +121,7 @@ $(XFS_SCRUB_ALL_PROG): $(XFS_SCRUB_ALL_PROG).in $(builddefs) $(TOPDIR)/libfrog/g
 	@echo "    [SED]    $@"
 	$(Q)$(SED) -e "s|@sbindir@|$(PKG_SBIN_DIR)|g" \
 		   -e "s|@scrub_svcname@|$(XFS_SCRUB_SVCNAME)|g" \
-		   -e "s|@scrub_media_svcname@|$(scrub_media_svcname)|g" \
+		   -e "s|@scrub_media_svcname@|$(XFS_SCRUB_MEDIA_SVCNAME)|g" \
 		   -e "s|@pkg_version@|$(PKG_VERSION)|g" \
 		   -e "s|@stampfile@|$(XFS_SCRUB_ALL_AUTO_MEDIA_SCAN_STAMP)|g" \
 		   -e "s|@scrub_service_args@|$(XFS_SCRUB_SERVICE_ARGS)|g" \


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (22 preceding siblings ...)
  2026-03-03  0:39   ` [PATCH 23/26] xfs_io: print systemd service names Darrick J. Wong
@ 2026-03-03  0:40   ` Darrick J. Wong
  2026-03-03 15:58     ` Christoph Hellwig
  2026-03-03  0:40   ` [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
  2026-03-03  0:40   ` [PATCH 26/26] debian/control: listify the build dependencies Darrick J. Wong
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:40 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If all backreferences are enabled in the filesystem, then enable online
repair by default if the user didn't supply any other autofsck setting.
Users might as well get full self-repair capability if they're paying
for the extra metadata.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 mkfs/xfs_mkfs.c |    9 +++++++++
 1 file changed, 9 insertions(+)


diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index a11994027c2df1..87bdf0e22b96f8 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -6289,6 +6289,15 @@ main(
 	if (mp->m_sb.sb_agcount > 1)
 		rewrite_secondary_superblocks(mp);
 
+	/*
+	 * If the filesystem has full backreferences and the user didn't
+	 * express an autofsck preference, enable online repair because they
+	 * might as well get some useful functionality from the extra metadata.
+	 */
+	if (cli.autofsck == FSPROP_AUTOFSCK_UNSET &&
+	    cli.sb_feat.rmapbt && cli.sb_feat.parent_pointers)
+		cli.autofsck = FSPROP_AUTOFSCK_REPAIR;
+
 	if (cli.autofsck != FSPROP_AUTOFSCK_UNSET)
 		set_autofsck(mp, &cli);
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (23 preceding siblings ...)
  2026-03-03  0:40   ` [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled Darrick J. Wong
@ 2026-03-03  0:40   ` Darrick J. Wong
  2026-03-03 15:58     ` Christoph Hellwig
  2026-03-03  0:40   ` [PATCH 26/26] debian/control: listify the build dependencies Darrick J. Wong
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:40 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Now that we're finished building autonomous repair, enable the service
on the root filesystem by default.  The root filesystem is mounted by
the initrd prior to starting systemd, which is why the udev rule cannot
autostart the service for the root filesystem.

dh_installsystemd won't activate a template service (aka one with an
at-sign in the name) even if it provides a DefaultInstance directive to
make that possible.  Use a fugly shim for this.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 debian/postinst |    8 ++++++++
 debian/prerm    |   13 +++++++++++++
 debian/rules    |    3 ++-
 3 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 debian/prerm


diff --git a/debian/postinst b/debian/postinst
index d11c8d94a3cbe4..966dbb7626cab3 100644
--- a/debian/postinst
+++ b/debian/postinst
@@ -21,5 +21,13 @@ case "${1}" in
 esac
 
 #DEBHELPER#
+#
+# dh_installsystemd doesn't handle template services even if we supply a
+# default instance, so we'll install it here.
+if [ -z "${DPKG_ROOT:-}" ] && [ -d /run/systemd/system ] ; then
+	if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
+		/bin/systemctl enable xfs_healer@.service || true
+	fi
+fi
 
 exit 0
diff --git a/debian/prerm b/debian/prerm
new file mode 100644
index 00000000000000..c526dcdd1d7103
--- /dev/null
+++ b/debian/prerm
@@ -0,0 +1,13 @@
+#!/bin/sh
+
+set -e
+
+# dh_installsystemd doesn't handle template services even if we supply a
+# default instance, so we'll install it here.
+if [ -z "${DPKG_ROOT:-}" ] && [ "$1" = remove ] && [ -d /run/systemd/system ] ; then
+	/bin/systemctl disable xfs_healer@.service || true
+fi
+
+#DEBHELPER#
+
+exit 0
diff --git a/debian/rules b/debian/rules
index 7c9f90e6c483ff..aaf99a95ce3df5 100755
--- a/debian/rules
+++ b/debian/rules
@@ -97,4 +97,5 @@ override_dh_installdocs:
 	dh_installdocs -XCHANGES
 
 override_dh_installsystemd:
-	dh_installsystemd -p xfsprogs --no-restart-after-upgrade --no-stop-on-upgrade system-xfs_scrub.slice xfs_scrub_all.timer
+	dh_installsystemd -p xfsprogs --no-restart-after-upgrade --no-stop-on-upgrade system-xfs_scrub.slice xfs_scrub_all.timer system-xfs_healer.slice
+	dh_installsystemd -p xfsprogs --restart-after-upgrade xfs_healer_start.service


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 26/26] debian/control: listify the build dependencies
  2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
                     ` (24 preceding siblings ...)
  2026-03-03  0:40   ` [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
@ 2026-03-03  0:40   ` Darrick J. Wong
  2026-03-03 15:58     ` Christoph Hellwig
  25 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:40 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

This will make it less gross to add more build deps later.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 debian/control |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)


diff --git a/debian/control b/debian/control
index 66b0a47a36ee24..7837019804e93a 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,19 @@ Section: admin
 Priority: optional
 Maintainer: XFS Development Team <linux-xfs@vger.kernel.org>
 Uploaders: Nathan Scott <nathans@debian.org>, Anibal Monsalve Salazar <anibal@debian.org>, Bastian Germann <bage@debian.org>
-Build-Depends: libinih-dev (>= 53), uuid-dev, debhelper (>= 12), gettext, libtool, libedit-dev, libblkid-dev (>= 2.17), linux-libc-dev, libdevmapper-dev, libicu-dev, pkg-config, liburcu-dev, systemd-dev | systemd (<< 253-2~)
+Build-Depends: debhelper (>= 12),
+ gettext,
+ libblkid-dev (>= 2.17),
+ libdevmapper-dev,
+ libedit-dev,
+ libicu-dev,
+ libinih-dev (>= 53),
+ libtool,
+ liburcu-dev,
+ linux-libc-dev,
+ pkg-config,
+ systemd-dev | systemd (<< 253-2~),
+ uuid-dev
 Standards-Version: 4.0.0
 Homepage: https://xfs.wiki.kernel.org/
 


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03  0:33 ` [PATCHSET v8 1/2] fstests: test generic file IO error reporting Darrick J. Wong
@ 2026-03-03  0:40   ` Darrick J. Wong
  2026-03-03  9:21     ` Amir Goldstein
  2026-03-03 14:54     ` Christoph Hellwig
  0 siblings, 2 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:40 UTC (permalink / raw)
  To: zlang, djwong
  Cc: linux-fsdevel, hch, gabriel, amir73il, jack, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Test the fsnotify filesystem error reporting.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 src/Makefile           |    2 
 src/fs-monitor.c       |  155 +++++++++++++++++++++++++++++++++
 tests/generic/1838     |  228 ++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/1838.out |   20 ++++
 4 files changed, 404 insertions(+), 1 deletion(-)
 create mode 100644 src/fs-monitor.c
 create mode 100755 tests/generic/1838
 create mode 100644 tests/generic/1838.out


diff --git a/src/Makefile b/src/Makefile
index 577d816ae859b6..1c761da0ccff20 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -36,7 +36,7 @@ LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize preallo_rw_pattern_reader \
 	fscrypt-crypt-util bulkstat_null_ocount splice-test chprojid_fail \
 	detached_mounts_propagation ext4_resize t_readdir_3 splice2pipe \
 	uuid_ioctl t_snapshot_deleted_subvolume fiemap-fault min_dio_alignment \
-	rw_hint
+	rw_hint fs-monitor
 
 EXTRA_EXECS = dmerror fill2attr fill2fs fill2fs_check scaleread.sh \
 	      btrfs_crc32c_forged_name.py popdir.pl popattr.py \
diff --git a/src/fs-monitor.c b/src/fs-monitor.c
new file mode 100644
index 00000000000000..fef596a3966933
--- /dev/null
+++ b/src/fs-monitor.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021, Collabora Ltd.
+ */
+
+#include <errno.h>
+#include <err.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <sys/fanotify.h>
+#include <sys/types.h>
+#include <unistd.h>
+#ifndef __GLIBC__
+#include <asm-generic/int-ll64.h>
+#endif
+
+#ifndef FAN_FS_ERROR
+#define FAN_FS_ERROR		0x00008000
+#define FAN_EVENT_INFO_TYPE_ERROR	5
+
+struct fanotify_event_info_error {
+	struct fanotify_event_info_header hdr;
+	__s32 error;
+	__u32 error_count;
+};
+#endif
+
+#ifndef FILEID_INO32_GEN
+#define FILEID_INO32_GEN	1
+#endif
+
+#ifndef FILEID_INVALID
+#define	FILEID_INVALID		0xff
+#endif
+
+static void print_fh(struct file_handle *fh)
+{
+	int i;
+	uint32_t *h = (uint32_t *) fh->f_handle;
+
+	printf("\tfh: ");
+	for (i = 0; i < fh->handle_bytes; i++)
+		printf("%hhx", fh->f_handle[i]);
+	printf("\n");
+
+	printf("\tdecoded fh: ");
+	if (fh->handle_type == FILEID_INO32_GEN)
+		printf("inode=%u gen=%u\n", h[0], h[1]);
+	else if (fh->handle_type == FILEID_INVALID && !fh->handle_bytes)
+		printf("Type %d (Superblock error)\n", fh->handle_type);
+	else
+		printf("Type %d (Unknown)\n", fh->handle_type);
+
+}
+
+static void handle_notifications(char *buffer, int len)
+{
+	struct fanotify_event_metadata *event =
+		(struct fanotify_event_metadata *) buffer;
+	struct fanotify_event_info_header *info;
+	struct fanotify_event_info_error *err;
+	struct fanotify_event_info_fid *fid;
+	int off;
+
+	for (; FAN_EVENT_OK(event, len); event = FAN_EVENT_NEXT(event, len)) {
+
+		if (event->mask != FAN_FS_ERROR) {
+			printf("unexpected FAN MARK: %llx\n",
+							(unsigned long long)event->mask);
+			goto next_event;
+		}
+
+		if (event->fd != FAN_NOFD) {
+			printf("Unexpected fd (!= FAN_NOFD)\n");
+			goto next_event;
+		}
+
+		printf("FAN_FS_ERROR (len=%d)\n", event->event_len);
+
+		for (off = sizeof(*event) ; off < event->event_len;
+		     off += info->len) {
+			info = (struct fanotify_event_info_header *)
+				((char *) event + off);
+
+			switch (info->info_type) {
+			case FAN_EVENT_INFO_TYPE_ERROR:
+				err = (struct fanotify_event_info_error *) info;
+
+				printf("\tGeneric Error Record: len=%d\n",
+				       err->hdr.len);
+				printf("\terror: %d\n", err->error);
+				printf("\terror_count: %d\n", err->error_count);
+				break;
+
+			case FAN_EVENT_INFO_TYPE_FID:
+				fid = (struct fanotify_event_info_fid *) info;
+
+				printf("\tfsid: %x%x\n",
+#if defined(__GLIBC__)
+				       fid->fsid.val[0], fid->fsid.val[1]);
+#else
+				       fid->fsid.__val[0], fid->fsid.__val[1]);
+#endif
+				print_fh((struct file_handle *) &fid->handle);
+				break;
+
+			default:
+				printf("\tUnknown info type=%d len=%d:\n",
+				       info->info_type, info->len);
+			}
+		}
+next_event:
+		printf("---\n\n");
+		fflush(stdout);
+	}
+}
+
+int main(int argc, char **argv)
+{
+	int fd;
+
+	char buffer[BUFSIZ];
+
+	if (argc < 2) {
+		printf("Missing path argument\n");
+		return 1;
+	}
+
+	fd = fanotify_init(FAN_CLASS_NOTIF|FAN_REPORT_FID, O_RDONLY);
+	if (fd < 0) {
+		perror("fanotify_init");
+		errx(1, "fanotify_init");
+	}
+
+	if (fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM,
+			  FAN_FS_ERROR, AT_FDCWD, argv[1])) {
+		perror("fanotify_mark");
+		errx(1, "fanotify_mark");
+	}
+
+	printf("fanotify active\n");
+	fflush(stdout);
+
+	while (1) {
+		int n = read(fd, buffer, BUFSIZ);
+
+		if (n < 0)
+			errx(1, "read");
+
+		handle_notifications(buffer, n);
+	}
+
+	return 0;
+}
diff --git a/tests/generic/1838 b/tests/generic/1838
new file mode 100755
index 00000000000000..087851ddcbdb44
--- /dev/null
+++ b/tests/generic/1838
@@ -0,0 +1,228 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1838
+#
+# Check that fsnotify can report file IO errors.
+
+. ./common/preamble
+_begin_fstest auto quick eio selfhealing
+
+# Override the default cleanup function.
+_cleanup()
+{
+	cd /
+	test -n "$fsmonitor_pid" && kill -TERM $fsmonitor_pid
+	rm -f $tmp.*
+	_dmerror_cleanup
+}
+
+# Import common functions.
+. ./common/fuzzy
+. ./common/filter
+. ./common/dmerror
+. ./common/systemd
+
+case "$FSTYP" in
+xfs)
+	# added as a part of xfs health monitoring
+	_require_xfs_io_command healthmon
+	# no out of place writes
+	_require_no_xfs_always_cow
+	;;
+ext4)
+	# added at the same time as uevents
+	modprobe fs-$FSTYP
+	test -e /sys/fs/ext4/features/uevents || \
+		_notrun "$FSTYP does not support fsnotify ioerrors"
+	;;
+*)
+	_notrun "$FSTYP does not support fsnotify ioerrors"
+	;;
+esac
+
+_require_scratch
+_require_dm_target error
+_require_test_program fs-monitor
+_require_xfs_io_command "fiemap"
+_require_odirect
+
+# fsnotify only gives us a file handle, the error number, and the number of
+# times it was seen in between event deliveries.   The handle is mostly useless
+# since we have no generic way to map that to a file path.  Therefore we can
+# only coalesce all the I/O errors into one report.
+filter_fsnotify_errors() {
+	_filter_scratch | \
+		grep -E '(FAN_FS_ERROR|Generic Error Record|error: 5)' | \
+		sed -e "s/len=[0-9]*/len=XXX/g" | \
+		sort | \
+		uniq
+}
+
+_scratch_mkfs >> $seqres.full
+
+#
+# The dm-error map added by this test doesn't work on zoned devices because
+# table sizes need to be aligned to the zone size, and even for zoned on
+# conventional this test will get confused because of the internal RT device.
+#
+# That check requires a mounted file system, so do a dummy mount before setting
+# up DM.
+#
+_scratch_mount
+test $FSTYP = xfs && _require_xfs_scratch_non_zoned
+_scratch_unmount
+
+_dmerror_init
+_dmerror_mount >> $seqres.full 2>&1
+
+test $FSTYP = xfs && _xfs_force_bdev data $SCRATCH_MNT
+
+# Write a file with 4 file blocks worth of data, figure out the LBA to target
+victim=$SCRATCH_MNT/a
+file_blksz=$(_get_file_block_size $SCRATCH_MNT)
+$XFS_IO_PROG -f -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c "fsync" $victim >> $seqres.full
+
+awk_len_prog='{print $4}'
+bmap_str="$($XFS_IO_PROG -c "fiemap -v" $victim | grep "^[[:space:]]*0:")"
+echo "$bmap_str" >> $seqres.full
+
+phys="$(echo "$bmap_str" | $AWK_PROG '{print $3}')"
+len="$(echo "$bmap_str" | $AWK_PROG "$awk_len_prog")"
+
+fs_blksz=$(_get_block_size $SCRATCH_MNT)
+echo "file_blksz:$file_blksz:fs_blksz:$fs_blksz" >> $seqres.full
+kernel_sectors_per_fs_block=$((fs_blksz / 512))
+
+# Did we get at least 4 fs blocks worth of extent?
+min_len_sectors=$(( 4 * kernel_sectors_per_fs_block ))
+test "$len" -lt $min_len_sectors && \
+	_fail "could not format a long enough extent on an empty fs??"
+
+phys_start=$(echo "$phys" | sed -e 's/\.\..*//g')
+
+echo "$phys:$len:$fs_blksz:$phys_start" >> $seqres.full
+echo "victim file:" >> $seqres.full
+od -tx1 -Ad -c $victim >> $seqres.full
+
+# Set the dmerror table so that all IO will pass through.
+_dmerror_reset_table
+
+cat >> $seqres.full << ENDL
+dmerror before:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+# All sector numbers that we feed to the kernel must be in units of 512b, but
+# they also must be aligned to the device's logical block size.
+logical_block_size=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+kernel_sectors_per_device_lba=$((logical_block_size / 512))
+
+# Mark as bad one of the device LBAs in the middle of the extent.  Target the
+# second LBA of the third block of the four-block file extent that we allocated
+# earlier, but without overflowing into the fourth file block.
+bad_sector=$(( phys_start + (2 * kernel_sectors_per_fs_block) ))
+bad_len=$kernel_sectors_per_device_lba
+if (( kernel_sectors_per_device_lba < kernel_sectors_per_fs_block )); then
+	bad_sector=$((bad_sector + kernel_sectors_per_device_lba))
+fi
+if (( (bad_sector % kernel_sectors_per_device_lba) != 0)); then
+	echo "bad_sector $bad_sector not congruent with device logical block size $logical_block_size"
+fi
+
+# Remount to flush the page cache, start fsnotify, and make the LBA bad
+_dmerror_unmount
+_dmerror_mount
+
+$here/src/fs-monitor $SCRATCH_MNT > $tmp.fsmonitor &
+fsmonitor_pid=$!
+sleep 1
+
+_dmerror_mark_range_bad $bad_sector $bad_len
+
+cat >> $seqres.full << ENDL
+dmerror after marking bad:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+_dmerror_load_error_table
+
+# See if buffered reads pick it up
+echo "Try buffered read"
+$XFS_IO_PROG -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio reads pick it up
+echo "Try directio read"
+$XFS_IO_PROG -d -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio writes pick it up
+echo "Try directio write"
+$XFS_IO_PROG -d -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# See if buffered writes pick it up
+echo "Try buffered write"
+$XFS_IO_PROG -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# Now mark the bad range good so that unmount won't fail due to IO errors.
+echo "Fix device"
+_dmerror_mark_range_good $bad_sector $bad_len
+_dmerror_load_error_table
+
+cat >> $seqres.full << ENDL
+dmerror after marking good:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+# Unmount filesystem to start fresh
+echo "Kill fsnotify"
+_dmerror_unmount
+sleep 1
+kill -TERM $fsmonitor_pid
+unset fsmonitor_pid
+echo fsnotify log >> $seqres.full
+cat $tmp.fsmonitor >> $seqres.full
+cat $tmp.fsmonitor | filter_fsnotify_errors
+
+# Start fsnotify again so that can verify that the errors don't persist after
+# we flip back to the good dm table.
+echo "Remount and restart fsnotify"
+_dmerror_mount
+$here/src/fs-monitor $SCRATCH_MNT > $tmp.fsmonitor &
+fsmonitor_pid=$!
+sleep 1
+
+# See if buffered reads pick it up
+echo "Try buffered read again"
+$XFS_IO_PROG -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio reads pick it up
+echo "Try directio read again"
+$XFS_IO_PROG -d -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio writes pick it up
+echo "Try directio write again"
+$XFS_IO_PROG -d -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# See if buffered writes pick it up
+echo "Try buffered write again"
+$XFS_IO_PROG -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# Unmount fs and kill fsnotify, then wait for it to finish
+echo "Kill fsnotify again"
+_dmerror_unmount
+sleep 1
+kill -TERM $fsmonitor_pid
+unset fsmonitor_pid
+cat $tmp.fsmonitor >> $seqres.full
+cat $tmp.fsmonitor | filter_fsnotify_errors
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/1838.out b/tests/generic/1838.out
new file mode 100644
index 00000000000000..adae590fe0b2ea
--- /dev/null
+++ b/tests/generic/1838.out
@@ -0,0 +1,20 @@
+QA output created by 1838
+Try buffered read
+pread: Input/output error
+Try directio read
+pread: Input/output error
+Try directio write
+pwrite: Input/output error
+Try buffered write
+fsync: Input/output error
+Fix device
+Kill fsnotify
+	Generic Error Record: len=XXX
+	error: 5
+FAN_FS_ERROR (len=XXX)
+Remount and restart fsnotify
+Try buffered read again
+Try directio read again
+Try directio write again
+Try buffered write again
+Kill fsnotify again


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 01/13] xfs: test health monitoring code
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
@ 2026-03-03  0:41   ` Darrick J. Wong
  2026-03-09 17:21     ` Zorro Lang
  2026-03-03  0:41   ` [PATCH 02/13] xfs: test for metadata corruption error reporting via healthmon Darrick J. Wong
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:41 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add some functionality tests for the new health monitoring code.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 doc/group-names.txt |    1 +
 tests/xfs/1885      |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1885.out  |    5 +++++
 3 files changed, 59 insertions(+)
 create mode 100755 tests/xfs/1885
 create mode 100644 tests/xfs/1885.out


diff --git a/doc/group-names.txt b/doc/group-names.txt
index 10b49e50517797..158f84d36d3154 100644
--- a/doc/group-names.txt
+++ b/doc/group-names.txt
@@ -117,6 +117,7 @@ samefs			overlayfs when all layers are on the same fs
 scrub			filesystem metadata scrubbers
 seed			btrfs seeded filesystems
 seek			llseek functionality
+selfhealing		self healing filesystem code
 selftest		tests with fixed results, used to validate testing setup
 send			btrfs send/receive
 shrinkfs		decreasing the size of a filesystem
diff --git a/tests/xfs/1885 b/tests/xfs/1885
new file mode 100755
index 00000000000000..1d75ef19c7c9d9
--- /dev/null
+++ b/tests/xfs/1885
@@ -0,0 +1,53 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1885
+#
+# Make sure that healthmon handles module refcount correctly.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/module
+
+refcount_file="/sys/module/xfs/refcnt"
+test -e "$refcount_file" || _notrun "cannot find xfs module refcount"
+
+_require_test
+_require_xfs_io_command healthmon
+
+# Capture mod refcount without the test fs mounted
+_test_unmount
+init_refcount="$(cat "$refcount_file")"
+
+# Capture mod refcount with the test fs mounted
+_test_mount
+nomon_mount_refcount="$(cat "$refcount_file")"
+
+# Capture mod refcount with test fs mounted and the healthmon fd open.
+# Pause the xfs_io process so that it doesn't actually respond to events.
+$XFS_IO_PROG -c 'healthmon -c -v' $TEST_DIR >> $seqres.full &
+sleep 0.5
+kill -STOP %1
+mon_mount_refcount="$(cat "$refcount_file")"
+
+# Capture mod refcount with only the healthmon fd open.
+_test_unmount
+mon_nomount_refcount="$(cat "$refcount_file")"
+
+# Capture mod refcount after continuing healthmon (which should exit due to the
+# unmount) and killing it.
+kill -CONT %1
+kill %1
+wait
+nomon_nomount_refcount="$(cat "$refcount_file")"
+
+_within_tolerance "mount refcount" "$nomon_mount_refcount" "$((init_refcount + 1))" 0 -v
+_within_tolerance "mount + healthmon refcount" "$mon_mount_refcount" "$((init_refcount + 2))" 0 -v
+_within_tolerance "healthmon refcount" "$mon_nomount_refcount" "$((init_refcount + 1))" 0 -v
+_within_tolerance "end refcount" "$nomon_nomount_refcount" "$init_refcount" 0 -v
+
+status=0
+exit
diff --git a/tests/xfs/1885.out b/tests/xfs/1885.out
new file mode 100644
index 00000000000000..f152cef0525609
--- /dev/null
+++ b/tests/xfs/1885.out
@@ -0,0 +1,5 @@
+QA output created by 1885
+mount refcount is in range
+mount + healthmon refcount is in range
+healthmon refcount is in range
+end refcount is in range


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 02/13] xfs: test for metadata corruption error reporting via healthmon
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 01/13] xfs: test health monitoring code Darrick J. Wong
@ 2026-03-03  0:41   ` Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 03/13] xfs: test io " Darrick J. Wong
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:41 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Check if we can detect runtime metadata corruptions via the health
monitor.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 common/rc          |   10 ++++++
 tests/xfs/1879     |   93 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1879.out |    8 ++++
 3 files changed, 111 insertions(+)
 create mode 100755 tests/xfs/1879
 create mode 100644 tests/xfs/1879.out


diff --git a/common/rc b/common/rc
index fd4ca9641822cf..38d4b500b3b51f 100644
--- a/common/rc
+++ b/common/rc
@@ -3013,6 +3013,16 @@ _require_xfs_io_command()
 		echo $testio | grep -q "Inappropriate ioctl" && \
 			_notrun "xfs_io $command support is missing"
 		;;
+	"healthmon")
+		testio=`$XFS_IO_PROG -c "$command -p $param" $TEST_DIR 2>&1`
+		echo $testio | grep -q "bad argument count" && \
+			_notrun "xfs_io $command $param support is missing"
+		echo $testio | grep -q "Inappropriate ioctl" && \
+			_notrun "xfs_io $command $param ioctl support is missing"
+		echo $testio | grep -q "Operation not supported" && \
+			_notrun "xfs_io $command $param kernel support is missing"
+		param_checked="$param"
+		;;
 	"label")
 		testio=`$XFS_IO_PROG -c "label" $TEST_DIR 2>&1`
 		;;
diff --git a/tests/xfs/1879 b/tests/xfs/1879
new file mode 100755
index 00000000000000..75bc8e3b5f4316
--- /dev/null
+++ b/tests/xfs/1879
@@ -0,0 +1,93 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1879
+#
+# Corrupt some metadata and try to access it with the health monitoring program
+# running.  Check that healthmon observes a metadata error.
+#
+. ./common/preamble
+_begin_fstest auto quick eio selfhealing
+
+_cleanup()
+{
+	cd /
+	rm -rf $tmp.* $testdir
+}
+
+. ./common/filter
+
+_require_scratch_nocheck
+_require_scratch_xfs_crc # can't detect minor corruption w/o crc
+_require_xfs_io_command healthmon
+
+# Disable the scratch rt device to avoid test failures relating to the rt
+# bitmap consuming all the free space in our small data device.
+unset SCRATCH_RTDEV
+
+echo "Format and mount"
+_scratch_mkfs -d agcount=1 | _filter_mkfs 2> $tmp.mkfs >> $seqres.full
+. $tmp.mkfs
+_scratch_mount
+mkdir $SCRATCH_MNT/a/
+# Enough entries to get to a single block directory
+for ((i = 0; i < ( (isize + 255) / 256); i++)); do
+	path="$(printf "%s/a/%0255d" "$SCRATCH_MNT" "$i")"
+	touch "$path"
+done
+inum="$(stat -c %i "$SCRATCH_MNT/a")"
+_scratch_unmount
+
+# Fuzz the directory block so that the touch below will be guaranteed to trip
+# a runtime sickness report in exactly the manner we desire.
+_scratch_xfs_db -x -c "inode $inum" -c "dblock 0" -c 'fuzz bhdr.hdr.owner add' -c print &>> $seqres.full
+
+# Try to allocate space to trigger a metadata corruption event
+echo "Runtime corruption detection"
+_scratch_mount
+$XFS_IO_PROG -c 'healthmon -c -v' $SCRATCH_MNT > $tmp.healthmon &
+sleep 1	# wait for program to start up
+touch $SCRATCH_MNT/a/farts &>> $seqres.full
+_scratch_unmount
+
+wait	# for healthmon to finish
+
+# Did we get errors?
+check_healthmon()
+{
+	cat $tmp.healthmon >> $seqres.full
+	_filter_scratch < $tmp.healthmon | \
+		grep -E '(sick|corrupt)' | \
+		sed -e 's|SCRATCH_MNT/a|VICTIM|g' \
+		    -e 's|SCRATCH_MNT ino [0-9]* gen 0x[0-9a-f]*|VICTIM|g' | \
+		sort | \
+		uniq
+}
+check_healthmon
+
+# Run scrub to trigger a health event from there too.
+echo "Scrub corruption detection"
+_scratch_mount
+if _supports_xfs_scrub $SCRATCH_MNT $SCRATCH_DEV; then
+	$XFS_IO_PROG -c 'healthmon -c -v' $SCRATCH_MNT > $tmp.healthmon &
+	sleep 1	# wait for program to start up
+	$XFS_SCRUB_PROG -n $SCRATCH_MNT &>> $seqres.full
+	_scratch_unmount
+
+	wait	# for healthmon to finish
+
+	# Did we get errors?
+	check_healthmon
+else
+	# mock the output since we don't support scrub
+	_scratch_unmount
+	cat << ENDL
+VICTIM directory: corrupt
+VICTIM directory: sick
+VICTIM parent: corrupt
+ENDL
+fi
+
+status=0
+exit
diff --git a/tests/xfs/1879.out b/tests/xfs/1879.out
new file mode 100644
index 00000000000000..2f6acbe1c4fb22
--- /dev/null
+++ b/tests/xfs/1879.out
@@ -0,0 +1,8 @@
+QA output created by 1879
+Format and mount
+Runtime corruption detection
+VICTIM directory: sick
+Scrub corruption detection
+VICTIM directory: corrupt
+VICTIM directory: sick
+VICTIM parent: corrupt


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 03/13] xfs: test io error reporting via healthmon
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 01/13] xfs: test health monitoring code Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 02/13] xfs: test for metadata corruption error reporting via healthmon Darrick J. Wong
@ 2026-03-03  0:41   ` Darrick J. Wong
  2026-03-03  0:41   ` [PATCH 04/13] xfs: set up common code for testing xfs_healer Darrick J. Wong
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:41 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Create a new test to make sure the kernel can report IO errors via
health monitoring.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1878     |   93 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1878.out |   10 ++++++
 2 files changed, 103 insertions(+)
 create mode 100755 tests/xfs/1878
 create mode 100644 tests/xfs/1878.out


diff --git a/tests/xfs/1878 b/tests/xfs/1878
new file mode 100755
index 00000000000000..1ff6ae040fb193
--- /dev/null
+++ b/tests/xfs/1878
@@ -0,0 +1,93 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1878
+#
+# Attempt to read and write a file in buffered and directio mode with the
+# health monitoring program running.  Check that healthmon observes all four
+# types of IO errors.
+#
+. ./common/preamble
+_begin_fstest auto quick eio selfhealing
+
+_cleanup()
+{
+	cd /
+	rm -rf $tmp.* $testdir
+	_dmerror_cleanup
+}
+
+. ./common/filter
+. ./common/dmerror
+
+_require_scratch_nocheck
+_require_xfs_io_command healthmon
+_require_dm_target error
+
+filter_healer_errors() {
+	_filter_scratch | \
+		grep -E '(buffered|directio)' | \
+		sed \
+		    -e 's/ino [0-9]*/ino NUM/g' \
+		    -e 's/gen 0x[0-9a-f]*/gen NUM/g' \
+		    -e 's/pos [0-9]*/pos NUM/g' \
+		    -e 's/len [0-9]*/len NUM/g' \
+		    -e 's|SCRATCH_MNT/a|VICTIM|g' \
+		    -e 's|SCRATCH_MNT ino NUM gen NUM|VICTIM|g' | \
+		uniq
+}
+
+# Disable the scratch rt device to avoid test failures relating to the rt
+# bitmap consuming all the free space in our small data device.
+unset SCRATCH_RTDEV
+
+echo "Format and mount"
+_scratch_mkfs > $seqres.full 2>&1
+_dmerror_init no_log
+_dmerror_mount
+
+_require_fs_space $SCRATCH_MNT 65536
+
+# Create a file with written regions far enough apart that the pagecache can't
+# possibly be caching the regions with a single folio.
+testfile=$SCRATCH_MNT/fsync-err-test
+$XFS_IO_PROG -f \
+	-c 'pwrite -b 1m 0 1m' \
+	-c 'pwrite -b 1m 10g 1m' \
+	-c 'pwrite -b 1m 20g 1m' \
+	-c fsync $testfile >> $seqres.full
+
+# First we check if directio errors get reported
+$XFS_IO_PROG -c 'healthmon -c -v' $SCRATCH_MNT >> $tmp.healthmon &
+sleep 1	# wait for program to start up
+_dmerror_load_error_table
+$XFS_IO_PROG -d -c 'pwrite -b 256k 12k 16k' $testfile >> $seqres.full
+$XFS_IO_PROG -d -c 'pread -b 256k 10g 16k' $testfile >> $seqres.full
+_dmerror_load_working_table
+
+_dmerror_unmount
+wait	# for healthmon to finish
+_dmerror_mount
+
+# Next we check if buffered io errors get reported.  We have to write something
+# before loading the error table to ensure the dquots get loaded.
+$XFS_IO_PROG -c 'pwrite -b 256k 20g 1k' -c fsync $testfile >> $seqres.full
+$XFS_IO_PROG -c 'healthmon -c -v' $SCRATCH_MNT >> $tmp.healthmon &
+sleep 1	# wait for program to start up
+_dmerror_load_error_table
+$XFS_IO_PROG -c 'pread -b 256k 12k 16k' $testfile >> $seqres.full
+$XFS_IO_PROG -c 'pwrite -b 256k 20g 16k' -c fsync $testfile >> $seqres.full
+_dmerror_load_working_table
+
+_dmerror_unmount
+wait	# for healthmon to finish
+
+# Did we get errors?
+cat $tmp.healthmon >> $seqres.full
+filter_healer_errors < $tmp.healthmon
+
+_dmerror_cleanup
+
+status=0
+exit
diff --git a/tests/xfs/1878.out b/tests/xfs/1878.out
new file mode 100644
index 00000000000000..f64c440b1a6ed1
--- /dev/null
+++ b/tests/xfs/1878.out
@@ -0,0 +1,10 @@
+QA output created by 1878
+Format and mount
+pwrite: Input/output error
+pread: Input/output error
+pread: Input/output error
+fsync: Input/output error
+VICTIM pos NUM len NUM: directio_write: Input/output error
+VICTIM pos NUM len NUM: directio_read: Input/output error
+VICTIM pos NUM len NUM: buffered_read: Input/output error
+VICTIM pos NUM len NUM: buffered_write: Input/output error


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 04/13] xfs: set up common code for testing xfs_healer
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (2 preceding siblings ...)
  2026-03-03  0:41   ` [PATCH 03/13] xfs: test io " Darrick J. Wong
@ 2026-03-03  0:41   ` Darrick J. Wong
  2026-03-03  0:42   ` [PATCH 05/13] xfs: test xfs_healer's event handling Darrick J. Wong
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:41 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add a bunch of common code so that we can test the xfs_healer daemon.
Most of the changes here are to make it easier to manage the systemd
service units for xfs_healer and xfs_scrub.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 common/config  |   14 +++++++
 common/rc      |    5 ++
 common/systemd |   32 ++++++++++++++++
 common/xfs     |  114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/802  |    4 +-
 5 files changed, 167 insertions(+), 2 deletions(-)


diff --git a/common/config b/common/config
index 1420e35ddfee42..8468a60081f50c 100644
--- a/common/config
+++ b/common/config
@@ -161,6 +161,20 @@ export XFS_ADMIN_PROG="$(type -P xfs_admin)"
 export XFS_GROWFS_PROG=$(type -P xfs_growfs)
 export XFS_SPACEMAN_PROG="$(type -P xfs_spaceman)"
 export XFS_SCRUB_PROG="$(type -P xfs_scrub)"
+
+XFS_HEALER_PROG="$(type -P xfs_healer)"
+XFS_HEALER_START_PROG="$(type -P xfs_healer_start)"
+
+# If not found, try the ones installed in libexec
+if [ ! -x "$XFS_HEALER_PROG" ] && [ -e /usr/libexec/xfsprogs/xfs_healer ]; then
+	XFS_HEALER_PROG=/usr/libexec/xfsprogs/xfs_healer
+fi
+if [ ! -x "$XFS_HEALER_START_PROG" ] && [ -e /usr/libexec/xfsprogs/xfs_healer_start ]; then
+	XFS_HEALER_START_PROG=/usr/libexec/xfsprogs/xfs_healer_start
+fi
+export XFS_HEALER_PROG
+export XFS_HEALER_START_PROG
+
 export XFS_PARALLEL_REPAIR_PROG="$(type -P xfs_prepair)"
 export XFS_PARALLEL_REPAIR64_PROG="$(type -P xfs_prepair64)"
 export __XFSDUMP_PROG="$(type -P xfsdump)"
diff --git a/common/rc b/common/rc
index 38d4b500b3b51f..91db0fb09da891 100644
--- a/common/rc
+++ b/common/rc
@@ -3026,6 +3026,11 @@ _require_xfs_io_command()
 	"label")
 		testio=`$XFS_IO_PROG -c "label" $TEST_DIR 2>&1`
 		;;
+	"verifymedia")
+		testio=`$XFS_IO_PROG -x -c "verifymedia $* 0 0" 2>&1`
+		echo $testio | grep -q "invalid option" && \
+			_notrun "xfs_io $command support is missing"
+		;;
 	"open")
 		# -c "open $f" is broken in xfs_io <= 4.8. Along with the fix,
 		# a new -C flag was introduced to execute one shot commands.
diff --git a/common/systemd b/common/systemd
index b2e24f267b2d93..b4c77c78a8da44 100644
--- a/common/systemd
+++ b/common/systemd
@@ -44,6 +44,18 @@ _systemd_unit_active() {
 	test "$(systemctl is-active "$1")" = "active"
 }
 
+# Wait for up to a certain number of seconds for a service to reach inactive
+# state.
+_systemd_unit_wait() {
+	local svcname="$1"
+	local timeout="${2:-30}"
+
+	for ((i = 0; i < (timeout * 2); i++)); do
+		test "$(systemctl is-active "$svcname")" = "inactive" && break
+		sleep 0.5
+	done
+}
+
 _require_systemd_unit_active() {
 	_require_systemd_unit_defined "$1"
 	_systemd_unit_active "$1" || \
@@ -71,3 +83,23 @@ _systemd_unit_status() {
 	_systemd_installed || return 1
 	systemctl status "$1"
 }
+
+# Start a running systemd unit
+_systemd_unit_start() {
+	systemctl start "$1"
+}
+# Stop a running systemd unit
+_systemd_unit_stop() {
+	systemctl stop "$1"
+}
+
+# Mask or unmask a running systemd unit
+_systemd_unit_mask() {
+	systemctl mask "$1"
+}
+_systemd_unit_unmask() {
+	systemctl unmask "$1"
+}
+_systemd_unit_masked() {
+	systemctl status "$1" 2>/dev/null | grep -q 'Loaded: masked'
+}
diff --git a/common/xfs b/common/xfs
index 7fa0db2e26b4c9..a4a538fde3f173 100644
--- a/common/xfs
+++ b/common/xfs
@@ -2301,3 +2301,117 @@ _filter_bmap_gno()
 		if ($ag =~ /\d+/) {print "$ag "} ;
         '
 }
+
+# Compute the systemd service instance name for a background service and path
+_xfs_systemd_svcname()
+{
+	local arg
+	local template
+	local out
+	local svc="$1"
+	shift
+
+	case "$svc" in
+	--scrub)	arg="-s"; template="xfs_scrub@.service";;
+	--healer)	arg="-h"; template="xfs_healer@.service";;
+	*)		arg="-t $svc"; template="$svc";;
+	esac
+
+	# xfs_io should be able to do all the magic to make this work...
+	out="$($XFS_IO_PROG -c "svcname ${arg} $*" / 2>/dev/null)"
+	if [ -n "$out" ]; then
+		echo "$out"
+		return
+	fi
+
+	# ...but if not, we can fall back to brute force systemd invocations.
+	systemd-escape --template "$template" --path "$*"
+}
+
+# Compute the xfs_healer systemd service instance name for a given path
+_xfs_healer_svcname()
+{
+	_xfs_systemd_svcname --healer "$@"
+}
+
+# Compute the xfs_scrub systemd service instance name for a given path
+_xfs_scrub_svcname()
+{
+	_xfs_systemd_svcname --scrub "$@"
+}
+
+# Run the xfs_healer program on some filesystem
+_xfs_healer() {
+	$XFS_HEALER_PROG "$@"
+}
+
+# Run the xfs_healer program on the scratch fs
+_scratch_xfs_healer() {
+	_xfs_healer "$@" "$SCRATCH_MNT"
+}
+
+# Turn off the background xfs_healer service if any so that it doesn't fix
+# injected metadata errors; then start a background copy of xfs_healer to
+# capture that.
+_invoke_xfs_healer() {
+	local mount="$1"
+	local logfile="$2"
+	shift; shift
+
+	if _systemd_is_running; then
+		local svc="$(_xfs_healer_svcname "$mount")"
+		_systemd_unit_stop "$svc" &>> $seqres.full
+	fi
+
+	$XFS_HEALER_PROG "$mount" "$@" &> "$logfile" &
+	XFS_HEALER_PID=$!
+
+	# Wait 30s for the healer program to really start up
+	for ((i = 0; i < 60; i++)); do
+		test -e "$logfile" && \
+			grep -q 'monitoring started' "$logfile" && \
+			break
+		sleep 0.5
+	done
+}
+
+# Run our own copy of xfs_healer against the scratch device.  Note that
+# unmounting the scratch fs causes the healer daemon to exit, so we don't need
+# to kill it explicitly from _cleanup.
+_scratch_invoke_xfs_healer() {
+	_invoke_xfs_healer "$SCRATCH_MNT" "$@"
+}
+
+# Unmount the filesystem to kill the xfs_healer instance started by
+# _invoke_xfs_healer, and wait up to a certain amount of time for it to exit.
+_kill_xfs_healer() {
+	local unmount="$1"
+	local timeout="${2:-30}"
+	local i
+
+	# Unmount fs to kill healer, then wait for it to finish
+	for ((i = 0; i < (timeout * 2); i++)); do
+		$unmount &>> $seqres.full && break
+		sleep 0.5
+	done
+
+	test -n "$XFS_HEALER_PID" && \
+		kill $XFS_HEALER_PID &>> $seqres.full
+	wait
+	unset XFS_HEALER_PID
+}
+
+# Unmount the scratch fs to kill a _scratch_invoke_xfs_healer instance.
+_scratch_kill_xfs_healer() {
+	local unmount="${1:-_scratch_unmount}"
+	shift
+
+	_kill_xfs_healer "$unmount" "$@"
+}
+
+# Does this mounted filesystem support xfs_healer?
+_require_xfs_healer()
+{
+	_xfs_healer --supported "$@" &>/dev/null || \
+		_notrun "health monitoring not supported on this kernel"
+}
diff --git a/tests/xfs/802 b/tests/xfs/802
index fc4767acb66a55..18312b15b645bd 100755
--- a/tests/xfs/802
+++ b/tests/xfs/802
@@ -105,8 +105,8 @@ run_scrub_service() {
 }
 
 echo "Scrub Scratch FS"
-scratch_path=$(systemd-escape --path "$SCRATCH_MNT")
-run_scrub_service xfs_scrub@$scratch_path
+svc="$(_xfs_scrub_svcname "$SCRATCH_MNT")"
+run_scrub_service "$svc"
 find_scrub_trace "$SCRATCH_MNT"
 
 # Remove the xfs_scrub_all media scan stamp directory (if specified) because we


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 05/13] xfs: test xfs_healer's event handling
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (3 preceding siblings ...)
  2026-03-03  0:41   ` [PATCH 04/13] xfs: set up common code for testing xfs_healer Darrick J. Wong
@ 2026-03-03  0:42   ` Darrick J. Wong
  2026-03-03  0:42   ` [PATCH 06/13] xfs: test xfs_healer can fix a filesystem Darrick J. Wong
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:42 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer can handle every type of event that the kernel
can throw at it by initiating a full scrub of a test filesystem.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1882     |   44 ++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1882.out |    2 ++
 2 files changed, 46 insertions(+)
 create mode 100755 tests/xfs/1882
 create mode 100644 tests/xfs/1882.out


diff --git a/tests/xfs/1882 b/tests/xfs/1882
new file mode 100755
index 00000000000000..2fb4589418401e
--- /dev/null
+++ b/tests/xfs/1882
@@ -0,0 +1,44 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1882
+#
+# Make sure that xfs_healer correctly handles all the reports that it gets
+# from the kernel.  We simulate this by using the --everything mode so we get
+# all the events, not just the sickness reports.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+. ./common/populate
+
+_require_scrub
+_require_xfs_io_command "scrub"		# online check support
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_scratch
+
+# Does this fs support health monitoring?
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+_require_xfs_healer $SCRATCH_MNT
+_scratch_unmount
+
+# Create a sample fs with all the goodies
+_scratch_populate_cached nofill &>> $seqres.full
+_scratch_mount
+
+_scratch_invoke_xfs_healer "$tmp.healer" --everything
+
+# Run scrub to make some noise
+_scratch_scrub -b -n >> $seqres.full
+
+_scratch_kill_xfs_healer
+cat $tmp.healer >> $seqres.full
+
+echo Silence is golden
+status=0
+exit
diff --git a/tests/xfs/1882.out b/tests/xfs/1882.out
new file mode 100644
index 00000000000000..9b31ccb735cabd
--- /dev/null
+++ b/tests/xfs/1882.out
@@ -0,0 +1,2 @@
+QA output created by 1882
+Silence is golden


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 06/13] xfs: test xfs_healer can fix a filesystem
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (4 preceding siblings ...)
  2026-03-03  0:42   ` [PATCH 05/13] xfs: test xfs_healer's event handling Darrick J. Wong
@ 2026-03-03  0:42   ` Darrick J. Wong
  2026-03-03  0:42   ` [PATCH 07/13] xfs: test xfs_healer can report file I/O errors Darrick J. Wong
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:42 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer can actually fix an injected metadata corruption.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1884     |   89 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1884.out |    2 +
 2 files changed, 91 insertions(+)
 create mode 100755 tests/xfs/1884
 create mode 100644 tests/xfs/1884.out


diff --git a/tests/xfs/1884 b/tests/xfs/1884
new file mode 100755
index 00000000000000..1fa6457ad25203
--- /dev/null
+++ b/tests/xfs/1884
@@ -0,0 +1,89 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1884
+#
+# Ensure that autonomous self healing fixes the filesystem correctly.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+
+_require_scrub
+_require_xfs_io_command "repair"	# online repair support
+_require_xfs_db_command "blocktrash"
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command "$XFS_PROPERTY_PROG" "xfs_property"
+_require_scratch
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+_xfs_has_feature $SCRATCH_MNT rmapbt || \
+	_notrun "reverse mapping required to test directory auto-repair"
+_xfs_has_feature $SCRATCH_MNT parent || \
+	_notrun "parent pointers required to test directory auto-repair"
+_require_xfs_healer $SCRATCH_MNT --repair
+
+# Configure the filesystem for automatic repair of the filesystem.
+$XFS_PROPERTY_PROG $SCRATCH_MNT set autofsck=repair >> $seqres.full
+
+# Create a largeish directory
+dblksz=$(_xfs_get_dir_blocksize "$SCRATCH_MNT")
+echo testdata > $SCRATCH_MNT/a
+mkdir -p "$SCRATCH_MNT/some/victimdir"
+for ((i = 0; i < (dblksz / 255); i++)); do
+	fname="$(printf "%0255d" "$i")"
+	ln $SCRATCH_MNT/a $SCRATCH_MNT/some/victimdir/$fname
+done
+
+# Did we get at least two dir blocks?
+dirsize=$(stat -c '%s' $SCRATCH_MNT/some/victimdir)
+test "$dirsize" -gt "$dblksz" || echo "failed to create two-block directory"
+
+# Break the directory, remount filesystem
+_scratch_unmount
+_scratch_xfs_db -x \
+	-c 'path /some/victimdir' \
+	-c 'bmap' \
+	-c 'dblock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' >> $seqres.full
+_scratch_mount
+
+_scratch_invoke_xfs_healer "$tmp.healer" --repair
+
+# Access the broken directory to trigger a repair, then poll the directory
+# for 5 seconds to see if it gets fixed without us needing to intervene.
+ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+_filter_scratch < $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "try $try saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "try $try no longer saw corruption or gave up" >> $seqres.full
+_filter_scratch < $tmp.err
+
+# List the dirents of /victimdir to see if it stops reporting corruption
+ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "retry $try still saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "retry $try no longer saw corruption or gave up" >> $seqres.full
+
+# Unmount to kill the healer
+_scratch_kill_xfs_healer
+cat $tmp.healer >> $seqres.full
+
+status=0
+exit
diff --git a/tests/xfs/1884.out b/tests/xfs/1884.out
new file mode 100644
index 00000000000000..929e33da01f92c
--- /dev/null
+++ b/tests/xfs/1884.out
@@ -0,0 +1,2 @@
+QA output created by 1884
+ls: reading directory 'SCRATCH_MNT/some/victimdir': Structure needs cleaning


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 07/13] xfs: test xfs_healer can report file I/O errors
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (5 preceding siblings ...)
  2026-03-03  0:42   ` [PATCH 06/13] xfs: test xfs_healer can fix a filesystem Darrick J. Wong
@ 2026-03-03  0:42   ` Darrick J. Wong
  2026-03-03  0:42   ` [PATCH 08/13] xfs: test xfs_healer can report file media errors Darrick J. Wong
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:42 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer can actually report file I/O errors.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1896     |  210 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1896.out |   21 +++++
 2 files changed, 231 insertions(+)
 create mode 100755 tests/xfs/1896
 create mode 100644 tests/xfs/1896.out


diff --git a/tests/xfs/1896 b/tests/xfs/1896
new file mode 100755
index 00000000000000..911e1d5ee8a576
--- /dev/null
+++ b/tests/xfs/1896
@@ -0,0 +1,210 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1896
+#
+# Check that xfs_healer can report file IO errors.
+
+. ./common/preamble
+_begin_fstest auto quick scrub eio selfhealing
+
+# Override the default cleanup function.
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+	_dmerror_cleanup
+}
+
+# Import common functions.
+. ./common/fuzzy
+. ./common/filter
+. ./common/dmerror
+. ./common/systemd
+
+_require_scratch
+_require_scrub
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_dm_target error
+_require_no_xfs_always_cow	# no out of place writes
+
+# Ignore everything from the healer except for the four IO error log messages.
+# Strip out file handle and range information because the blocksize can vary.
+# Writeback and readahead can trigger multiple error messages due to retries,
+# hence the uniq.
+filter_healer_errors() {
+	_filter_scratch | \
+		grep -E '(buffered|directio)' | \
+		sed \
+		    -e 's/ino [0-9]*/ino NUM/g' \
+		    -e 's/gen 0x[0-9a-f]*/gen NUM/g' \
+		    -e 's/pos [0-9]*/pos NUM/g' \
+		    -e 's/len [0-9]*/len NUM/g' \
+		    -e 's|SCRATCH_MNT/a|VICTIM|g' \
+		    -e 's|SCRATCH_MNT ino NUM gen NUM|VICTIM|g' | \
+		sort | \
+		uniq
+}
+
+_scratch_mkfs >> $seqres.full
+
+#
+# The dm-error map added by this test doesn't work on zoned devices because
+# table sizes need to be aligned to the zone size, and even for zoned on
+# conventional this test will get confused because of the internal RT device.
+#
+# That check requires a mounted file system, so do a dummy mount before setting
+# up DM.
+#
+_scratch_mount
+_require_xfs_scratch_non_zoned
+_require_xfs_healer $SCRATCH_MNT
+_scratch_unmount
+
+_dmerror_init
+_dmerror_mount >> $seqres.full 2>&1
+
+# Write a file with 4 file blocks worth of data, figure out the LBA to target
+victim=$SCRATCH_MNT/a
+file_blksz=$(_get_file_block_size $SCRATCH_MNT)
+$XFS_IO_PROG -f -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c "fsync" $victim >> $seqres.full
+unset errordev
+
+awk_len_prog='{print $6}'
+if _xfs_is_realtime_file $victim; then
+	if ! _xfs_has_feature $SCRATCH_MNT rtgroups; then
+		awk_len_prog='{print $4}'
+	fi
+	errordev="RT"
+fi
+bmap_str="$($XFS_IO_PROG -c "bmap -elpv" $victim | grep "^[[:space:]]*0:")"
+echo "$errordev:$bmap_str" >> $seqres.full
+
+phys="$(echo "$bmap_str" | $AWK_PROG '{print $3}')"
+len="$(echo "$bmap_str" | $AWK_PROG "$awk_len_prog")"
+
+fs_blksz=$(_get_block_size $SCRATCH_MNT)
+echo "file_blksz:$file_blksz:fs_blksz:$fs_blksz" >> $seqres.full
+kernel_sectors_per_fs_block=$((fs_blksz / 512))
+
+# Did we get at least 4 fs blocks worth of extent?
+min_len_sectors=$(( 4 * kernel_sectors_per_fs_block ))
+test "$len" -lt $min_len_sectors && \
+	_fail "could not format a long enough extent on an empty fs??"
+
+phys_start=$(echo "$phys" | sed -e 's/\.\..*//g')
+
+echo "$errordev:$phys:$len:$fs_blksz:$phys_start" >> $seqres.full
+echo "victim file:" >> $seqres.full
+od -tx1 -Ad -c $victim >> $seqres.full
+
+# Set the dmerror table so that all IO will pass through.
+_dmerror_reset_table
+
+cat >> $seqres.full << ENDL
+dmerror before:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+# All sector numbers that we feed to the kernel must be in units of 512b, but
+# they also must be aligned to the device's logical block size.
+logical_block_size=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+kernel_sectors_per_device_lba=$((logical_block_size / 512))
+
+# Mark as bad one of the device LBAs in the middle of the extent.  Target the
+# second LBA of the third block of the four-block file extent that we allocated
+# earlier, but without overflowing into the fourth file block.
+bad_sector=$(( phys_start + (2 * kernel_sectors_per_fs_block) ))
+bad_len=$kernel_sectors_per_device_lba
+if (( kernel_sectors_per_device_lba < kernel_sectors_per_fs_block )); then
+	bad_sector=$((bad_sector + kernel_sectors_per_device_lba))
+fi
+if (( (bad_sector % kernel_sectors_per_device_lba) != 0)); then
+	echo "bad_sector $bad_sector not congruent with device logical block size $logical_block_size"
+fi
+
+# Remount to flush the page cache, start the healer, and make the LBA bad
+_dmerror_unmount
+_dmerror_mount
+
+_scratch_invoke_xfs_healer "$tmp.healer"
+
+_dmerror_mark_range_bad $bad_sector $bad_len $errordev
+
+cat >> $seqres.full << ENDL
+dmerror after marking bad:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+_dmerror_load_error_table
+
+# See if buffered reads pick it up
+echo "Try buffered read"
+$XFS_IO_PROG -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio reads pick it up
+echo "Try directio read"
+$XFS_IO_PROG -d -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio writes pick it up
+echo "Try directio write"
+$XFS_IO_PROG -d -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# See if buffered writes pick it up
+echo "Try buffered write"
+$XFS_IO_PROG -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# Now mark the bad range good so that unmount won't fail due to IO errors.
+echo "Fix device"
+_dmerror_mark_range_good $bad_sector $bad_len $errordev
+_dmerror_load_error_table
+
+cat >> $seqres.full << ENDL
+dmerror after marking good:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+# Unmount filesystem to start fresh
+echo "Kill healer"
+_scratch_kill_xfs_healer _dmerror_unmount
+cat $tmp.healer >> $seqres.full
+cat $tmp.healer | filter_healer_errors
+
+# Start the healer again so that can verify that the errors don't persist after
+# we flip back to the good dm table.
+echo "Remount and restart healer"
+_dmerror_mount
+_scratch_invoke_xfs_healer "$tmp.healer"
+
+# See if buffered reads pick it up
+echo "Try buffered read again"
+$XFS_IO_PROG -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio reads pick it up
+echo "Try directio read again"
+$XFS_IO_PROG -d -c "pread 0 $((4 * file_blksz))" $victim >> $seqres.full
+
+# See if directio writes pick it up
+echo "Try directio write again"
+$XFS_IO_PROG -d -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# See if buffered writes pick it up
+echo "Try buffered write again"
+$XFS_IO_PROG -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c fsync $victim >> $seqres.full
+
+# Unmount fs to kill healer, then wait for it to finish
+echo "Kill healer again"
+_scratch_kill_xfs_healer _dmerror_unmount
+cat $tmp.healer >> $seqres.full
+cat $tmp.healer | filter_healer_errors
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1896.out b/tests/xfs/1896.out
new file mode 100644
index 00000000000000..1378d4fad44522
--- /dev/null
+++ b/tests/xfs/1896.out
@@ -0,0 +1,21 @@
+QA output created by 1896
+Try buffered read
+pread: Input/output error
+Try directio read
+pread: Input/output error
+Try directio write
+pwrite: Input/output error
+Try buffered write
+fsync: Input/output error
+Fix device
+Kill healer
+VICTIM pos NUM len NUM: buffered_read: Input/output error
+VICTIM pos NUM len NUM: buffered_write: Input/output error
+VICTIM pos NUM len NUM: directio_read: Input/output error
+VICTIM pos NUM len NUM: directio_write: Input/output error
+Remount and restart healer
+Try buffered read again
+Try directio read again
+Try directio write again
+Try buffered write again
+Kill healer again


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 08/13] xfs: test xfs_healer can report file media errors
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (6 preceding siblings ...)
  2026-03-03  0:42   ` [PATCH 07/13] xfs: test xfs_healer can report file I/O errors Darrick J. Wong
@ 2026-03-03  0:42   ` Darrick J. Wong
  2026-03-03  0:43   ` [PATCH 09/13] xfs: test xfs_healer can report filesystem shutdowns Darrick J. Wong
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:42 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer can actually report media errors as found by the
kernel.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1897     |  172 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1897.out |    7 ++
 2 files changed, 179 insertions(+)
 create mode 100755 tests/xfs/1897
 create mode 100755 tests/xfs/1897.out


diff --git a/tests/xfs/1897 b/tests/xfs/1897
new file mode 100755
index 00000000000000..4670c333a2d82c
--- /dev/null
+++ b/tests/xfs/1897
@@ -0,0 +1,172 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1897
+#
+# Check that xfs_healer can report media errors.
+
+. ./common/preamble
+_begin_fstest auto quick scrub eio selfhealing
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+	_dmerror_cleanup
+}
+
+. ./common/fuzzy
+. ./common/filter
+. ./common/dmerror
+. ./common/systemd
+
+_require_scratch
+_require_scrub
+_require_dm_target error
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_xfs_io_command verifymedia
+
+filter_healer() {
+	_filter_scratch | \
+		grep -E '(media failed|media error)' | \
+		sed \
+		    -e 's/datadev/DEVICE/g' \
+		    -e 's/rtdev/DEVICE/g' \
+		    -e 's/ino [0-9]*/ino NUM/g' \
+		    -e 's/gen 0x[0-9a-f]*/gen NUM/g' \
+		    -e 's/pos [0-9]*/pos NUM/g' \
+		    -e 's/len [0-9]*/len NUM/g' \
+		    -e 's/0x[0-9a-f]*/NUM/g' \
+		    -e 's|SCRATCH_MNT/a|VICTIM|g' \
+		    -e 's|SCRATCH_MNT ino NUM gen NUM|VICTIM|g'
+}
+
+filter_verify() {
+	sed -e 's/\([a-z]*dev\): verify error at offset \([0-9]*\) length \([0-9]*\)/DEVICE: verify error at offset XXX length XXX/g'
+}
+
+_scratch_mkfs >> $seqres.full
+
+# The dm-error map added by this test doesn't work on zoned devices because
+# table sizes need to be aligned to the zone size, and even for zoned on
+# conventional this test will get confused because of the internal RT device.
+#
+# That check requires a mounted file system, so do a dummy mount before setting
+# up DM.
+_scratch_mount
+_require_xfs_scratch_non_zoned
+_require_xfs_healer $SCRATCH_MNT
+_scratch_unmount
+
+_dmerror_init
+_dmerror_mount
+
+# Write a file with 4 file blocks worth of data, figure out the LBA to target
+victim=$SCRATCH_MNT/a
+file_blksz=$(_get_file_block_size $SCRATCH_MNT)
+$XFS_IO_PROG -f -c "pwrite -S 0x58 0 $((4 * file_blksz))" -c "fsync" $victim >> $seqres.full
+unset errordev
+verifymediadev="-d"
+
+awk_len_prog='{print $6}'
+if _xfs_is_realtime_file $victim; then
+	if ! _xfs_has_feature $SCRATCH_MNT rtgroups; then
+		awk_len_prog='{print $4}'
+	fi
+	errordev="RT"
+	verifymediadev="-r"
+fi
+bmap_str="$($XFS_IO_PROG -c "bmap -elpv" $victim | grep "^[[:space:]]*0:")"
+echo "$errordev:$bmap_str" >> $seqres.full
+
+phys="$(echo "$bmap_str" | $AWK_PROG '{print $3}')"
+len="$(echo "$bmap_str" | $AWK_PROG "$awk_len_prog")"
+
+fs_blksz=$(_get_block_size $SCRATCH_MNT)
+echo "file_blksz:$file_blksz:fs_blksz:$fs_blksz" >> $seqres.full
+kernel_sectors_per_fs_block=$((fs_blksz / 512))
+
+# Did we get at least 4 fs blocks worth of extent?
+min_len_sectors=$(( 4 * kernel_sectors_per_fs_block ))
+test "$len" -lt $min_len_sectors && \
+	_fail "could not format a long enough extent on an empty fs??"
+
+phys_start=$(echo "$phys" | sed -e 's/\.\..*//g')
+
+echo "$errordev:$phys:$len:$fs_blksz:$phys_start" >> $seqres.full
+echo "victim file:" >> $seqres.full
+od -tx1 -Ad -c $victim >> $seqres.full
+
+# Set the dmerror table so that all IO will pass through.
+_dmerror_reset_table
+
+cat >> $seqres.full << ENDL
+dmerror before:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+# All sector numbers that we feed to the kernel must be in units of 512b, but
+# they also must be aligned to the device's logical block size.
+logical_block_size=`$here/src/min_dio_alignment $SCRATCH_MNT $SCRATCH_DEV`
+kernel_sectors_per_device_lba=$((logical_block_size / 512))
+
+# Pretend as bad one of the device LBAs in the middle of the extent.  Target
+# the second LBA of the third block of the four-block file extent that we
+# allocated earlier, but without overflowing into the fourth file block.
+bad_sector=$(( phys_start + (2 * kernel_sectors_per_fs_block) ))
+bad_len=$kernel_sectors_per_device_lba
+if (( kernel_sectors_per_device_lba < kernel_sectors_per_fs_block )); then
+	bad_sector=$((bad_sector + kernel_sectors_per_device_lba))
+fi
+if (( (bad_sector % kernel_sectors_per_device_lba) != 0)); then
+	echo "bad_sector $bad_sector not congruent with device logical block size $logical_block_size"
+fi
+_dmerror_mark_range_bad $bad_sector $bad_len $errordev
+
+cat >> $seqres.full << ENDL
+dmerror after marking bad:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+_dmerror_load_error_table
+
+echo "Simulate media error"
+_scratch_invoke_xfs_healer "$tmp.healer"
+echo "verifymedia $verifymediadev -R $((bad_sector * 512)) $(((bad_sector + bad_len) * 512))" >> $seqres.full
+$XFS_IO_PROG -x -c "verifymedia $verifymediadev -R $((bad_sector * 512)) $(((bad_sector + bad_len) * 512))" $SCRATCH_MNT 2>&1 | filter_verify
+
+# Now mark the bad range good so that a retest shows no media failure.
+_dmerror_mark_range_good $bad_sector $bad_len $errordev
+_dmerror_load_error_table
+
+cat >> $seqres.full << ENDL
+dmerror after marking good:
+$DMERROR_TABLE
+$DMERROR_RTTABLE
+<end table>
+ENDL
+
+echo "No more media error"
+echo "verifymedia $verifymediadev -R $((bad_sector * 512)) $(((bad_sector + bad_len) * 512))" >> $seqres.full
+$XFS_IO_PROG -x -c "verifymedia $verifymediadev -R $((bad_sector * 512)) $(((bad_sector + bad_len) * 512))" $SCRATCH_MNT >> $seqres.full
+
+# Unmount filesystem to start fresh
+echo "Kill healer"
+_scratch_kill_xfs_healer _dmerror_unmount
+
+# filesystems without rmap do not translate media errors to lost file ranges
+# so fake the output
+_xfs_has_feature "$SCRATCH_DEV" rmapbt || \
+	echo "VICTIM pos 0 len 0: media failed" >> $tmp.healer
+
+cat $tmp.healer >> $seqres.full
+cat $tmp.healer | filter_healer
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1897.out b/tests/xfs/1897.out
new file mode 100755
index 00000000000000..1bb615c3119dce
--- /dev/null
+++ b/tests/xfs/1897.out
@@ -0,0 +1,7 @@
+QA output created by 1897
+Simulate media error
+DEVICE: verify error at offset XXX length XXX: Input/output error
+No more media error
+Kill healer
+SCRATCH_MNT DEVICE daddr NUM bbcount NUM: media error
+VICTIM pos NUM len NUM: media failed


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 09/13] xfs: test xfs_healer can report filesystem shutdowns
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (7 preceding siblings ...)
  2026-03-03  0:42   ` [PATCH 08/13] xfs: test xfs_healer can report file media errors Darrick J. Wong
@ 2026-03-03  0:43   ` Darrick J. Wong
  2026-03-03  0:43   ` [PATCH 10/13] xfs: test xfs_healer can initiate full filesystem repairs Darrick J. Wong
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:43 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer can actually report abnormal filesystem shutdowns.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1898     |   37 +++++++++++++++++++++++++++++++++++++
 tests/xfs/1898.out |    4 ++++
 2 files changed, 41 insertions(+)
 create mode 100755 tests/xfs/1898
 create mode 100755 tests/xfs/1898.out


diff --git a/tests/xfs/1898 b/tests/xfs/1898
new file mode 100755
index 00000000000000..2b6c72093e7021
--- /dev/null
+++ b/tests/xfs/1898
@@ -0,0 +1,37 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1898
+#
+# Check that xfs_healer can report filesystem shutdowns.
+
+. ./common/preamble
+_begin_fstest auto quick scrub eio selfhealing
+
+. ./common/fuzzy
+. ./common/filter
+. ./common/systemd
+
+_require_scratch_nocheck
+_require_scrub
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+_require_xfs_healer $SCRATCH_MNT
+$XFS_IO_PROG -f -c "pwrite -S 0x58 0 500k" -c "fsync" $victim >> $seqres.full
+
+echo "Start healer and shut down"
+_scratch_invoke_xfs_healer "$tmp.healer"
+_scratch_shutdown -f
+
+# Unmount filesystem to start fresh
+echo "Kill healer"
+_scratch_kill_xfs_healer
+cat $tmp.healer >> $seqres.full
+cat $tmp.healer | _filter_scratch | grep 'shut down'
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1898.out b/tests/xfs/1898.out
new file mode 100755
index 00000000000000..f71f848da810ce
--- /dev/null
+++ b/tests/xfs/1898.out
@@ -0,0 +1,4 @@
+QA output created by 1898
+Start healer and shut down
+Kill healer
+SCRATCH_MNT: filesystem shut down due to forced unmount


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 10/13] xfs: test xfs_healer can initiate full filesystem repairs
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (8 preceding siblings ...)
  2026-03-03  0:43   ` [PATCH 09/13] xfs: test xfs_healer can report filesystem shutdowns Darrick J. Wong
@ 2026-03-03  0:43   ` Darrick J. Wong
  2026-03-03  0:43   ` [PATCH 11/13] xfs: test xfs_healer can follow mount moves Darrick J. Wong
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:43 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that when xfs_healer can't perform a spot repair, it will actually
start up xfs_scrub to perform a full scan and repair.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1899     |  108 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1899.out |    3 +
 2 files changed, 111 insertions(+)
 create mode 100755 tests/xfs/1899
 create mode 100644 tests/xfs/1899.out


diff --git a/tests/xfs/1899 b/tests/xfs/1899
new file mode 100755
index 00000000000000..5d35ca8265645f
--- /dev/null
+++ b/tests/xfs/1899
@@ -0,0 +1,108 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1899
+#
+# Ensure that autonomous self healing works fixes the filesystem correctly
+# even if the spot repair doesn't work and it falls back to a full fsck.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+
+_require_scrub
+_require_xfs_io_command "repair"	# online repair support
+_require_xfs_db_command "blocktrash"
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command "$XFS_PROPERTY_PROG" "xfs_property"
+_require_scratch
+_require_systemd_unit_defined "xfs_scrub@.service"
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+_xfs_has_feature $SCRATCH_MNT rmapbt || \
+	_notrun "reverse mapping required to test directory auto-repair"
+_xfs_has_feature $SCRATCH_MNT parent || \
+	_notrun "parent pointers required to test directory auto-repair"
+_require_xfs_healer $SCRATCH_MNT --repair
+
+filter_healer() {
+	_filter_scratch | \
+		grep 'Full repairs in progress' | \
+		uniq
+}
+
+# Configure the filesystem for automatic repair of the filesystem.
+$XFS_PROPERTY_PROG $SCRATCH_MNT set autofsck=repair >> $seqres.full
+
+# Create a largeish directory
+dblksz=$(_xfs_get_dir_blocksize "$SCRATCH_MNT")
+echo testdata > $SCRATCH_MNT/a
+mkdir -p "$SCRATCH_MNT/some/victimdir"
+for ((i = 0; i < (dblksz / 255); i++)); do
+	fname="$(printf "%0255d" "$i")"
+	ln $SCRATCH_MNT/a $SCRATCH_MNT/some/victimdir/$fname
+done
+
+# Did we get at least two dir blocks?
+dirsize=$(stat -c '%s' $SCRATCH_MNT/some/victimdir)
+test "$dirsize" -gt "$dblksz" || echo "failed to create two-block directory"
+
+# Break the directory, remount filesystem
+_scratch_unmount
+_scratch_xfs_db -x \
+	-c 'path /some/victimdir' \
+	-c 'bmap' \
+	-c 'dblock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' \
+	-c 'path /a' \
+	-c 'bmap -a' \
+	-c 'ablock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' \
+	>> $seqres.full
+_scratch_mount
+
+_scratch_invoke_xfs_healer "$tmp.healer" --repair
+
+# Access the broken directory to trigger a repair, then poll the directory
+# for 5 seconds to see if it gets fixed without us needing to intervene.
+ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+_filter_scratch < $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "try $try saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "try $try no longer saw corruption or gave up" >> $seqres.full
+_filter_scratch < $tmp.err
+
+# Wait for the background fixer to finish
+svc="$(_xfs_scrub_svcname "$SCRATCH_MNT")"
+_systemd_unit_wait "$svc"
+
+# List the dirents of /victimdir and parent pointers of /a to see if they both
+# stop reporting corruption
+(ls $SCRATCH_MNT/some/victimdir ; $XFS_IO_PROG -c 'parent') > /dev/null 2> $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "retry $try still saw corruption" >> $seqres.full
+	sleep 0.1
+	(ls $SCRATCH_MNT/some/victimdir ; $XFS_IO_PROG -c 'parent') > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "retry $try no longer saw corruption or gave up" >> $seqres.full
+
+# Unmount to kill the healer
+_scratch_kill_xfs_healer
+cat $tmp.healer >> $seqres.full
+cat $tmp.healer | filter_healer
+
+status=0
+exit
diff --git a/tests/xfs/1899.out b/tests/xfs/1899.out
new file mode 100644
index 00000000000000..5345fd400f3627
--- /dev/null
+++ b/tests/xfs/1899.out
@@ -0,0 +1,3 @@
+QA output created by 1899
+ls: reading directory 'SCRATCH_MNT/some/victimdir': Structure needs cleaning
+SCRATCH_MNT: Full repairs in progress.


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 11/13] xfs: test xfs_healer can follow mount moves
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (9 preceding siblings ...)
  2026-03-03  0:43   ` [PATCH 10/13] xfs: test xfs_healer can initiate full filesystem repairs Darrick J. Wong
@ 2026-03-03  0:43   ` Darrick J. Wong
  2026-03-03  0:43   ` [PATCH 12/13] xfs: test xfs_healer wont repair the wrong filesystem Darrick J. Wong
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:43 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that when xfs_healer needs to reopen a filesystem to repair it,
it can still find the filesystem even if it has been mount --move'd.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1900     |  115 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1900.out |    2 +
 2 files changed, 117 insertions(+)
 create mode 100755 tests/xfs/1900
 create mode 100755 tests/xfs/1900.out


diff --git a/tests/xfs/1900 b/tests/xfs/1900
new file mode 100755
index 00000000000000..9a8f9fabd124ad
--- /dev/null
+++ b/tests/xfs/1900
@@ -0,0 +1,115 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1900
+#
+# Ensure that autonomous self healing fixes the filesystem correctly even if
+# the original mount has moved somewhere else.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+
+_cleanup()
+{
+	command -v _kill_fsstress &>/dev/null && _kill_fsstress
+	cd /
+	rm -r -f $tmp.*
+	if [ -n "$new_dir" ]; then
+		_unmount "$new_dir" &>/dev/null
+		rm -rf "$new_dir"
+	fi
+}
+
+_require_test
+_require_scrub
+_require_xfs_io_command "repair"	# online repair support
+_require_xfs_db_command "blocktrash"
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command "$XFS_PROPERTY_PROG" "xfs_property"
+_require_scratch
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+_xfs_has_feature $SCRATCH_MNT rmapbt || \
+	_notrun "reverse mapping required to test directory auto-repair"
+_xfs_has_feature $SCRATCH_MNT parent || \
+	_notrun "parent pointers required to test directory auto-repair"
+_require_xfs_healer $SCRATCH_MNT --repair
+
+# Configure the filesystem for automatic repair of the filesystem.
+$XFS_PROPERTY_PROG $SCRATCH_MNT set autofsck=repair >> $seqres.full
+
+# Create a largeish directory
+dblksz=$(_xfs_get_dir_blocksize "$SCRATCH_MNT")
+echo testdata > $SCRATCH_MNT/a
+mkdir -p "$SCRATCH_MNT/some/victimdir"
+for ((i = 0; i < (dblksz / 255); i++)); do
+	fname="$(printf "%0255d" "$i")"
+	ln $SCRATCH_MNT/a $SCRATCH_MNT/some/victimdir/$fname
+done
+
+# Did we get at least two dir blocks?
+dirsize=$(stat -c '%s' $SCRATCH_MNT/some/victimdir)
+test "$dirsize" -gt "$dblksz" || echo "failed to create two-block directory"
+
+# Break the directory, remount filesystem
+_scratch_unmount
+_scratch_xfs_db -x \
+	-c 'path /some/victimdir' \
+	-c 'bmap' \
+	-c 'dblock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' >> $seqres.full
+_scratch_mount
+
+_scratch_invoke_xfs_healer "$tmp.healer" --repair
+
+# Move the scratch filesystem to a completely different mountpoint so that
+# we can test if the healer can find it again.
+new_dir=$TEST_DIR/moocow
+mkdir -p $new_dir
+_mount --bind $SCRATCH_MNT $new_dir
+_unmount $SCRATCH_MNT
+
+df -t xfs >> $seqres.full
+
+# Access the broken directory to trigger a repair, then poll the directory
+# for 5 seconds to see if it gets fixed without us needing to intervene.
+ls $new_dir/some/victimdir > /dev/null 2> $tmp.err
+_filter_scratch < $tmp.err | _filter_test_dir
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "try $try saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $new_dir/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "try $try no longer saw corruption or gave up" >> $seqres.full
+_filter_scratch < $tmp.err | _filter_test_dir
+
+# List the dirents of /victimdir to see if it stops reporting corruption
+ls $new_dir/some/victimdir > /dev/null 2> $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "retry $try still saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "retry $try no longer saw corruption or gave up" >> $seqres.full
+
+new_dir_unmount() {
+	_unmount $new_dir
+}
+
+# Unmount to kill the healer
+_scratch_kill_xfs_healer new_dir_unmount
+cat $tmp.healer >> $seqres.full
+
+status=0
+exit
diff --git a/tests/xfs/1900.out b/tests/xfs/1900.out
new file mode 100755
index 00000000000000..604c9eb5eb10f4
--- /dev/null
+++ b/tests/xfs/1900.out
@@ -0,0 +1,2 @@
+QA output created by 1900
+ls: reading directory 'TEST_DIR/moocow/some/victimdir': Structure needs cleaning


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 12/13] xfs: test xfs_healer wont repair the wrong filesystem
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (10 preceding siblings ...)
  2026-03-03  0:43   ` [PATCH 11/13] xfs: test xfs_healer can follow mount moves Darrick J. Wong
@ 2026-03-03  0:43   ` Darrick J. Wong
  2026-03-03  0:44   ` [PATCH 13/13] xfs: test xfs_healer background service Darrick J. Wong
  2026-03-03  0:47   ` [PATCH 14/13] xfs: test xfs_healer startup service Darrick J. Wong
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:43 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that when xfs_healer needs to reopen a filesystem to repair it, it
won't latch on to another xfs filesystem that has been mounted atop the same
mountpoint.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1901     |  137 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1901.out |    2 +
 2 files changed, 139 insertions(+)
 create mode 100755 tests/xfs/1901
 create mode 100755 tests/xfs/1901.out


diff --git a/tests/xfs/1901 b/tests/xfs/1901
new file mode 100755
index 00000000000000..c92dcf9a3b3d48
--- /dev/null
+++ b/tests/xfs/1901
@@ -0,0 +1,137 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1901
+#
+# Ensure that autonomous self healing won't fix the wrong filesystem if a
+# snapshot of the original filesystem is now mounted on the same directory as
+# the original.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+
+_cleanup()
+{
+	command -v _kill_fsstress &>/dev/null && _kill_fsstress
+	cd /
+	rm -r -f $tmp.*
+	test -e "$mntpt" && _unmount "$mntpt" &>/dev/null
+	test -e "$mntpt" && _unmount "$mntpt" &>/dev/null
+	test -e "$loop1" && _destroy_loop_device "$loop1"
+	test -e "$loop2" && _destroy_loop_device "$loop2"
+	test -e "$testdir" && rm -r -f "$testdir"
+}
+
+_require_test
+_require_scrub
+_require_xfs_io_command "repair"	# online repair support
+_require_xfs_db_command "blocktrash"
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command "$XFS_PROPERTY_PROG" "xfs_property"
+
+testdir=$TEST_DIR/$seq
+mntpt=$testdir/mount
+disk1=$testdir/disk1
+disk2=$testdir/disk2
+
+mkdir -p "$mntpt"
+$XFS_IO_PROG -f -c "truncate 300m" $disk1
+$XFS_IO_PROG -f -c "truncate 300m" $disk2
+loop1="$(_create_loop_device "$disk1")"
+
+filter_mntpt() {
+	sed -e "s|$mntpt|MNTPT|g"
+}
+
+_mkfs_dev "$loop1" >> $seqres.full
+_mount "$loop1" "$mntpt" || _notrun "cannot mount victim filesystem"
+
+_xfs_has_feature $mntpt rmapbt || \
+	_notrun "reverse mapping required to test directory auto-repair"
+_xfs_has_feature $mntpt parent || \
+	_notrun "parent pointers required to test directory auto-repair"
+_require_xfs_healer $mntpt --repair
+
+# Configure the filesystem for automatic repair of the filesystem.
+$XFS_PROPERTY_PROG $mntpt set autofsck=repair >> $seqres.full
+
+# Create a largeish directory
+dblksz=$(_xfs_get_dir_blocksize "$mntpt")
+echo testdata > $mntpt/a
+mkdir -p "$mntpt/some/victimdir"
+for ((i = 0; i < (dblksz / 255); i++)); do
+	fname="$(printf "%0255d" "$i")"
+	ln $mntpt/a $mntpt/some/victimdir/$fname
+done
+
+# Did we get at least two dir blocks?
+dirsize=$(stat -c '%s' $mntpt/some/victimdir)
+test "$dirsize" -gt "$dblksz" || echo "failed to create two-block directory"
+
+# Clone the fs, break the directory, remount filesystem
+_unmount "$mntpt"
+
+cp --sparse=always "$disk1" "$disk2" || _fail "cannot copy disk1"
+loop2="$(_create_loop_device_like_bdev "$disk2" "$loop1")"
+
+$XFS_DB_PROG "$loop1" -x \
+	-c 'path /some/victimdir' \
+	-c 'bmap' \
+	-c 'dblock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' >> $seqres.full
+_mount "$loop1" "$mntpt" || _fail "cannot mount broken fs"
+
+_invoke_xfs_healer "$mntpt" "$tmp.healer" --repair
+
+# Stop the healer process so that it can't read error events while we do some
+# shenanigans.
+test -n "$XFS_HEALER_PID" || _fail "nobody set XFS_HEALER_PID?"
+kill -STOP $XFS_HEALER_PID
+
+
+echo "LOG $XFS_HEALER_PID SO FAR:" >> $seqres.full
+cat $tmp.healer >> $seqres.full
+
+# Access the broken directory to trigger a repair event, which will not yet be
+# processed.
+ls $mntpt/some/victimdir > /dev/null 2> $tmp.err
+filter_mntpt < $tmp.err
+
+ps auxfww | grep xfs_healer >> $seqres.full
+
+echo "LOG AFTER TRYING TO POKE:" >> $seqres.full
+cat $tmp.healer >> $seqres.full
+
+# Mount the clone filesystem to the same mountpoint so that the healer cannot
+# actually reopen it to perform repairs.
+_mount "$loop2" "$mntpt" -o nouuid || _fail "cannot mount decoy fs"
+
+grep -w xfs /proc/mounts >> $seqres.full
+
+# Continue the healer process so it can handle events now.  Wait a few seconds
+# while it fails to reopen disk1's mount point to repair things.
+kill -CONT $XFS_HEALER_PID
+sleep 2
+
+new_dir_unmount() {
+	_unmount "$mntpt"
+	_unmount "$mntpt"
+}
+
+# Unmount to kill the healer
+_kill_xfs_healer new_dir_unmount
+echo "LOG AFTER FAILURE" >> $seqres.full
+cat $tmp.healer >> $seqres.full
+
+# Did the healer log complaints about not being able to reopen the mountpoint
+# to enact repairs?
+grep -q 'Stale file handle' $tmp.healer || \
+	echo "Should have seen stale file handle complaints"
+
+status=0
+exit
diff --git a/tests/xfs/1901.out b/tests/xfs/1901.out
new file mode 100755
index 00000000000000..ff83e03725307a
--- /dev/null
+++ b/tests/xfs/1901.out
@@ -0,0 +1,2 @@
+QA output created by 1901
+ls: reading directory 'MNTPT/some/victimdir': Structure needs cleaning


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 13/13] xfs: test xfs_healer background service
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (11 preceding siblings ...)
  2026-03-03  0:43   ` [PATCH 12/13] xfs: test xfs_healer wont repair the wrong filesystem Darrick J. Wong
@ 2026-03-03  0:44   ` Darrick J. Wong
  2026-03-03  0:47   ` [PATCH 14/13] xfs: test xfs_healer startup service Darrick J. Wong
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:44 UTC (permalink / raw)
  To: zlang, djwong; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that when xfs_healer can monitor and repair filesystems when it's
running as a systemd service, which is the intended usage model.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1902     |  152 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1902.out |    2 +
 2 files changed, 154 insertions(+)
 create mode 100755 tests/xfs/1902
 create mode 100755 tests/xfs/1902.out


diff --git a/tests/xfs/1902 b/tests/xfs/1902
new file mode 100755
index 00000000000000..d327995df8c5b0
--- /dev/null
+++ b/tests/xfs/1902
@@ -0,0 +1,152 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test 1902
+#
+# Ensure that autonomous self healing fixes the filesystem correctly when
+# running in a systemd service
+#
+# unreliable_in_parallel: this test runs the xfs_healer systemd service, which
+# cannot be isolated to a specific testcase with the way check-parallel is
+# implemented.
+#
+. ./common/preamble
+_begin_fstest auto selfhealing unreliable_in_parallel
+
+_cleanup()
+{
+	cd /
+	if [ -n "$new_svcfile" ]; then
+		rm -f "$new_svcfile"
+		systemctl daemon-reload
+	fi
+	rm -r -f $tmp.*
+}
+
+. ./common/filter
+. ./common/fuzzy
+. ./common/systemd
+
+_require_systemd_is_running
+_require_systemd_unit_defined xfs_healer@.service
+_require_scrub
+_require_xfs_io_command "repair"	# online repair support
+_require_xfs_db_command "blocktrash"
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command "$XFS_PROPERTY_PROG" "xfs_property"
+_require_scratch
+
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+
+_xfs_has_feature $SCRATCH_MNT rmapbt || \
+	_notrun "reverse mapping required to test directory auto-repair"
+_xfs_has_feature $SCRATCH_MNT parent || \
+	_notrun "parent pointers required to test directory auto-repair"
+_require_xfs_healer $SCRATCH_MNT --repair
+
+# Configure the filesystem for automatic repair of the filesystem.
+$XFS_PROPERTY_PROG $SCRATCH_MNT set autofsck=repair >> $seqres.full
+
+# Create a largeish directory
+dblksz=$(_xfs_get_dir_blocksize "$SCRATCH_MNT")
+echo testdata > $SCRATCH_MNT/a
+mkdir -p "$SCRATCH_MNT/some/victimdir"
+for ((i = 0; i < (dblksz / 255); i++)); do
+	fname="$(printf "%0255d" "$i")"
+	ln $SCRATCH_MNT/a $SCRATCH_MNT/some/victimdir/$fname
+done
+
+# Did we get at least two dir blocks?
+dirsize=$(stat -c '%s' $SCRATCH_MNT/some/victimdir)
+test "$dirsize" -gt "$dblksz" || echo "failed to create two-block directory"
+
+# Break the directory
+_scratch_unmount
+_scratch_xfs_db -x \
+	-c 'path /some/victimdir' \
+	-c 'bmap' \
+	-c 'dblock 1' \
+	-c 'blocktrash -z -0 -o 0 -x 2048 -y 2048 -n 2048' >> $seqres.full
+
+# Find the existing xfs_healer@ service definition, figure out where we're
+# going to land our test-specific override
+orig_svcfile="$(_systemd_unit_path "xfs_healer@-.service")"
+test -f "$orig_svcfile" || \
+	_notrun "cannot find xfs_healer@ service file"
+
+new_svcdir="$(_systemd_runtime_dir)"
+test -d "$new_svcdir" || \
+	_notrun "cannot find runtime systemd service dir"
+
+# We need to make some local mods to the xfs_healer@ service definition
+# so we fork it and create a new service just for this test.
+new_healer_template="xfs_healer_fstest@.service"
+new_healer_svc="$(_xfs_systemd_svcname "$new_healer_template" "$SCRATCH_MNT")"
+_systemd_unit_status "$new_healer_svc" 2>&1 | \
+	grep -E -q '(could not be found|Loaded: not-found)' || \
+	_notrun "systemd service \"$new_healer_svc\" found, will not mess with this"
+
+new_svcfile="$new_svcdir/$new_healer_template"
+cp "$orig_svcfile" "$new_svcfile"
+
+# Pick up all the CLI args except for --repair and --no-autofsck because we're
+# going to force it to --autofsck below
+execargs="$(grep '^ExecStart=' $new_svcfile | \
+	    sed -e 's/^ExecStart=\S*//g' \
+	        -e 's/--no-autofsck//g' \
+		-e 's/--repair//g')"
+sed -e '/ExecStart=/d' -e '/BindPaths=/d' -e '/ExecCondition=/d' -i $new_svcfile
+cat >> "$new_svcfile" << ENDL
+
+[Service]
+ExecCondition=$XFS_HEALER_PROG --supported %f
+ExecStart=$XFS_HEALER_PROG $execargs
+ENDL
+_systemd_reload
+
+# Emit the results of our editing to the full log.
+systemctl cat "$new_healer_svc" >> $seqres.full
+
+# Remount, with service activation
+_scratch_mount
+
+old_healer_svc="$(_xfs_healer_svcname "$SCRATCH_MNT")"
+_systemd_unit_stop "$old_healer_svc" &>> $seqres.full
+_systemd_unit_start "$new_healer_svc" &>> $seqres.full
+
+_systemd_unit_status "$new_healer_svc" 2>&1 | grep -q 'Active: active' || \
+	echo "systemd service \"$new_healer_svc\" not running??"
+
+# Access the broken directory to trigger a repair, then poll the directory
+# for 5 seconds to see if it gets fixed without us needing to intervene.
+ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+_filter_scratch < $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "try $try saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "try $try no longer saw corruption or gave up" >> $seqres.full
+_filter_scratch < $tmp.err
+
+# List the dirents of /victimdir to see if it stops reporting corruption
+ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+try=0
+while [ $try -lt 50 ] && grep -q 'Structure needs cleaning' $tmp.err; do
+	echo "retry $try still saw corruption" >> $seqres.full
+	sleep 0.1
+	ls $SCRATCH_MNT/some/victimdir > /dev/null 2> $tmp.err
+	try=$((try + 1))
+done
+echo "retry $try no longer saw corruption or gave up" >> $seqres.full
+
+# Unmount to kill the healer
+_scratch_kill_xfs_healer
+journalctl -u "$new_healer_svc" >> $seqres.full
+
+status=0
+exit
diff --git a/tests/xfs/1902.out b/tests/xfs/1902.out
new file mode 100755
index 00000000000000..84f9b9e50e1e02
--- /dev/null
+++ b/tests/xfs/1902.out
@@ -0,0 +1,2 @@
+QA output created by 1902
+ls: reading directory 'SCRATCH_MNT/some/victimdir': Structure needs cleaning


^ permalink raw reply related	[flat|nested] 112+ messages in thread

* [PATCH 14/13] xfs: test xfs_healer startup service
  2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
                     ` (12 preceding siblings ...)
  2026-03-03  0:44   ` [PATCH 13/13] xfs: test xfs_healer background service Darrick J. Wong
@ 2026-03-03  0:47   ` Darrick J. Wong
  13 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03  0:47 UTC (permalink / raw)
  To: zlang; +Cc: hch, fstests, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Make sure that xfs_healer_start can actually start up xfs_healer service
instances when a filesystem is mounted.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 tests/xfs/1903     |  124 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/1903.out |    6 +++
 2 files changed, 130 insertions(+)
 create mode 100755 tests/xfs/1903
 create mode 100644 tests/xfs/1903.out

diff --git a/tests/xfs/1903 b/tests/xfs/1903
new file mode 100755
index 00000000000000..d71d75a6af3f9d
--- /dev/null
+++ b/tests/xfs/1903
@@ -0,0 +1,124 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2026 Oracle.  All Rights Reserved.
+#
+# FS QA Test No. 1903
+#
+# Check that the xfs_healer startup service starts the per-mount xfs_healer
+# service for the scratch filesystem.  IOWs, this is basic testing for the
+# xfs_healer systemd background services.
+#
+
+# unreliable_in_parallel: this appears to try to run healer services on all
+# mounted filesystems - that's a problem when there are a hundred other test
+# filesystems mounted running other tests...
+
+. ./common/preamble
+_begin_fstest auto selfhealing unreliable_in_parallel
+
+_cleanup()
+{
+	cd /
+	test -n "$new_healerstart_svc" &&
+		_systemd_unit_stop "$new_healerstart_svc"
+	test -n "$was_masked" && \
+		_systemd_unit_mask "$healer_svc" &>> $seqres.full
+	if [ -n "$new_svcfile" ]; then
+		rm -f "$new_svcfile"
+		systemctl daemon-reload
+	fi
+	rm -r -f $tmp.*
+}
+
+. ./common/filter
+. ./common/populate
+. ./common/fuzzy
+. ./common/systemd
+
+_require_systemd_is_running
+_require_systemd_unit_defined xfs_healer@.service
+_require_systemd_unit_defined xfs_healer_start.service
+_require_scratch
+_require_scrub
+_require_xfs_io_command "scrub"
+_require_xfs_spaceman_command "health"
+_require_populate_commands
+_require_command "$XFS_HEALER_PROG" "xfs_healer"
+_require_command $ATTR_PROG "attr"
+
+_xfs_skip_online_rebuild
+_xfs_skip_offline_rebuild
+
+orig_svcfile="$(_systemd_unit_path "xfs_healer_start.service")"
+test -f "$orig_svcfile" || \
+	_notrun "cannot find xfs_healer_start service file"
+
+new_svcdir="$(_systemd_runtime_dir)"
+test -d "$new_svcdir" || \
+	_notrun "cannot find runtime systemd service dir"
+
+# We need to make some local mods to the xfs_healer_start service definition
+# so we fork it and create a new service just for this test.
+new_healerstart_svc="xfs_healer_start_fstest.service"
+_systemd_unit_status "$new_healerstart_svc" 2>&1 | \
+	grep -E -q '(could not be found|Loaded: not-found)' || \
+	_notrun "systemd service \"$new_healerstart_svc\" found, will not mess with this"
+
+find_healer_trace() {
+	local path="$1"
+
+	sleep 2		# wait for delays in startup
+	$XFS_HEALER_PROG --supported "$path" 2>&1 | grep -q 'already running' || \
+		echo "cannot find evidence that xfs_healer is running for $path"
+}
+
+echo "Format and populate"
+_scratch_mkfs >> $seqres.full
+_scratch_mount
+_require_xfs_healer $SCRATCH_MNT
+
+# Configure the filesystem for background checks of the filesystem.
+$ATTR_PROG -R -s xfs:autofsck -V check $SCRATCH_MNT >> $seqres.full
+
+was_masked=
+healer_svc="$(_xfs_healer_svcname "$SCRATCH_MNT")"
+
+# Preserve the xfs_healer@ mask state -- we don't want this permanently
+# changing global state.
+if _systemd_unit_masked "$healer_svc"; then
+	_systemd_unit_unmask "$healer_svc" &>> $seqres.full
+	was_masked=1
+fi
+
+echo "Start healer on scratch FS"
+_systemd_unit_start "$healer_svc"
+find_healer_trace "$SCRATCH_MNT"
+_systemd_unit_stop "$healer_svc"
+
+new_svcfile="$new_svcdir/$new_healerstart_svc"
+cp "$orig_svcfile" "$new_svcfile"
+
+sed -e '/ExecStart=/d' -e '/BindPaths=/d' -e '/ExecCondition=/d' -i $new_svcfile
+cat >> "$new_svcfile" << ENDL
+[Service]
+ExecCondition=$XFS_HEALER_START_PROG --supported
+ExecStart=$XFS_HEALER_START_PROG
+ENDL
+_systemd_reload
+
+# Emit the results of our editing to the full log.
+systemctl cat "$new_healerstart_svc" >> $seqres.full
+
+echo "Start healer for everything"
+_systemd_unit_start "$new_healerstart_svc"
+find_healer_trace "$SCRATCH_MNT"
+
+echo "Restart healer for scratch FS"
+_scratch_cycle_mount
+find_healer_trace "$SCRATCH_MNT"
+
+echo "Healer testing done" | tee -a $seqres.full
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/1903.out b/tests/xfs/1903.out
new file mode 100644
index 00000000000000..07810f60ca10c6
--- /dev/null
+++ b/tests/xfs/1903.out
@@ -0,0 +1,6 @@
+QA output created by 1903
+Format and populate
+Start healer on scratch FS
+Start healer for everything
+Restart healer for scratch FS
+Healer testing done

^ permalink raw reply related	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03  0:40   ` [PATCH 1/1] generic: test fsnotify filesystem " Darrick J. Wong
@ 2026-03-03  9:21     ` Amir Goldstein
  2026-03-03 14:51       ` Christoph Hellwig
  2026-03-03 14:54     ` Christoph Hellwig
  1 sibling, 1 reply; 112+ messages in thread
From: Amir Goldstein @ 2026-03-03  9:21 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: zlang, linux-fsdevel, hch, gabriel, jack, fstests, linux-xfs

On Tue, Mar 3, 2026 at 1:40 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> From: Darrick J. Wong <djwong@kernel.org>
>
> Test the fsnotify filesystem error reporting.

For the record, I feel that I need to say to all the people whom we pushed back
on fanotify tests in fstests until there was a good enough reason to do so,
that this seems like a good reason to do so ;)

But also for future test writers, please note that FAN_FS_ERROR is an
exception to the rule and please keep writing new fanotify/inotify tests in LTP
(until there is a good enough reason...)

>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
>  src/Makefile           |    2
>  src/fs-monitor.c       |  155 +++++++++++++++++++++++++++++++++
>  tests/generic/1838     |  228 ++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/1838.out |   20 ++++
>  4 files changed, 404 insertions(+), 1 deletion(-)
>  create mode 100644 src/fs-monitor.c
>  create mode 100755 tests/generic/1838
>  create mode 100644 tests/generic/1838.out
>
>
...

> diff --git a/tests/generic/1838 b/tests/generic/1838
> new file mode 100755
> index 00000000000000..087851ddcbdb44
> --- /dev/null
> +++ b/tests/generic/1838
> @@ -0,0 +1,228 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
> +#
> +# FS QA Test No. 1838
> +#
> +# Check that fsnotify can report file IO errors.
> +
> +. ./common/preamble
> +_begin_fstest auto quick eio selfhealing
> +
> +# Override the default cleanup function.
> +_cleanup()
> +{
> +       cd /
> +       test -n "$fsmonitor_pid" && kill -TERM $fsmonitor_pid
> +       rm -f $tmp.*
> +       _dmerror_cleanup
> +}
> +
> +# Import common functions.
> +. ./common/fuzzy
> +. ./common/filter
> +. ./common/dmerror
> +. ./common/systemd
> +
> +case "$FSTYP" in
> +xfs)
> +       # added as a part of xfs health monitoring
> +       _require_xfs_io_command healthmon
> +       # no out of place writes
> +       _require_no_xfs_always_cow
> +       ;;
> +ext4)
> +       # added at the same time as uevents
> +       modprobe fs-$FSTYP
> +       test -e /sys/fs/ext4/features/uevents || \
> +               _notrun "$FSTYP does not support fsnotify ioerrors"
> +       ;;
> +*)
> +       _notrun "$FSTYP does not support fsnotify ioerrors"
> +       ;;
> +esac
> +

_require_fsnotify_errors ?

> +_require_scratch
> +_require_dm_target error
> +_require_test_program fs-monitor
> +_require_xfs_io_command "fiemap"
> +_require_odirect
> +
> +# fsnotify only gives us a file handle, the error number, and the number of
> +# times it was seen in between event deliveries.   The handle is mostly useless
> +# since we have no generic way to map that to a file path.  Therefore we can
> +# only coalesce all the I/O errors into one report.
> +filter_fsnotify_errors() {
> +       _filter_scratch | \
> +               grep -E '(FAN_FS_ERROR|Generic Error Record|error: 5)' | \
> +               sed -e "s/len=[0-9]*/len=XXX/g" | \
> +               sort | \
> +               uniq
> +}

move to common/filter?

Apart from those nits, no further comments.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03  9:21     ` Amir Goldstein
@ 2026-03-03 14:51       ` Christoph Hellwig
  2026-03-03 14:56         ` Amir Goldstein
  2026-03-04 10:10         ` Jan Kara
  0 siblings, 2 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 14:51 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Darrick J. Wong, zlang, linux-fsdevel, hch, gabriel, jack,
	fstests, linux-xfs

On Tue, Mar 03, 2026 at 10:21:04AM +0100, Amir Goldstein wrote:
> On Tue, Mar 3, 2026 at 1:40 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Test the fsnotify filesystem error reporting.
> 
> For the record, I feel that I need to say to all the people whom we pushed back
> on fanotify tests in fstests until there was a good enough reason to do so,
> that this seems like a good reason to do so ;)

Who pushed backed on that?  Because IMHO hiding stuff in ltp is a sure
way it doesn't get exercisesd regularly?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03  0:40   ` [PATCH 1/1] generic: test fsnotify filesystem " Darrick J. Wong
  2026-03-03  9:21     ` Amir Goldstein
@ 2026-03-03 14:54     ` Christoph Hellwig
  2026-03-03 16:06       ` Gabriel Krisman Bertazi
  2026-03-03 16:49       ` Darrick J. Wong
  1 sibling, 2 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 14:54 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: zlang, linux-fsdevel, hch, gabriel, amir73il, jack, fstests,
	linux-xfs

> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2021, Collabora Ltd.
> + */

Where is this coming from?

> +#ifndef __GLIBC__
> +#include <asm-generic/int-ll64.h>
> +#endif

And what is this for?  Looks pretty whacky.

> +case "$FSTYP" in
> +xfs)
> +	# added as a part of xfs health monitoring
> +	_require_xfs_io_command healthmon
> +	# no out of place writes
> +	_require_no_xfs_always_cow
> +	;;
> +ext4)
> +	# added at the same time as uevents
> +	modprobe fs-$FSTYP
> +	test -e /sys/fs/ext4/features/uevents || \
> +		_notrun "$FSTYP does not support fsnotify ioerrors"
> +	;;
> +*)
> +	_notrun "$FSTYP does not support fsnotify ioerrors"
> +	;;
> +esac

Please abstract this out into a documented helper in common/

> +#
> +# The dm-error map added by this test doesn't work on zoned devices because
> +# table sizes need to be aligned to the zone size, and even for zoned on
> +# conventional this test will get confused because of the internal RT device.
> +#
> +# That check requires a mounted file system, so do a dummy mount before setting
> +# up DM.
> +#
> +_scratch_mount
> +test $FSTYP = xfs && _require_xfs_scratch_non_zoned
> +_scratch_unmount

Hmm, this is a bit sad.  Can we align the map?  Or should we carve in
and add proper error injection to the block code, which has been
somewhere on my todo list forever because dm-error and friends are
so painful to setup.  Maybe I need to expedite that.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 14:51       ` Christoph Hellwig
@ 2026-03-03 14:56         ` Amir Goldstein
  2026-03-04 10:10         ` Jan Kara
  1 sibling, 0 replies; 112+ messages in thread
From: Amir Goldstein @ 2026-03-03 14:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J. Wong, zlang, linux-fsdevel, hch, gabriel, jack,
	fstests, linux-xfs

On Tue, Mar 3, 2026 at 3:51 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Mar 03, 2026 at 10:21:04AM +0100, Amir Goldstein wrote:
> > On Tue, Mar 3, 2026 at 1:40 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > > From: Darrick J. Wong <djwong@kernel.org>
> > >
> > > Test the fsnotify filesystem error reporting.
> >
> > For the record, I feel that I need to say to all the people whom we pushed back
> > on fanotify tests in fstests until there was a good enough reason to do so,
> > that this seems like a good reason to do so ;)
>
> Who pushed backed on that?  Because IMHO hiding stuff in ltp is a sure
> way it doesn't get exercisesd regularly?
>

Jan and myself pushed back on adding generic fanotify tests to fstest
because we already have most fanotify tests in LTP.

LTP is run by many testers on many boxes and many release
kernels and we are happy with this project to host tests for the
subsystem that we maintain.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle
  2026-03-03  0:34   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
@ 2026-03-03 15:44     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 02/26] libfrog: create healthmon event log library functions
  2026-03-03  0:34   ` [PATCH 02/26] libfrog: create healthmon event log library functions Darrick J. Wong
@ 2026-03-03 15:44     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 03/26] libfrog: add support code for starting systemd services programmatically
  2026-03-03  0:34   ` [PATCH 03/26] libfrog: add support code for starting systemd services programmatically Darrick J. Wong
@ 2026-03-03 15:45     ` Christoph Hellwig
  2026-03-03 15:59       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:34:36PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Add some simple routines for computing the name of systemd service
> instances and starting systemd services.  These will be used by the
> xfs_healer_start service to start per-filesystem xfs_healer service
> instances.
> 
> Note that we run systemd helper programs as subprocesses for a couple of
> reasons.  First, the path-escaping functionality is not a part of any
> library-accessible API, which means it can only be accessed via
> systemd-escape(1).  Second, although the service startup functionality
> can be reached via dbus, doing so would introduce a new library
> dependency.  Systemd is also undergoing a dbus -> varlink RPC transition
> so we avoid that mess by calling the cli systemctl(1) program.

Just curious: did you run this past the systemd folks?  Shelling out
always feel a bit iffy, and they're usually happy to help on how to
integrate with their services, so just asking might result in a better
way.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 04/26] libfrog: hoist a couple of service helper functions
  2026-03-03  0:34   ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong
@ 2026-03-03 15:45     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 05/26] man2: document the healthmon ioctl
  2026-03-03  0:35   ` [PATCH 05/26] man2: document the healthmon ioctl Darrick J. Wong
@ 2026-03-03 15:46     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 06/26] man2: document the media verification ioctl
  2026-03-03  0:35   ` [PATCH 06/26] man2: document the media verification ioctl Darrick J. Wong
@ 2026-03-03 15:46     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 07/26] xfs_io: monitor filesystem health events
  2026-03-03  0:35   ` [PATCH 07/26] xfs_io: monitor filesystem health events Darrick J. Wong
@ 2026-03-03 15:46     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 08/26] xfs_io: add a media verify command
  2026-03-03  0:35   ` [PATCH 08/26] xfs_io: add a media verify command Darrick J. Wong
@ 2026-03-03 15:46     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:46 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 09/26] xfs_healer: create daemon to listen for health events
  2026-03-03  0:36   ` [PATCH 09/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
@ 2026-03-03 15:47     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:47 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 10/26] xfs_healer: enable repairing filesystems
  2026-03-03  0:36   ` [PATCH 10/26] xfs_healer: enable repairing filesystems Darrick J. Wong
@ 2026-03-03 15:47     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:47 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 11/26] xfs_healer: use getparents to look up file names
  2026-03-03  0:36   ` [PATCH 11/26] xfs_healer: use getparents to look up file names Darrick J. Wong
@ 2026-03-03 15:48     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:48 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 12/26] xfs_healer: create a per-mount background monitoring service
  2026-03-03  0:36   ` [PATCH 12/26] xfs_healer: create a per-mount background monitoring service Darrick J. Wong
@ 2026-03-03 15:48     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:48 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service
  2026-03-03  0:37   ` [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service Darrick J. Wong
@ 2026-03-03 15:49     ` Christoph Hellwig
  2026-03-03 16:52       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

> +/* Start healer services for existing XFS mounts. */
> +static int
> +start_existing_mounts(
> +	int			mnt_ns_fd)
> +{
> +	struct mnt_id_req	req = {
> +		.size		= sizeof(struct mnt_id_req),
> +#ifdef HAVE_LISTMOUNT_NS_FD
> +		.mnt_ns_fd	= mnt_ns_fd,
> +#else
> +		.spare		= mnt_ns_fd,
> +#endif
> +		.mnt_id		= LSMT_ROOT,
> +	};
> +	uint64_t		mnt_ids[32];
> +	int			i;


> +	while ((ret = syscall(SYS_listmount, &req, &mnt_ids, 32, 0)) > 0) {

Should this use a wrapper so we can switch to the type safe libc
version once it becomes available?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable
  2026-03-03  0:37   ` [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
@ 2026-03-03 15:49     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode
  2026-03-03  0:37   ` [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
@ 2026-03-03 15:50     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:50 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure
  2026-03-03  0:38   ` [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
@ 2026-03-03 15:50     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:50 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-03  0:38   ` [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
@ 2026-03-03 15:51     ` Christoph Hellwig
  2026-03-03 17:26       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:51 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

but in a way just grabing a weak handle at mount time and never
dropping it would seem more useful?

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs
  2026-03-03  0:38   ` [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
@ 2026-03-03 15:52     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:38:31PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When xfs_healer reopens a mountpoint to perform a repair, it should
> validate that the opened fd points to a file on the same filesystem as
> the one being monitored.

.. and if we'd always keep the week handle around we would not need
this?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 19/26] xfs_healer: add a manual page
  2026-03-03  0:38   ` [PATCH 19/26] xfs_healer: add a manual page Darrick J. Wong
@ 2026-03-03 15:52     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible
  2026-03-03  0:39   ` [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible Darrick J. Wong
@ 2026-03-03 15:53     ` Christoph Hellwig
  2026-03-03 16:59       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

>  	if (disk->d_fd < 0)
> @@ -266,6 +267,18 @@ disk_close(
>  #define LBASIZE(d)		(1ULL << (d)->d_lbalog)
>  #define BTOLBA(d, bytes)	(((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog)
>  
> +#ifndef BTOBB
> +# define BTOBB(bytes)		((uint64_t)((bytes) + 511) >> 9)
> +#endif
> +
> +#ifndef BTOBBT
> +# define BTOBBT(bytes)		((uint64_t)(bytes) >> 9)
> +#endif
> +
> +#ifndef BBTOB
> +# define BBTOB(bytes)		((uint64_t)(bytes) << 9)
> +#endif

Is this really something that should be in scrub and not in
common code?  And why the ifndef?  the 9 and the derived from that
511 would also really benefit from symbolic names.

> +	if (disk->d_verify_fd >= 0) {
> +		const uint64_t	orig_start_daddr = BTOBBT(start);
> +		struct xfs_verify_media me = {
> +			.me_start_daddr	= orig_start_daddr,
> +			.me_end_daddr	= BTOBB(start + length),
> +			.me_dev		= disk->d_verify_disk,
> +			.me_rest_us	= bg_mode > 2 ? bg_mode - 1 : 0,
> +		};
> +		int		ret;
> +
> +		if (single_step)
> +			me.me_flags |= XFS_VERIFY_MEDIA_REPORT;
> +
> +		ret = ioctl(disk->d_verify_fd, XFS_IOC_VERIFY_MEDIA, &me);
> +		if (ret < 0)
> +			return ret;
> +		if (me.me_ioerror) {
> +			errno = me.me_ioerror;
> +			return -1;
> +		}
> +
> +		return BBTOB(me.me_start_daddr - orig_start_daddr);
> +	}

split this whole block into a helper for readabiltity?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 21/26] xfs_scrub: perform media scanning of the log region
  2026-03-03  0:39   ` [PATCH 21/26] xfs_scrub: perform media scanning of the log region Darrick J. Wong
@ 2026-03-03 15:54     ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:39:18PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Scan the log area for media errors because a defect in a region could
> prevent the user from being able to perform log recovery.

Looks good.  Not that we really could do anything here..

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 22/26] xfs_io: add listmount command
  2026-03-03  0:39   ` [PATCH 22/26] xfs_io: add listmount command Darrick J. Wong
@ 2026-03-03 15:56     ` Christoph Hellwig
  2026-03-03 17:08       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:56 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:39:33PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Add a command to list all mounts, now that we use this in
> xfs_healer_start.

> +/* copied from linux/mount.h in linux 6.18 */
> +struct statmount_fixed {

Shouldn't this use the kernel uapi header and/or a copy of it?

And be split out of the .c file into a separate header for maintainance
and eventually nuking once the kernel requirement becomes new enough?

> +static int
> +listmount(
> +	const struct mnt_id_req	*req,
> +	uint64_t		*mnt_ids,
> +	size_t			nr_mnt_ids)
> +{
> +	return syscall(SYS_listmount, req, mnt_ids, nr_mnt_ids, 0);
> +}

Same comment as for the other listmount instance here.

> +
> +static int
> +statmount(
> +	const struct mnt_id_req	*req,
> +	struct statmount_fixed	*smbuf,
> +	size_t			smbuf_size)
> +{
> +	return syscall(SYS_statmount, req, smbuf, smbuf_size, 0);
> +}

and similar here.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-03  0:39   ` [PATCH 23/26] xfs_io: print systemd service names Darrick J. Wong
@ 2026-03-03 15:57     ` Christoph Hellwig
  2026-03-03 17:29       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs


The idea of this is good, but is xfs_io really the right place for
it?  I'd expect scrub or healer to just output this somewhow.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled
  2026-03-03  0:40   ` [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled Darrick J. Wong
@ 2026-03-03 15:58     ` Christoph Hellwig
  2026-03-03 17:32       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:58 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:40:05PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If all backreferences are enabled in the filesystem, then enable online
> repair by default if the user didn't supply any other autofsck setting.
> Users might as well get full self-repair capability if they're paying
> for the extra metadata.

Does this cause scrub to run by default or just healer on demand?
People might not be happy about the former.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-03  0:40   ` [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
@ 2026-03-03 15:58     ` Christoph Hellwig
  2026-03-03 17:14       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:58 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:40:20PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Now that we're finished building autonomous repair, enable the service
> on the root filesystem by default.  The root filesystem is mounted by
> the initrd prior to starting systemd, which is why the udev rule cannot
> autostart the service for the root filesystem.
> 
> dh_installsystemd won't activate a template service (aka one with an
> at-sign in the name) even if it provides a DefaultInstance directive to
> make that possible.  Use a fugly shim for this.

Given that this is brand new code it feels a bit too early.  But maybe
that's just me.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 26/26] debian/control: listify the build dependencies
  2026-03-03  0:40   ` [PATCH 26/26] debian/control: listify the build dependencies Darrick J. Wong
@ 2026-03-03 15:58     ` Christoph Hellwig
  2026-03-03 17:09       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 15:58 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: aalbersh, hch, linux-xfs

On Mon, Mar 02, 2026 at 04:40:36PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> This will make it less gross to add more build deps later.

Looks good, but should this go to the beginning of the series?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 03/26] libfrog: add support code for starting systemd services programmatically
  2026-03-03 15:45     ` Christoph Hellwig
@ 2026-03-03 15:59       ` Darrick J. Wong
  2026-03-05  2:39         ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 15:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:45:37AM -0800, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 04:34:36PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Add some simple routines for computing the name of systemd service
> > instances and starting systemd services.  These will be used by the
> > xfs_healer_start service to start per-filesystem xfs_healer service
> > instances.
> > 
> > Note that we run systemd helper programs as subprocesses for a couple of
> > reasons.  First, the path-escaping functionality is not a part of any
> > library-accessible API, which means it can only be accessed via
> > systemd-escape(1).  Second, although the service startup functionality
> > can be reached via dbus, doing so would introduce a new library
> > dependency.  Systemd is also undergoing a dbus -> varlink RPC transition
> > so we avoid that mess by calling the cli systemctl(1) program.
> 
> Just curious: did you run this past the systemd folks?  Shelling out
> always feel a bit iffy, and they're usually happy to help on how to
> integrate with their services, so just asking might result in a better
> way.

I'll do that, though even if they add a dbus/varlink endpoint for path
escaping, it'll be a few releases more until we can depend on it
existing in the distros. :(

Service startup can be done fairly easily with something like this,
though obviously you'd want to do real error checking here:

static DBusConnection*
connect_to_system_bus(void)
{
	// Connect to the system bus (requires root or polkit permissions)
	DBusConnection *conn = dbus_bus_get(DBUS_BUS_SYSTEM, error);

	return conn;
}

int
systemd_stop_service(
	const char *service_name)
{
	DBusError error;
	DBusConnection *conn = connect_to_system_bus();

	const char *manager_path = "/org/freedesktop/systemd1";
	const char *manager_interface = "org.freedesktop.systemd1.Manager";
	const char *method = "StopUnit";

	DBusMessage *msg = dbus_message_new_method_call(
		"org.freedesktop.systemd1",
		manager_path,
		manager_interface,
		method
	);

	const char *mode = "replace"; // Stop and replace existing job
	dbus_message_append_args(msg, DBUS_TYPE_STRING, &service_name,
			DBUS_TYPE_STRING, &mode, DBUS_TYPE_INVALID);

	DBusMessage *reply =
			dbus_connection_send_with_reply_and_block(conn,
					msg, 5000, &error);

	dbus_message_unref(reply);
	dbus_message_unref(msg);
	dbus_connection_unref(conn);

	return 0;
}

with the previously mentioned problem that now xfsprogs grows build and
packaging dependencies on libdbus.  My guess is that it'll be a long
time till they deprecate starting services over dbus.  AFAICT systemd in
Trixie doesn't even expose varlink endpoints yet.

(Note that xfs_scrub_all already has a runtime dependency on
python3-dbus)

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 14:54     ` Christoph Hellwig
@ 2026-03-03 16:06       ` Gabriel Krisman Bertazi
  2026-03-03 16:12         ` Christoph Hellwig
  2026-03-03 16:49       ` Darrick J. Wong
  1 sibling, 1 reply; 112+ messages in thread
From: Gabriel Krisman Bertazi @ 2026-03-03 16:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J. Wong, zlang, linux-fsdevel, hch, amir73il, jack,
	fstests, linux-xfs

Christoph Hellwig <hch@infradead.org> writes:

>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright 2021, Collabora Ltd.
>> + */
>
> Where is this coming from?

This code is heavily based, if not the same, to what I originally wrote
as a kernel tree "samples/fs-monitor.c" when I was employed by
Collabora.  I appreciate Darrick keeping the note actually.

>
>> +#ifndef __GLIBC__
>> +#include <asm-generic/int-ll64.h>
>> +#endif
>
> And what is this for?  Looks pretty whacky.

Comes from kernel commit 3193e8942fc7 ("samples: fix building fs-monitor
on musl systems") to fix building with musl.  We don't need it here.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 16:06       ` Gabriel Krisman Bertazi
@ 2026-03-03 16:12         ` Christoph Hellwig
  2026-03-03 16:38           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 16:12 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Christoph Hellwig, Darrick J. Wong, zlang, linux-fsdevel, hch,
	amir73il, jack, fstests, linux-xfs

On Tue, Mar 03, 2026 at 11:06:52AM -0500, Gabriel Krisman Bertazi wrote:
> Christoph Hellwig <hch@infradead.org> writes:
> 
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/*
> >> + * Copyright 2021, Collabora Ltd.
> >> + */
> >
> > Where is this coming from?
> 
> This code is heavily based, if not the same, to what I originally wrote
> as a kernel tree "samples/fs-monitor.c" when I was employed by
> Collabora.  I appreciate Darrick keeping the note actually.

The note is good.  But if we import code from somewhere, we should
document where it is coming from, both for attribution and to ease
any future resyncs if needed.

> >> +#ifndef __GLIBC__
> >> +#include <asm-generic/int-ll64.h>
> >> +#endif
> >
> > And what is this for?  Looks pretty whacky.
> 
> Comes from kernel commit 3193e8942fc7 ("samples: fix building fs-monitor
> on musl systems") to fix building with musl.  We don't need it here.

In the place that needs it it really should have a comment explainig
the logic behind it.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 16:12         ` Christoph Hellwig
@ 2026-03-03 16:38           ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 16:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Gabriel Krisman Bertazi, zlang, linux-fsdevel, hch, amir73il,
	jack, fstests, linux-xfs

On Tue, Mar 03, 2026 at 08:12:55AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 11:06:52AM -0500, Gabriel Krisman Bertazi wrote:
> > Christoph Hellwig <hch@infradead.org> writes:
> > 
> > >> +// SPDX-License-Identifier: GPL-2.0
> > >> +/*
> > >> + * Copyright 2021, Collabora Ltd.
> > >> + */
> > >
> > > Where is this coming from?
> > 
> > This code is heavily based, if not the same, to what I originally wrote
> > as a kernel tree "samples/fs-monitor.c" when I was employed by
> > Collabora.  I appreciate Darrick keeping the note actually.
> 
> The note is good.  But if we import code from somewhere, we should
> document where it is coming from, both for attribution and to ease
> any future resyncs if needed.

Yeah, I copied this straight from the kernel tree, which is why it
contains this wart:

> > >> +#ifndef __GLIBC__
> > >> +#include <asm-generic/int-ll64.h>
> > >> +#endif
> > >
> > > And what is this for?  Looks pretty whacky.
> > 
> > Comes from kernel commit 3193e8942fc7 ("samples: fix building fs-monitor
> > on musl systems") to fix building with musl.  We don't need it here.
> 
> In the place that needs it it really should have a comment explainig
> the logic behind it.

I don't know that people *don't* try to run fstests with musl.  But as
they seem surprisingly patient with continuously fixing up xfsprogs,
perhaps it's ok to clean this up on the way into fstests.

I'll add more attribution for the c file pointing back to where it came
from.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 14:54     ` Christoph Hellwig
  2026-03-03 16:06       ` Gabriel Krisman Bertazi
@ 2026-03-03 16:49       ` Darrick J. Wong
  2026-03-03 16:53         ` Christoph Hellwig
  1 sibling, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 16:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: zlang, linux-fsdevel, hch, gabriel, amir73il, jack, fstests,
	linux-xfs

On Tue, Mar 03, 2026 at 06:54:29AM -0800, Christoph Hellwig wrote:
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright 2021, Collabora Ltd.
> > + */
> 
> Where is this coming from?
> 
> > +#ifndef __GLIBC__
> > +#include <asm-generic/int-ll64.h>
> > +#endif
> 
> And what is this for?  Looks pretty whacky.
> 
> > +case "$FSTYP" in
> > +xfs)
> > +	# added as a part of xfs health monitoring
> > +	_require_xfs_io_command healthmon
> > +	# no out of place writes
> > +	_require_no_xfs_always_cow
> > +	;;
> > +ext4)
> > +	# added at the same time as uevents
> > +	modprobe fs-$FSTYP
> > +	test -e /sys/fs/ext4/features/uevents || \
> > +		_notrun "$FSTYP does not support fsnotify ioerrors"
> > +	;;
> > +*)
> > +	_notrun "$FSTYP does not support fsnotify ioerrors"
> > +	;;
> > +esac
> 
> Please abstract this out into a documented helper in common/

Ok.  I'm not sure how to check for feature support on ext4 anymore since
the uevents patch didn't get merged, and then I clearly forgot to rip
that out of this helper here.

> > +#
> > +# The dm-error map added by this test doesn't work on zoned devices because
> > +# table sizes need to be aligned to the zone size, and even for zoned on
> > +# conventional this test will get confused because of the internal RT device.
> > +#
> > +# That check requires a mounted file system, so do a dummy mount before setting
> > +# up DM.
> > +#
> > +_scratch_mount
> > +test $FSTYP = xfs && _require_xfs_scratch_non_zoned
> > +_scratch_unmount
> 
> Hmm, this is a bit sad.  Can we align the map?  Or should we carve in
> and add proper error injection to the block code, which has been
> somewhere on my todo list forever because dm-error and friends are
> so painful to setup.  Maybe I need to expedite that.

I think it's theoretically possible to figure out that there's a zone
size and then round outwards the error-target part of the dm table to
align with a zone.  I have a lot more doubts about whether or not doing
that in bash/awk is a good idea though.  It'd be a lot easier if either
the block layer did error injection or if someone just fixes those
limitations in dm itself.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service
  2026-03-03 15:49     ` Christoph Hellwig
@ 2026-03-03 16:52       ` Darrick J. Wong
  2026-03-03 16:54         ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 16:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:49:33AM -0800, Christoph Hellwig wrote:
> > +/* Start healer services for existing XFS mounts. */
> > +static int
> > +start_existing_mounts(
> > +	int			mnt_ns_fd)
> > +{
> > +	struct mnt_id_req	req = {
> > +		.size		= sizeof(struct mnt_id_req),
> > +#ifdef HAVE_LISTMOUNT_NS_FD
> > +		.mnt_ns_fd	= mnt_ns_fd,
> > +#else
> > +		.spare		= mnt_ns_fd,
> > +#endif
> > +		.mnt_id		= LSMT_ROOT,
> > +	};
> > +	uint64_t		mnt_ids[32];
> > +	int			i;
> 
> 
> > +	while ((ret = syscall(SYS_listmount, &req, &mnt_ids, 32, 0)) > 0) {
> 
> Should this use a wrapper so we can switch to the type safe libc
> version once it becomes available?

What kind of wrapper?

static inline void
set_mnt_id_req_ns_fd(struct mnt_id_req *r, int mnt_ns_fd)
{
#ifdef HAVE_LISTMOUNT_NS_FD
	r->mnt_ns_fd = mnt_ns_fd;
#else
	r->spare = mnt_ns_fd;
#endif
}

or did you have something else in mind?  The manual page for listmount
says that glibc provides no wrapper[1].

--D

[1] https://www.man7.org/linux//man-pages/man2/listmount.2.html

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 16:49       ` Darrick J. Wong
@ 2026-03-03 16:53         ` Christoph Hellwig
  2026-03-03 17:59           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 16:53 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, zlang, linux-fsdevel, hch, gabriel, amir73il,
	jack, fstests, linux-xfs

On Tue, Mar 03, 2026 at 08:49:01AM -0800, Darrick J. Wong wrote:
> > > +ext4)
> > > +	# added at the same time as uevents
> > > +	modprobe fs-$FSTYP
> > > +	test -e /sys/fs/ext4/features/uevents || \
> > > +		_notrun "$FSTYP does not support fsnotify ioerrors"
> > > +	;;
> > > +*)
> > > +	_notrun "$FSTYP does not support fsnotify ioerrors"
> > > +	;;
> > > +esac
> > 
> > Please abstract this out into a documented helper in common/
> 
> Ok.  I'm not sure how to check for feature support on ext4 anymore since
> the uevents patch didn't get merged, and then I clearly forgot to rip
> that out of this helper here.

Oh.  Well, drop that then and move the xfs side and the default n
into a common helper instead of hardcoding it in the test.

> > and add proper error injection to the block code, which has been
> > somewhere on my todo list forever because dm-error and friends are
> > so painful to setup.  Maybe I need to expedite that.
> 
> I think it's theoretically possible to figure out that there's a zone
> size and then round outwards the error-target part of the dm table to
> align with a zone.

It's the sysfs chunk size.  btrfs/237 harcodes reading that out,
which could be easily lifted into a helper.

> I have a lot more doubts about whether or not doing
> that in bash/awk is a good idea though.  It'd be a lot easier if either
> the block layer did error injection or if someone just fixes those
> limitations in dm itself.

I'll sign up to do the block layer stuff.  Doing so should allow us
to run a lot more of the error injetion tests on zoned xfs, which
would be good.  So I guess you should keep it as-is for now,
and I'll do a sweep later.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service
  2026-03-03 16:52       ` Darrick J. Wong
@ 2026-03-03 16:54         ` Christoph Hellwig
  2026-03-03 17:06           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-03 16:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 08:52:21AM -0800, Darrick J. Wong wrote:
> > > +	while ((ret = syscall(SYS_listmount, &req, &mnt_ids, 32, 0)) > 0) {
> > 
> > Should this use a wrapper so we can switch to the type safe libc
> > version once it becomes available?
> 
> What kind of wrapper?

For calling the listmount system call.

> or did you have something else in mind?  The manual page for listmount
> says that glibc provides no wrapper[1].

Ånd there are no plans to provide one? :(  Even if so having a libfrog
wrapper would be nice rather than open coding syscall() in at least
two places in this series.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible
  2026-03-03 15:53     ` Christoph Hellwig
@ 2026-03-03 16:59       ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 16:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:53:45AM -0800, Christoph Hellwig wrote:
> >  	if (disk->d_fd < 0)
> > @@ -266,6 +267,18 @@ disk_close(
> >  #define LBASIZE(d)		(1ULL << (d)->d_lbalog)
> >  #define BTOLBA(d, bytes)	(((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog)
> >  
> > +#ifndef BTOBB
> > +# define BTOBB(bytes)		((uint64_t)((bytes) + 511) >> 9)
> > +#endif
> > +
> > +#ifndef BTOBBT
> > +# define BTOBBT(bytes)		((uint64_t)(bytes) >> 9)
> > +#endif
> > +
> > +#ifndef BBTOB
> > +# define BBTOB(bytes)		((uint64_t)(bytes) << 9)
> > +#endif
> 
> Is this really something that should be in scrub and not in
> common code?  And why the ifndef?  the 9 and the derived from that
> 511 would also really benefit from symbolic names.

Hrmm, that's a good question, why /did/ I duplicate that from xfs_fs.h?
I have no idea why and it builds fine without it so I'll drop it.

> > +	if (disk->d_verify_fd >= 0) {
> > +		const uint64_t	orig_start_daddr = BTOBBT(start);
> > +		struct xfs_verify_media me = {
> > +			.me_start_daddr	= orig_start_daddr,
> > +			.me_end_daddr	= BTOBB(start + length),
> > +			.me_dev		= disk->d_verify_disk,
> > +			.me_rest_us	= bg_mode > 2 ? bg_mode - 1 : 0,
> > +		};
> > +		int		ret;
> > +
> > +		if (single_step)
> > +			me.me_flags |= XFS_VERIFY_MEDIA_REPORT;
> > +
> > +		ret = ioctl(disk->d_verify_fd, XFS_IOC_VERIFY_MEDIA, &me);
> > +		if (ret < 0)
> > +			return ret;
> > +		if (me.me_ioerror) {
> > +			errno = me.me_ioerror;
> > +			return -1;
> > +		}
> > +
> > +		return BBTOB(me.me_start_daddr - orig_start_daddr);
> > +	}
> 
> split this whole block into a helper for readabiltity?

Will do.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service
  2026-03-03 16:54         ` Christoph Hellwig
@ 2026-03-03 17:06           ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 08:54:12AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 08:52:21AM -0800, Darrick J. Wong wrote:
> > > > +	while ((ret = syscall(SYS_listmount, &req, &mnt_ids, 32, 0)) > 0) {
> > > 
> > > Should this use a wrapper so we can switch to the type safe libc
> > > version once it becomes available?
> > 
> > What kind of wrapper?
> 
> For calling the listmount system call.
> 
> > or did you have something else in mind?  The manual page for listmount
> > says that glibc provides no wrapper[1].
> 
> Ånd there are no plans to provide one? :(  Even if so having a libfrog
> wrapper would be nice rather than open coding syscall() in at least
> two places in this series.

Oh, I see.  Yes, I could create a libfrog helper to wrap the listmount
callsites.

I can't tell what sorts of discussions glibc may or may not have had
because sourceware is barely reachable due to AIDDOS attacks or whatever
the reason du jour is, and given that the archives are pipermail they're
probably not searchable anyway. :(

Google, FWIW, shows a discussion from November 2023 that seems to have
dried up, and the glibc gitweb doesn't produce any hits for listmount or
statmount.

So my guess is that we can just make our own libfrog wrapper and if libc
support ever shows up we can always port.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 22/26] xfs_io: add listmount command
  2026-03-03 15:56     ` Christoph Hellwig
@ 2026-03-03 17:08       ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:08 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:56:35AM -0800, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 04:39:33PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Add a command to list all mounts, now that we use this in
> > xfs_healer_start.
> 
> > +/* copied from linux/mount.h in linux 6.18 */
> > +struct statmount_fixed {
> 
> Shouldn't this use the kernel uapi header and/or a copy of it?
> 
> And be split out of the .c file into a separate header for maintainance
> and eventually nuking once the kernel requirement becomes new enough?
> 
> > +static int
> > +listmount(
> > +	const struct mnt_id_req	*req,
> > +	uint64_t		*mnt_ids,
> > +	size_t			nr_mnt_ids)
> > +{
> > +	return syscall(SYS_listmount, req, mnt_ids, nr_mnt_ids, 0);
> > +}
> 
> Same comment as for the other listmount instance here.
> 
> > +
> > +static int
> > +statmount(
> > +	const struct mnt_id_req	*req,
> > +	struct statmount_fixed	*smbuf,
> > +	size_t			smbuf_size)
> > +{
> > +	return syscall(SYS_statmount, req, smbuf, smbuf_size, 0);
> > +}
> 
> and similar here.

Yeah, I'll move all this into a libfrog .c/.h file.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 26/26] debian/control: listify the build dependencies
  2026-03-03 15:58     ` Christoph Hellwig
@ 2026-03-03 17:09       ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:58:52AM -0800, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 04:40:36PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > This will make it less gross to add more build deps later.
> 
> Looks good, but should this go to the beginning of the series?

Fine with me.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-03 15:58     ` Christoph Hellwig
@ 2026-03-03 17:14       ` Darrick J. Wong
  2026-03-04 13:01         ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:58:34AM -0800, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 04:40:20PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Now that we're finished building autonomous repair, enable the service
> > on the root filesystem by default.  The root filesystem is mounted by
> > the initrd prior to starting systemd, which is why the udev rule cannot
> > autostart the service for the root filesystem.
> > 
> > dh_installsystemd won't activate a template service (aka one with an
> > at-sign in the name) even if it provides a DefaultInstance directive to
> > make that possible.  Use a fugly shim for this.
> 
> Given that this is brand new code it feels a bit too early.  But maybe
> that's just me.

A lot depends on the distro -- RHEL and SUSE require the sysadmin to
activate services.  Debian turns on any service shipping in a package by
default, which is sort of funny since they don't enable online fsck in
their kernel at all, so all the healer services fail the --supported
checks and deactivate immediately.

(By contrast stock OL doesn't enable the service but enables scrub in
the kernel; and UEK enables everything.)

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-03 15:51     ` Christoph Hellwig
@ 2026-03-03 17:26       ` Darrick J. Wong
  2026-03-04 13:03         ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:51:45AM -0800, Christoph Hellwig wrote:
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> but in a way just grabing a weak handle at mount time and never
> dropping it would seem more useful?

Er... that /is/ what xfs_healer does -- it opens the rootdir, creates
the weakhandle, the weakhandle makes a copy of the rootdir handle, and
then xfs_healer closes the rootdir fd.

Later when xfs_healer needs to do a repair, it asks the weakhandle to
reopen the old mountpoint path, compare that fd's handle to the sample
in the weakhandle, and tell us if the fd matches.

Or did you mean that xfs_healer should keep the rootdir fd open for the
duration of its existence, that way weakhandle reconnection is trivial?

[from the next patch]

> > When xfs_healer reopens a mountpoint to perform a repair, it should
> > validate that the opened fd points to a file on the same filesystem as
> > the one being monitored.
> 
> .. and if we'd always keep the week handle around we would not need
> this?

The trouble with keeping the rootdir fd around is that now we pin the
mount and nobody can unmount the disk until they manually kill
xfs_healer.  IOWs, struct weakhandle is basically a wrapper around
struct xfs_handle with some cleverness to avoid maintaining an open fd
to the xfs filesystem when it's not needed.

(As opposed to libhandle, which maintains an open fd to the xfs
filesystem until you kill the program.)

<shrug> I might not be understanding the questions though.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-03 15:57     ` Christoph Hellwig
@ 2026-03-03 17:29       ` Darrick J. Wong
  2026-03-04 13:04         ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:57:18AM -0800, Christoph Hellwig wrote:
> 
> The idea of this is good, but is xfs_io really the right place for
> it?  I'd expect scrub or healer to just output this somewhow.

I suppose they could, but I didn't want to clutter up the argv parsing
any more than I had to; and xfs_healer gets installed to /usr/libexec
which would make fstests' use of them to find the service name more
complex.

(That was a long way of saying "can't we just keep using xfs_io as a
dumping ground for QA-related xfs stuff?" ;))

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled
  2026-03-03 15:58     ` Christoph Hellwig
@ 2026-03-03 17:32       ` Darrick J. Wong
  2026-03-05 22:22         ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:58:06AM -0800, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 04:40:05PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > If all backreferences are enabled in the filesystem, then enable online
> > repair by default if the user didn't supply any other autofsck setting.
> > Users might as well get full self-repair capability if they're paying
> > for the extra metadata.
> 
> Does this cause scrub to run by default or just healer on demand?
> People might not be happy about the former.

Ultimately it's up to the distro to decide if (a) they turn on the
kernel support and (b) enable the systemd services by default.  Setting
the fsproperty just means that you'll get different levels of
online repair functionality if the user/sysadmin/crond actually invoke
the services.

(That said, I was wondering if it was time to get rid of all the Kconfig
options...)

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 16:53         ` Christoph Hellwig
@ 2026-03-03 17:59           ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-03 17:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: zlang, linux-fsdevel, hch, gabriel, amir73il, jack, fstests,
	linux-xfs

On Tue, Mar 03, 2026 at 08:53:12AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 08:49:01AM -0800, Darrick J. Wong wrote:
> > > > +ext4)
> > > > +	# added at the same time as uevents
> > > > +	modprobe fs-$FSTYP
> > > > +	test -e /sys/fs/ext4/features/uevents || \
> > > > +		_notrun "$FSTYP does not support fsnotify ioerrors"
> > > > +	;;
> > > > +*)
> > > > +	_notrun "$FSTYP does not support fsnotify ioerrors"
> > > > +	;;
> > > > +esac
> > > 
> > > Please abstract this out into a documented helper in common/
> > 
> > Ok.  I'm not sure how to check for feature support on ext4 anymore since
> > the uevents patch didn't get merged, and then I clearly forgot to rip
> > that out of this helper here.
> 
> Oh.  Well, drop that then and move the xfs side and the default n
> into a common helper instead of hardcoding it in the test.

/me discovers that Baolin Liu added a "err_report_sec" sysfs knob to
ext4 in 7.0-rc1, so I can just change the helper to look for that.  I'll
move the logic to common/rc.

> > > and add proper error injection to the block code, which has been
> > > somewhere on my todo list forever because dm-error and friends are
> > > so painful to setup.  Maybe I need to expedite that.
> > 
> > I think it's theoretically possible to figure out that there's a zone
> > size and then round outwards the error-target part of the dm table to
> > align with a zone.
> 
> It's the sysfs chunk size.  btrfs/237 harcodes reading that out,
> which could be easily lifted into a helper.
> 
> > I have a lot more doubts about whether or not doing
> > that in bash/awk is a good idea though.  It'd be a lot easier if either
> > the block layer did error injection or if someone just fixes those
> > limitations in dm itself.
> 
> I'll sign up to do the block layer stuff.  Doing so should allow us
> to run a lot more of the error injetion tests on zoned xfs, which
> would be good.  So I guess you should keep it as-is for now,
> and I'll do a sweep later.

Ok, thanks. :)

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 1/1] generic: test fsnotify filesystem error reporting
  2026-03-03 14:51       ` Christoph Hellwig
  2026-03-03 14:56         ` Amir Goldstein
@ 2026-03-04 10:10         ` Jan Kara
  1 sibling, 0 replies; 112+ messages in thread
From: Jan Kara @ 2026-03-04 10:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Amir Goldstein, Darrick J. Wong, zlang, linux-fsdevel, hch,
	gabriel, jack, fstests, linux-xfs

On Tue 03-03-26 06:51:19, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 10:21:04AM +0100, Amir Goldstein wrote:
> > On Tue, Mar 3, 2026 at 1:40 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > >
> > > From: Darrick J. Wong <djwong@kernel.org>
> > >
> > > Test the fsnotify filesystem error reporting.
> > 
> > For the record, I feel that I need to say to all the people whom we pushed back
> > on fanotify tests in fstests until there was a good enough reason to do so,
> > that this seems like a good reason to do so ;)
> 
> Who pushed backed on that?  Because IMHO hiding stuff in ltp is a sure
> way it doesn't get exercisesd regularly?

Amir wrote it well, I'd just add the 0-day runs LTP, distro people run LTP
and lot of other test bots also run LTP so I wouldn't say fsnotify tests
are not exercised regularly. For record I don't expect regular filesystem
developers to need to run fsnotify tests as the code is generally well
separated from individual filesystems. Filesystem error reporting is kind
of special in this regard so I agree having it in fstests makes sense.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-03 17:14       ` Darrick J. Wong
@ 2026-03-04 13:01         ` Christoph Hellwig
  2026-03-05 22:10           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-04 13:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 09:14:00AM -0800, Darrick J. Wong wrote:
> A lot depends on the distro -- RHEL and SUSE require the sysadmin to
> activate services.  Debian turns on any service shipping in a package by
> default, which is sort of funny since they don't enable online fsck in
> their kernel at all, so all the healer services fail the --supported
> checks and deactivate immediately.

So this patch doesn't make much sense right now?

Either way it really should have these details in the commit log.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-03 17:26       ` Darrick J. Wong
@ 2026-03-04 13:03         ` Christoph Hellwig
  2026-03-04 16:30           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-04 13:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 09:26:54AM -0800, Darrick J. Wong wrote:
> Or did you mean that xfs_healer should keep the rootdir fd open for the
> duration of its existence, that way weakhandle reconnection is trivial?
> 
> [from the next patch]
> 
> > > When xfs_healer reopens a mountpoint to perform a repair, it should
> > > validate that the opened fd points to a file on the same filesystem as
> > > the one being monitored.
> > 
> > .. and if we'd always keep the week handle around we would not need
> > this?
> 
> The trouble with keeping the rootdir fd around is that now we pin the
> mount and nobody can unmount the disk until they manually kill
> xfs_healer.  IOWs, struct weakhandle is basically a wrapper around
> struct xfs_handle with some cleverness to avoid maintaining an open fd
> to the xfs filesystem when it's not needed.

Ok.  I've officially forgot what all the kernel code did.  I somehow
expected a weak handle to be a fd that the kernel could close on us,
which would be much more handy here.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-03 17:29       ` Darrick J. Wong
@ 2026-03-04 13:04         ` Christoph Hellwig
  2026-03-04 16:35           ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-04 13:04 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 09:29:16AM -0800, Darrick J. Wong wrote:
> (That was a long way of saying "can't we just keep using xfs_io as a
> dumping ground for QA-related xfs stuff?" ;))

I really hate messing it up with things that are no I/O at all,
and not related to issuing I/O or related syscalls.  Maybe just add
a new little binary for it?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-04 13:03         ` Christoph Hellwig
@ 2026-03-04 16:30           ` Darrick J. Wong
  2026-03-05 14:00             ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-04 16:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 05:03:26AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 09:26:54AM -0800, Darrick J. Wong wrote:
> > Or did you mean that xfs_healer should keep the rootdir fd open for the
> > duration of its existence, that way weakhandle reconnection is trivial?
> > 
> > [from the next patch]
> > 
> > > > When xfs_healer reopens a mountpoint to perform a repair, it should
> > > > validate that the opened fd points to a file on the same filesystem as
> > > > the one being monitored.
> > > 
> > > .. and if we'd always keep the week handle around we would not need
> > > this?
> > 
> > The trouble with keeping the rootdir fd around is that now we pin the
> > mount and nobody can unmount the disk until they manually kill
> > xfs_healer.  IOWs, struct weakhandle is basically a wrapper around
> > struct xfs_handle with some cleverness to avoid maintaining an open fd
> > to the xfs filesystem when it's not needed.
> 
> Ok.  I've officially forgot what all the kernel code did.  I somehow
> expected a weak handle to be a fd that the kernel could close on us,
> which would be much more handy here.

Yeah.  I tried creating a(nother) anon_inode that has the same sort of
weak link to the xfs_mount that the healthmon fd has, for the purpose of
forwarding scrub ioctls.  That got twisty fast because the scrub code
wants to be able to call things like mnt_want_write_file and file_inode,
but the file doesn't point to an xfs inode and abusing the anon inode
file to make that work just became too gross to stomach. :/

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-04 13:04         ` Christoph Hellwig
@ 2026-03-04 16:35           ` Darrick J. Wong
  2026-03-05 13:55             ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-04 16:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 05:04:18AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 09:29:16AM -0800, Darrick J. Wong wrote:
> > (That was a long way of saying "can't we just keep using xfs_io as a
> > dumping ground for QA-related xfs stuff?" ;))
> 
> I really hate messing it up with things that are no I/O at all,
> and not related to issuing I/O or related syscalls.  Maybe just add
> a new little binary for it?

How about xfs_db, since normal users shouldn't need to compute the
service unit names?

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 03/26] libfrog: add support code for starting systemd services programmatically
  2026-03-03 15:59       ` Darrick J. Wong
@ 2026-03-05  2:39         ` Darrick J. Wong
  2026-03-05 13:57           ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05  2:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 07:59:15AM -0800, Darrick J. Wong wrote:
> On Tue, Mar 03, 2026 at 07:45:37AM -0800, Christoph Hellwig wrote:
> > On Mon, Mar 02, 2026 at 04:34:36PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > Add some simple routines for computing the name of systemd service
> > > instances and starting systemd services.  These will be used by the
> > > xfs_healer_start service to start per-filesystem xfs_healer service
> > > instances.
> > > 
> > > Note that we run systemd helper programs as subprocesses for a couple of
> > > reasons.  First, the path-escaping functionality is not a part of any
> > > library-accessible API, which means it can only be accessed via
> > > systemd-escape(1).  Second, although the service startup functionality
> > > can be reached via dbus, doing so would introduce a new library
> > > dependency.  Systemd is also undergoing a dbus -> varlink RPC transition
> > > so we avoid that mess by calling the cli systemctl(1) program.
> > 
> > Just curious: did you run this past the systemd folks?  Shelling out
> > always feel a bit iffy, and they're usually happy to help on how to
> > integrate with their services, so just asking might result in a better
> > way.
> 
> I'll do that, though even if they add a dbus/varlink endpoint for path
> escaping, it'll be a few releases more until we can depend on it
> existing in the distros. :(
> 
> Service startup can be done fairly easily with something like this,
> though obviously you'd want to do real error checking here:
> 
> static DBusConnection*
> connect_to_system_bus(void)
> {
> 	// Connect to the system bus (requires root or polkit permissions)
> 	DBusConnection *conn = dbus_bus_get(DBUS_BUS_SYSTEM, error);
> 
> 	return conn;
> }
> 
> int
> systemd_stop_service(
> 	const char *service_name)
> {
> 	DBusError error;
> 	DBusConnection *conn = connect_to_system_bus();
> 
> 	const char *manager_path = "/org/freedesktop/systemd1";
> 	const char *manager_interface = "org.freedesktop.systemd1.Manager";
> 	const char *method = "StopUnit";
> 
> 	DBusMessage *msg = dbus_message_new_method_call(
> 		"org.freedesktop.systemd1",
> 		manager_path,
> 		manager_interface,
> 		method
> 	);
> 
> 	const char *mode = "replace"; // Stop and replace existing job
> 	dbus_message_append_args(msg, DBUS_TYPE_STRING, &service_name,
> 			DBUS_TYPE_STRING, &mode, DBUS_TYPE_INVALID);
> 
> 	DBusMessage *reply =
> 			dbus_connection_send_with_reply_and_block(conn,
> 					msg, 5000, &error);
> 
> 	dbus_message_unref(reply);
> 	dbus_message_unref(msg);
> 	dbus_connection_unref(conn);
> 
> 	return 0;
> }
> 
> with the previously mentioned problem that now xfsprogs grows build and
> packaging dependencies on libdbus.  My guess is that it'll be a long
> time till they deprecate starting services over dbus.  AFAICT systemd in
> Trixie doesn't even expose varlink endpoints yet.
> 
> (Note that xfs_scrub_all already has a runtime dependency on
> python3-dbus)

Interim update -- just from looking at what
'systemctl restart --no-block' does, there's quite a lot of complexity
that the CLI hides.  If you ask it to start a service, it gives you back
a job object, then you can sit and wait to see what the result of the
job is, etc.  The above vibecoding was actually enough to start the
service, but TBH I think the dangers of shelling out are <cough> roughly
on the same level as all the crap you have to add to talk to systemd
over libdbus.

I'll try their mailing list, but first I have to wait for them to
approve my subscription, and only then can I ask...

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-04 16:35           ` Darrick J. Wong
@ 2026-03-05 13:55             ` Christoph Hellwig
  2026-03-05 22:00               ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-05 13:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 08:35:02AM -0800, Darrick J. Wong wrote:
> On Wed, Mar 04, 2026 at 05:04:18AM -0800, Christoph Hellwig wrote:
> > On Tue, Mar 03, 2026 at 09:29:16AM -0800, Darrick J. Wong wrote:
> > > (That was a long way of saying "can't we just keep using xfs_io as a
> > > dumping ground for QA-related xfs stuff?" ;))
> > 
> > I really hate messing it up with things that are no I/O at all,
> > and not related to issuing I/O or related syscalls.  Maybe just add
> > a new little binary for it?
> 
> How about xfs_db, since normal users shouldn't need to compute the
> service unit names?

Still seems totally out of place for something not touching the
on-disk structures.  What's the problem with adding a new trivial
binary for it?  Or even just publishing the name in a file in
/usr/share?


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 03/26] libfrog: add support code for starting systemd services programmatically
  2026-03-05  2:39         ` Darrick J. Wong
@ 2026-03-05 13:57           ` Christoph Hellwig
  0 siblings, 0 replies; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-05 13:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 06:39:04PM -0800, Darrick J. Wong wrote:
> Interim update -- just from looking at what
> 'systemctl restart --no-block' does, there's quite a lot of complexity
> that the CLI hides.  If you ask it to start a service, it gives you back
> a job object, then you can sit and wait to see what the result of the
> job is, etc.  The above vibecoding was actually enough to start the
> service, but TBH I think the dangers of shelling out are <cough> roughly
> on the same level as all the crap you have to add to talk to systemd
> over libdbus.
> 
> I'll try their mailing list, but first I have to wait for them to
> approve my subscription, and only then can I ask...

Eh.  But yeah, I guess just sticking to the CLI might end up best.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-04 16:30           ` Darrick J. Wong
@ 2026-03-05 14:00             ` Christoph Hellwig
  2026-03-05 17:55               ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-05 14:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 08:30:20AM -0800, Darrick J. Wong wrote:
> Yeah.  I tried creating a(nother) anon_inode that has the same sort of
> weak link to the xfs_mount that the healthmon fd has, for the purpose of
> forwarding scrub ioctls.  That got twisty fast because the scrub code
> wants to be able to call things like mnt_want_write_file and file_inode,
> but the file doesn't point to an xfs inode and abusing the anon inode
> file to make that work just became too gross to stomach. :/

Yeah, don't see how that work.  I guess IFF we wanted that it would
have to be a VFS-level weak FD concept, and that seems out of scope for
this at the moment unfortunately.


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems
  2026-03-05 14:00             ` Christoph Hellwig
@ 2026-03-05 17:55               ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05 17:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Thu, Mar 05, 2026 at 06:00:33AM -0800, Christoph Hellwig wrote:
> On Wed, Mar 04, 2026 at 08:30:20AM -0800, Darrick J. Wong wrote:
> > Yeah.  I tried creating a(nother) anon_inode that has the same sort of
> > weak link to the xfs_mount that the healthmon fd has, for the purpose of
> > forwarding scrub ioctls.  That got twisty fast because the scrub code
> > wants to be able to call things like mnt_want_write_file and file_inode,
> > but the file doesn't point to an xfs inode and abusing the anon inode
> > file to make that work just became too gross to stomach. :/
> 
> Yeah, don't see how that work.  I guess IFF we wanted that it would
> have to be a VFS-level weak FD concept, and that seems out of scope for
> this at the moment unfortunately.

<nod> I'll add a comment summarizing this part of the discussion to the
commit message for adding repair functionality, so that we record the
justification for all the getmntent trickery.

/me observes that 7.0 adds a STATMOUNT_BY_FD flag to statmount() that
enables us to get the statmount data (and hence mnt_id) for an open
file.  We could record that (in addition to the datadev path) to try to
reconnect after a mount --move without having to parse /proc/mtab like
getmntent does.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-05 13:55             ` Christoph Hellwig
@ 2026-03-05 22:00               ` Darrick J. Wong
  2026-03-06 14:20                 ` Christoph Hellwig
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05 22:00 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Thu, Mar 05, 2026 at 05:55:11AM -0800, Christoph Hellwig wrote:
> On Wed, Mar 04, 2026 at 08:35:02AM -0800, Darrick J. Wong wrote:
> > On Wed, Mar 04, 2026 at 05:04:18AM -0800, Christoph Hellwig wrote:
> > > On Tue, Mar 03, 2026 at 09:29:16AM -0800, Darrick J. Wong wrote:
> > > > (That was a long way of saying "can't we just keep using xfs_io as a
> > > > dumping ground for QA-related xfs stuff?" ;))
> > > 
> > > I really hate messing it up with things that are no I/O at all,
> > > and not related to issuing I/O or related syscalls.  Maybe just add
> > > a new little binary for it?
> > 
> > How about xfs_db, since normal users shouldn't need to compute the
> > service unit names?
> 
> Still seems totally out of place for something not touching the
> on-disk structures.  What's the problem with adding a new trivial
> binary for it?  Or even just publishing the name in a file in
> /usr/share?

Eh I'll just put it in xfs_{scrub,healer} as a --svcname argument.

$ xfs_scrub --svcname /home
xfs_scrub@home.service
$ xfs_scrub --svcname -x /home
xfs_scrub_media@home.service
$ xfs_healer --svcname /home
xfs_healer@home.service

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-04 13:01         ` Christoph Hellwig
@ 2026-03-05 22:10           ` Darrick J. Wong
  2026-03-05 22:18             ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05 22:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Wed, Mar 04, 2026 at 05:01:59AM -0800, Christoph Hellwig wrote:
> On Tue, Mar 03, 2026 at 09:14:00AM -0800, Darrick J. Wong wrote:
> > A lot depends on the distro -- RHEL and SUSE require the sysadmin to
> > activate services.  Debian turns on any service shipping in a package by
> > default, which is sort of funny since they don't enable online fsck in
> > their kernel at all, so all the healer services fail the --supported
> > checks and deactivate immediately.
> 
> So this patch doesn't make much sense right now?
> 
> Either way it really should have these details in the commit log.

<shrug> I'll amend the commit message:

    Note that this won't do much right now because Debian doesn't enable
    online fsck in their kernels, so the ExecCondition will return false
    and the service won't actually activate.

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default
  2026-03-05 22:10           ` Darrick J. Wong
@ 2026-03-05 22:18             ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05 22:18 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Thu, Mar 05, 2026 at 02:10:50PM -0800, Darrick J. Wong wrote:
> On Wed, Mar 04, 2026 at 05:01:59AM -0800, Christoph Hellwig wrote:
> > On Tue, Mar 03, 2026 at 09:14:00AM -0800, Darrick J. Wong wrote:
> > > A lot depends on the distro -- RHEL and SUSE require the sysadmin to
> > > activate services.  Debian turns on any service shipping in a package by
> > > default, which is sort of funny since they don't enable online fsck in
> > > their kernel at all, so all the healer services fail the --supported
> > > checks and deactivate immediately.
> > 
> > So this patch doesn't make much sense right now?
> > 
> > Either way it really should have these details in the commit log.
> 
> <shrug> I'll amend the commit message:
> 
>     Note that this won't do much right now because Debian doesn't enable
>     online fsck in their kernels, so the ExecCondition will return false
>     and the service won't actually activate.

Though now I see that the first part of the commit message is also out
of date (we don't do udev anymore) so let's just replace the whole thing
with:

"debian: enable xfs_healer on the root filesystem by default

"Now that we're finished building autonomous repair, enable the healer
service on the root filesystem by default.  The root filesystem is
mounted by the initrd prior to starting systemd, which is why the
xfs_healer_start service cannot autostart the service for the root
filesystem.

"dh_installsystemd won't activate a template service (aka one with an
at-sign in the name) even if it provides a DefaultInstance directive to
make that possible.  Hence we enable this explicitly via the postinst
script.

"Note that Debian enables services by default upon package installation,
so this is consistent with their policies.  Their kernel doesn't enable
online fsck, so healer won't do much more than monitor for corruptions
and log them."

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled
  2026-03-03 17:32       ` Darrick J. Wong
@ 2026-03-05 22:22         ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-05 22:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Tue, Mar 03, 2026 at 09:32:03AM -0800, Darrick J. Wong wrote:
> On Tue, Mar 03, 2026 at 07:58:06AM -0800, Christoph Hellwig wrote:
> > On Mon, Mar 02, 2026 at 04:40:05PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > If all backreferences are enabled in the filesystem, then enable online
> > > repair by default if the user didn't supply any other autofsck setting.
> > > Users might as well get full self-repair capability if they're paying
> > > for the extra metadata.
> > 
> > Does this cause scrub to run by default or just healer on demand?
> > People might not be happy about the former.
> 
> Ultimately it's up to the distro to decide if (a) they turn on the
> kernel support and (b) enable the systemd services by default.  Setting
> the fsproperty just means that you'll get different levels of
> online repair functionality if the user/sysadmin/crond actually invoke
> the services.
> 
> (That said, I was wondering if it was time to get rid of all the Kconfig
> options...)

And here's where I'll add the note about distro policies:

"mkfs: enable online repair if all backrefs are enabled

"If all backreferences are enabled in the filesystem, then enable online
repair by default if the user didn't supply any other autofsck setting.
Users might as well get full self-repair capability if they're paying
for the extra metadata.

"Note that it's up to each distro to enable the systemd services
according to their own service activation policies.  Debian policy is to
enable all systemd services at package installation but they don't
enable online fsck in their Kconfig so the services won't activate.
RHEL and SUSE policy requires sysadmins to enable them explicitly unless
the OS vendor also ships a systemd preset file enabling the services.
Distros without systemd won't get any of the systemd services,
obviously."

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-05 22:00               ` Darrick J. Wong
@ 2026-03-06 14:20                 ` Christoph Hellwig
  2026-03-06 15:58                   ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Christoph Hellwig @ 2026-03-06 14:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, aalbersh, hch, linux-xfs

On Thu, Mar 05, 2026 at 02:00:51PM -0800, Darrick J. Wong wrote:
> > Still seems totally out of place for something not touching the
> > on-disk structures.  What's the problem with adding a new trivial
> > binary for it?  Or even just publishing the name in a file in
> > /usr/share?
> 
> Eh I'll just put it in xfs_{scrub,healer} as a --svcname argument.

I think I proposed something like this before, and I really like
that!


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 23/26] xfs_io: print systemd service names
  2026-03-06 14:20                 ` Christoph Hellwig
@ 2026-03-06 15:58                   ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-06 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: aalbersh, hch, linux-xfs

On Fri, Mar 06, 2026 at 06:20:21AM -0800, Christoph Hellwig wrote:
> On Thu, Mar 05, 2026 at 02:00:51PM -0800, Darrick J. Wong wrote:
> > > Still seems totally out of place for something not touching the
> > > on-disk structures.  What's the problem with adding a new trivial
> > > binary for it?  Or even just publishing the name in a file in
> > > /usr/share?
> > 
> > Eh I'll just put it in xfs_{scrub,healer} as a --svcname argument.
> 
> I think I proposed something like this before, and I really like
> that!

Yes, you did propose it before, and I figured you'd like it :)

--D

^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 01/13] xfs: test health monitoring code
  2026-03-03  0:41   ` [PATCH 01/13] xfs: test health monitoring code Darrick J. Wong
@ 2026-03-09 17:21     ` Zorro Lang
  2026-03-09 18:03       ` Darrick J. Wong
  0 siblings, 1 reply; 112+ messages in thread
From: Zorro Lang @ 2026-03-09 17:21 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: hch, fstests, linux-xfs

On Mon, Mar 02, 2026 at 04:41:07PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Add some functionality tests for the new health monitoring code.
> 
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
>  doc/group-names.txt |    1 +
>  tests/xfs/1885      |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/1885.out  |    5 +++++
>  3 files changed, 59 insertions(+)
>  create mode 100755 tests/xfs/1885
>  create mode 100644 tests/xfs/1885.out
> 
> 
> diff --git a/doc/group-names.txt b/doc/group-names.txt
> index 10b49e50517797..158f84d36d3154 100644
> --- a/doc/group-names.txt
> +++ b/doc/group-names.txt
> @@ -117,6 +117,7 @@ samefs			overlayfs when all layers are on the same fs
>  scrub			filesystem metadata scrubbers
>  seed			btrfs seeded filesystems
>  seek			llseek functionality
> +selfhealing		self healing filesystem code
>  selftest		tests with fixed results, used to validate testing setup
>  send			btrfs send/receive
>  shrinkfs		decreasing the size of a filesystem
> diff --git a/tests/xfs/1885 b/tests/xfs/1885
> new file mode 100755
> index 00000000000000..1d75ef19c7c9d9
> --- /dev/null
> +++ b/tests/xfs/1885
> @@ -0,0 +1,53 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
> +#
> +# FS QA Test 1885
> +#
> +# Make sure that healthmon handles module refcount correctly.
> +#
> +. ./common/preamble
> +_begin_fstest auto selfhealing

I found this test is quick enough, how about add it into "quick" group.

> +
> +. ./common/filter
> +. ./common/module

Which helper is this "module" file being included for?

> +
> +refcount_file="/sys/module/xfs/refcnt"
> +test -e "$refcount_file" || _notrun "cannot find xfs module refcount"

Or did you intend to add this part as a helper into common/module?

> +
> +_require_test
> +_require_xfs_io_command healthmon
> +
> +# Capture mod refcount without the test fs mounted
> +_test_unmount
> +init_refcount="$(cat "$refcount_file")"
> +
> +# Capture mod refcount with the test fs mounted
> +_test_mount
> +nomon_mount_refcount="$(cat "$refcount_file")"
> +
> +# Capture mod refcount with test fs mounted and the healthmon fd open.
> +# Pause the xfs_io process so that it doesn't actually respond to events.
> +$XFS_IO_PROG -c 'healthmon -c -v' $TEST_DIR >> $seqres.full &
> +sleep 0.5
> +kill -STOP %1
> +mon_mount_refcount="$(cat "$refcount_file")"
> +
> +# Capture mod refcount with only the healthmon fd open.
> +_test_unmount
> +mon_nomount_refcount="$(cat "$refcount_file")"
> +
> +# Capture mod refcount after continuing healthmon (which should exit due to the
> +# unmount) and killing it.
> +kill -CONT %1
> +kill %1
> +wait

We typically ensure that background processes are handled within the _cleanup function.

> +nomon_nomount_refcount="$(cat "$refcount_file")"
> +
> +_within_tolerance "mount refcount" "$nomon_mount_refcount" "$((init_refcount + 1))" 0 -v
> +_within_tolerance "mount + healthmon refcount" "$mon_mount_refcount" "$((init_refcount + 2))" 0 -v
> +_within_tolerance "healthmon refcount" "$mon_nomount_refcount" "$((init_refcount + 1))" 0 -v
> +_within_tolerance "end refcount" "$nomon_nomount_refcount" "$init_refcount" 0 -v
> +
> +status=0
> +exit

_exit 0

> diff --git a/tests/xfs/1885.out b/tests/xfs/1885.out
> new file mode 100644
> index 00000000000000..f152cef0525609
> --- /dev/null
> +++ b/tests/xfs/1885.out
> @@ -0,0 +1,5 @@
> +QA output created by 1885
> +mount refcount is in range
> +mount + healthmon refcount is in range
> +healthmon refcount is in range
> +end refcount is in range
> 


^ permalink raw reply	[flat|nested] 112+ messages in thread

* Re: [PATCH 01/13] xfs: test health monitoring code
  2026-03-09 17:21     ` Zorro Lang
@ 2026-03-09 18:03       ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-09 18:03 UTC (permalink / raw)
  To: Zorro Lang; +Cc: hch, fstests, linux-xfs

On Tue, Mar 10, 2026 at 01:21:14AM +0800, Zorro Lang wrote:
> On Mon, Mar 02, 2026 at 04:41:07PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Add some functionality tests for the new health monitoring code.
> > 
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > ---
> >  doc/group-names.txt |    1 +
> >  tests/xfs/1885      |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/1885.out  |    5 +++++
> >  3 files changed, 59 insertions(+)
> >  create mode 100755 tests/xfs/1885
> >  create mode 100644 tests/xfs/1885.out
> > 
> > 
> > diff --git a/doc/group-names.txt b/doc/group-names.txt
> > index 10b49e50517797..158f84d36d3154 100644
> > --- a/doc/group-names.txt
> > +++ b/doc/group-names.txt
> > @@ -117,6 +117,7 @@ samefs			overlayfs when all layers are on the same fs
> >  scrub			filesystem metadata scrubbers
> >  seed			btrfs seeded filesystems
> >  seek			llseek functionality
> > +selfhealing		self healing filesystem code
> >  selftest		tests with fixed results, used to validate testing setup
> >  send			btrfs send/receive
> >  shrinkfs		decreasing the size of a filesystem
> > diff --git a/tests/xfs/1885 b/tests/xfs/1885
> > new file mode 100755
> > index 00000000000000..1d75ef19c7c9d9
> > --- /dev/null
> > +++ b/tests/xfs/1885
> > @@ -0,0 +1,53 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2024-2026 Oracle.  All Rights Reserved.
> > +#
> > +# FS QA Test 1885
> > +#
> > +# Make sure that healthmon handles module refcount correctly.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto selfhealing
> 
> I found this test is quick enough, how about add it into "quick" group.

OK.

> > +
> > +. ./common/filter
> > +. ./common/module
> 
> Which helper is this "module" file being included for?

I think at one point I would rmmod/modprobe the module to force the
refcount leak issue, but discovered there's a sysfs knob for that...

> > +
> > +refcount_file="/sys/module/xfs/refcnt"
> > +test -e "$refcount_file" || _notrun "cannot find xfs module refcount"
> 
> Or did you intend to add this part as a helper into common/module?

...so this probably should get refactored into a new helper.

> > +
> > +_require_test
> > +_require_xfs_io_command healthmon
> > +
> > +# Capture mod refcount without the test fs mounted
> > +_test_unmount
> > +init_refcount="$(cat "$refcount_file")"
> > +
> > +# Capture mod refcount with the test fs mounted
> > +_test_mount
> > +nomon_mount_refcount="$(cat "$refcount_file")"
> > +
> > +# Capture mod refcount with test fs mounted and the healthmon fd open.
> > +# Pause the xfs_io process so that it doesn't actually respond to events.
> > +$XFS_IO_PROG -c 'healthmon -c -v' $TEST_DIR >> $seqres.full &
> > +sleep 0.5
> > +kill -STOP %1
> > +mon_mount_refcount="$(cat "$refcount_file")"
> > +
> > +# Capture mod refcount with only the healthmon fd open.
> > +_test_unmount
> > +mon_nomount_refcount="$(cat "$refcount_file")"
> > +
> > +# Capture mod refcount after continuing healthmon (which should exit due to the
> > +# unmount) and killing it.
> > +kill -CONT %1
> > +kill %1
> > +wait
> 
> We typically ensure that background processes are handled within the _cleanup function.

oops, will clean that up.

$XFS_IO_PROG -c 'healthmon -c -v' $TEST_DIR >> $seqres.full &
healer_pid=$!
...
kill $healer_pid

etc.  Thanks for pointing that out.

--D

> > +nomon_nomount_refcount="$(cat "$refcount_file")"
> > +
> > +_within_tolerance "mount refcount" "$nomon_mount_refcount" "$((init_refcount + 1))" 0 -v
> > +_within_tolerance "mount + healthmon refcount" "$mon_mount_refcount" "$((init_refcount + 2))" 0 -v
> > +_within_tolerance "healthmon refcount" "$mon_nomount_refcount" "$((init_refcount + 1))" 0 -v
> > +_within_tolerance "end refcount" "$nomon_nomount_refcount" "$init_refcount" 0 -v
> > +
> > +status=0
> > +exit
> 
> _exit 0
> 
> > diff --git a/tests/xfs/1885.out b/tests/xfs/1885.out
> > new file mode 100644
> > index 00000000000000..f152cef0525609
> > --- /dev/null
> > +++ b/tests/xfs/1885.out
> > @@ -0,0 +1,5 @@
> > +QA output created by 1885
> > +mount refcount is in range
> > +mount + healthmon refcount is in range
> > +healthmon refcount is in range
> > +end refcount is in range
> > 
> 

^ permalink raw reply	[flat|nested] 112+ messages in thread

* [PATCH 04/26] libfrog: hoist a couple of service helper functions
  2026-03-19  4:38 [PATCHSET v10 1/2] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
@ 2026-03-19  4:39 ` Darrick J. Wong
  0 siblings, 0 replies; 112+ messages in thread
From: Darrick J. Wong @ 2026-03-19  4:39 UTC (permalink / raw)
  To: aalbersh, djwong; +Cc: hch, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Hoist a couple of service/daemon-related helper functions to libfrog so
that we can share the code between xfs_scrub and xfs_healer.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 libfrog/systemd.h |   28 ++++++++++++++++++++++++++++
 scrub/xfs_scrub.c |   32 +++++++++-----------------------
 2 files changed, 37 insertions(+), 23 deletions(-)


diff --git a/libfrog/systemd.h b/libfrog/systemd.h
index 4f414bc3c1e9c3..c96df4afa39aa6 100644
--- a/libfrog/systemd.h
+++ b/libfrog/systemd.h
@@ -17,4 +17,32 @@ enum systemd_unit_manage {
 
 int systemd_manage_unit(enum systemd_unit_manage how, const char *unitname);
 
+static inline bool systemd_is_service(void)
+{
+	return getenv("SERVICE_MODE") != NULL;
+}
+
+/* Special processing for a service/daemon program that is exiting. */
+static inline int
+systemd_service_exit(int ret)
+{
+	/*
+	 * We have to sleep 2 seconds here because journald uses the pid to
+	 * connect our log messages to the systemd service.  This is critical
+	 * for capturing all the log messages if the service fails, because
+	 * failure analysis tools use the service name to gather log messages
+	 * for reporting.
+	 */
+	sleep(2);
+
+	/*
+	 * If we're being run as a service, the return code must fit the LSB
+	 * init script action error guidelines, which is to say that we
+	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
+	 * section 22.2) and hope the admin will scan the log for what actually
+	 * happened.
+	 */
+	return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+}
+
 #endif /* __LIBFROG_SYSTEMD_H__ */
diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c
index 3dba972a7e8d2a..79937aa8cce4c4 100644
--- a/scrub/xfs_scrub.c
+++ b/scrub/xfs_scrub.c
@@ -19,6 +19,7 @@
 #include "unicrash.h"
 #include "progress.h"
 #include "libfrog/histogram.h"
+#include "libfrog/systemd.h"
 
 /*
  * XFS Online Metadata Scrub (and Repair)
@@ -866,8 +867,7 @@ main(
 	if (stdout_isatty && !progress_fp)
 		progress_fp = fdopen(1, "w+");
 
-	if (getenv("SERVICE_MODE"))
-		is_service = true;
+	is_service = systemd_is_service();
 
 	/* Initialize overall phase stats. */
 	error = phase_start(&all_pi, 0, NULL);
@@ -960,29 +960,15 @@ main(
 	hist_free(&ctx.datadev_hist);
 	hist_free(&ctx.rtdev_hist);
 
-	/*
-	 * If we're being run as a service, the return code must fit the LSB
-	 * init script action error guidelines, which is to say that we
-	 * compress all errors to 1 ("generic or unspecified error", LSB 5.0
-	 * section 22.2) and hope the admin will scan the log for what
-	 * actually happened.
-	 *
-	 * We have to sleep 2 seconds here because journald uses the pid to
-	 * connect our log messages to the systemd service.  This is critical
-	 * for capturing all the log messages if the scrub fails, because the
-	 * fail service uses the service name to gather log messages for the
-	 * error report.
-	 *
-	 * Note: We don't count a lack of kernel support as a service failure
-	 * because we haven't determined that there's anything wrong with the
-	 * filesystem.
-	 */
 	if (is_service) {
-		sleep(2);
+		/*
+		 * Note: We don't count a lack of kernel support as a service
+		 * failure because we haven't determined that there's anything
+		 * wrong with the filesystem.
+		 */
 		if (!ctx.scrub_setup_succeeded)
-			return 0;
-		if (ret != SCRUB_RET_SUCCESS)
-			return 1;
+			ret = 0;
+		return systemd_service_exit(ret);
 	}
 
 	return ret;


^ permalink raw reply related	[flat|nested] 112+ messages in thread

end of thread, other threads:[~2026-03-19  4:39 UTC | newest]

Thread overview: 112+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-03  0:25 [PATCHBOMB v8] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
2026-03-03  0:33 ` [PATCHSET " Darrick J. Wong
2026-03-03  0:34   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
2026-03-03 15:44     ` Christoph Hellwig
2026-03-03  0:34   ` [PATCH 02/26] libfrog: create healthmon event log library functions Darrick J. Wong
2026-03-03 15:44     ` Christoph Hellwig
2026-03-03  0:34   ` [PATCH 03/26] libfrog: add support code for starting systemd services programmatically Darrick J. Wong
2026-03-03 15:45     ` Christoph Hellwig
2026-03-03 15:59       ` Darrick J. Wong
2026-03-05  2:39         ` Darrick J. Wong
2026-03-05 13:57           ` Christoph Hellwig
2026-03-03  0:34   ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong
2026-03-03 15:45     ` Christoph Hellwig
2026-03-03  0:35   ` [PATCH 05/26] man2: document the healthmon ioctl Darrick J. Wong
2026-03-03 15:46     ` Christoph Hellwig
2026-03-03  0:35   ` [PATCH 06/26] man2: document the media verification ioctl Darrick J. Wong
2026-03-03 15:46     ` Christoph Hellwig
2026-03-03  0:35   ` [PATCH 07/26] xfs_io: monitor filesystem health events Darrick J. Wong
2026-03-03 15:46     ` Christoph Hellwig
2026-03-03  0:35   ` [PATCH 08/26] xfs_io: add a media verify command Darrick J. Wong
2026-03-03 15:46     ` Christoph Hellwig
2026-03-03  0:36   ` [PATCH 09/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
2026-03-03 15:47     ` Christoph Hellwig
2026-03-03  0:36   ` [PATCH 10/26] xfs_healer: enable repairing filesystems Darrick J. Wong
2026-03-03 15:47     ` Christoph Hellwig
2026-03-03  0:36   ` [PATCH 11/26] xfs_healer: use getparents to look up file names Darrick J. Wong
2026-03-03 15:48     ` Christoph Hellwig
2026-03-03  0:36   ` [PATCH 12/26] xfs_healer: create a per-mount background monitoring service Darrick J. Wong
2026-03-03 15:48     ` Christoph Hellwig
2026-03-03  0:37   ` [PATCH 13/26] xfs_healer: create a service to start the per-mount healer service Darrick J. Wong
2026-03-03 15:49     ` Christoph Hellwig
2026-03-03 16:52       ` Darrick J. Wong
2026-03-03 16:54         ` Christoph Hellwig
2026-03-03 17:06           ` Darrick J. Wong
2026-03-03  0:37   ` [PATCH 14/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
2026-03-03 15:49     ` Christoph Hellwig
2026-03-03  0:37   ` [PATCH 15/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2026-03-03 15:50     ` Christoph Hellwig
2026-03-03  0:38   ` [PATCH 16/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2026-03-03 15:50     ` Christoph Hellwig
2026-03-03  0:38   ` [PATCH 17/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
2026-03-03 15:51     ` Christoph Hellwig
2026-03-03 17:26       ` Darrick J. Wong
2026-03-04 13:03         ` Christoph Hellwig
2026-03-04 16:30           ` Darrick J. Wong
2026-03-05 14:00             ` Christoph Hellwig
2026-03-05 17:55               ` Darrick J. Wong
2026-03-03  0:38   ` [PATCH 18/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
2026-03-03 15:52     ` Christoph Hellwig
2026-03-03  0:38   ` [PATCH 19/26] xfs_healer: add a manual page Darrick J. Wong
2026-03-03 15:52     ` Christoph Hellwig
2026-03-03  0:39   ` [PATCH 20/26] xfs_scrub: use the verify media ioctl during phase 6 if possible Darrick J. Wong
2026-03-03 15:53     ` Christoph Hellwig
2026-03-03 16:59       ` Darrick J. Wong
2026-03-03  0:39   ` [PATCH 21/26] xfs_scrub: perform media scanning of the log region Darrick J. Wong
2026-03-03 15:54     ` Christoph Hellwig
2026-03-03  0:39   ` [PATCH 22/26] xfs_io: add listmount command Darrick J. Wong
2026-03-03 15:56     ` Christoph Hellwig
2026-03-03 17:08       ` Darrick J. Wong
2026-03-03  0:39   ` [PATCH 23/26] xfs_io: print systemd service names Darrick J. Wong
2026-03-03 15:57     ` Christoph Hellwig
2026-03-03 17:29       ` Darrick J. Wong
2026-03-04 13:04         ` Christoph Hellwig
2026-03-04 16:35           ` Darrick J. Wong
2026-03-05 13:55             ` Christoph Hellwig
2026-03-05 22:00               ` Darrick J. Wong
2026-03-06 14:20                 ` Christoph Hellwig
2026-03-06 15:58                   ` Darrick J. Wong
2026-03-03  0:40   ` [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled Darrick J. Wong
2026-03-03 15:58     ` Christoph Hellwig
2026-03-03 17:32       ` Darrick J. Wong
2026-03-05 22:22         ` Darrick J. Wong
2026-03-03  0:40   ` [PATCH 25/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
2026-03-03 15:58     ` Christoph Hellwig
2026-03-03 17:14       ` Darrick J. Wong
2026-03-04 13:01         ` Christoph Hellwig
2026-03-05 22:10           ` Darrick J. Wong
2026-03-05 22:18             ` Darrick J. Wong
2026-03-03  0:40   ` [PATCH 26/26] debian/control: listify the build dependencies Darrick J. Wong
2026-03-03 15:58     ` Christoph Hellwig
2026-03-03 17:09       ` Darrick J. Wong
2026-03-03  0:33 ` [PATCHSET v8 1/2] fstests: test generic file IO error reporting Darrick J. Wong
2026-03-03  0:40   ` [PATCH 1/1] generic: test fsnotify filesystem " Darrick J. Wong
2026-03-03  9:21     ` Amir Goldstein
2026-03-03 14:51       ` Christoph Hellwig
2026-03-03 14:56         ` Amir Goldstein
2026-03-04 10:10         ` Jan Kara
2026-03-03 14:54     ` Christoph Hellwig
2026-03-03 16:06       ` Gabriel Krisman Bertazi
2026-03-03 16:12         ` Christoph Hellwig
2026-03-03 16:38           ` Darrick J. Wong
2026-03-03 16:49       ` Darrick J. Wong
2026-03-03 16:53         ` Christoph Hellwig
2026-03-03 17:59           ` Darrick J. Wong
2026-03-03  0:33 ` [PATCHSET v8 2/2] fstests: autonomous self healing of filesystems Darrick J. Wong
2026-03-03  0:41   ` [PATCH 01/13] xfs: test health monitoring code Darrick J. Wong
2026-03-09 17:21     ` Zorro Lang
2026-03-09 18:03       ` Darrick J. Wong
2026-03-03  0:41   ` [PATCH 02/13] xfs: test for metadata corruption error reporting via healthmon Darrick J. Wong
2026-03-03  0:41   ` [PATCH 03/13] xfs: test io " Darrick J. Wong
2026-03-03  0:41   ` [PATCH 04/13] xfs: set up common code for testing xfs_healer Darrick J. Wong
2026-03-03  0:42   ` [PATCH 05/13] xfs: test xfs_healer's event handling Darrick J. Wong
2026-03-03  0:42   ` [PATCH 06/13] xfs: test xfs_healer can fix a filesystem Darrick J. Wong
2026-03-03  0:42   ` [PATCH 07/13] xfs: test xfs_healer can report file I/O errors Darrick J. Wong
2026-03-03  0:42   ` [PATCH 08/13] xfs: test xfs_healer can report file media errors Darrick J. Wong
2026-03-03  0:43   ` [PATCH 09/13] xfs: test xfs_healer can report filesystem shutdowns Darrick J. Wong
2026-03-03  0:43   ` [PATCH 10/13] xfs: test xfs_healer can initiate full filesystem repairs Darrick J. Wong
2026-03-03  0:43   ` [PATCH 11/13] xfs: test xfs_healer can follow mount moves Darrick J. Wong
2026-03-03  0:43   ` [PATCH 12/13] xfs: test xfs_healer wont repair the wrong filesystem Darrick J. Wong
2026-03-03  0:44   ` [PATCH 13/13] xfs: test xfs_healer background service Darrick J. Wong
2026-03-03  0:47   ` [PATCH 14/13] xfs: test xfs_healer startup service Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2026-03-19  4:38 [PATCHSET v10 1/2] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
2026-03-19  4:39 ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox