public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHBOMB v10] xfsprogs: autonomous self healing of filesystems
@ 2026-03-19  4:37 Darrick J. Wong
  2026-03-19  4:38 ` [PATCHSET v10 1/2] " Darrick J. Wong
  2026-03-19  4:38 ` [PATCHSET v10 2/2] xfs_scrub: refactor to XFS_IOC_VERIFY_MEDIA Darrick J. Wong
  0 siblings, 2 replies; 71+ messages in thread
From: Darrick J. Wong @ 2026-03-19  4:37 UTC (permalink / raw)
  To: Andrey Albershteyn; +Cc: cem, hch, linux-xfs, Zorro Lang

Hi all,

This patchset contains the userspace and QA changes (xfs_healer) needed
to put to use all the new kernel functionality to deliver live
information about filesystem health events (xfs_healthmon.c) to
userspace and a lot of cleanups to xfs_scrub's media verification.

In userspace, we create a new daemon program that will read the event
objects and initiate repairs automatically.  This daemon is managed
entirely by systemd and will not block unmounting of the filesystem
unless repairs are ongoing.  They are auto-started by a starter
service that uses fanotify.

When the patchsets under this cover letter are merged, online fsck for
XFS will at long last be fully feature complete.  The passive scan parts
have been done since mid-2024, this final part adds proactive repair.

Here's what's left to review, thanks to Christoph for doing a bunch of
xfs_healer reviews and sharing some cleanups he wanted to see in
xfs_scrub; and to Zorro for merging the fstests.

[PATCHSET v10 1/2] xfsprogs: autonomous self healing of filesystems
  [PATCH 19/26] xfs_healer: use statmount to find moved filesystems
[PATCHSET v10 2/2] xfs_scrub: refactor to XFS_IOC_VERIFY_MEDIA
  [PATCH 01/22] libfrog: allow bitmap_free to handle a null bitmap
  [PATCH 02/22] mkfs: rename byte unit conversion macros
  [PATCH 03/22] libfrog: lift *BYTES helpers to convert.h
  [PATCH 04/22] xfs_scrub: report truncated devices as media errors
  [PATCH 05/22] xfs_scrub: fix i18n of the decode_special_owner return
  [PATCH 07/22] xfs_scrub: move read verification scheduling to
  [PATCH 09/22] xfs_scrub: don't pass the io_end_arg around everywhere
  [PATCH 11/22] xfs_scrub: rename nr_io_threads
  [PATCH 16/22] xfs_scrub: perform media scanning of the log region
  [PATCH 17/22] xfs_scrub: index read-verify pools by xfs_device ids
  [PATCH 18/22] xfs_scrub: move failmap and other outputs into
  [PATCH 19/22] xfs_scrub: clean up device-related error messages
  [PATCH 20/22] xfs_scrub: drop SCSI_VERIFY code from disk.
  [PATCH 21/22] xfs_scrub: raise media verification IO limits
  [PATCH 22/22] xfs_scrub: allow overrides of the media verification IO

v10: cleanups of the media verification code in xfs_scrub
v9: reorg listmount/statmount, use it to find moved mounts, improve the
    commit messages and documentation
v8: clean up userspace for merging now that the kernel part is upstream
v7: more cleanups of the media verification ioctl, improve comments, and
    reuse the bio
v6: fix pi-breaking bugs, make verify failures trigger health reports
    and filter bio status flags better
v5: add verify-media ioctl, collapse small helper funcs with only
    one caller
v4: drop multiple client support so we can make direct calls into
    healthmon instead of chasing pointers and doing indirect calls
v3: drag out of rfc status

--D

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2026-03-23 15:18 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19  4:37 [PATCHBOMB v10] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
2026-03-19  4:38 ` [PATCHSET v10 1/2] " Darrick J. Wong
2026-03-19  4:39   ` [PATCH 01/26] libfrog: add a function to grab the path from an open fd and a file handle Darrick J. Wong
2026-03-19  4:39   ` [PATCH 02/26] libfrog: create healthmon event log library functions Darrick J. Wong
2026-03-19  4:39   ` [PATCH 03/26] libfrog: add support code for starting systemd services programmatically Darrick J. Wong
2026-03-19  4:39   ` [PATCH 04/26] libfrog: hoist a couple of service helper functions Darrick J. Wong
2026-03-19  4:40   ` [PATCH 05/26] libfrog: add wrappers for listmount and statmount Darrick J. Wong
2026-03-19  4:40   ` [PATCH 06/26] man2: document the healthmon ioctl Darrick J. Wong
2026-03-19  4:40   ` [PATCH 07/26] man2: document the media verification ioctl Darrick J. Wong
2026-03-19  4:40   ` [PATCH 08/26] xfs_io: monitor filesystem health events Darrick J. Wong
2026-03-19  4:41   ` [PATCH 09/26] xfs_io: add a media verify command Darrick J. Wong
2026-03-19  4:41   ` [PATCH 10/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
2026-03-19  4:41   ` [PATCH 11/26] xfs_healer: enable repairing filesystems Darrick J. Wong
2026-03-19  4:41   ` [PATCH 12/26] xfs_healer: use getparents to look up file names Darrick J. Wong
2026-03-19  4:42   ` [PATCH 13/26] xfs_healer: create a per-mount background monitoring service Darrick J. Wong
2026-03-19  4:42   ` [PATCH 14/26] xfs_healer: create a service to start the per-mount healer service Darrick J. Wong
2026-03-19  4:42   ` [PATCH 15/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
2026-03-19  4:42   ` [PATCH 16/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2026-03-19  4:43   ` [PATCH 17/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2026-03-19  4:43   ` [PATCH 18/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
2026-03-19  4:43   ` [PATCH 19/26] xfs_healer: use statmount to find moved filesystems even faster Darrick J. Wong
2026-03-20  7:11     ` Christoph Hellwig
2026-03-19  4:43   ` [PATCH 20/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
2026-03-19  4:44   ` [PATCH 21/26] xfs_healer: add a manual page Darrick J. Wong
2026-03-19  4:44   ` [PATCH 22/26] xfs_scrub: print systemd service names Darrick J. Wong
2026-03-19  4:44   ` [PATCH 23/26] xfs_io: add listmount and statmount commands Darrick J. Wong
2026-03-19  4:45   ` [PATCH 24/26] mkfs: enable online repair if all backrefs are enabled Darrick J. Wong
2026-03-19  4:45   ` [PATCH 25/26] debian/control: listify the build dependencies Darrick J. Wong
2026-03-19  4:45   ` [PATCH 26/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
2026-03-19  4:38 ` [PATCHSET v10 2/2] xfs_scrub: refactor to XFS_IOC_VERIFY_MEDIA Darrick J. Wong
2026-03-19  4:45   ` [PATCH 01/22] libfrog: allow bitmap_free to handle a null bitmap pointer Darrick J. Wong
2026-03-20  7:12     ` Christoph Hellwig
2026-03-19  4:46   ` [PATCH 02/22] mkfs: rename byte unit conversion macros Darrick J. Wong
2026-03-20  7:12     ` Christoph Hellwig
2026-03-19  4:46   ` [PATCH 03/22] libfrog: lift *BYTES helpers to convert.h Darrick J. Wong
2026-03-20  7:12     ` Christoph Hellwig
2026-03-19  4:46   ` [PATCH 04/22] xfs_scrub: report truncated devices as media errors Darrick J. Wong
2026-03-20  7:13     ` Christoph Hellwig
2026-03-19  4:46   ` [PATCH 05/22] xfs_scrub: fix i18n of the decode_special_owner return value Darrick J. Wong
2026-03-20  7:13     ` Christoph Hellwig
2026-03-19  4:47   ` [PATCH 06/22] scrub: remove the unused io_disk field in struct read_verify Darrick J. Wong
2026-03-19  4:47   ` [PATCH 07/22] xfs_scrub: move read verification scheduling to phase6.c Darrick J. Wong
2026-03-20  7:14     ` Christoph Hellwig
2026-03-19  4:47   ` [PATCH 08/22] scrub: simplify the read_verify_pool_alloc interface Darrick J. Wong
2026-03-19  4:47   ` [PATCH 09/22] xfs_scrub: don't pass the io_end_arg around everywhere Darrick J. Wong
2026-03-20  7:14     ` Christoph Hellwig
2026-03-19  4:48   ` [PATCH 10/22] scrub: use enum xfs_device for read verification Darrick J. Wong
2026-03-19  4:48   ` [PATCH 11/22] xfs_scrub: rename nr_io_threads Darrick J. Wong
2026-03-20  7:14     ` Christoph Hellwig
2026-03-19  4:48   ` [PATCH 12/22] scrub: simplify verifier threads calculation Darrick J. Wong
2026-03-19  4:48   ` [PATCH 13/22] xfs_scrub: move disk media verification error injection Darrick J. Wong
2026-03-19  4:49   ` [PATCH 14/22] xfs_scrub: use the verify media ioctl during phase 6 if possible Darrick J. Wong
2026-03-19  4:49   ` [PATCH 15/22] scrub: don't allocate disk for ioctl-based media verify Darrick J. Wong
2026-03-19  4:49   ` [PATCH 16/22] xfs_scrub: perform media scanning of the log region Darrick J. Wong
2026-03-20  7:15     ` Christoph Hellwig
2026-03-19  4:49   ` [PATCH 17/22] xfs_scrub: index read-verify pools by xfs_device ids Darrick J. Wong
2026-03-20  7:15     ` Christoph Hellwig
2026-03-19  4:50   ` [PATCH 18/22] xfs_scrub: move failmap and other outputs into read_verify_pool Darrick J. Wong
2026-03-20  7:15     ` Christoph Hellwig
2026-03-19  4:50   ` [PATCH 19/22] xfs_scrub: clean up device-related error messages Darrick J. Wong
2026-03-20  7:15     ` Christoph Hellwig
2026-03-19  4:50   ` [PATCH 20/22] xfs_scrub: drop SCSI_VERIFY code from disk Darrick J. Wong
2026-03-20  7:16     ` Christoph Hellwig
2026-03-19  4:51   ` [PATCH 21/22] xfs_scrub: raise media verification IO limits Darrick J. Wong
2026-03-20  7:16     ` Christoph Hellwig
2026-03-20 15:46       ` Darrick J. Wong
2026-03-19  4:51   ` [PATCH 22/22] xfs_scrub: allow overrides of the " Darrick J. Wong
2026-03-20  7:17     ` Christoph Hellwig
2026-03-20 15:44       ` Darrick J. Wong
2026-03-23  6:08         ` Christoph Hellwig
2026-03-23 15:18           ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox