public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHBOMB 6.19] xfs: autonomous self healing
@ 2025-10-22 23:56 Darrick J. Wong
  2025-10-22 23:59 ` [PATCHSET V2] xfs: autonomous self healing of filesystems Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 80+ messages in thread
From: Darrick J. Wong @ 2025-10-22 23:56 UTC (permalink / raw)
  To: Carlos Maiolino, Christoph Hellwig
  Cc: xfs, Chandan Babu R, linux-fsdevel, fstests

Hi everyone,

You might recall that 18 months ago I showed off an early draft of a
patchset implementing autonomous self healing capabilities for XFS.
The premise is quite simple -- add a few hooks to the kernel to capture
significant filesystem metadata and file health events (pretty much all
failures), queue these events to a special anonfd, and let userspace
read the events at its leisure.  That's patchset 1.

The userspace part is more interesting, because there's a new daemon
that opens the anonfd given the root dir of a filesystem, captures a
file handle for the root dir, detaches from the root dir, and waits for
metadata events.  Upon receipt of an adverse health event, it will
reopen the root directory and initiate repairs.  I've left the prototype
Python script in place (patchset 2) but my ultimate goal is for everyone
to use the Rust version (patchset 3) because it's much quicker to
respond to problems.

New QA tests are patchset 4.  Zorro: No need to merge this right away.

This work was mostly complete by the end of 2024, and I've been letting
it run on my XFS QA testing fleets ever since then.  I am submitting
this patchset for upstream for 6.19.  Once this is merged, the online
fsck project will be complete.

--D

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2025-12-01 21:55 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-22 23:56 [PATCHBOMB 6.19] xfs: autonomous self healing Darrick J. Wong
2025-10-22 23:59 ` [PATCHSET V2] xfs: autonomous self healing of filesystems Darrick J. Wong
2025-10-23  0:00   ` [PATCH 01/19] docs: remove obsolete links in the xfs online repair documentation Darrick J. Wong
2025-10-24  5:40     ` Christoph Hellwig
2025-10-27 16:15       ` Darrick J. Wong
2025-10-23  0:01   ` [PATCH 02/19] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2025-10-30 16:38     ` Darrick J. Wong
2025-10-23  0:01   ` [PATCH 03/19] xfs: create debugfs uuid aliases Darrick J. Wong
2025-10-23  0:01   ` [PATCH 04/19] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-10-23  0:01   ` [PATCH 05/19] xfs: create a filesystem shutdown hook Darrick J. Wong
2025-10-23  0:02   ` [PATCH 06/19] xfs: create hooks for media errors Darrick J. Wong
2025-10-23  0:02   ` [PATCH 07/19] iomap: report buffered read and write io errors to the filesystem Darrick J. Wong
2025-10-23  0:02   ` [PATCH 08/19] iomap: report directio read and write errors to callers Darrick J. Wong
2025-10-23  0:02   ` [PATCH 09/19] xfs: create file io error hooks Darrick J. Wong
2025-10-23  0:03   ` [PATCH 10/19] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-10-23  0:03   ` [PATCH 11/19] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-10-30 16:54     ` Darrick J. Wong
2025-10-23  0:03   ` [PATCH 12/19] xfs: report metadata health events through healthmon Darrick J. Wong
2025-10-23  0:04   ` [PATCH 13/19] xfs: report shutdown " Darrick J. Wong
2025-10-23  0:04   ` [PATCH 14/19] xfs: report media errors " Darrick J. Wong
2025-10-23  0:04   ` [PATCH 15/19] xfs: report file io " Darrick J. Wong
2025-10-23  0:04   ` [PATCH 16/19] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2025-10-23  0:05   ` [PATCH 17/19] xfs: validate fds against running healthmon Darrick J. Wong
2025-10-23  0:05   ` [PATCH 18/19] xfs: add media error reporting ioctl Darrick J. Wong
2025-10-23  0:05   ` [PATCH 19/19] xfs: send uevents when major filesystem events happen Darrick J. Wong
2025-10-23  0:00 ` [PATCHSET V2 1/2] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
2025-10-23  0:05   ` [PATCH 01/26] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-10-23  0:06   ` [PATCH 02/26] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-10-23  0:06   ` [PATCH 03/26] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-10-23  0:06   ` [PATCH 04/26] xfs: report metadata health events through healthmon Darrick J. Wong
2025-10-23  0:06   ` [PATCH 05/26] xfs: report shutdown " Darrick J. Wong
2025-10-23  0:07   ` [PATCH 06/26] xfs: report media errors " Darrick J. Wong
2025-10-23  0:07   ` [PATCH 07/26] xfs: report file io " Darrick J. Wong
2025-10-23  0:07   ` [PATCH 08/26] xfs: validate fds against running healthmon Darrick J. Wong
2025-10-23  0:07   ` [PATCH 09/26] xfs: add media error reporting ioctl Darrick J. Wong
2025-10-23  0:08   ` [PATCH 10/26] xfs_io: monitor filesystem health events Darrick J. Wong
2025-10-23  0:08   ` [PATCH 11/26] xfs_io: add a media error reporting command Darrick J. Wong
2025-10-23  0:08   ` [PATCH 12/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
2025-10-23  0:08   ` [PATCH 13/26] xfs_healer: check events against schema Darrick J. Wong
2025-10-23  0:09   ` [PATCH 14/26] xfs_healer: enable repairing filesystems Darrick J. Wong
2025-10-23  0:09   ` [PATCH 15/26] xfs_healer: check for fs features needed for effective repairs Darrick J. Wong
2025-10-23  0:09   ` [PATCH 16/26] xfs_healer: use getparents to look up file names Darrick J. Wong
2025-10-23  0:09   ` [PATCH 17/26] builddefs: refactor udev directory specification Darrick J. Wong
2025-10-23  0:10   ` [PATCH 18/26] xfs_healer: create a background monitoring service Darrick J. Wong
2025-10-23  0:10   ` [PATCH 19/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
2025-10-23  0:10   ` [PATCH 20/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2025-10-23  0:11   ` [PATCH 21/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2025-10-23  0:11   ` [PATCH 22/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
2025-10-23  0:11   ` [PATCH 23/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
2025-10-23  0:11   ` [PATCH 24/26] xfs_healer: add a manual page Darrick J. Wong
2025-10-23  0:12   ` [PATCH 25/26] xfs_scrub: report media scrub failures to the kernel Darrick J. Wong
2025-10-23  0:12   ` [PATCH 26/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
2025-10-23  0:00 ` [PATCHSET V2 2/2] xfsprogs: autonomous self healing of filesystems in Rust Darrick J. Wong
2025-10-23  0:12   ` [PATCH 01/19] xfs_healer: start building a Rust version Darrick J. Wong
2025-10-23  0:12   ` [PATCH 02/19] xfs_healer: enable gettext for localization Darrick J. Wong
2025-10-23  0:13   ` [PATCH 03/19] xfs_healer: bindgen xfs_fs.h Darrick J. Wong
2025-10-23  0:13   ` [PATCH 04/19] xfs_healer: define Rust objects for health events and kernel interface Darrick J. Wong
2025-10-23  0:13   ` [PATCH 05/19] xfs_healer: read binary health events from the kernel Darrick J. Wong
2025-10-23  0:13   ` [PATCH 06/19] xfs_healer: read json " Darrick J. Wong
2025-10-23  0:14   ` [PATCH 07/19] xfs_healer: create a weak file handle so we don't pin the mount Darrick J. Wong
2025-10-23  0:14   ` [PATCH 08/19] xfs_healer: fix broken filesystem metadata Darrick J. Wong
2025-10-23  0:14   ` [PATCH 09/19] xfs_healer: check for fs features needed for effective repairs Darrick J. Wong
2025-10-23  0:14   ` [PATCH 10/19] xfs_healer: use getparents to look up file names Darrick J. Wong
2025-10-23  0:15   ` [PATCH 11/19] xfs_healer: make the rust program check if kernel support available Darrick J. Wong
2025-10-23  0:15   ` [PATCH 12/19] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2025-10-23  0:15   ` [PATCH 13/19] xfs_healer: use rc on the mountpoint instead of lifetime annotations Darrick J. Wong
2025-10-23  0:15   ` [PATCH 14/19] xfs_healer: use thread pools Darrick J. Wong
2025-10-23  0:16   ` [PATCH 15/19] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2025-10-23  0:16   ` [PATCH 16/19] xfs_healer: use getmntent in Rust to find moved filesystems Darrick J. Wong
2025-10-23  0:16   ` [PATCH 17/19] xfs_healer: validate that repair fds point to the monitored fs in Rust Darrick J. Wong
2025-10-23  0:17   ` [PATCH 18/19] debian/control: listify the build dependencies Darrick J. Wong
2025-10-23  0:17   ` [PATCH 19/19] debian/control: pull in build dependencies for xfs_healer Darrick J. Wong
2025-11-04 22:48   ` [PATCHSET V2 2/2] xfsprogs: autonomous self healing of filesystems in Rust Darrick J. Wong
2025-12-01 17:59     ` Andrey Albershteyn
2025-12-01 21:55       ` Darrick J. Wong
2025-10-23  0:00 ` [PATCHSET V2] fstests: autonomous self healing of filesystems Darrick J. Wong
2025-10-23  0:17   ` [PATCH 1/4] xfs: test health monitoring code Darrick J. Wong
2025-10-23  0:17   ` [PATCH 2/4] xfs: test for metadata corruption error reporting via healthmon Darrick J. Wong
2025-10-23  0:18   ` [PATCH 3/4] xfs: test io " Darrick J. Wong
2025-10-23  0:18   ` [PATCH 4/4] xfs: test new xfs_healer daemon Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox