linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Carlos Maiolino <cem@kernel.org>, Christoph Hellwig <hch@infradead.org>
Cc: xfs <linux-xfs@vger.kernel.org>,
	Chandan Babu R <chandanbabu@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: [PATCHBOMB v2 6.19] xfs: autonomous self healing
Date: Tue, 4 Nov 2025 16:46:49 -0800	[thread overview]
Message-ID: <20251105004649.GA196370@frogsfrogsfrogs> (raw)

Hi everyone,

You might recall that 18 months ago I showed off an early draft of a
patchset implementing autonomous self healing capabilities for XFS.
The premise is quite simple -- add a few hooks to the kernel to capture
significant filesystem metadata and file health events (pretty much all
failures), queue these events to a special anonfd, and let userspace
read the events at its leisure.  That's patchset 1.

Since the previous release, I've removed all the json event generation
stuff and made media errors use the rmap btree to report file data loss.
I also ported the userspace program to C.  I'm not going to blast
everyone with the full set; just know that the C version is here:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring

Patchset 2 is now a cleanup of the file IO error hooks in patchset 1 to
use a more generic interface and to call fsnotify with the error
reports.  This means that the fsnotify filesystem error functionality
conveys generic errors to unprivileged userspace programs, but I'm
leaving the privileged healthmon interface so that xfsprogs can figure
out which specific part of the filesystem needs fixing.

This work was mostly complete by the end of 2024, and I've been letting
it run on my XFS QA testing fleets ever since then.  I am submitting
this patchset for upstream for 6.19.  Once this is merged, the online
fsck project will be complete.

--D

The unreviewed patches in this series are:

[PATCHSET V3 1/2] xfs: autonomous self healing of filesystems
  [PATCH 02/22] docs: discuss autonomous self healing in the xfs online
  [PATCH 03/22] xfs: create debugfs uuid aliases
  [PATCH 04/22] xfs: create hooks for monitoring health updates
  [PATCH 05/22] xfs: create a filesystem shutdown hook
  [PATCH 06/22] xfs: create hooks for media errors
  [PATCH 07/22] iomap: report buffered read and write io errors to the
  [PATCH 08/22] iomap: report directio read and write errors to callers
  [PATCH 09/22] xfs: create file io error hooks
  [PATCH 10/22] xfs: create a special file to pass filesystem health to
  [PATCH 11/22] xfs: create event queuing, formatting,
  [PATCH 12/22] xfs: report metadata health events through healthmon
  [PATCH 13/22] xfs: report shutdown events through healthmon
  [PATCH 14/22] xfs: report media errors through healthmon
  [PATCH 15/22] xfs: report file io errors through healthmon
  [PATCH 16/22] xfs: allow reconfiguration of the health monitoring
  [PATCH 17/22] xfs: validate fds against running healthmon
  [PATCH 18/22] xfs: add media error reporting ioctl
  [PATCH 19/22] xfs: send uevents when major filesystem events happen
  [PATCH 20/22] xfs: merge health monitoring events when possible
  [PATCH 21/22] xfs: restrict healthmon users further
  [PATCH 22/22] xfs: charge healthmon event objects to the memcg of the
[PATCHSET V3 2/2] iomap: generic file IO error reporting
  [PATCH 1/6] iomap: report file IO errors to fsnotify
  [PATCH 2/6] xfs: switch healthmon to use the iomap I/O error
  [PATCH 3/6] xfs: port notify-failure to use the new vfs io error
  [PATCH 4/6] xfs: remove file I/O error hooks
  [PATCH 5/6] iomap: remove I/O error hooks
  [PATCH 6/6] xfs: report fs metadata errors via fsnotify


             reply	other threads:[~2025-11-05  0:46 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05  0:46 Darrick J. Wong [this message]
2025-11-05  0:48 ` [PATCHSET V3 1/2] xfs: autonomous self healing of filesystems Darrick J. Wong
2025-11-05  0:48   ` [PATCH 01/22] docs: remove obsolete links in the xfs online repair documentation Darrick J. Wong
2025-11-05  0:48   ` [PATCH 02/22] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2025-11-05  0:49   ` [PATCH 03/22] xfs: create debugfs uuid aliases Darrick J. Wong
2025-11-05  0:49   ` [PATCH 04/22] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-11-05  0:49   ` [PATCH 05/22] xfs: create a filesystem shutdown hook Darrick J. Wong
2025-11-05  0:49   ` [PATCH 06/22] xfs: create hooks for media errors Darrick J. Wong
2025-11-05  0:50   ` [PATCH 07/22] iomap: report buffered read and write io errors to the filesystem Darrick J. Wong
2025-11-05  0:50   ` [PATCH 08/22] iomap: report directio read and write errors to callers Darrick J. Wong
2025-11-05  0:50   ` [PATCH 09/22] xfs: create file io error hooks Darrick J. Wong
2025-11-05  0:51   ` [PATCH 10/22] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-11-05  0:51   ` [PATCH 11/22] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-11-05  0:51   ` [PATCH 12/22] xfs: report metadata health events through healthmon Darrick J. Wong
2025-11-05  0:51   ` [PATCH 13/22] xfs: report shutdown " Darrick J. Wong
2025-11-05  0:52   ` [PATCH 14/22] xfs: report media errors " Darrick J. Wong
2025-11-05  0:52   ` [PATCH 15/22] xfs: report file io " Darrick J. Wong
2025-11-05  0:52   ` [PATCH 16/22] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2025-11-05  0:52   ` [PATCH 17/22] xfs: validate fds against running healthmon Darrick J. Wong
2025-11-05  0:53   ` [PATCH 18/22] xfs: add media error reporting ioctl Darrick J. Wong
2025-11-05  0:53   ` [PATCH 19/22] xfs: send uevents when major filesystem events happen Darrick J. Wong
2025-11-05  0:53   ` [PATCH 20/22] xfs: merge health monitoring events when possible Darrick J. Wong
2025-11-05  0:53   ` [PATCH 21/22] xfs: restrict healthmon users further Darrick J. Wong
2025-11-05  0:54   ` [PATCH 22/22] xfs: charge healthmon event objects to the memcg of the listening process Darrick J. Wong
2025-11-05  0:48 ` [PATCHSET V3 2/2] iomap: generic file IO error reporting Darrick J. Wong
2025-11-05  0:54   ` [PATCH 1/6] iomap: report file IO errors to fsnotify Darrick J. Wong
2025-11-05 11:00     ` Jan Kara
2025-11-05 11:14       ` Amir Goldstein
2025-11-05 14:24         ` Jan Kara
2025-11-05 18:28           ` Darrick J. Wong
2025-11-05 19:41             ` Darrick J. Wong
2025-11-06 10:13               ` Jan Kara
2025-11-06 17:06                 ` Darrick J. Wong
2025-11-05  0:54   ` [PATCH 2/6] xfs: switch healthmon to use the iomap I/O error reporting Darrick J. Wong
2025-11-05  0:54   ` [PATCH 3/6] xfs: port notify-failure to use the new vfs io " Darrick J. Wong
2025-11-05  0:55   ` [PATCH 4/6] xfs: remove file I/O error hooks Darrick J. Wong
2025-11-05  0:55   ` [PATCH 5/6] iomap: remove " Darrick J. Wong
2025-11-05  0:55   ` [PATCH 6/6] xfs: report fs metadata errors via fsnotify Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105004649.GA196370@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=chandanbabu@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).