From: "Darrick J. Wong" <djwong@kernel.org>
To: aalbersh@kernel.org
Cc: linux-xfs@vger.kernel.org, Neal Gompa <neal@gompa.dev>
Subject: Re: [PATCHSET V2 2/2] xfsprogs: autonomous self healing of filesystems in Rust
Date: Tue, 4 Nov 2025 14:48:06 -0800 [thread overview]
Message-ID: <20251104224806.GN196370@frogsfrogsfrogs> (raw)
In-Reply-To: <176117748158.1029045.18328755324893036160.stgit@frogsfrogsfrogs>
On Wed, Oct 22, 2025 at 05:00:20PM -0700, Darrick J. Wong wrote:
> Hi all,
>
> The initial implementation of the self healing daemon is written in
> Python. This was useful for rapid prototyping, but a more performant
> and typechecked codebase is valuable. Write a second implementation in
> Rust to reduce event processing overhead and library dependence. This
> could have been done in C, but I decided to use an environment with
> somewhat fewer footguns.
Having discarded the json output format last week, I decided to rewrite
the Python version of xfs_healer in C partly out of curiosity and partly
because I didn't see much advantage to having a Python script to call
ioctls and interpret C structs. After removing the json support from
the Rust version, the release binary sizes are:
-rwxr-xr-x root/root 1051096 2025-11-04 14:25 ./usr/libexec/xfsprogs/xfs_healer
-rwxr-xr-x root/root 43904 2025-11-04 14:25 ./usr/libexec/xfsprogs/xfs_healer.orig
This is a nearly 24x size increase to have Rust. I'm a n00b Rustacean
and a veteran C stuckee, but between that and the difficulties of
integrating two languages and two build systems together, I don't think
it's worth the trouble to keep the Rust code. I've made a final push
with the Rust code to my dev repo for the sake of posterity:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring-rust_2025-11-04
But I'm deleting this from my tree after I send this email.
That said, I quite enjoyed using this an excuse to familiarize myself
with how to write bad Rust code. Using traits and the newtype pattern
for geometric units (e.g. xfs_fsblock_t) was very helpful in keeping
unit conversions understandable; and having to think about object access
and lifetimes helped me produce a stable prototype very quickly. It
also helps that rustc errors are far more helpful than gcc.
The only thing I didn't particularly like is the forced coordination for
shared resources that already coordinate threads -- you can't easily
have multiple readers sharing an open fd, even if that magic fd only
emits struct sized objects and takes i_rwsem exclusively to prevent
corruption problems.
Dealing with cargo for a distro package build was nightmarish --
hermetically sealed build systems (you want this) can't access crates.io
which means that I as the author had to be careful only to use crate
packages that are in EPEL or Debian stable, and to tell cargo only to
look on the local filesystem. So I guess I now have experience in that,
should anyone want to know how to do that.
(Also, how do you do i18n in Rust programs? gettext???)
--D
> If you're going to start using this code, I strongly recommend pulling
> from my git trees, which are linked below.
>
> This has been running on the djcloud for months with no problems. Enjoy!
> Comments and questions are, as always, welcome.
>
> --D
>
> xfsprogs git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring-rust
> ---
> Commits in this patchset:
> * xfs_healer: start building a Rust version
> * xfs_healer: enable gettext for localization
> * xfs_healer: bindgen xfs_fs.h
> * xfs_healer: define Rust objects for health events and kernel interface
> * xfs_healer: read binary health events from the kernel
> * xfs_healer: read json health events from the kernel
> * xfs_healer: create a weak file handle so we don't pin the mount
> * xfs_healer: fix broken filesystem metadata
> * xfs_healer: check for fs features needed for effective repairs
> * xfs_healer: use getparents to look up file names
> * xfs_healer: make the rust program check if kernel support available
> * xfs_healer: use the autofsck fsproperty to select mode
> * xfs_healer: use rc on the mountpoint instead of lifetime annotations
> * xfs_healer: use thread pools
> * xfs_healer: run full scrub after lost corruption events or targeted repair failure
> * xfs_healer: use getmntent in Rust to find moved filesystems
> * xfs_healer: validate that repair fds point to the monitored fs in Rust
> * debian/control: listify the build dependencies
> * debian/control: pull in build dependencies for xfs_healer
> ---
> healer/bindgen_xfs_fs.h | 6 +
> configure.ac | 84 ++++++++
> debian/control | 30 +++
> debian/rules | 3
> healer/.cargo/config.toml.system | 6 +
> healer/Cargo.toml.in | 37 +++
> healer/Makefile | 143 +++++++++++++
> healer/rbindgen | 57 +++++
> healer/src/fsgeom.rs | 41 ++++
> healer/src/fsprops.rs | 101 +++++++++
> healer/src/getmntent.rs | 117 +++++++++++
> healer/src/getparents.rs | 210 ++++++++++++++++++++
> healer/src/healthmon/cstruct.rs | 354 +++++++++++++++++++++++++++++++++
> healer/src/healthmon/event.rs | 122 +++++++++++
> healer/src/healthmon/fs.rs | 163 +++++++++++++++
> healer/src/healthmon/groups.rs | 160 +++++++++++++++
> healer/src/healthmon/inodes.rs | 142 +++++++++++++
> healer/src/healthmon/json.rs | 409 ++++++++++++++++++++++++++++++++++++++
> healer/src/healthmon/mod.rs | 47 ++++
> healer/src/healthmon/samefs.rs | 33 +++
> healer/src/lib.rs | 17 ++
> healer/src/main.rs | 390 ++++++++++++++++++++++++++++++++++++
> healer/src/repair.rs | 390 ++++++++++++++++++++++++++++++++++++
> healer/src/util.rs | 81 ++++++++
> healer/src/weakhandle.rs | 209 +++++++++++++++++++
> healer/src/xfs_types.rs | 292 +++++++++++++++++++++++++++
> healer/src/xfsprogs.rs.in | 33 +++
> include/builddefs.in | 13 +
> include/buildrules | 1
> m4/Makefile | 1
> m4/package_rust.m4 | 163 +++++++++++++++
> 31 files changed, 3851 insertions(+), 4 deletions(-)
> create mode 100644 healer/bindgen_xfs_fs.h
> create mode 100644 healer/.cargo/config.toml.system
> create mode 100644 healer/Cargo.toml.in
> create mode 100755 healer/rbindgen
> create mode 100644 healer/src/fsgeom.rs
> create mode 100644 healer/src/fsprops.rs
> create mode 100644 healer/src/getmntent.rs
> create mode 100644 healer/src/getparents.rs
> create mode 100644 healer/src/healthmon/cstruct.rs
> create mode 100644 healer/src/healthmon/event.rs
> create mode 100644 healer/src/healthmon/fs.rs
> create mode 100644 healer/src/healthmon/groups.rs
> create mode 100644 healer/src/healthmon/inodes.rs
> create mode 100644 healer/src/healthmon/json.rs
> create mode 100644 healer/src/healthmon/mod.rs
> create mode 100644 healer/src/healthmon/samefs.rs
> create mode 100644 healer/src/lib.rs
> create mode 100644 healer/src/main.rs
> create mode 100644 healer/src/repair.rs
> create mode 100644 healer/src/util.rs
> create mode 100644 healer/src/weakhandle.rs
> create mode 100644 healer/src/xfs_types.rs
> create mode 100644 healer/src/xfsprogs.rs.in
> create mode 100644 m4/package_rust.m4
>
>
next prev parent reply other threads:[~2025-11-04 22:48 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-22 23:56 [PATCHBOMB 6.19] xfs: autonomous self healing Darrick J. Wong
2025-10-22 23:59 ` [PATCHSET V2] xfs: autonomous self healing of filesystems Darrick J. Wong
2025-10-23 0:00 ` [PATCH 01/19] docs: remove obsolete links in the xfs online repair documentation Darrick J. Wong
2025-10-24 5:40 ` Christoph Hellwig
2025-10-27 16:15 ` Darrick J. Wong
2025-10-23 0:01 ` [PATCH 02/19] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2025-10-30 16:38 ` Darrick J. Wong
2025-10-23 0:01 ` [PATCH 03/19] xfs: create debugfs uuid aliases Darrick J. Wong
2025-10-23 0:01 ` [PATCH 04/19] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-10-23 0:01 ` [PATCH 05/19] xfs: create a filesystem shutdown hook Darrick J. Wong
2025-10-23 0:02 ` [PATCH 06/19] xfs: create hooks for media errors Darrick J. Wong
2025-10-23 0:02 ` [PATCH 07/19] iomap: report buffered read and write io errors to the filesystem Darrick J. Wong
2025-10-23 0:02 ` [PATCH 08/19] iomap: report directio read and write errors to callers Darrick J. Wong
2025-10-23 0:02 ` [PATCH 09/19] xfs: create file io error hooks Darrick J. Wong
2025-10-23 0:03 ` [PATCH 10/19] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-10-23 0:03 ` [PATCH 11/19] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-10-30 16:54 ` Darrick J. Wong
2025-10-23 0:03 ` [PATCH 12/19] xfs: report metadata health events through healthmon Darrick J. Wong
2025-10-23 0:04 ` [PATCH 13/19] xfs: report shutdown " Darrick J. Wong
2025-10-23 0:04 ` [PATCH 14/19] xfs: report media errors " Darrick J. Wong
2025-10-23 0:04 ` [PATCH 15/19] xfs: report file io " Darrick J. Wong
2025-10-23 0:04 ` [PATCH 16/19] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2025-10-23 0:05 ` [PATCH 17/19] xfs: validate fds against running healthmon Darrick J. Wong
2025-10-23 0:05 ` [PATCH 18/19] xfs: add media error reporting ioctl Darrick J. Wong
2025-10-23 0:05 ` [PATCH 19/19] xfs: send uevents when major filesystem events happen Darrick J. Wong
2025-10-23 0:00 ` [PATCHSET V2 1/2] xfsprogs: autonomous self healing of filesystems Darrick J. Wong
2025-10-23 0:05 ` [PATCH 01/26] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-10-23 0:06 ` [PATCH 02/26] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-10-23 0:06 ` [PATCH 03/26] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-10-23 0:06 ` [PATCH 04/26] xfs: report metadata health events through healthmon Darrick J. Wong
2025-10-23 0:06 ` [PATCH 05/26] xfs: report shutdown " Darrick J. Wong
2025-10-23 0:07 ` [PATCH 06/26] xfs: report media errors " Darrick J. Wong
2025-10-23 0:07 ` [PATCH 07/26] xfs: report file io " Darrick J. Wong
2025-10-23 0:07 ` [PATCH 08/26] xfs: validate fds against running healthmon Darrick J. Wong
2025-10-23 0:07 ` [PATCH 09/26] xfs: add media error reporting ioctl Darrick J. Wong
2025-10-23 0:08 ` [PATCH 10/26] xfs_io: monitor filesystem health events Darrick J. Wong
2025-10-23 0:08 ` [PATCH 11/26] xfs_io: add a media error reporting command Darrick J. Wong
2025-10-23 0:08 ` [PATCH 12/26] xfs_healer: create daemon to listen for health events Darrick J. Wong
2025-10-23 0:08 ` [PATCH 13/26] xfs_healer: check events against schema Darrick J. Wong
2025-10-23 0:09 ` [PATCH 14/26] xfs_healer: enable repairing filesystems Darrick J. Wong
2025-10-23 0:09 ` [PATCH 15/26] xfs_healer: check for fs features needed for effective repairs Darrick J. Wong
2025-10-23 0:09 ` [PATCH 16/26] xfs_healer: use getparents to look up file names Darrick J. Wong
2025-10-23 0:09 ` [PATCH 17/26] builddefs: refactor udev directory specification Darrick J. Wong
2025-10-23 0:10 ` [PATCH 18/26] xfs_healer: create a background monitoring service Darrick J. Wong
2025-10-23 0:10 ` [PATCH 19/26] xfs_healer: don't start service if kernel support unavailable Darrick J. Wong
2025-10-23 0:10 ` [PATCH 20/26] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2025-10-23 0:11 ` [PATCH 21/26] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2025-10-23 0:11 ` [PATCH 22/26] xfs_healer: use getmntent to find moved filesystems Darrick J. Wong
2025-10-23 0:11 ` [PATCH 23/26] xfs_healer: validate that repair fds point to the monitored fs Darrick J. Wong
2025-10-23 0:11 ` [PATCH 24/26] xfs_healer: add a manual page Darrick J. Wong
2025-10-23 0:12 ` [PATCH 25/26] xfs_scrub: report media scrub failures to the kernel Darrick J. Wong
2025-10-23 0:12 ` [PATCH 26/26] debian: enable xfs_healer on the root filesystem by default Darrick J. Wong
2025-10-23 0:00 ` [PATCHSET V2 2/2] xfsprogs: autonomous self healing of filesystems in Rust Darrick J. Wong
2025-10-23 0:12 ` [PATCH 01/19] xfs_healer: start building a Rust version Darrick J. Wong
2025-10-23 0:12 ` [PATCH 02/19] xfs_healer: enable gettext for localization Darrick J. Wong
2025-10-23 0:13 ` [PATCH 03/19] xfs_healer: bindgen xfs_fs.h Darrick J. Wong
2025-10-23 0:13 ` [PATCH 04/19] xfs_healer: define Rust objects for health events and kernel interface Darrick J. Wong
2025-10-23 0:13 ` [PATCH 05/19] xfs_healer: read binary health events from the kernel Darrick J. Wong
2025-10-23 0:13 ` [PATCH 06/19] xfs_healer: read json " Darrick J. Wong
2025-10-23 0:14 ` [PATCH 07/19] xfs_healer: create a weak file handle so we don't pin the mount Darrick J. Wong
2025-10-23 0:14 ` [PATCH 08/19] xfs_healer: fix broken filesystem metadata Darrick J. Wong
2025-10-23 0:14 ` [PATCH 09/19] xfs_healer: check for fs features needed for effective repairs Darrick J. Wong
2025-10-23 0:14 ` [PATCH 10/19] xfs_healer: use getparents to look up file names Darrick J. Wong
2025-10-23 0:15 ` [PATCH 11/19] xfs_healer: make the rust program check if kernel support available Darrick J. Wong
2025-10-23 0:15 ` [PATCH 12/19] xfs_healer: use the autofsck fsproperty to select mode Darrick J. Wong
2025-10-23 0:15 ` [PATCH 13/19] xfs_healer: use rc on the mountpoint instead of lifetime annotations Darrick J. Wong
2025-10-23 0:15 ` [PATCH 14/19] xfs_healer: use thread pools Darrick J. Wong
2025-10-23 0:16 ` [PATCH 15/19] xfs_healer: run full scrub after lost corruption events or targeted repair failure Darrick J. Wong
2025-10-23 0:16 ` [PATCH 16/19] xfs_healer: use getmntent in Rust to find moved filesystems Darrick J. Wong
2025-10-23 0:16 ` [PATCH 17/19] xfs_healer: validate that repair fds point to the monitored fs in Rust Darrick J. Wong
2025-10-23 0:17 ` [PATCH 18/19] debian/control: listify the build dependencies Darrick J. Wong
2025-10-23 0:17 ` [PATCH 19/19] debian/control: pull in build dependencies for xfs_healer Darrick J. Wong
2025-11-04 22:48 ` Darrick J. Wong [this message]
2025-12-01 17:59 ` [PATCHSET V2 2/2] xfsprogs: autonomous self healing of filesystems in Rust Andrey Albershteyn
2025-12-01 21:55 ` Darrick J. Wong
2025-10-23 0:00 ` [PATCHSET V2] fstests: autonomous self healing of filesystems Darrick J. Wong
2025-10-23 0:17 ` [PATCH 1/4] xfs: test health monitoring code Darrick J. Wong
2025-10-23 0:17 ` [PATCH 2/4] xfs: test for metadata corruption error reporting via healthmon Darrick J. Wong
2025-10-23 0:18 ` [PATCH 3/4] xfs: test io " Darrick J. Wong
2025-10-23 0:18 ` [PATCH 4/4] xfs: test new xfs_healer daemon Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104224806.GN196370@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=aalbersh@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=neal@gompa.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox