From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: cem@kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 02/11] xfs: start creating infrastructure for health monitoring
Date: Wed, 7 Jan 2026 10:50:55 -0800 [thread overview]
Message-ID: <20260107185055.GE15551@frogsfrogsfrogs> (raw)
In-Reply-To: <20260107091713.GB22838@lst.de>
On Wed, Jan 07, 2026 at 10:17:13AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 05, 2026 at 11:11:08PM -0800, Darrick J. Wong wrote:
> > +struct xfs_health_monitor {
> > + __u64 flags; /* flags */
> > + __u8 format; /* output format */
> > + __u8 pad1[7]; /* zeroes */
> > + __u64 pad2[2]; /* zeroes */
> > +};
>
> Why not use a single __u8-based padding field?
Ok.
> > +struct xfs_healthmon {
> > + /*
> > + * Weak reference to the xfs filesystem that is being monitored. It
> > + * will be set to zero when the filesystem detaches from the monitor.
> > + * Do not dereference this pointer.
> > + */
> > + uintptr_t mount_cookie;
> > +
> > + /*
> > + * Device number of the filesystem being monitored. This is for
> > + * consistent tracing even after unmount.
> > + */
> > + dev_t dev;
>
> It isn't really used for tracking, but just in a single print, right?
It's used for tracepoints and fdinfo.
> > + * be parsed easily by userspace. Then we hook various parts of the filesystem
>
> Is the hooking terminology still right?
I still think of the entry points as hooks, but I'll reword it to avoid
confusion with the actual xfs_hooks:
"When those internal events occur, the filesystem will call this health
monitor to convey them to userspace."
> > + * The healthmon abstraction has a weak reference to the host filesystem mount
> > + * so that the queueing and processing of the events do not pin the mount and
> > + * cannot slow down the main filesystem. The healthmon object can exist past
> > + * the end of the filesystem mount.
> > + */
> > +
> > +/* sign of a detached health monitor */
> > +#define DETACHED_MOUNT_COOKIE ((uintptr_t)0)
>
> This almost looks like a not performance optimized version of hazard
> pointers. Not that we care much about performance here.
Yep. AFAIK the kernel doesn't have an actual hazard pointer
implementation that we could latch onto, right?
> > +/*
> > + * Free the health monitor after an RCU grace period to eliminate possibility
> > + * of races with xfs_healthmon_get.
> > + */
> > +static inline void
> > +xfs_healthmon_free(
> > + struct xfs_healthmon *hm)
> > +{
> > + kfree_rcu_mightsleep(hm);
> > +}
>
> Is there much of a point in this wrapper vs just open coding the call to
> kfree_rcu_mightsleep in the only caller?
No, and indeed this could be compressed into _healthmon_put:
if (refcount_dec_and_test(&hm->ref)) {
while (hm->first_event) {
/* free hm->first_event */
}
kfree(hm->buffer);
mutex_destroy(&hm->lock);
kfree_rcu_mightsleep(hm);
}
which would be much easier to think about.
> > +/* Is this health monitor active? */
> > +static inline bool
> > +xfs_healthmon_activated(
> > + struct xfs_healthmon *hm)
> > +{
> > + return hm->mount_cookie != DETACHED_MOUNT_COOKIE;
> > +}
> > +
> > +/* Is this health monitor watching the given filesystem? */
> > +static inline bool
> > +xfs_healthmon_covers_fs(
> > + struct xfs_healthmon *hm,
> > + struct super_block *sb)
> > +{
> > + return hm->mount_cookie == (uintptr_t)sb;
> > +}
>
> Is there much of a point in these helpers vs open coding them in the callers?
> (no caller yet in this patch of the second one anyway). Especially as we
> need to hold a lock for them to be safe.
The only one really worth keeping is _healthmon_activated because it
gets called from various places. And even then, it now only has three
callsites so maybe it'll just go away.
> > +
> > +/* Attach a health monitor to an xfs_mount. Only one allowed at a time. */
> > +STATIC int
> > +xfs_healthmon_attach(
> > + struct xfs_mount *mp,
> > + struct xfs_healthmon *hm)
> > +{
> > + int ret = 0;
> > +
> > + spin_lock(&xfs_healthmon_lock);
> > + if (mp->m_healthmon == NULL) {
> > + mp->m_healthmon = hm;
> > + hm->mount_cookie = (uintptr_t)mp->m_super;
> > + refcount_inc(&hm->ref);
> > + } else {
> > + ret = -EEXIST;
> > + }
> > + spin_unlock(&xfs_healthmon_lock);
> > +
> > + return ret;
>
> Maybe just me, but I'd do away with the ret variable and just handle the
> EEXIST case directly:
>
> spin_lock(&xfs_healthmon_lock);
> if (mp->m_healthmon) {
> spin_unlock(&xfs_healthmon_lock);
> return -EEXIST;
> }
> refcount_inc(&hm->ref);
> mp->m_healthmon = hm;
> hm->mount_cookie = (uintptr_t)mp->m_super;
> spin_unlock(&xfs_healthmon_lock);
> return 0;
>
> > +/* Detach a xfs mount from a specific healthmon instance. */
> > +STATIC void
> > +xfs_healthmon_detach(
> > + struct xfs_healthmon *hm)
> > +{
> > + spin_lock(&xfs_healthmon_lock);
> > + if (xfs_healthmon_activated(hm)) {
> > + struct xfs_mount *mp =
> > + XFS_M((struct super_block *)hm->mount_cookie);
> > +
> > + mp->m_healthmon = NULL;
> > + hm->mount_cookie = DETACHED_MOUNT_COOKIE;
> > + } else {
> > + hm = NULL;
> > + }
> > + spin_unlock(&xfs_healthmon_lock);
> > +
> > + if (hm)
> > + xfs_healthmon_put(hm);
> > +}
>
> Kinda similar here:
>
> struct xfs_mount *mp;
>
> spin_lock(&xfs_healthmon_lock);
> if (hm->mount_cookie == DETACHED_MOUNT_COOKIE) {
> spin_unlock(&xfs_healthmon_lock);
> return;
> }
>
> XFS_M((struct super_block *)hm->mount_cookie)->m_healthmon = NULL;
> hm->mount_cookie = DETACHED_MOUNT_COOKIE;
> spin_unlock(&xfs_healthmon_lock);
>
> xfs_healthmon_put(hm);
Will change. And get rid of some of the mount cookie helpers.
Thanks for reading!
--D
next prev parent reply other threads:[~2026-01-07 18:50 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 7:10 [PATCHSET V4] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-06 7:10 ` [PATCH 01/11] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2026-01-07 9:08 ` Christoph Hellwig
2026-01-07 19:15 ` Darrick J. Wong
2026-01-06 7:11 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-01-07 9:17 ` Christoph Hellwig
2026-01-07 18:50 ` Darrick J. Wong [this message]
2026-01-08 10:21 ` Christoph Hellwig
2026-01-06 7:11 ` [PATCH 03/11] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2026-01-07 9:32 ` Christoph Hellwig
2026-01-07 19:01 ` Darrick J. Wong
2026-01-06 7:11 ` [PATCH 04/11] xfs: convey filesystem unmount events to the health monitor Darrick J. Wong
2026-01-06 7:11 ` [PATCH 05/11] xfs: convey metadata health " Darrick J. Wong
2026-01-06 7:12 ` [PATCH 06/11] xfs: convey filesystem shutdown " Darrick J. Wong
2026-01-06 7:12 ` [PATCH 07/11] xfs: convey externally discovered fsdax media errors " Darrick J. Wong
2026-01-06 7:12 ` [PATCH 08/11] xfs: convey file I/O " Darrick J. Wong
2026-01-06 7:12 ` [PATCH 09/11] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2026-01-06 7:13 ` [PATCH 10/11] xfs: check if an open file is on the health monitored fs Darrick J. Wong
2026-01-06 7:13 ` [PATCH 11/11] xfs: add media error reporting ioctl Darrick J. Wong
2026-01-07 9:36 ` Christoph Hellwig
2026-01-07 16:30 ` Darrick J. Wong
2026-01-08 10:25 ` Christoph Hellwig
2026-01-08 16:09 ` Darrick J. Wong
2026-01-08 16:14 ` Christoph Hellwig
2026-01-08 16:18 ` Darrick J. Wong
2026-01-08 16:20 ` Christoph Hellwig
2026-01-08 16:53 ` Darrick J. Wong
2026-01-12 5:24 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2026-01-13 0:32 [PATCHSET v5] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-13 0:33 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-01-13 16:03 ` Christoph Hellwig
2026-01-16 5:42 [PATCHSET v6] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-16 5:42 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-02-06 13:07 ` Pankaj Raghav (Samsung)
2026-02-06 17:47 ` Darrick J. Wong
2026-02-06 18:54 ` Pankaj Raghav
2026-02-06 20:41 ` Darrick J. Wong
2026-02-09 6:34 ` Christoph Hellwig
2026-02-10 4:57 ` Darrick J. Wong
2026-01-21 6:34 [PATCHSET v7 1/3] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-21 6:35 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260107185055.GE15551@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox