Re: [PATCH 02/11] xfs: start creating infrastructure for health monitoring

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: cem@kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 02/11] xfs: start creating infrastructure for health monitoring
Date: Wed, 7 Jan 2026 10:50:55 -0800	[thread overview]
Message-ID: <20260107185055.GE15551@frogsfrogsfrogs> (raw)
In-Reply-To: <20260107091713.GB22838@lst.de>

On Wed, Jan 07, 2026 at 10:17:13AM +0100, Christoph Hellwig wrote:
> On Mon, Jan 05, 2026 at 11:11:08PM -0800, Darrick J. Wong wrote:
> > +struct xfs_health_monitor {
> > +	__u64	flags;		/* flags */
> > +	__u8	format;		/* output format */
> > +	__u8	pad1[7];	/* zeroes */
> > +	__u64	pad2[2];	/* zeroes */
> > +};
> 
> Why not use a single __u8-based padding field?

Ok.

> > +struct xfs_healthmon {
> > +	/*
> > +	 * Weak reference to the xfs filesystem that is being monitored.  It
> > +	 * will be set to zero when the filesystem detaches from the monitor.
> > +	 * Do not dereference this pointer.
> > +	 */
> > +	uintptr_t			mount_cookie;
> > +
> > +	/*
> > +	 * Device number of the filesystem being monitored.  This is for
> > +	 * consistent tracing even after unmount.
> > +	 */
> > +	dev_t				dev;
> 
> It isn't really used for tracking, but just in a single print, right?

It's used for tracepoints and fdinfo.

> > + * be parsed easily by userspace.  Then we hook various parts of the filesystem
> 
> Is the hooking terminology still right?

I still think of the entry points as hooks, but I'll reword it to avoid
confusion with the actual xfs_hooks:

"When those internal events occur, the filesystem will call this health
monitor to convey them to userspace."

> > + * The healthmon abstraction has a weak reference to the host filesystem mount
> > + * so that the queueing and processing of the events do not pin the mount and
> > + * cannot slow down the main filesystem.  The healthmon object can exist past
> > + * the end of the filesystem mount.
> > + */
> > +
> > +/* sign of a detached health monitor */
> > +#define DETACHED_MOUNT_COOKIE		((uintptr_t)0)
> 
> This almost looks like a not performance optimized version of hazard
> pointers.  Not that we care much about performance here.

Yep.  AFAIK the kernel doesn't have an actual hazard pointer
implementation that we could latch onto, right?

> > +/*
> > + * Free the health monitor after an RCU grace period to eliminate possibility
> > + * of races with xfs_healthmon_get.
> > + */
> > +static inline void
> > +xfs_healthmon_free(
> > +	struct xfs_healthmon		*hm)
> > +{
> > +	kfree_rcu_mightsleep(hm);
> > +}
> 
> Is there much of a point in this wrapper vs just open coding the call to
> kfree_rcu_mightsleep in the only caller?

No, and indeed this could be compressed into _healthmon_put:

	if (refcount_dec_and_test(&hm->ref)) {
		while (hm->first_event) {
			/* free hm->first_event */
		}
		kfree(hm->buffer);
		mutex_destroy(&hm->lock);
		kfree_rcu_mightsleep(hm);
	}

which would be much easier to think about.

> > +/* Is this health monitor active? */
> > +static inline bool
> > +xfs_healthmon_activated(
> > +	struct xfs_healthmon	*hm)
> > +{
> > +	return hm->mount_cookie != DETACHED_MOUNT_COOKIE;
> > +}
> > +
> > +/* Is this health monitor watching the given filesystem? */
> > +static inline bool
> > +xfs_healthmon_covers_fs(
> > +	struct xfs_healthmon	*hm,
> > +	struct super_block	*sb)
> > +{
> > +	return hm->mount_cookie == (uintptr_t)sb;
> > +}
> 
> Is there much of a point in these helpers vs open coding them in the callers?
> (no caller yet in this patch of the second one anyway).  Especially as we
> need to hold a lock for them to be safe.

The only one really worth keeping is _healthmon_activated because it
gets called from various places.  And even then, it now only has three
callsites so maybe it'll just go away.

> > +
> > +/* Attach a health monitor to an xfs_mount.  Only one allowed at a time. */
> > +STATIC int
> > +xfs_healthmon_attach(
> > +	struct xfs_mount	*mp,
> > +	struct xfs_healthmon	*hm)
> > +{
> > +	int			ret = 0;
> > +
> > +	spin_lock(&xfs_healthmon_lock);
> > +	if (mp->m_healthmon == NULL) {
> > +		mp->m_healthmon = hm;
> > +		hm->mount_cookie = (uintptr_t)mp->m_super;
> > +		refcount_inc(&hm->ref);
> > +	} else {
> > +		ret = -EEXIST;
> > +	}
> > +	spin_unlock(&xfs_healthmon_lock);
> > +
> > +	return ret;
> 
> Maybe just me, but I'd do away with the ret variable and just handle the
> EEXIST case directly:
> 
> 	spin_lock(&xfs_healthmon_lock);
> 	if (mp->m_healthmon) {
> 		spin_unlock(&xfs_healthmon_lock);
> 		return -EEXIST;
> 	}
> 	refcount_inc(&hm->ref);
> 	mp->m_healthmon = hm;
> 	hm->mount_cookie = (uintptr_t)mp->m_super;
> 	spin_unlock(&xfs_healthmon_lock);
> 	return 0;
> 
> > +/* Detach a xfs mount from a specific healthmon instance. */
> > +STATIC void
> > +xfs_healthmon_detach(
> > +	struct xfs_healthmon	*hm)
> > +{
> > +	spin_lock(&xfs_healthmon_lock);
> > +	if (xfs_healthmon_activated(hm)) {
> > +		struct xfs_mount	*mp =
> > +			XFS_M((struct super_block *)hm->mount_cookie);
> > +
> > +		mp->m_healthmon = NULL;
> > +		hm->mount_cookie = DETACHED_MOUNT_COOKIE;
> > +	} else {
> > +		hm = NULL;
> > +	}
> > +	spin_unlock(&xfs_healthmon_lock);
> > +
> > +	if (hm)
> > +		xfs_healthmon_put(hm);
> > +}
> 
> Kinda similar here:
> 
> 	struct xfs_mount	*mp;
> 
> 	spin_lock(&xfs_healthmon_lock);
> 	if (hm->mount_cookie == DETACHED_MOUNT_COOKIE) {
> 		spin_unlock(&xfs_healthmon_lock);
> 		return;
> 	}
> 
> 	XFS_M((struct super_block *)hm->mount_cookie)->m_healthmon = NULL;
> 	hm->mount_cookie = DETACHED_MOUNT_COOKIE;
> 	spin_unlock(&xfs_healthmon_lock);
> 
> 	xfs_healthmon_put(hm);

Will change.  And get rid of some of the mount cookie helpers.

Thanks for reading!

--D

next prev parent reply	other threads:[~2026-01-07 18:50 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-06  7:10 [PATCHSET V4] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-06  7:10 ` [PATCH 01/11] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2026-01-07  9:08   ` Christoph Hellwig
2026-01-07 19:15     ` Darrick J. Wong
2026-01-06  7:11 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-01-07  9:17   ` Christoph Hellwig
2026-01-07 18:50     ` Darrick J. Wong [this message]
2026-01-08 10:21       ` Christoph Hellwig
2026-01-06  7:11 ` [PATCH 03/11] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2026-01-07  9:32   ` Christoph Hellwig
2026-01-07 19:01     ` Darrick J. Wong
2026-01-06  7:11 ` [PATCH 04/11] xfs: convey filesystem unmount events to the health monitor Darrick J. Wong
2026-01-06  7:11 ` [PATCH 05/11] xfs: convey metadata health " Darrick J. Wong
2026-01-06  7:12 ` [PATCH 06/11] xfs: convey filesystem shutdown " Darrick J. Wong
2026-01-06  7:12 ` [PATCH 07/11] xfs: convey externally discovered fsdax media errors " Darrick J. Wong
2026-01-06  7:12 ` [PATCH 08/11] xfs: convey file I/O " Darrick J. Wong
2026-01-06  7:12 ` [PATCH 09/11] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2026-01-06  7:13 ` [PATCH 10/11] xfs: check if an open file is on the health monitored fs Darrick J. Wong
2026-01-06  7:13 ` [PATCH 11/11] xfs: add media error reporting ioctl Darrick J. Wong
2026-01-07  9:36   ` Christoph Hellwig
2026-01-07 16:30     ` Darrick J. Wong
2026-01-08 10:25       ` Christoph Hellwig
2026-01-08 16:09         ` Darrick J. Wong
2026-01-08 16:14           ` Christoph Hellwig
2026-01-08 16:18             ` Darrick J. Wong
2026-01-08 16:20               ` Christoph Hellwig
2026-01-08 16:53                 ` Darrick J. Wong
2026-01-12  5:24                   ` Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2026-01-13  0:32 [PATCHSET v5] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-13  0:33 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-01-13 16:03   ` Christoph Hellwig
2026-01-16  5:42 [PATCHSET v6] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-16  5:42 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
2026-02-06 13:07   ` Pankaj Raghav (Samsung)
2026-02-06 17:47     ` Darrick J. Wong
2026-02-06 18:54       ` Pankaj Raghav
2026-02-06 20:41         ` Darrick J. Wong
2026-02-09  6:34           ` Christoph Hellwig
2026-02-10  4:57             ` Darrick J. Wong
2026-01-21  6:34 [PATCHSET v7 1/3] xfs: autonomous self healing of filesystems Darrick J. Wong
2026-01-21  6:35 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260107185055.GE15551@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox