Linux filesystem development
 help / color / mirror / Atom feed
  • * Re: [PATCH v6.1 11/11] xfs: add media verification ioctl
           [not found]   ` <20260120041226.GJ15551@frogsfrogsfrogs>
           [not found]     ` <20260120071830.GA5686@lst.de>
    @ 2026-02-06  3:01     ` Chris Mason
      2026-02-06  4:53       ` Darrick J. Wong
      1 sibling, 1 reply; 11+ messages in thread
    From: Chris Mason @ 2026-02-06  3:01 UTC (permalink / raw)
      To: Darrick J. Wong; +Cc: cem, hch, linux-xfs, linux-fsdevel
    
    "Darrick J. Wong" <djwong@kernel.org> wrote:
    > From: Darrick J. Wong <djwong@kernel.org>
    > 
    > Add a new privileged ioctl so that xfs_scrub can ask the kernel to
    > verify the media of the devices backing an xfs filesystem, and have any
    > resulting media errors reported to fsnotify and xfs_healer.
    
    Hi everyone,
    
    I'm trying out my AI review prompts on a few more trees, and I ran it
    on the fuse-iomap-cache branch:
    
    https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-cache
    
    This was the latest email lore could find, apologies if I grabbed the wrong one.
    
    Hopefully the review below is useful, please let me know if there are places it
    has gone wildly wrong.  I kept the header to make it obvious exactly which
    commit I'd run through.
    
    (This is another XFS commit that ended up in my rev-list range, low confidence
    in the actual review, but I'm curious if AI got this one right)
    
    commit 2e5c40a5b6ca722202bc0e99830b6ecf8755d19c
    Author: Darrick J. Wong <djwong@kernel.org>
    
    xfs: add media verification ioctl
    
    [ ... ]
    
    > diff --git a/fs/xfs/xfs_verify_media.c b/fs/xfs/xfs_verify_media.c
    > --- /dev/null
    > +++ b/fs/xfs/xfs_verify_media.c
    > @@ -0,0 +1,445 @@
    
    [ ... ]
    
    > +/* Walk reverse mappings to look for all file data loss */
    > +static int
    > +xfs_verify_report_losses(
    > +	struct xfs_mount	*mp,
    > +	enum xfs_group_type	type,
    > +	xfs_daddr_t		daddr,
    > +	u64			bblen)
    > +{
    > +	struct xfs_group	*xg = NULL;
    > +	struct xfs_trans	*tp;
    > +	xfs_fsblock_t		start_bno, end_bno;
    > +	uint32_t		start_gno, end_gno;
    > +	int			error;
    > +
    > +	if (type == XG_TYPE_RTG) {
    > +		start_bno = xfs_daddr_to_rtb(mp, daddr);
    > +		end_bno = xfs_daddr_to_rtb(mp, daddr + bblen - 1);
    > +	} else {
    > +		start_bno = XFS_DADDR_TO_FSB(mp, daddr);
    > +		end_bno = XFS_DADDR_TO_FSB(mp, daddr + bblen - 1);
    > +	}
    > +
    > +	tp = xfs_trans_alloc_empty(mp);
    > +	start_gno = xfs_fsb_to_gno(mp, start_bno, type);
    > +	end_gno = xfs_fsb_to_gno(mp, end_bno, type);
    > +	while ((xg = xfs_group_next_range(mp, xg, start_gno, end_gno, type))) {
    > +		struct xfs_buf		*agf_bp = NULL;
    > +		struct xfs_rtgroup	*rtg = NULL;
    > +		struct xfs_btree_cur	*cur;
    > +		struct xfs_rmap_irec	ri_low = { };
    > +		struct xfs_rmap_irec	ri_high;
    > +		struct xfs_group_data_lost lost;
    > +
    > +		if (type == XG_TYPE_AG) {
    > +			struct xfs_perag	*pag = to_perag(xg);
    > +
    > +			error = xfs_alloc_read_agf(pag, tp, 0, &agf_bp);
    > +			if (error) {
    > +				xfs_perag_put(pag);
    > +				break;
    > +			}
    > +
    > +			cur = xfs_rmapbt_init_cursor(mp, tp, agf_bp, pag);
    > +		} else {
    > +			rtg = to_rtg(xg);
    > +			xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP);
    > +			cur = xfs_rtrmapbt_init_cursor(tp, rtg);
    > +		}
    
    Does this code have a reference counting mismatch in the error paths?
    
    The xfs_group returned by xfs_group_next_range() is acquired via
    xfs_group_grab() which increments xg_active_ref.
    
    In the error path at line 120, the code uses xfs_perag_put() to release
    the reference:
    
    	error = xfs_alloc_read_agf(pag, tp, 0, &agf_bp);
    	if (error) {
    		xfs_perag_put(pag);
    		break;
    	}
    
    And later in the loop, line 144 uses xfs_group_put().
    
    Both xfs_perag_put() and xfs_group_put() decrement xg_ref, not
    xg_active_ref. Looking at the reference counting implementation:
    
    xfs_group_grab() calls atomic_inc_not_zero(&xg->xg_active_ref)
    xfs_group_rele() calls atomic_dec(&xg->xg_active_ref)
    xfs_group_put() calls atomic_dec(&xg->xg_ref)
    
    Should these error paths use xfs_group_rele() instead to match the
    xfs_group_grab() in xfs_group_next_range()?
    
    
    ^ permalink raw reply	[flat|nested] 11+ messages in thread
  • [parent not found: <176852588582.2137143.1283636639551788931.stgit@frogsfrogsfrogs>]
  • * [PATCHSET v7 1/3] xfs: autonomous self healing of filesystems
    @ 2026-01-21  6:34 Darrick J. Wong
      2026-01-21  6:35 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
      0 siblings, 1 reply; 11+ messages in thread
    From: Darrick J. Wong @ 2026-01-21  6:34 UTC (permalink / raw)
      To: djwong, cem; +Cc: hch, linux-fsdevel, linux-xfs, hch
    
    Hi all,
    
    This patchset builds new functionality to deliver live information about
    filesystem health events to userspace.  This is done by creating an
    anonymous file that can be read() for events by userspace programs.
    Events are captured by hooking various parts of XFS and iomap so that
    metadata health failures, file I/O errors, and major changes in
    filesystem state (unmounts, shutdowns, etc.) can be observed by
    programs.
    
    When an event occurs, the hook functions queue an event object to each
    event anonfd for later processing.  Programs must have CAP_SYS_ADMIN
    to open the anonfd and there's a maximum event lag to prevent resource
    overconsumption.  The events themselves can be read() from the anonfd
    as C structs for the xfs_healer daemon.
    
    In userspace, we create a new daemon program that will read the event
    objects and initiate repairs automatically.  This daemon is managed
    entirely by systemd and will not block unmounting of the filesystem
    unless repairs are ongoing.  They are auto-started by a starter
    service that uses fanotify.
    
    This patchset depends on the new fserror code that Christian Brauner
    has tentatively accepted for Linux 7.0:
    https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs-7.0.fserror
    
    v7: more cleanups of the media verification ioctl, improve comments, and
        reuse the bio
    v6: fix pi-breaking bugs, make verify failures trigger health reports
        and filter bio status flags better
    v5: add verify-media ioctl, collapse small helper funcs with only
        one caller
    v4: drop multiple client support so we can make direct calls into
        healthmon instead of chasing pointers and doing indirect calls
    v3: drag out of rfc status
    
    If you're going to start using this code, I strongly recommend pulling
    from my git trees, which are linked below.
    
    With a bit of luck, this should all go splendidly.
    Comments and questions are, as always, welcome.
    
    --D
    
    kernel git tree:
    https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=health-monitoring
    
    xfsprogs git tree:
    https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=health-monitoring
    
    fstests git tree:
    https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=health-monitoring
    ---
    Commits in this patchset:
     * docs: discuss autonomous self healing in the xfs online repair design doc
     * xfs: start creating infrastructure for health monitoring
     * xfs: create event queuing, formatting, and discovery infrastructure
     * xfs: convey filesystem unmount events to the health monitor
     * xfs: convey metadata health events to the health monitor
     * xfs: convey filesystem shutdown events to the health monitor
     * xfs: convey externally discovered fsdax media errors to the health monitor
     * xfs: convey file I/O errors to the health monitor
     * xfs: allow toggling verbose logging on the health monitoring file
     * xfs: check if an open file is on the health monitored fs
     * xfs: add media verification ioctl
    ---
     fs/xfs/libxfs/xfs_fs.h                             |  189 +++
     fs/xfs/libxfs/xfs_health.h                         |    5 
     fs/xfs/xfs_healthmon.h                             |  184 +++
     fs/xfs/xfs_mount.h                                 |    4 
     fs/xfs/xfs_trace.h                                 |  512 ++++++++
     fs/xfs/xfs_verify_media.h                          |   13 
     .../filesystems/xfs/xfs-online-fsck-design.rst     |  153 ++
     fs/xfs/Makefile                                    |    2 
     fs/xfs/xfs_fsops.c                                 |    2 
     fs/xfs/xfs_health.c                                |  124 ++
     fs/xfs/xfs_healthmon.c                             | 1255 ++++++++++++++++++++
     fs/xfs/xfs_ioctl.c                                 |    7 
     fs/xfs/xfs_mount.c                                 |    2 
     fs/xfs/xfs_notify_failure.c                        |   17 
     fs/xfs/xfs_super.c                                 |   12 
     fs/xfs/xfs_trace.c                                 |    5 
     fs/xfs/xfs_verify_media.c                          |  445 +++++++
     17 files changed, 2924 insertions(+), 7 deletions(-)
     create mode 100644 fs/xfs/xfs_healthmon.h
     create mode 100644 fs/xfs/xfs_verify_media.h
     create mode 100644 fs/xfs/xfs_healthmon.c
     create mode 100644 fs/xfs/xfs_verify_media.c
    
    
    ^ permalink raw reply	[flat|nested] 11+ messages in thread

    end of thread, other threads:[~2026-02-10  4:57 UTC | newest]
    
    Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <176852588473.2137143.1604994842772101197.stgit@frogsfrogsfrogs>
         [not found] ` <176852588776.2137143.7103003682733018282.stgit@frogsfrogsfrogs>
         [not found]   ` <20260120041226.GJ15551@frogsfrogsfrogs>
         [not found]     ` <20260120071830.GA5686@lst.de>
         [not found]       ` <20260120180040.GU15551@frogsfrogsfrogs>
    2026-01-21  7:05         ` [PATCH v6.1 11/11] xfs: add media verification ioctl Christoph Hellwig
    2026-01-21 19:58           ` Darrick J. Wong
    2026-02-06  3:01     ` Chris Mason
    2026-02-06  4:53       ` Darrick J. Wong
         [not found] ` <176852588582.2137143.1283636639551788931.stgit@frogsfrogsfrogs>
    2026-02-06 13:07   ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Pankaj Raghav (Samsung)
    2026-02-06 17:47     ` Darrick J. Wong
    2026-02-06 18:54       ` Pankaj Raghav
    2026-02-06 20:41         ` Darrick J. Wong
    2026-02-09  6:34           ` Christoph Hellwig
    2026-02-10  4:57             ` Darrick J. Wong
    2026-01-21  6:34 [PATCHSET v7 1/3] xfs: autonomous self healing of filesystems Darrick J. Wong
    2026-01-21  6:35 ` [PATCH 02/11] xfs: start creating infrastructure for health monitoring Darrick J. Wong
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox