public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: fix an AGI lock acquisition ordering problem in xrep_dinode_findmode
Date: Tue, 9 Apr 2024 15:51:21 -0700	[thread overview]
Message-ID: <20240409225121.GH6390@frogsfrogsfrogs> (raw)
In-Reply-To: <ZhMfXA/1YyRDe869@dread.disaster.area>

On Mon, Apr 08, 2024 at 08:34:04AM +1000, Dave Chinner wrote:
> On Tue, Apr 02, 2024 at 10:18:31PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > While reviewing the next patch which fixes an ABBA deadlock between the
> > AGI and a directory ILOCK, someone asked a question about why we're
> > holding the AGI in the first place.  The reason for that is to quiesce
> > the inode structures for that AG while we do a repair.
> > 
> > I then realized that the xrep_dinode_findmode invokes xchk_iscan_iter,
> > which walks the inobts (and hence the AGIs) to find all the inodes.
> > This itself is also an ABBA vector, since the damaged inode could be in
> > AG 5, which we hold while we scan AG 0 for directories.  5 -> 0 is not
> > allowed.
> > 
> > To address this, modify the iscan to allow trylock of the AGI buffer
> > using the flags argument to xfs_ialloc_read_agi that the previous patch
> > added.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/scrub/inode_repair.c |    1 +
> >  fs/xfs/scrub/iscan.c        |   36 +++++++++++++++++++++++++++++++++++-
> >  fs/xfs/scrub/iscan.h        |   15 +++++++++++++++
> >  fs/xfs/scrub/trace.h        |   10 ++++++++--
> >  4 files changed, 59 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c
> > index eab380e95ef40..35da0193c919e 100644
> > --- a/fs/xfs/scrub/inode_repair.c
> > +++ b/fs/xfs/scrub/inode_repair.c
> > @@ -356,6 +356,7 @@ xrep_dinode_find_mode(
> >  	 * so there's a real possibility that _iscan_iter can return EBUSY.
> >  	 */
> >  	xchk_iscan_start(sc, 5000, 100, &ri->ftype_iscan);
> > +	xchk_iscan_set_agi_trylock(&ri->ftype_iscan);
> >  	ri->ftype_iscan.skip_ino = sc->sm->sm_ino;
> >  	ri->alleged_ftype = XFS_DIR3_FT_UNKNOWN;
> >  	while ((error = xchk_iscan_iter(&ri->ftype_iscan, &dp)) == 1) {
> > diff --git a/fs/xfs/scrub/iscan.c b/fs/xfs/scrub/iscan.c
> > index 66ba0fbd059e0..736ce7c9de6a8 100644
> > --- a/fs/xfs/scrub/iscan.c
> > +++ b/fs/xfs/scrub/iscan.c
> > @@ -243,6 +243,40 @@ xchk_iscan_finish(
> >  	mutex_unlock(&iscan->lock);
> >  }
> >  
> > +/*
> > + * Grab the AGI to advance the inode scan.  Returns 0 if *agi_bpp is now set,
> > + * -ECANCELED if the live scan aborted, -EBUSY if the AGI could not be grabbed,
> > + * or the usual negative errno.
> > + */
> > +STATIC int
> > +xchk_iscan_read_agi(
> > +	struct xchk_iscan	*iscan,
> > +	struct xfs_perag	*pag,
> > +	struct xfs_buf		**agi_bpp)
> > +{
> > +	struct xfs_scrub	*sc = iscan->sc;
> > +	unsigned long		relax;
> > +	int			ret;
> > +
> > +	if (!xchk_iscan_agi_trylock(iscan))
> > +		return xfs_ialloc_read_agi(pag, sc->tp, 0, agi_bpp);
> > +
> > +	relax = msecs_to_jiffies(iscan->iget_retry_delay);
> > +	do {
> > +		ret = xfs_ialloc_read_agi(pag, sc->tp, XFS_IALLOC_FLAG_TRYLOCK,
> > +				agi_bpp);
> 
> Why is this using xfs_ialloc_read_agi() and not xfs_read_agi()?
> How do we get here without the perag AGI state not already
> initialised?

!finobt filesystems won't have called xfs_ialloc_read_agi to initialize
the incore per-ag reservation during mount.  That's a bit of a corner
case since there shouldn't be /so/ many filesystems without finobt these
days, but it's theoretically possible.

> i.e. if you just use xfs_read_agi(), all the code that has to plumb
> flags into xfs_ialloc_read_agi() goes away and this change because a
> lot less intrusive....
> 
> > +		if (ret != -EAGAIN)
> > +			return ret;
> > +		if (!iscan->iget_timeout ||
> > +		    time_is_before_jiffies(iscan->__iget_deadline))
> > +			return -EBUSY;
> > +
> > +		trace_xchk_iscan_agi_retry_wait(iscan);
> > +	} while (!schedule_timeout_killable(relax) &&
> > +		 !xchk_iscan_aborted(iscan));
> > +	return -ECANCELED;
> > +}
> > +
> >  /*
> >   * Advance ino to the next inode that the inobt thinks is allocated, being
> >   * careful to jump to the next AG if we've reached the right end of this AG's
> > @@ -281,7 +315,7 @@ xchk_iscan_advance(
> >  		if (!pag)
> >  			return -ECANCELED;
> >  
> > -		ret = xfs_ialloc_read_agi(pag, sc->tp, 0, &agi_bp);
> > +		ret = xchk_iscan_read_agi(iscan, pag, &agi_bp);
> >  		if (ret)
> >  			goto out_pag;
> >  
> > diff --git a/fs/xfs/scrub/iscan.h b/fs/xfs/scrub/iscan.h
> > index 71f657552dfac..c9da8f7721f66 100644
> > --- a/fs/xfs/scrub/iscan.h
> > +++ b/fs/xfs/scrub/iscan.h
> > @@ -59,6 +59,9 @@ struct xchk_iscan {
> >  /* Set if the scan has been aborted due to some event in the fs. */
> >  #define XCHK_ISCAN_OPSTATE_ABORTED	(1)
> >  
> > +/* Use trylock to acquire the AGI */
> > +#define XCHK_ISCAN_OPSTATE_TRYLOCK_AGI	(2)
> > +
> >  static inline bool
> >  xchk_iscan_aborted(const struct xchk_iscan *iscan)
> >  {
> > @@ -71,6 +74,18 @@ xchk_iscan_abort(struct xchk_iscan *iscan)
> >  	set_bit(XCHK_ISCAN_OPSTATE_ABORTED, &iscan->__opstate);
> >  }
> >  
> > +static inline bool
> > +xchk_iscan_agi_trylock(const struct xchk_iscan *iscan)
> > +{
> > +	return test_bit(XCHK_ISCAN_OPSTATE_TRYLOCK_AGI, &iscan->__opstate);
> > +}
> 
> Function does not actually do any locking, but the name implies it
> is actually doing a trylock operation. Perhaps
> xchk_iscan_agi_needs_trylock()?

Ok.  I apologize for the rage of the last few days.  I need a
loooooooong vacation.

--D

> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

  reply	other threads:[~2024-04-09 22:51 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-27  1:40 [PATCHBOMB v30] xfs: online fsck patches for 6.10 Darrick J. Wong
2024-03-27  1:46 ` [PATCHSET 01/15] xfs: bug fixes for 6.9 Darrick J. Wong
2024-03-27  1:50   ` [PATCH 1/1] xfs: fix potential AGI <-> ILOCK ABBA deadlock in xrep_dinode_findmode_walk_directory Darrick J. Wong
2024-03-27 16:56     ` Christoph Hellwig
2024-03-29 18:38       ` Darrick J. Wong
2024-04-03  5:18   ` [PATCHSET] xfs: bug fixes for 6.9 Darrick J. Wong
2024-04-03  5:18     ` [PATCH 1/3] xfs: pass xfs_buf lookup flags to xfs_*read_agi Darrick J. Wong
2024-04-05 14:53       ` Christoph Hellwig
2024-04-03  5:18     ` [PATCH 2/3] xfs: fix an AGI lock acquisition ordering problem in xrep_dinode_findmode Darrick J. Wong
2024-04-05 14:54       ` Christoph Hellwig
2024-04-05 16:52         ` Christoph Hellwig
2024-04-07 22:34       ` Dave Chinner
2024-04-09 22:51         ` Darrick J. Wong [this message]
2024-04-03  5:18     ` [PATCH 3/3] xfs: fix potential AGI <-> ILOCK ABBA deadlock in xrep_dinode_findmode_walk_directory Darrick J. Wong
2024-04-05  3:27     ` [PATCH 4/3] xfs: fix error bailout in xrep_abt_build_new_trees Darrick J. Wong
2024-04-05  5:17       ` Christoph Hellwig
2024-03-27  1:46 ` [PATCHSET v30.1 02/15] xfs: improve log incompat feature handling Darrick J. Wong
2024-03-27  1:50   ` [PATCH 1/2] xfs: only clear log incompat flags at clean unmount Darrick J. Wong
2024-04-07 22:48     ` Dave Chinner
2024-03-27  1:51   ` [PATCH 2/2] xfs: only add log incompat features with explicit permission Darrick J. Wong
2024-04-07 23:00     ` Dave Chinner
2024-04-09 22:53       ` Darrick J. Wong
2024-03-27  1:47 ` [PATCHSET v30.1 03/15] xfs: refactorings for atomic file content exchanges Darrick J. Wong
2024-03-27  1:51   ` [PATCH 1/7] xfs: move inode lease breaking functions to xfs_inode.c Darrick J. Wong
2024-03-27  1:51   ` [PATCH 2/7] xfs: move xfs_iops.c declarations out of xfs_inode.h Darrick J. Wong
2024-03-27  1:51   ` [PATCH 3/7] xfs: declare xfs_file.c symbols in xfs_file.h Darrick J. Wong
2024-03-27  1:52   ` [PATCH 4/7] xfs: create a new helper to return a file's allocation unit Darrick J. Wong
2024-03-27  1:52   ` [PATCH 5/7] xfs: hoist multi-fsb allocation unit detection to a helper Darrick J. Wong
2024-03-27 11:05     ` Christoph Hellwig
2024-04-07 23:07     ` Dave Chinner
2024-04-09 21:09       ` Darrick J. Wong
2024-03-27  1:52   ` [PATCH 6/7] xfs: refactor non-power-of-two alignment checks Darrick J. Wong
2024-03-27  1:52   ` [PATCH 7/7] xfs: constify xfs_bmap_is_written_extent Darrick J. Wong
2024-03-27  1:47 ` [PATCHSET v30.1 04/15] xfs: atomic file content exchanges Darrick J. Wong
2024-03-27  1:53   ` [PATCH 01/15] vfs: export remap and write check helpers Darrick J. Wong
2024-03-27 11:07     ` Christoph Hellwig
2024-03-29 19:45       ` Darrick J. Wong
2024-03-27  1:53   ` [PATCH 02/15] xfs: introduce new file range exchange ioctl Darrick J. Wong
2024-03-27 11:12     ` Christoph Hellwig
2024-03-27  1:53   ` [PATCH 03/15] xfs: create a log incompat flag for atomic file mapping exchanges Darrick J. Wong
2024-04-07 23:17     ` Dave Chinner
2024-04-09 21:12       ` Darrick J. Wong
2024-03-27  1:53   ` [PATCH 04/15] xfs: introduce a file mapping exchange log intent item Darrick J. Wong
2024-04-07 23:51     ` Dave Chinner
2024-04-09  1:18       ` Darrick J. Wong
2024-04-09  3:06         ` Darrick J. Wong
2024-03-27  1:54   ` [PATCH 05/15] xfs: create deferred log items for file mapping exchanges Darrick J. Wong
2024-03-27  1:54   ` [PATCH 06/15] xfs: bind together the front and back ends of the file range exchange code Darrick J. Wong
2024-04-08  0:05     ` Dave Chinner
2024-03-27  1:54   ` [PATCH 07/15] xfs: add error injection to test file mapping exchange recovery Darrick J. Wong
2024-03-27  1:54   ` [PATCH 08/15] xfs: condense extended attributes after a mapping exchange operation Darrick J. Wong
2024-03-27  1:55   ` [PATCH 09/15] xfs: condense directories " Darrick J. Wong
2024-03-27  1:55   ` [PATCH 10/15] xfs: condense symbolic links " Darrick J. Wong
2024-03-27  1:55   ` [PATCH 11/15] xfs: make file range exchange support realtime files Darrick J. Wong
2024-03-27  1:55   ` [PATCH 12/15] xfs: support non-power-of-two rtextsize with exchange-range Darrick J. Wong
2024-03-27  1:56   ` [PATCH 13/15] docs: update swapext -> exchmaps language Darrick J. Wong
2024-03-27  1:56   ` [PATCH 14/15] xfs: introduce new file range commit ioctls Darrick J. Wong
2024-03-27 11:06     ` Christoph Hellwig
2024-03-29 19:45       ` Darrick J. Wong
2024-03-27  1:56   ` [PATCH 15/15] xfs: enable logged file mapping exchange feature Darrick J. Wong
2024-03-27  1:47 ` [PATCHSET v30.1 05/15] xfs: create temporary files for online repair Darrick J. Wong
2024-03-27  1:57   ` [PATCH 1/4] xfs: hide private inodes from bulkstat and handle functions Darrick J. Wong
2024-03-27 11:12     ` Christoph Hellwig
2024-03-27  1:57   ` [PATCH 2/4] xfs: create temporary files and directories for online repair Darrick J. Wong
2024-03-27  1:57   ` [PATCH 3/4] xfs: refactor live buffer invalidation for repairs Darrick J. Wong
2024-03-27  1:57   ` [PATCH 4/4] xfs: add the ability to reap entire inode forks Darrick J. Wong
2024-03-27  1:47 ` [PATCHSET v30.1 06/15] xfs: online repair of realtime summaries Darrick J. Wong
2024-03-27  1:58   ` [PATCH 1/3] xfs: support preallocating and copying content into temporary files Darrick J. Wong
2024-03-27  1:58   ` [PATCH 2/3] xfs: teach the tempfile to set up atomic file content exchanges Darrick J. Wong
2024-03-27  1:58   ` [PATCH 3/3] xfs: online repair of realtime summaries Darrick J. Wong
2024-03-27  1:48 ` [PATCHSET v30.1 07/15] xfs: set and validate dir/attr block owners Darrick J. Wong
2024-03-27  1:58   ` [PATCH 01/10] xfs: add an explicit owner field to xfs_da_args Darrick J. Wong
2024-03-27  1:59   ` [PATCH 02/10] xfs: use the xfs_da_args owner field to set new dir/attr block owner Darrick J. Wong
2024-03-27  1:59   ` [PATCH 03/10] xfs: reduce indenting in xfs_attr_node_list Darrick J. Wong
2024-03-27 11:13     ` Christoph Hellwig
2024-03-28 17:39       ` Darrick J. Wong
2024-03-27  1:59   ` [PATCH 04/10] xfs: validate attr leaf buffer owners Darrick J. Wong
2024-03-27  1:59   ` [PATCH 05/10] xfs: validate attr remote value " Darrick J. Wong
2024-03-27  2:00   ` [PATCH 06/10] xfs: validate dabtree node " Darrick J. Wong
2024-03-27  2:00   ` [PATCH 07/10] xfs: validate directory leaf " Darrick J. Wong
2024-03-27  2:00   ` [PATCH 08/10] xfs: validate explicit directory data " Darrick J. Wong
2024-03-27  2:00   ` [PATCH 09/10] xfs: validate explicit directory block " Darrick J. Wong
2024-03-27  2:01   ` [PATCH 10/10] xfs: validate explicit directory free block owners Darrick J. Wong
2024-03-27  1:48 ` [PATCHSET v30.1 08/15] xfs: online repair of extended attributes Darrick J. Wong
2024-03-27  2:01   ` [PATCH 1/7] xfs: enable discarding of folios backing an xfile Darrick J. Wong
2024-03-27  2:01   ` [PATCH 2/7] xfs: create a blob array data structure Darrick J. Wong
2024-03-27  2:01   ` [PATCH 3/7] xfs: use atomic extent swapping to fix user file fork data Darrick J. Wong
2024-03-27  2:02   ` [PATCH 4/7] xfs: repair extended attributes Darrick J. Wong
2024-03-27  2:02   ` [PATCH 5/7] xfs: scrub should set preen if attr leaf has holes Darrick J. Wong
2024-03-27  2:02   ` [PATCH 6/7] xfs: flag empty xattr leaf blocks for optimization Darrick J. Wong
2024-03-27  2:03   ` [PATCH 7/7] xfs: create an xattr iteration function for scrub Darrick J. Wong
2024-03-27 11:15     ` Christoph Hellwig
2024-03-27  1:48 ` [PATCHSET v30.1 09/15] xfs: online repair of inode unlinked state Darrick J. Wong
2024-03-27  2:03   ` [PATCH 1/2] xfs: ensure unlinked list state is consistent with nlink during scrub Darrick J. Wong
2024-03-27  2:03   ` [PATCH 2/2] xfs: update the unlinked list when repairing link counts Darrick J. Wong
2024-03-27  1:48 ` [PATCHSET v30.1 10/15] xfs: online repair of directories Darrick J. Wong
2024-03-27  2:03   ` [PATCH 1/5] xfs: inactivate directory data blocks Darrick J. Wong
2024-03-27  2:04   ` [PATCH 2/5] xfs: online repair of directories Darrick J. Wong
2024-03-27  2:04   ` [PATCH 3/5] xfs: scan the filesystem to repair a directory dotdot entry Darrick J. Wong
2024-03-27  2:04   ` [PATCH 4/5] xfs: online repair of parent pointers Darrick J. Wong
2024-03-27  2:04   ` [PATCH 5/5] xfs: ask the dentry cache if it knows the parent of a directory Darrick J. Wong
2024-03-27 11:16     ` Christoph Hellwig
2024-03-29 19:52       ` Darrick J. Wong
2024-04-03  5:03     ` [PATCH v30.2 " Darrick J. Wong
2024-04-03 11:43       ` Christoph Hellwig
2024-03-27  1:49 ` [PATCHSET v30.1 11/15] xfs: move orphan files to lost and found Darrick J. Wong
2024-03-27  2:05   ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong
2024-03-27  2:05   ` [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero Darrick J. Wong
2024-03-27  2:05   ` [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file Darrick J. Wong
2024-03-27  1:49 ` [PATCHSET v30.1 12/15] xfs: online repair of symbolic links Darrick J. Wong
2024-03-27  2:05   ` [PATCH 1/1] " Darrick J. Wong
2024-03-27 16:53     ` Christoph Hellwig
2024-03-29 20:44       ` Darrick J. Wong
2024-03-29 20:58         ` Darrick J. Wong
2024-04-03  5:12   ` [PATCHSET v30.2] " Darrick J. Wong
2024-04-03  5:12     ` [PATCH 1/3] xfs: expose xfs_bmap_local_to_extents for online repair Darrick J. Wong
2024-04-03 11:43       ` Christoph Hellwig
2024-04-03  5:12     ` [PATCH 2/3] xfs: pass the owner to xfs_symlink_write_target Darrick J. Wong
2024-04-03 11:43       ` Christoph Hellwig
2024-04-03  5:12     ` [PATCH 3/3] xfs: online repair of symbolic links Darrick J. Wong
2024-04-03 11:44       ` Christoph Hellwig
2024-03-27  1:49 ` [PATCHSET v30.1 13/15] xfs: online fsck of iunlink buckets Darrick J. Wong
2024-03-27  2:06   ` [PATCH 1/3] xfs: check AGI unlinked inode buckets Darrick J. Wong
2024-03-27  2:06   ` [PATCH 2/3] xfs: hoist AGI repair context to a heap object Darrick J. Wong
2024-03-27  2:06   ` [PATCH 3/3] xfs: repair AGI unlinked inode bucket lists Darrick J. Wong
2024-03-27  1:49 ` [PATCHSET v30.1 14/15] xfs: inode-related repair fixes Darrick J. Wong
2024-03-27  2:06   ` [PATCH 1/4] xfs: check unused nlink fields in the ondisk inode Darrick J. Wong
2024-03-27  2:07   ` [PATCH 2/4] xfs: try to avoid allocating from sick inode clusters Darrick J. Wong
2024-03-27  2:07   ` [PATCH 3/4] xfs: pin inodes that would otherwise overflow link count Darrick J. Wong
2024-03-27  2:07   ` [PATCH 4/4] xfs: create subordinate scrub contexts for xchk_metadata_inode_subtype Darrick J. Wong
2024-03-27  1:50 ` [PATCHSET v30.1 15/15] xfs: less heavy locks during fstrim Darrick J. Wong
2024-03-27  2:07   ` [PATCH 1/1] xfs: fix severe performance problems when fstrimming a subset of an AG Darrick J. Wong
2024-03-27 11:35     ` Christoph Hellwig
2024-03-29 21:35       ` Darrick J. Wong
2024-03-30  5:38         ` Christoph Hellwig
2024-03-30 21:15         ` Dave Chinner
2024-03-31 22:44           ` Darrick J. Wong
2024-03-27 22:15     ` Dave Chinner
2024-03-29 22:51       ` Darrick J. Wong
2024-03-30 21:51         ` Dave Chinner
2024-03-31 22:44           ` Darrick J. Wong
2024-04-01 22:12             ` Dave Chinner
2024-04-03  5:07     ` [PATCH v30.2 " Darrick J. Wong
2024-04-04 21:46       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240409225121.GH6390@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox