linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 5/9] xfs: fix double ijoin in xfs_inactive_symlink_rmt()
Date: Fri, 11 May 2018 12:04:25 +1000	[thread overview]
Message-ID: <20180511020425.GW10363@dastard> (raw)
In-Reply-To: <20180509150238.GM11261@magnolia>

On Wed, May 09, 2018 at 08:02:38AM -0700, Darrick J. Wong wrote:
> On Wed, May 09, 2018 at 06:10:42AM -0400, Brian Foster wrote:
> > On Wed, May 09, 2018 at 10:24:28AM +1000, Dave Chinner wrote:
> > > On Tue, May 08, 2018 at 10:18:11AM -0400, Brian Foster wrote:
> > > > On Tue, May 08, 2018 at 01:41:58PM +1000, Dave Chinner wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
> > > > > xfs_inactive_symlink_rmt() does something nasty - it joins an inode
> > > > > into a transaction it is already joined to. This means the inode can
> > > > > have multiple log item descriptors attached to the transaction for
> > > > > it. This breaks teh 1:1 mapping that is supposed to exist
> > > > > between the log item and log item descriptor.
> > > > > 
> > > > > This results in the log item being processed twice during
> > > > > transaction commit and CIL formatting, and there are lots of other
> > > > > potential issues tha arise from double processing of log items in
> > > > > the transaction commit state machine.
> > > > > 
> > > > > In this case, the inode is already held by the rolling transaction
> > > > > returned from xfs_defer_finish(), so there's no need to join it
> > > > > again.
> > > > > 
> > > > > Signed-Off-By: Dave Chinner <dchinner@redhat.com>
> > > > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > > ---
> > > > >  fs/xfs/xfs_symlink.c | 9 ++-------
> > > > >  1 file changed, 2 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> > > > > index 5b66ac12913c..27870e5cd259 100644
> > > > > --- a/fs/xfs/xfs_symlink.c
> > > > > +++ b/fs/xfs/xfs_symlink.c
> > > > > @@ -488,16 +488,11 @@ xfs_inactive_symlink_rmt(
> > > > >  	error = xfs_defer_finish(&tp, &dfops);
> > > > >  	if (error)
> > > > >  		goto error_bmap_cancel;
> > > > > -	/*
> > > > > -	 * The first xact was committed, so add the inode to the new one.
> > > > > -	 * Mark it dirty so it will be logged and moved forward in the log as
> > > > > -	 * part of every commit.
> > > > > -	 */
> > > > > -	xfs_trans_ijoin(tp, ip, 0);
> > > > > -	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> > > > > +
> > > > >  	/*
> > > > >  	 * Commit the transaction containing extent freeing and EFDs.
> > > > >  	 */
> > > > > +	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> > > > 
> > > > Seems fine.. but do we even need this call? We're about to commit the
> > > > transaction and unlock the inode...
> > > 
> > > Yes, I think we do. We need it to be committed in each of the
> > > rolling transactions so that the inode doesn't get written/replayed
> > > before any of the other dependent metadata changes in this final
> > > transaction.
> > > 
> > 
> > Hmm, I don't follow what that means. IIUC the act of logging it again
> > simply moves it forward in the log. That makes sense down in the dfops
> > code but seems unecessary here given that we are about to complete the
> > chain of transactions.
> > 
> > xfs_inactive_symlink_rmt() makes changes to the inode, invals/unmaps the
> > remote bufs, joins the inode to the dfops and finishes the dfops. We
> > return from xfs_defer_finish() having committed the (still locked) inode
> > modifications and have a new/rolled transaction that covers the free of
> > the associated blocks (EFDs). I could certainly be missing something,
> > but from that point what difference does it make whether the final
> > transaction relogs the inode before it commits?

It ensures the inode changes are sanely ordered. i.e. we don't
write the inode to disk before all the other changes made in the
rolling transaction are written to disk. And the same goes for
recovery.

> > The in-core inode is
> > still locked and the previous inode modifications may very well already
> > be in the on-disk log.
> > 
> > That said, it seems harmless despite not understanding what it's for. So
> > with or without the xfs_trans_log_inode() call:
> 
> <shrug> I interpreted this as Dave being fastidious about relogging
> inodes after every transaction roll even if it's not strictly necessary
> (but not otherwise harmful)

That's wrong. It is a requirement of rolling transactions that we
log every object that is held locked across xfs_trans_commit()
regardless of whether it was dirtied in that specific transaction or
not.

That's because we call xfs_trans_reserve() when rolling the
transaction, and that means the locked objects cannot be written to
disk during that rolling transaction sequence.  That means we can
deadlock if the object we hold locked pins the tail of the log and
there isn't space for the re-reservation of log space for the next
transaction in the sequence.

IOWs, the simple rule of thumb is that objects held locked across
xfs_trans_commit() should be logged in that transaction. Follow that
simple rule everywhere, and we don't leave nasty landmines around
to step on when we change code around....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-05-11  2:04 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-08  3:41 [PATCH 0/9 v2] xfs: log item and transaction cleanups Dave Chinner
2018-05-08  3:41 ` [PATCH 1/9] xfs: log item flags are racy Dave Chinner
2018-05-09 14:51   ` Darrick J. Wong
2018-05-08  3:41 ` [PATCH 2/9] xfs: add tracing to high level transaction operations Dave Chinner
2018-05-09 14:51   ` Darrick J. Wong
2018-05-08  3:41 ` [PATCH 3/9] xfs: adder caller IP to xfs_defer* tracepoints Dave Chinner
2018-05-09 14:52   ` Darrick J. Wong
2018-05-08  3:41 ` [PATCH 4/9] xfs: don't assert fail with AIL lock held Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09  6:13   ` Christoph Hellwig
2018-05-09 14:52   ` Darrick J. Wong
2018-05-08  3:41 ` [PATCH 5/9] xfs: fix double ijoin in xfs_inactive_symlink_rmt() Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09  0:24     ` Dave Chinner
2018-05-09 10:10       ` Brian Foster
2018-05-09 15:02         ` Darrick J. Wong
2018-05-11  2:04           ` Dave Chinner [this message]
2018-05-11 13:24             ` Brian Foster
2018-05-12  2:00               ` Dave Chinner
2018-05-12 14:17                 ` Brian Foster
2018-05-08  3:41 ` [PATCH 6/9] xfs: fix double ijoin in xfs_reflink_cancel_cow_range Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09 15:17   ` Darrick J. Wong
2018-05-08  3:42 ` [PATCH 7/9] xfs: fix double ijoin in xfs_reflink_clear_inode_flag() Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09  0:40     ` Dave Chinner
2018-05-09 10:12       ` Brian Foster
2018-05-09 15:19         ` Darrick J. Wong
2018-05-08  3:42 ` [PATCH 8/9] xfs: add some more debug checks to buffer log item reuse Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09 15:19   ` Darrick J. Wong
2018-05-08  3:42 ` [PATCH 9/9] xfs: get rid of the log item descriptor Dave Chinner
2018-05-08 14:18   ` Brian Foster
2018-05-09  6:27   ` Christoph Hellwig
2018-05-09 15:19   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180511020425.GW10363@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).