linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org, hch@lst.de
Subject: Re: [PATCH 0.1/13] xfs: release new dquot buffer on defer_finish error
Date: Fri, 4 May 2018 13:05:28 -0700	[thread overview]
Message-ID: <20180504200528.GK26569@magnolia> (raw)
In-Reply-To: <20180504160356.GD26217@bfoster.bfoster>

On Fri, May 04, 2018 at 12:03:56PM -0400, Brian Foster wrote:
> On Fri, May 04, 2018 at 08:52:36AM -0700, Darrick J. Wong wrote:
> > On Fri, May 04, 2018 at 11:41:21AM -0400, Brian Foster wrote:
> > > On Fri, May 04, 2018 at 08:12:47AM -0700, Darrick J. Wong wrote:
> > > > On Fri, May 04, 2018 at 07:31:58AM -0400, Brian Foster wrote:
> > > > > On Thu, May 03, 2018 at 10:53:43AM -0700, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > 
> > > > > > In commit efa092f3d4c6 "[XFS] Fixes a bug in the quota code when
> > > > > > allocating a new dquot record", we allocate a new dquot block, grab a
> > > > > > buffer to initialize it, and return the locked initialized dquot buffer
> > > > > > to the caller for further in-core dquot initialization.  Unfortunately,
> > > > > > if the _bmap_finish errored out, _qm_dqalloc would also error out
> > > > > > without bothering to free the (locked) buffer.  Leaking a locked buffer
> > > > > > caused hangs in generic/388 when quotas are enabled.
> > > > > > 
> > > > > > Furthermore, the _bmap_finish -> _defer_finish conversion in
> > > > > > 310a75a3c6c747 ("xfs: change xfs_bmap_{finish,cancel,init,free} ->
> > > > > > xfs_defer_*") failed to observe that the buffer was held going into
> > > > > > _defer_finish and therefore failed to notice that the buffer lock is
> > > > > > /not/ maintained afterwards.  Now that we can bjoin a buffer to a
> > > > > > defer_ops, use this mechanism to ensure that the buffer stays locked
> > > > > > across the _defer_finish.  Release the holds and locks on the buffer as
> > > > > > appropriate if we have to error out.
> > > > > > 
> > > > > > There is a subtlety here for the caller in that the buffer emerges
> > > > > > locked and held to the transaction, so if the _trans_commit fails we
> > > > > > have to release the buffer explicitly.  This fixes the unmount hang
> > > > > > in generic/388 when quotas are enabled.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > ---
> > > > > >  fs/xfs/xfs_dquot.c |   48 +++++++++++++++++++++++++++---------------------
> > > > > >  1 file changed, 27 insertions(+), 21 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > > > > index a7daef9e16bf..4c39d8632230 100644
> > > > > > --- a/fs/xfs/xfs_dquot.c
> > > > > > +++ b/fs/xfs/xfs_dquot.c
> > > > > > @@ -362,33 +362,39 @@ xfs_qm_dqalloc(
> > > > > >  			      dqp->dq_flags & XFS_DQ_ALLTYPES, bp);
> > > > > >  
> > > > > >  	/*
> > > > > > -	 * xfs_defer_finish() may commit the current transaction and
> > > > > > -	 * start a second transaction if the freelist is not empty.
> > > > > > +	 * Hold the buffer and join it to the dfops so that we'll still own
> > > > > > +	 * the buffer when we return to the caller.  The buffer disposal on
> > > > > > +	 * error must be paid attention to very carefully, as it has been
> > > > > > +	 * broken since commit efa092f3d4c6 "[XFS] Fixes a bug in the quota
> > > > > > +	 * code when allocating a new dquot record" in 2005, and the later
> > > > > > +	 * conversion to xfs_defer_ops in commit 310a75a3c6c747 failed to keep
> > > > > > +	 * the buffer locked across the _defer_finish call.  We can now do
> > > > > > +	 * this correctly with xfs_defer_bjoin.
> > > > > >  	 *
> > > > > > -	 * Since we still want to modify this buffer, we need to
> > > > > > -	 * ensure that the buffer is not released on commit of
> > > > > > -	 * the first transaction and ensure the buffer is added to the
> > > > > > -	 * second transaction.
> > > > > > +	 * Above, we allocated a disk block for the dquot information and
> > > > > > +	 * used get_buf to initialize the dquot.  If the _defer_bjoin fails,
> > > > > > +	 * the buffer is still locked to *tpp, so we must _bhold_release and
> > > > > > +	 * then _trans_brelse the buffer.  If the _defer_finish fails, the old
> > > > > > +	 * transaction is gone but the new buffer is not joined or held to any
> > > > > > +	 * transaction, so we must _buf_relse it.
> > > > > >  	 *
> > > > > > -	 * If there is only one transaction then don't stop the buffer
> > > > > > -	 * from being released when it commits later on.
> > > > > > +	 * If everything succeeds, the caller of this function is returned a
> > > > > > +	 * buffer that is locked, held, and joined to the transaction.  If the
> > > > > > +	 * transaction commit fails (in the caller) the caller must unlock the
> > > > > > +	 * buffer manually.
> > > > > 
> > > > > If the buffer is held due to the xfs_defer_bjoin(), doesn't that mean
> > > > > that the caller has to ultimately release it even after successful
> > > > > transaction commit (assuming we don't roll the transaction again
> > > > > somewhere)? I see we have an xfs_trans_brelse() up in xfs_qm_dqread(),
> > > > > but it looks like that only clears the hold if the buffer isn't logged
> > > > > in the tx. Hm?
> > > > 
> > > > Correct.  The buffer is initialized in the same transaction as the dquot
> > > > block allocation and committed in xfs_defer_finish.  After
> > > > initialization (which is to say when we return to xfs_qm_dqtobp), the
> > > > buffer is held, joined, and not logged to the transaction, and nothing
> > > > else is supposed to dirty the buffer.  Both buffer and transaction are
> > > > then returned in this state to _dqread, which the in-core dquot state
> > > > out of the dquot buffer and _trans_brelse's the (still clean) buffer,
> > > > which breaks the hold and unlocks the buffer.
> > > > 
> > > 
> > > Ok that makes sense, but doesn't that depend on having a deferred
> > > operation? Is that always guaranteed here?
> > 
> > Assuming you meant the case where we _trans_read_buf'd the dquot buffer
> > in from disk, we return a buffer that's clean, locked, and joined to the
> > transaction.  The only difference is that the buffer isn't held, but
> > _trans_brelse clears the hold unconditionally.
> > 
> 
> I'm referring to the xfs_qm_dqalloc() case. It looks like we
> xfs_bmapi_write(), get the buffer, call xfs_qm_init_dquot_blk() (logs
> the buffer) then go into the defer/exit sequence modified by this
> patch...
> 
> If the defer finish doesn't do anything, are we in the right state or is
> the buffer still dirty+held? If the latter, doesn't that mean the buffer
> remains held after the caller commits the current transaction?

Ah, right, now I see what you're getting at.

If the defer_finish does nothing then yes we do exit up to _dqread with
the buffer locked, dirty, joined, and held to the transaction, in which
case the xfs_trans_brelse does nothing.  The transaction commit right
after that will log the buffer and release the hold, but the buffer's
still locked.

So I guess we need to roll the transaction after the defer_finish to
guarantee that the buffer is no longer dirty.  Potentially we could
modify _defer_finish to always roll at least once if any of the
_defer_[bi]join items are dirty so that we always return with the
_defer_*join'd items clean?

For now it's probably fine to _bhold_release it after the _defer_finish
(as Brian just suggested on irc) and reconsider "_defer_finish with
defer_joined dirty stuff" separately.

The refactor fixes all this so that we always commit the transaction
before handing back a bjoin'd buffer, so the state is consistent.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > > After the refactor we guarantee that the buffer is locked, clean, and
> > > > not attached to a transaction by the time we get to calling
> > > > xfs_dquot_from_disk rather than returning transaction and buffer up the
> > > > call stack and having to reason up the stack about what state they're in.
> > > > 
> > > > --D
> > > > 
> > > > > Brian
> > > > > 
> > > > > >  	 */
> > > > > > -
> > > > > > -	xfs_trans_bhold(tp, bp);
> > > > > > -
> > > > > > +	xfs_trans_bhold(*tpp, bp);
> > > > > > +	error = xfs_defer_bjoin(&dfops, bp);
> > > > > > +	if (error) {
> > > > > > +		xfs_trans_bhold_release(*tpp, bp);
> > > > > > +		xfs_trans_brelse(*tpp, bp);
> > > > > > +		goto error1;
> > > > > > +	}
> > > > > >  	error = xfs_defer_finish(tpp, &dfops);
> > > > > > -	if (error)
> > > > > > +	if (error) {
> > > > > > +		xfs_buf_relse(bp);
> > > > > >  		goto error1;
> > > > > > -
> > > > > > -	/* Transaction was committed? */
> > > > > > -	if (*tpp != tp) {
> > > > > > -		tp = *tpp;
> > > > > > -		xfs_trans_bjoin(tp, bp);
> > > > > > -	} else {
> > > > > > -		xfs_trans_bhold_release(tp, bp);
> > > > > >  	}
> > > > > > -
> > > > > > -	*O_bpp = bp;
> > > > > >  	return 0;
> > > > > >  
> > > > > >  error1:
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > > the body of a message to majordomo@vger.kernel.org
> > > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-05-04 20:05 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  5:43 [PATCH v2 00/13] xfs-4.18: quota refactor Darrick J. Wong
2018-04-30  5:43 ` [PATCH 01/13] xfs: refactor XFS_QMOPT_DQNEXT out of existence Darrick J. Wong
2018-04-30  5:43 ` [PATCH 02/13] xfs: refactor dquot cache handling Darrick J. Wong
2018-04-30  5:43 ` [PATCH 03/13] xfs: delegate dqget input checks to helper function Darrick J. Wong
2018-04-30  5:43 ` [PATCH 04/13] xfs: remove unnecessary xfs_qm_dqattach parameter Darrick J. Wong
2018-04-30  5:43 ` [PATCH 05/13] xfs: split out dqget for inodes from regular dqget Darrick J. Wong
2018-04-30  5:43 ` [PATCH 06/13] xfs: fetch dquots directly during quotacheck Darrick J. Wong
2018-04-30  5:43 ` [PATCH 07/13] xfs: refactor incore dquot initialization functions Darrick J. Wong
2018-04-30  5:44 ` [PATCH 08/13] xfs: refactor xfs_qm_dqtobp and xfs_qm_dqalloc Darrick J. Wong
2018-05-01 13:44   ` Brian Foster
2018-05-02 16:32   ` Christoph Hellwig
2018-05-03  0:10   ` Darrick J. Wong
2018-04-30  5:44 ` [PATCH 09/13] xfs: remove xfs_qm_dqread flags argument Darrick J. Wong
2018-05-01 13:44   ` Brian Foster
2018-05-02 16:34   ` Christoph Hellwig
2018-05-02 16:58     ` Darrick J. Wong
2018-05-07 14:41       ` Christoph Hellwig
2018-05-08  0:04         ` Darrick J. Wong
2018-05-08  0:05   ` [PATCH 09/13] xfs: remove direct calls to _qm_dqread Darrick J. Wong
2018-05-09 16:40     ` Brian Foster
2018-05-10  8:26     ` Christoph Hellwig
2018-05-10 15:20       ` Darrick J. Wong
2018-04-30  5:44 ` [PATCH 10/13] xfs: replace XFS_QMOPT_DQALLOC with XFS_DQGET_{ALLOC, EXISTS} Darrick J. Wong
2018-04-30  5:47   ` [PATCH v2 10/13] xfs: replace XFS_QMOPT_DQALLOC with boolean Darrick J. Wong
2018-05-01 13:45     ` Brian Foster
2018-05-01 15:52       ` Darrick J. Wong
2018-05-02 16:35     ` Christoph Hellwig
2018-04-30  5:44 ` [PATCH 11/13] xfs: report failing address when dquot verifier fails Darrick J. Wong
2018-04-30  5:44 ` [PATCH 12/13] xfs: rename on-disk dquot counter zap functions Darrick J. Wong
2018-05-01 13:45   ` Brian Foster
2018-05-02 16:35   ` Christoph Hellwig
2018-04-30  5:44 ` [PATCH 13/13] xfs: refactor dquot iteration Darrick J. Wong
2018-05-01 13:45   ` Brian Foster
2018-05-01 15:53     ` Darrick J. Wong
2018-05-02 16:37       ` Christoph Hellwig
2018-05-02 16:43   ` [PATCH v2 " Darrick J. Wong
2018-05-03 17:53 ` [PATCH 0.1/13] xfs: release new dquot buffer on defer_finish error Darrick J. Wong
2018-05-04 11:31   ` Brian Foster
2018-05-04 15:12     ` Darrick J. Wong
2018-05-04 15:41       ` Brian Foster
2018-05-04 15:52         ` Darrick J. Wong
2018-05-04 16:03           ` Brian Foster
2018-05-04 20:05             ` Darrick J. Wong [this message]
2018-05-04 21:19   ` [PATCH v2 " Darrick J. Wong
2018-05-07 11:03     ` Brian Foster
2018-05-03 17:54 ` [PATCH 0.2/13] xfs: don't spray logs when dquot flush/purge fail Darrick J. Wong
2018-05-04 11:32   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180504200528.GK26569@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).