From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org, hch@lst.de
Subject: Re: [PATCH 0.1/13] xfs: release new dquot buffer on defer_finish error
Date: Fri, 4 May 2018 13:05:28 -0700 [thread overview]
Message-ID: <20180504200528.GK26569@magnolia> (raw)
In-Reply-To: <20180504160356.GD26217@bfoster.bfoster>
On Fri, May 04, 2018 at 12:03:56PM -0400, Brian Foster wrote:
> On Fri, May 04, 2018 at 08:52:36AM -0700, Darrick J. Wong wrote:
> > On Fri, May 04, 2018 at 11:41:21AM -0400, Brian Foster wrote:
> > > On Fri, May 04, 2018 at 08:12:47AM -0700, Darrick J. Wong wrote:
> > > > On Fri, May 04, 2018 at 07:31:58AM -0400, Brian Foster wrote:
> > > > > On Thu, May 03, 2018 at 10:53:43AM -0700, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > >
> > > > > > In commit efa092f3d4c6 "[XFS] Fixes a bug in the quota code when
> > > > > > allocating a new dquot record", we allocate a new dquot block, grab a
> > > > > > buffer to initialize it, and return the locked initialized dquot buffer
> > > > > > to the caller for further in-core dquot initialization. Unfortunately,
> > > > > > if the _bmap_finish errored out, _qm_dqalloc would also error out
> > > > > > without bothering to free the (locked) buffer. Leaking a locked buffer
> > > > > > caused hangs in generic/388 when quotas are enabled.
> > > > > >
> > > > > > Furthermore, the _bmap_finish -> _defer_finish conversion in
> > > > > > 310a75a3c6c747 ("xfs: change xfs_bmap_{finish,cancel,init,free} ->
> > > > > > xfs_defer_*") failed to observe that the buffer was held going into
> > > > > > _defer_finish and therefore failed to notice that the buffer lock is
> > > > > > /not/ maintained afterwards. Now that we can bjoin a buffer to a
> > > > > > defer_ops, use this mechanism to ensure that the buffer stays locked
> > > > > > across the _defer_finish. Release the holds and locks on the buffer as
> > > > > > appropriate if we have to error out.
> > > > > >
> > > > > > There is a subtlety here for the caller in that the buffer emerges
> > > > > > locked and held to the transaction, so if the _trans_commit fails we
> > > > > > have to release the buffer explicitly. This fixes the unmount hang
> > > > > > in generic/388 when quotas are enabled.
> > > > > >
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > ---
> > > > > > fs/xfs/xfs_dquot.c | 48 +++++++++++++++++++++++++++---------------------
> > > > > > 1 file changed, 27 insertions(+), 21 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> > > > > > index a7daef9e16bf..4c39d8632230 100644
> > > > > > --- a/fs/xfs/xfs_dquot.c
> > > > > > +++ b/fs/xfs/xfs_dquot.c
> > > > > > @@ -362,33 +362,39 @@ xfs_qm_dqalloc(
> > > > > > dqp->dq_flags & XFS_DQ_ALLTYPES, bp);
> > > > > >
> > > > > > /*
> > > > > > - * xfs_defer_finish() may commit the current transaction and
> > > > > > - * start a second transaction if the freelist is not empty.
> > > > > > + * Hold the buffer and join it to the dfops so that we'll still own
> > > > > > + * the buffer when we return to the caller. The buffer disposal on
> > > > > > + * error must be paid attention to very carefully, as it has been
> > > > > > + * broken since commit efa092f3d4c6 "[XFS] Fixes a bug in the quota
> > > > > > + * code when allocating a new dquot record" in 2005, and the later
> > > > > > + * conversion to xfs_defer_ops in commit 310a75a3c6c747 failed to keep
> > > > > > + * the buffer locked across the _defer_finish call. We can now do
> > > > > > + * this correctly with xfs_defer_bjoin.
> > > > > > *
> > > > > > - * Since we still want to modify this buffer, we need to
> > > > > > - * ensure that the buffer is not released on commit of
> > > > > > - * the first transaction and ensure the buffer is added to the
> > > > > > - * second transaction.
> > > > > > + * Above, we allocated a disk block for the dquot information and
> > > > > > + * used get_buf to initialize the dquot. If the _defer_bjoin fails,
> > > > > > + * the buffer is still locked to *tpp, so we must _bhold_release and
> > > > > > + * then _trans_brelse the buffer. If the _defer_finish fails, the old
> > > > > > + * transaction is gone but the new buffer is not joined or held to any
> > > > > > + * transaction, so we must _buf_relse it.
> > > > > > *
> > > > > > - * If there is only one transaction then don't stop the buffer
> > > > > > - * from being released when it commits later on.
> > > > > > + * If everything succeeds, the caller of this function is returned a
> > > > > > + * buffer that is locked, held, and joined to the transaction. If the
> > > > > > + * transaction commit fails (in the caller) the caller must unlock the
> > > > > > + * buffer manually.
> > > > >
> > > > > If the buffer is held due to the xfs_defer_bjoin(), doesn't that mean
> > > > > that the caller has to ultimately release it even after successful
> > > > > transaction commit (assuming we don't roll the transaction again
> > > > > somewhere)? I see we have an xfs_trans_brelse() up in xfs_qm_dqread(),
> > > > > but it looks like that only clears the hold if the buffer isn't logged
> > > > > in the tx. Hm?
> > > >
> > > > Correct. The buffer is initialized in the same transaction as the dquot
> > > > block allocation and committed in xfs_defer_finish. After
> > > > initialization (which is to say when we return to xfs_qm_dqtobp), the
> > > > buffer is held, joined, and not logged to the transaction, and nothing
> > > > else is supposed to dirty the buffer. Both buffer and transaction are
> > > > then returned in this state to _dqread, which the in-core dquot state
> > > > out of the dquot buffer and _trans_brelse's the (still clean) buffer,
> > > > which breaks the hold and unlocks the buffer.
> > > >
> > >
> > > Ok that makes sense, but doesn't that depend on having a deferred
> > > operation? Is that always guaranteed here?
> >
> > Assuming you meant the case where we _trans_read_buf'd the dquot buffer
> > in from disk, we return a buffer that's clean, locked, and joined to the
> > transaction. The only difference is that the buffer isn't held, but
> > _trans_brelse clears the hold unconditionally.
> >
>
> I'm referring to the xfs_qm_dqalloc() case. It looks like we
> xfs_bmapi_write(), get the buffer, call xfs_qm_init_dquot_blk() (logs
> the buffer) then go into the defer/exit sequence modified by this
> patch...
>
> If the defer finish doesn't do anything, are we in the right state or is
> the buffer still dirty+held? If the latter, doesn't that mean the buffer
> remains held after the caller commits the current transaction?
Ah, right, now I see what you're getting at.
If the defer_finish does nothing then yes we do exit up to _dqread with
the buffer locked, dirty, joined, and held to the transaction, in which
case the xfs_trans_brelse does nothing. The transaction commit right
after that will log the buffer and release the hold, but the buffer's
still locked.
So I guess we need to roll the transaction after the defer_finish to
guarantee that the buffer is no longer dirty. Potentially we could
modify _defer_finish to always roll at least once if any of the
_defer_[bi]join items are dirty so that we always return with the
_defer_*join'd items clean?
For now it's probably fine to _bhold_release it after the _defer_finish
(as Brian just suggested on irc) and reconsider "_defer_finish with
defer_joined dirty stuff" separately.
The refactor fixes all this so that we always commit the transaction
before handing back a bjoin'd buffer, so the state is consistent.
--D
> Brian
>
> > --D
> >
> > > Brian
> > >
> > > > After the refactor we guarantee that the buffer is locked, clean, and
> > > > not attached to a transaction by the time we get to calling
> > > > xfs_dquot_from_disk rather than returning transaction and buffer up the
> > > > call stack and having to reason up the stack about what state they're in.
> > > >
> > > > --D
> > > >
> > > > > Brian
> > > > >
> > > > > > */
> > > > > > -
> > > > > > - xfs_trans_bhold(tp, bp);
> > > > > > -
> > > > > > + xfs_trans_bhold(*tpp, bp);
> > > > > > + error = xfs_defer_bjoin(&dfops, bp);
> > > > > > + if (error) {
> > > > > > + xfs_trans_bhold_release(*tpp, bp);
> > > > > > + xfs_trans_brelse(*tpp, bp);
> > > > > > + goto error1;
> > > > > > + }
> > > > > > error = xfs_defer_finish(tpp, &dfops);
> > > > > > - if (error)
> > > > > > + if (error) {
> > > > > > + xfs_buf_relse(bp);
> > > > > > goto error1;
> > > > > > -
> > > > > > - /* Transaction was committed? */
> > > > > > - if (*tpp != tp) {
> > > > > > - tp = *tpp;
> > > > > > - xfs_trans_bjoin(tp, bp);
> > > > > > - } else {
> > > > > > - xfs_trans_bhold_release(tp, bp);
> > > > > > }
> > > > > > -
> > > > > > - *O_bpp = bp;
> > > > > > return 0;
> > > > > >
> > > > > > error1:
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > > the body of a message to majordomo@vger.kernel.org
> > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-05-04 20:05 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-30 5:43 [PATCH v2 00/13] xfs-4.18: quota refactor Darrick J. Wong
2018-04-30 5:43 ` [PATCH 01/13] xfs: refactor XFS_QMOPT_DQNEXT out of existence Darrick J. Wong
2018-04-30 5:43 ` [PATCH 02/13] xfs: refactor dquot cache handling Darrick J. Wong
2018-04-30 5:43 ` [PATCH 03/13] xfs: delegate dqget input checks to helper function Darrick J. Wong
2018-04-30 5:43 ` [PATCH 04/13] xfs: remove unnecessary xfs_qm_dqattach parameter Darrick J. Wong
2018-04-30 5:43 ` [PATCH 05/13] xfs: split out dqget for inodes from regular dqget Darrick J. Wong
2018-04-30 5:43 ` [PATCH 06/13] xfs: fetch dquots directly during quotacheck Darrick J. Wong
2018-04-30 5:43 ` [PATCH 07/13] xfs: refactor incore dquot initialization functions Darrick J. Wong
2018-04-30 5:44 ` [PATCH 08/13] xfs: refactor xfs_qm_dqtobp and xfs_qm_dqalloc Darrick J. Wong
2018-05-01 13:44 ` Brian Foster
2018-05-02 16:32 ` Christoph Hellwig
2018-05-03 0:10 ` Darrick J. Wong
2018-04-30 5:44 ` [PATCH 09/13] xfs: remove xfs_qm_dqread flags argument Darrick J. Wong
2018-05-01 13:44 ` Brian Foster
2018-05-02 16:34 ` Christoph Hellwig
2018-05-02 16:58 ` Darrick J. Wong
2018-05-07 14:41 ` Christoph Hellwig
2018-05-08 0:04 ` Darrick J. Wong
2018-05-08 0:05 ` [PATCH 09/13] xfs: remove direct calls to _qm_dqread Darrick J. Wong
2018-05-09 16:40 ` Brian Foster
2018-05-10 8:26 ` Christoph Hellwig
2018-05-10 15:20 ` Darrick J. Wong
2018-04-30 5:44 ` [PATCH 10/13] xfs: replace XFS_QMOPT_DQALLOC with XFS_DQGET_{ALLOC, EXISTS} Darrick J. Wong
2018-04-30 5:47 ` [PATCH v2 10/13] xfs: replace XFS_QMOPT_DQALLOC with boolean Darrick J. Wong
2018-05-01 13:45 ` Brian Foster
2018-05-01 15:52 ` Darrick J. Wong
2018-05-02 16:35 ` Christoph Hellwig
2018-04-30 5:44 ` [PATCH 11/13] xfs: report failing address when dquot verifier fails Darrick J. Wong
2018-04-30 5:44 ` [PATCH 12/13] xfs: rename on-disk dquot counter zap functions Darrick J. Wong
2018-05-01 13:45 ` Brian Foster
2018-05-02 16:35 ` Christoph Hellwig
2018-04-30 5:44 ` [PATCH 13/13] xfs: refactor dquot iteration Darrick J. Wong
2018-05-01 13:45 ` Brian Foster
2018-05-01 15:53 ` Darrick J. Wong
2018-05-02 16:37 ` Christoph Hellwig
2018-05-02 16:43 ` [PATCH v2 " Darrick J. Wong
2018-05-03 17:53 ` [PATCH 0.1/13] xfs: release new dquot buffer on defer_finish error Darrick J. Wong
2018-05-04 11:31 ` Brian Foster
2018-05-04 15:12 ` Darrick J. Wong
2018-05-04 15:41 ` Brian Foster
2018-05-04 15:52 ` Darrick J. Wong
2018-05-04 16:03 ` Brian Foster
2018-05-04 20:05 ` Darrick J. Wong [this message]
2018-05-04 21:19 ` [PATCH v2 " Darrick J. Wong
2018-05-07 11:03 ` Brian Foster
2018-05-03 17:54 ` [PATCH 0.2/13] xfs: don't spray logs when dquot flush/purge fail Darrick J. Wong
2018-05-04 11:32 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180504200528.GK26569@magnolia \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).