public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't leak perag metadata reservation on finobt block free
Date: Wed, 10 Jan 2018 15:33:11 -0500	[thread overview]
Message-ID: <20180110203310.GD17232@bfoster.bfoster> (raw)
In-Reply-To: <20180110191704.GS5602@magnolia>

On Wed, Jan 10, 2018 at 11:17:04AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 10, 2018 at 07:06:16AM -0500, Brian Foster wrote:
> > On Tue, Jan 09, 2018 at 04:42:42PM -0500, Brian Foster wrote:
> > > On Tue, Jan 09, 2018 at 12:16:19PM -0800, Darrick J. Wong wrote:
> > > > On Tue, Jan 09, 2018 at 01:35:58PM -0500, Brian Foster wrote:
> > > > > We started using the perag metadata reservation for free inode btree
> > > > > blocks in commit 76d771b4cbe33 ("xfs: use per-AG reservations for
> > > > > the finobt"). While this change consumes metadata res. for finobt
> > > > > block allocations, we still don't replenish the res. pool when
> > > > > finobt blocks are freed. This leads to leaking reservation as finobt
> > > > > blocks are allocated and freed over time, which in turn can lead to
> > > > > overruse of blocks that should be protected by the reservation.
> > > > > 
> > > > > Update the finobt free block path to specify the metadata
> > > > > reservation type as done in the allocation path.
> > > > > 
> > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > ---
> > > > >  fs/xfs/libxfs/xfs_ialloc_btree.c | 25 +++++++++++++++++++++----
> > > > >  1 file changed, 21 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > index 47f44d624cb1..18fe6b3a7802 100644
> > > > > --- a/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > @@ -146,16 +146,33 @@ xfs_finobt_alloc_block(
> > > > >  }
> > > > >  
> > > > >  STATIC int
> > > > > -xfs_inobt_free_block(
> > > > > +__xfs_inobt_free_block(
> > > > >  	struct xfs_btree_cur	*cur,
> > > > > -	struct xfs_buf		*bp)
> > > > > +	struct xfs_buf		*bp,
> > > > > +	enum xfs_ag_resv_type	resv)
> > > > >  {
> > > > >  	struct xfs_owner_info	oinfo;
> > > > >  
> > > > >  	xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
> > > > >  	return xfs_free_extent(cur->bc_tp,
> > > > >  			XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp)), 1,
> > > > > -			&oinfo, XFS_AG_RESV_NONE);
> > > > > +			&oinfo, resv);
> > > > > +}
> > > > > +
> > > > > +STATIC int
> > > > > +xfs_inobt_free_block(
> > > > > +	struct xfs_btree_cur	*cur,
> > > > > +	struct xfs_buf		*bp)
> > > > > +{
> > > > > +	return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_NONE);
> > > > > +}
> > > > > +
> > > > > +STATIC int
> > > > > +xfs_finobt_free_block(
> > > > > +	struct xfs_btree_cur	*cur,
> > > > > +	struct xfs_buf		*bp)
> > > > > +{
> > > > > +	return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_METADATA);
> > > > 
> > > > cur->bc_mp->m_inotbt_nores ? XFS_AG_RESV_NONE : XFS_AG_RESV_METADATA
> > > > 
> > > > Since we don't use the finobt reservation if there wasn't room.
> > > > 
> > > 
> > > Yep.. will send a v2. Thanks!
> > > 
> > 
> > Wait, I don't think that is right.. ->m_inotbt_nores is set at perag
> > init time based on whether the reservation is fulfilled or not. If not,
> > the xfs_ag_resv fields are initialized to zero to indicate there is no
> > reservation.
> 
> ARGH... finobt allocations always pass RESV_METADATA, even if
> !m_inotbt_nores, which means that we always draw from that perag
> reservation, even if we didn't actually make one for the finobt, but
> someone else did!
> 
> This can happen on a reflink+finobt filesystem where there's enough
> space to grant the refcountbt's reservation request but not enough to
> grant the finobt's request (i.e. the !m_inotbt_nores case).  So the
> alloc case is broken too.
> 

Got it. So rather than both using RESV_METADATA, both of those callers
need to consider ->m_inotbt_nores and pass NONE/METADATA appropriately,
essentially because the metadata pool is shared between the finobt and
refcountbt. I'll give that a shot and post a patch after some testing.

Brian

> > From there, the xfs_ag_resv_[alloc|free]_extent() functions
> > effectively no-op the accounting for such block allocations. Since there
> > is no reservation in this case, xfs_inactive_ifree() reverts to the
> > older behavior of reserving a block in the transaction (which starts to
> > smell like this is a bit of a hack to avoid tx res failures in this
> > particular context, since the inode allocation side of things still
> > always reserves blocks for finobt operations despite the reservation
> > :/).
> 
> Yeah.  Smelly.  Sorry about this whole mess. :/
> 
> > So anyways, shouldn't ->alloc_block()/->free_block() be consistent in
> > unconditionally tagging the allocation as RESV_METADATA? Am I missing
> > something?
> > 
> > Brian
> > 
> > > Brian
> > > 
> > > > --D
> > > > 
> > > > >  }
> > > > >  
> > > > >  STATIC int
> > > > > @@ -380,7 +397,7 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
> > > > >  	.dup_cursor		= xfs_inobt_dup_cursor,
> > > > >  	.set_root		= xfs_finobt_set_root,
> > > > >  	.alloc_block		= xfs_finobt_alloc_block,
> > > > > -	.free_block		= xfs_inobt_free_block,
> > > > > +	.free_block		= xfs_finobt_free_block,
> > > > >  	.get_minrecs		= xfs_inobt_get_minrecs,
> > > > >  	.get_maxrecs		= xfs_inobt_get_maxrecs,
> > > > >  	.init_key_from_rec	= xfs_inobt_init_key_from_rec,
> > > > > -- 
> > > > > 2.13.6
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2018-01-10 20:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-09 18:35 [PATCH] xfs: don't leak perag metadata reservation on finobt block free Brian Foster
2018-01-09 20:16 ` Darrick J. Wong
2018-01-09 21:42   ` Brian Foster
2018-01-10 12:06     ` Brian Foster
2018-01-10 19:17       ` Darrick J. Wong
2018-01-10 20:33         ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180110203310.GD17232@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox