From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't leak perag metadata reservation on finobt block free
Date: Wed, 10 Jan 2018 15:33:11 -0500 [thread overview]
Message-ID: <20180110203310.GD17232@bfoster.bfoster> (raw)
In-Reply-To: <20180110191704.GS5602@magnolia>
On Wed, Jan 10, 2018 at 11:17:04AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 10, 2018 at 07:06:16AM -0500, Brian Foster wrote:
> > On Tue, Jan 09, 2018 at 04:42:42PM -0500, Brian Foster wrote:
> > > On Tue, Jan 09, 2018 at 12:16:19PM -0800, Darrick J. Wong wrote:
> > > > On Tue, Jan 09, 2018 at 01:35:58PM -0500, Brian Foster wrote:
> > > > > We started using the perag metadata reservation for free inode btree
> > > > > blocks in commit 76d771b4cbe33 ("xfs: use per-AG reservations for
> > > > > the finobt"). While this change consumes metadata res. for finobt
> > > > > block allocations, we still don't replenish the res. pool when
> > > > > finobt blocks are freed. This leads to leaking reservation as finobt
> > > > > blocks are allocated and freed over time, which in turn can lead to
> > > > > overruse of blocks that should be protected by the reservation.
> > > > >
> > > > > Update the finobt free block path to specify the metadata
> > > > > reservation type as done in the allocation path.
> > > > >
> > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > ---
> > > > > fs/xfs/libxfs/xfs_ialloc_btree.c | 25 +++++++++++++++++++++----
> > > > > 1 file changed, 21 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > index 47f44d624cb1..18fe6b3a7802 100644
> > > > > --- a/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
> > > > > @@ -146,16 +146,33 @@ xfs_finobt_alloc_block(
> > > > > }
> > > > >
> > > > > STATIC int
> > > > > -xfs_inobt_free_block(
> > > > > +__xfs_inobt_free_block(
> > > > > struct xfs_btree_cur *cur,
> > > > > - struct xfs_buf *bp)
> > > > > + struct xfs_buf *bp,
> > > > > + enum xfs_ag_resv_type resv)
> > > > > {
> > > > > struct xfs_owner_info oinfo;
> > > > >
> > > > > xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
> > > > > return xfs_free_extent(cur->bc_tp,
> > > > > XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp)), 1,
> > > > > - &oinfo, XFS_AG_RESV_NONE);
> > > > > + &oinfo, resv);
> > > > > +}
> > > > > +
> > > > > +STATIC int
> > > > > +xfs_inobt_free_block(
> > > > > + struct xfs_btree_cur *cur,
> > > > > + struct xfs_buf *bp)
> > > > > +{
> > > > > + return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_NONE);
> > > > > +}
> > > > > +
> > > > > +STATIC int
> > > > > +xfs_finobt_free_block(
> > > > > + struct xfs_btree_cur *cur,
> > > > > + struct xfs_buf *bp)
> > > > > +{
> > > > > + return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_METADATA);
> > > >
> > > > cur->bc_mp->m_inotbt_nores ? XFS_AG_RESV_NONE : XFS_AG_RESV_METADATA
> > > >
> > > > Since we don't use the finobt reservation if there wasn't room.
> > > >
> > >
> > > Yep.. will send a v2. Thanks!
> > >
> >
> > Wait, I don't think that is right.. ->m_inotbt_nores is set at perag
> > init time based on whether the reservation is fulfilled or not. If not,
> > the xfs_ag_resv fields are initialized to zero to indicate there is no
> > reservation.
>
> ARGH... finobt allocations always pass RESV_METADATA, even if
> !m_inotbt_nores, which means that we always draw from that perag
> reservation, even if we didn't actually make one for the finobt, but
> someone else did!
>
> This can happen on a reflink+finobt filesystem where there's enough
> space to grant the refcountbt's reservation request but not enough to
> grant the finobt's request (i.e. the !m_inotbt_nores case). So the
> alloc case is broken too.
>
Got it. So rather than both using RESV_METADATA, both of those callers
need to consider ->m_inotbt_nores and pass NONE/METADATA appropriately,
essentially because the metadata pool is shared between the finobt and
refcountbt. I'll give that a shot and post a patch after some testing.
Brian
> > From there, the xfs_ag_resv_[alloc|free]_extent() functions
> > effectively no-op the accounting for such block allocations. Since there
> > is no reservation in this case, xfs_inactive_ifree() reverts to the
> > older behavior of reserving a block in the transaction (which starts to
> > smell like this is a bit of a hack to avoid tx res failures in this
> > particular context, since the inode allocation side of things still
> > always reserves blocks for finobt operations despite the reservation
> > :/).
>
> Yeah. Smelly. Sorry about this whole mess. :/
>
> > So anyways, shouldn't ->alloc_block()/->free_block() be consistent in
> > unconditionally tagging the allocation as RESV_METADATA? Am I missing
> > something?
> >
> > Brian
> >
> > > Brian
> > >
> > > > --D
> > > >
> > > > > }
> > > > >
> > > > > STATIC int
> > > > > @@ -380,7 +397,7 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
> > > > > .dup_cursor = xfs_inobt_dup_cursor,
> > > > > .set_root = xfs_finobt_set_root,
> > > > > .alloc_block = xfs_finobt_alloc_block,
> > > > > - .free_block = xfs_inobt_free_block,
> > > > > + .free_block = xfs_finobt_free_block,
> > > > > .get_minrecs = xfs_inobt_get_minrecs,
> > > > > .get_maxrecs = xfs_inobt_get_maxrecs,
> > > > > .init_key_from_rec = xfs_inobt_init_key_from_rec,
> > > > > --
> > > > > 2.13.6
> > > > >
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2018-01-10 20:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-09 18:35 [PATCH] xfs: don't leak perag metadata reservation on finobt block free Brian Foster
2018-01-09 20:16 ` Darrick J. Wong
2018-01-09 21:42 ` Brian Foster
2018-01-10 12:06 ` Brian Foster
2018-01-10 19:17 ` Darrick J. Wong
2018-01-10 20:33 ` Brian Foster [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180110203310.GD17232@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox