From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH RFC 2/4] xfs: defer agfl block frees when dfops is available
Date: Fri, 8 Dec 2017 09:16:30 -0500 [thread overview]
Message-ID: <20171208141629.GB55826@bfoster.bfoster> (raw)
In-Reply-To: <20171207224126.GJ4094@dastard>
On Fri, Dec 08, 2017 at 09:41:26AM +1100, Dave Chinner wrote:
> On Thu, Dec 07, 2017 at 01:58:08PM -0500, Brian Foster wrote:
> > The AGFL fixup code executes before every block allocation/free and
> > rectifies the AGFL based on the current, dynamic allocation
> > requirements of the fs. The AGFL must hold a minimum number of
> > blocks to satisfy a worst case split of the free space btrees caused
> > by the impending allocation operation. The AGFL is also updated to
> > maintain the implicit requirement for a minimum number of free slots
> > to satisfy a worst case join of the free space btrees.
> >
> > Since the AGFL caches individual blocks, AGFL reduction typically
> > involves multiple, single block frees. We've had reports of
> > transaction overrun problems during certain workloads that boil down
> > to AGFL reduction freeing multiple blocks and consuming more space
> > in the log than was reserved for the transaction.
> >
> > Since the objective of freeing AGFL blocks is to ensure free AGFL
> > free slots are available for the upcoming allocation, one way to
> > address this problem is to release surplus blocks from the AGFL
> > immediately but defer the free of those blocks (similar to how
> > file-mapped blocks are unmapped from the file in one transaction and
> > freed via a deferred operation) until the transaction is rolled.
> > This turns AGFL reduction into an operation with predictable log
> > reservation consumption.
> >
> > Add the capability to defer AGFL block frees when a deferred ops
> > list is handed to the AGFL fixup code. Deferring AGFL frees is a
> > conditional behavior based on whether the caller has populated the
> > new dfops field of the xfs_alloc_arg structure. A bit of
> > customization is required to handle deferred completion processing
> > because AGFL blocks are accounted against a separate reservation
> > pool and AGFL are not inserted into the extent busy list when freed
> > (they are inserted when used and released back to the AGFL). Reuse
> > the majority of the existing deferred extent free infrastructure and
> > customize it appropriately to handle AGFL blocks.
>
> Ok, so it uses the EFI/EFD to make sure that the block freeing is
> logged and replayed. So my question is:
>
> > +/*
> > + * AGFL blocks are accounted differently in the reserve pools and are not
> > + * inserted into the busy extent list.
> > + */
> > +STATIC int
> > +xfs_agfl_free_finish_item(
> > + struct xfs_trans *tp,
> > + struct xfs_defer_ops *dop,
> > + struct list_head *item,
> > + void *done_item,
> > + void **state)
> > +{
>
> How does this function get called by log recovery when processing
> the EFI as there is no flag in the EFI that says this was a AGFL
> block?
>
It doesn't...
> That said, I haven't traced through whether this matters or not,
> but I suspect it does because freelist frees use XFS_AG_RESV_AGFL
> and that avoids accounting the free to the superblock counters
> because the block is already accounted as free space....
>
I don't think it does matter. I actually tested log recovery precisely
for this question, to see whether the traditional EFI recovery path
would disrupt accounting or anything and I didn't reproduce any problems
(well, except for that rmap record cleanup failure thing).
However, I do still need to trace through and understand why that is, to
know for sure that there aren't any problems lurking here (and if not, I
should probably document it), but I suspect the reason is that the
differences between how agfl and regular blocks are handled here only
affect in-core state of the AG reservation pools. These are all
reinitialized from zero on a subsequent mount based on the on-disk state
(... but good point, and I will try to confirm that before posting a
non-RFC variant).
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-12-08 14:16 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-07 18:58 [PATCH RFC 0/4] xfs: defer agfl block frees Brian Foster
2017-12-07 18:58 ` [PATCH RFC 1/4] xfs: create agfl block free helper function Brian Foster
2017-12-07 22:24 ` Dave Chinner
2017-12-07 18:58 ` [PATCH RFC 2/4] xfs: defer agfl block frees when dfops is available Brian Foster
2017-12-07 22:41 ` Dave Chinner
2017-12-07 22:54 ` Dave Chinner
2017-12-08 14:17 ` Brian Foster
2017-12-08 14:16 ` Brian Foster [this message]
2018-01-08 21:56 ` Brian Foster
2018-01-09 20:43 ` Darrick J. Wong
2018-01-10 12:58 ` Brian Foster
2018-01-10 19:08 ` Darrick J. Wong
2018-01-10 20:32 ` Brian Foster
2017-12-07 18:58 ` [PATCH RFC 3/4] xfs: defer agfl block frees on extent frees Brian Foster
2017-12-07 22:49 ` Dave Chinner
2017-12-08 14:20 ` Brian Foster
2017-12-07 18:58 ` [PATCH RFC 4/4] xfs: defer agfl frees on inobt allocs during chunk removal Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171208141629.GB55826@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox