linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-xfs@vger.kernel.org
Cc: david@fromorbit.com
Subject: tr_ifree transaction log reservation calculation
Date: Tue, 21 Nov 2017 10:05:57 -0500	[thread overview]
Message-ID: <20171121150557.GA5014@bfoster.bfoster> (raw)

Hi all,

I'm looking into a bug that appears to reflect the fairly rare
xfs_inactive_ifree() transaction overruns that have been reported in the
past. To summarize, the cause of the overrun appears to be that the
pre-inode-chunk-free agfl fixup ends up freeing a single agfl block that
leads to multiple [cnt|bno]bt joins that repopulate the agfl and cause
several more iterations in xfs_alloc_fix_freelist(). These extra
iterations combine with a couple other conditions that ultimately result
in consuming most or all of the anticipated cntbt log reservation before
we actually get to freeing the inode chunk:

- left+right contiguous blocks that require 2 cntbt record removals and
  an insert with a new length key
- the overrun is the first transaction in a CIL ctx and thus consumes
  the CIL ticket reservation
- the transaction spans a log buffer and thus requires additional space
  for split region headers

Note that I don't believe the above are problems, but rather this
suggests how we probably get away with the higher level problem in most
cases where this additional "worst case" reservation goes unused.

I ended up looking at tr_ifree while investigating some options to
resolve this problem and am slightly confused by the reservation
calculation. In particular, we do this for the inobt portion of the
operation (i.e., "the inode btree: max depth * blocksize"):

                xfs_calc_buf_res(2 + mp->m_ialloc_blks +   
                                 mp->m_in_maxlevels, 0) +  

... where it looks to me that we only incorporate the overhead of the
inobt buffers rather than the buffer content themselves. Is this
expected/appropriate, or should we be passing something like
XFS_FSB_TO_B(mp, 1) there rather than 0? As it is, while this is not
related to the allocation btrees, it does happen to add enough
reservation to the transaction to avoid the overrun. Then again, it
might not technically be appropriate to add that reservation since the
inobt and free space btree updates are separated by a transaction roll.

That aside, the next best option for dealing with this situation seems
to be to limit the number of agfl fixups that can occur per-transaction.
Yet another option might be to roll the transaction in certain
situations (i.e., let the deferred op handling give us a new transaction
if an agfl fixup resulted in a split/join), but I suspect that could get
more involved as we may want to keep the agf locked across that
sequence, etc.

Thoughts? Can anyone shed some light on this reservation? Thanks!

Brian

             reply	other threads:[~2017-11-21 15:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-21 15:05 Brian Foster [this message]
     [not found] ` <0a9b47ba-a41e-3b96-981f-f04f9e2ab11c@hpe.com>
2017-11-21 17:35   ` tr_ifree transaction log reservation calculation Brian Foster
2017-11-22  2:26     ` Dave Chinner
2017-11-22 12:21       ` Brian Foster
2017-11-22 20:41         ` Brian Foster
2017-11-23  0:24           ` Dave Chinner
2017-11-23 14:36             ` Brian Foster
2017-11-23 21:54               ` Dave Chinner
2017-11-24 14:51                 ` Brian Foster
2017-11-25 23:20                   ` Dave Chinner
2017-11-27 18:46                     ` Brian Foster
2017-11-27 21:29                       ` Dave Chinner
2017-11-28 13:28                         ` Brian Foster
2017-11-28 21:34                           ` Dave Chinner
2017-11-29 14:31                             ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171121150557.GA5014@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).