From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/9] xfs: AIL doesn't need manual pushing
Date: Fri, 26 Aug 2022 08:46:26 -0700 [thread overview]
Message-ID: <Ywjq0sZx5QwJAFai@magnolia> (raw)
In-Reply-To: <20220823015156.GM3600936@dread.disaster.area>
On Tue, Aug 23, 2022 at 11:51:56AM +1000, Dave Chinner wrote:
> On Mon, Aug 22, 2022 at 10:08:04AM -0700, Darrick J. Wong wrote:
> > On Wed, Aug 10, 2022 at 09:03:46AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > We have a mechanism that checks the amount of log space remaining
> > > available every time we make a transaction reservation. If the
> > > amount of space is below a threshold (25% free) we push on the AIL
> > > to tell it to do more work. To do this, we end up calculating the
> > > LSN that the AIL needs to push to on every reservation and updating
> > > the push target for the AIL with that new target LSN.
> > >
> > > This is silly and expensive. The AIL is perfectly capable of
> > > calculating the push target itself, and it will always be running
> > > when the AIL contains objects.
> > >
> > > Modify the AIL to calculate it's 25% push target before it starts a
> > > push using the same reserve grant head based calculation as is
> > > currently used, and remove all the places where we ask the AIL to
> > > push to a new 25% free target.
> .....
> > > @@ -414,6 +395,57 @@ xfsaild_push_item(
> > > return lip->li_ops->iop_push(lip, &ailp->ail_buf_list);
> > > }
> > >
> > > +/*
> > > + * Compute the LSN that we'd need to push the log tail towards in order to have
> > > + * at least 25% of the log space free. If the log free space already meets this
> > > + * threshold, this function returns NULLCOMMITLSN.
> > > + */
> > > +xfs_lsn_t
> > > +__xfs_ail_push_target(
> > > + struct xfs_ail *ailp)
> > > +{
> > > + struct xlog *log = ailp->ail_log;
> > > + xfs_lsn_t threshold_lsn = 0;
> > > + xfs_lsn_t last_sync_lsn;
> > > + int free_blocks;
> > > + int free_bytes;
> > > + int threshold_block;
> > > + int threshold_cycle;
> > > + int free_threshold;
> > > +
> > > + free_bytes = xlog_space_left(log, &log->l_reserve_head.grant);
> > > + free_blocks = BTOBBT(free_bytes);
> > > +
> > > + /*
> > > + * Set the threshold for the minimum number of free blocks in the
> > > + * log to the maximum of what the caller needs, one quarter of the
> > > + * log, and 256 blocks.
> > > + */
> > > + free_threshold = log->l_logBBsize >> 2;
> > > + if (free_blocks >= free_threshold)
> >
> > What happened to the "free_threshold = max(free_threshold, 256);" from
> > the old code? Or is the documented 256 block minimum no longer
> > necessary?
>
> Oh, I must have dropped the comment change when fixing the last
> round of rebase conflicts. The minimum of 256 blocks is largely
> useless because the even the smallest logs we create on tiny
> filesystems are around 1000 filesystem blocks in size. So a minimum
> free threshold of 128kB (256 BBs) is always going to be less than
> one quarter the size of the journal....
<nod> And even more pointless now that we've effectively mandated 64M
logs for all new filesystems.
>
> > > @@ -454,21 +486,24 @@ xfsaild_push(
> > > * capture updates that occur after the sync push waiter has gone to
> > > * sleep.
> > > */
> > > - if (waitqueue_active(&ailp->ail_empty)) {
> > > + if (test_bit(XFS_AIL_OPSTATE_PUSH_ALL, &ailp->ail_opstate) ||
> > > + waitqueue_active(&ailp->ail_empty)) {
> > > lip = xfs_ail_max(ailp);
> > > if (lip)
> > > target = lip->li_lsn;
> > > + else
> > > + clear_bit(XFS_AIL_OPSTATE_PUSH_ALL, &ailp->ail_opstate);
> > > } else {
> > > - /* barrier matches the ail_target update in xfs_ail_push() */
> > > - smp_rmb();
> > > - target = ailp->ail_target;
> > > - ailp->ail_target_prev = target;
> > > + target = __xfs_ail_push_target(ailp);
> >
> > Hmm. So now the AIL decides how far it ought to push itself: until 25%
> > of the log is free if nobody's watching, or all the way to the end if
> > there are xfs_ail_push_all_sync waiters or OPSTATE_PUSH_ALL is set
> > because someone needs grant space?
>
> Kind of. What the target does is determine if the AIL needs to do
> any work before it goes back to sleep. If we haven't run out of
> reservation space or memory (or some other push all trigger), it
> will simply go back to sleep for a while if there is more than 25%
> of the journal space free without doing anything.
>
> If there are items in the AIL at a lower LSN than the target, it
> will try to push up to the target or to the point of getting stuck
> before going back to sleep and trying again soon after.
>
> If the OPSTATE_PUSH_ALL flag is set, it will keep updating the
> push target until the log is empty every time it loops. THis is
> slightly different behaviour to the existing "push all" code which
> selects a LSN to push towards and it doesn't try to push beyond that
> even if new items are inserted into the AIL after the push_all has
> been triggered.
<nod> Ok, that's what I thought I was seeing -- the target is now a
little more dynamic, which means a "push all" will be more aggressive,
with perhaps less latency spikes later.
> However, because push_all_sync() effectly waits until the AIL is
> empty (i.e. keep looping updating the push target until the AIL is
> empty), and async pushes never wait for it to complete, there is no
> practical difference between the current implementation and this
> one.
>
> > So the xlog*grant* callers now merely wake up the AIL and let push
> > whatever it will, instead of telling the AIL how far to push itself?
>
> Yes.
>
> > Does that mean that those grant callers might have to wait until the AIL
> > empties itself?
>
> No. The moment the log tail moves forward because of a removal from
> the tail of the AIL via xfs_ail_update_finish(), we call
> xlog_assign_tail_lsn_locked() to move the l_tail_lsn forwards and
> make grant space available, then we call xfs_log_space_wake() to
> wake up any grant waiters that are waiting on the space to be made
> available.
Aha! There's the missing piece, thank you.
> The reason for using the "push all" when grant space runs out is
> that we can run out of grant space when there is more than 25% of
> the log free. Small logs are notorious for this, and we have a hack
> in the log callback code (xlog_state_set_callback()) where we push
> the AIL because the *head* moved) to ensure that we kick the AIL
> when we consume space in it because that can push us over the "less
> than 25% available" available that starts tail pushing back up
> again.
...and thank you for the reminder of why that was there, because I was
puzzling over what that (now removed) line of code was doing.
> Hence when we run out of grant space and are going to sleep, we have
> to consider that the grant space may be consuming almost all the log
> space and there is almost nothing in the AIL. In this situation, the
> AIL pins the tail and moving the tail forwards is the only way the
> grant space will come available, so we have to force the AIL to push
> everything to guarantee grant space will eventually be returned.
> Hence triggering a "push all" just before sleeping removes all the
> nasty corner cases we have in other parts of the code that work
> around the "we didn't ask the AIL to push enough to free grant
> space" condition that leads to log space hangs...
<nod> I'll resume reading now.
--D
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2022-08-26 15:46 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-09 23:03 [PATCH 0/9 v2] xfs: byte-base grant head reservation tracking Dave Chinner
2022-08-09 23:03 ` [PATCH 1/9] xfs: move and xfs_trans_committed_bulk Dave Chinner
2022-08-10 14:17 ` kernel test robot
2022-08-10 17:08 ` kernel test robot
2022-08-22 15:03 ` Darrick J. Wong
2022-09-07 13:51 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 2/9] xfs: AIL doesn't need manual pushing Dave Chinner
2022-08-22 17:08 ` Darrick J. Wong
2022-08-23 1:51 ` Dave Chinner
2022-08-26 15:46 ` Darrick J. Wong [this message]
2022-09-07 14:01 ` Christoph Hellwig
2023-10-12 8:44 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 3/9] xfs: background AIL push targets physical space, not grant space Dave Chinner
2022-08-22 19:00 ` Darrick J. Wong
2022-08-23 2:01 ` Dave Chinner
2022-08-26 15:47 ` Darrick J. Wong
2022-08-26 23:49 ` Darrick J. Wong
2022-09-07 14:04 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 4/9] xfs: ensure log tail is always up to date Dave Chinner
2022-08-23 0:33 ` Darrick J. Wong
2022-08-23 2:18 ` Dave Chinner
2022-08-26 21:39 ` Darrick J. Wong
2022-08-26 23:49 ` Darrick J. Wong
2022-09-07 14:06 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 5/9] xfs: l_last_sync_lsn is really AIL state Dave Chinner
2022-08-26 22:19 ` Darrick J. Wong
2022-09-07 14:11 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 6/9] xfs: collapse xlog_state_set_callback in caller Dave Chinner
2022-08-26 22:20 ` Darrick J. Wong
2022-09-07 14:12 ` Christoph Hellwig
2022-08-09 23:03 ` [PATCH 7/9] xfs: track log space pinned by the AIL Dave Chinner
2022-08-26 22:39 ` Darrick J. Wong
2022-08-09 23:03 ` [PATCH 8/9] xfs: pass the full grant head to accounting functions Dave Chinner
2022-08-26 22:25 ` Darrick J. Wong
2022-08-09 23:03 ` [PATCH 9/9] xfs: grant heads track byte counts, not LSNs Dave Chinner
2022-08-26 23:45 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2022-12-20 23:22 [PATCH 0/9 v3] xfs: byte-based grant head reservation tracking Dave Chinner
2022-12-20 23:23 ` [PATCH 2/9] xfs: AIL doesn't need manual pushing Dave Chinner
2023-09-21 1:48 [PATCH 0/9] xfs: byte-based grant head reservation tracking Dave Chinner
2023-09-21 1:48 ` [PATCH 2/9] xfs: AIL doesn't need manual pushing Dave Chinner
2023-09-21 22:20 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ywjq0sZx5QwJAFai@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).