From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 29/30] xfs: factor xfs_iflush_done
Date: Thu, 11 Jun 2020 10:07:09 -0400 [thread overview]
Message-ID: <20200611140709.GB56572@bfoster> (raw)
In-Reply-To: <20200611001622.GN2040@dread.disaster.area>
On Thu, Jun 11, 2020 at 10:16:22AM +1000, Dave Chinner wrote:
> On Wed, Jun 10, 2020 at 09:08:33AM -0400, Brian Foster wrote:
> > On Wed, Jun 10, 2020 at 08:14:31AM +1000, Dave Chinner wrote:
> > > On Tue, Jun 09, 2020 at 09:12:49AM -0400, Brian Foster wrote:
> > > > On Thu, Jun 04, 2020 at 05:46:05PM +1000, Dave Chinner wrote:
...
> >
> > I'm referring to the fact that we no longer check the lsn of each
> > (flushed) log item attached to the buffer under the ail lock.
>
> That whole loop in xfs_iflush_ail_updates() runs under the AIL
> lock, so it does the right thing for anything that is moved to the
> "ail_updates" list.
>
> If we win the unlocked race (li_lsn does not change) then we move
> the inode to the ail update list and it gets rechecked under the AIL
> lock and does the right thing. If we lose the race (li_lsn changes)
> then the inode has been redirtied and we *don't need to check it
> under the AIL* - all we need to do is leave it attached to the
> buffer.
>
> This is the same as the old code: win the race, need_ail is
> incremented and we recheck under the AIL lock. Lose the race and
> we don't recheck under the AIL because we don't need to. This
> happened less under the old code, because it typically only happened
> with single dirty inodes on a cluster buffer (think directory inode
> under long running large directory modification operations), but
> that race most definitely existed and the code most definitely
> handled it correctly.
>
> Keep in mind that this inode redirtying/AIL repositioning race can
> even occur /after/ we've locked and removed items from the AIL but
> before we've run xfs_iflush_finish(). i.e. we can remove it from the
> AIL but by the time xfs_iflush_finish() runs it's back in the AIL.
>
All of the above would make a nice commit log for an independent patch.
;) Note again that I wasn't suggesting the logic was incorrect...
> > Note that
> > I am not saying it's necessarily wrong, but rather that IMO it's too
> > subtle a change to silently squash into a refactoring patch.
>
> Except it isn't a change at all. The same subtle issue exists in the
> code before this patch. It's just that this refactoring makes subtle
> race conditions that were previously unknown to reviewers so much
> more obvious they can now see them clearly. That tells me the code
> is much improved by this refactoring, not that there's a problem
> that needs reworking....
>
This patch elevates a bit of code from effectively being an (ail) lock
avoidance optimization to essentially per-item filtering logic without
any explanation beyond facilitating future modifications. Independent of
whether it's correct, this is not purely a refactoring change IMO.
> > > FWIW, I untangled the function this way because the "track dirty
> > > inodes by ordered buffers" patchset completely removes the AIL stuff
> > > - the ail_updates list and the xfs_iflush_ail_updates() function go
> > > away completely and the rest of the refactoring remains unchanged.
> > > i.e. as the commit messages says, this change makes follow-on
> > > patches much easier to understand...
> > >
> >
> > The general function breakdown seems fine to me. I find the multiple
> > list processing to be a bit overdone, particularly if it doesn't serve a
> > current functional purpose. If the purpose is to support a future patch
> > series, I'd suggest to continue using the existing logic of moving all
> > flushed inodes to a single list and leave the separate list bits to the
> > start of the series where it's useful so it's possible to review with
> > the associated context (or alternatively just defer the entire patch).
>
> That's how I originally did it, and it was a mess. it didn't
> separate cleanly at all, and didn't make future patches much easier
> at all. Hence I don't think reworking the patch just to look
> different gains us anything at this point...
>
I find that hard to believe. This patch splits the buffer list into two
lists, processes the first one, immediately combines it with the second,
then processes the second which is no different from the single list
that was constructed by the original code. The only reasons I can see
for this kind of churn is either to address some kind of performance or
efficiency issue or if the lists are used for further changes. The
former is not a documented reason and there's no context for the latter
because it's apparently part of some future series.
TBH, I think this patch should probably be broken down into two or three
independent patches anyways. What's the issue with something like the
appended diff (on top of this patch) in the meantime? If the multiple
list logic is truly necessary, reintroduce it when it's used so it's
actually reviewable..
Brian
--- 8< ---
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 3894d190ea5b..83580e204560 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -718,8 +718,8 @@ xfs_iflush_done(
struct xfs_buf *bp)
{
struct xfs_log_item *lip, *n;
- LIST_HEAD(flushed_inodes);
- LIST_HEAD(ail_updates);
+ int need_ail = 0;
+ LIST_HEAD(tmp);
/*
* Pull the attached inodes from the buffer one at a time and take the
@@ -732,25 +732,24 @@ xfs_iflush_done(
xfs_iflush_abort(iip->ili_inode);
continue;
}
+
if (!iip->ili_last_fields)
continue;
- /* Do an unlocked check for needing the AIL lock. */
+ list_move_tail(&lip->li_bio_list, &tmp);
+
+ /* Do an unlocked check for needing AIL processing */
if (iip->ili_flush_lsn == lip->li_lsn ||
test_bit(XFS_LI_FAILED, &lip->li_flags))
- list_move_tail(&lip->li_bio_list, &ail_updates);
- else
- list_move_tail(&lip->li_bio_list, &flushed_inodes);
+ need_ail++;
}
- if (!list_empty(&ail_updates)) {
- xfs_iflush_ail_updates(bp->b_mount->m_ail, &ail_updates);
- list_splice_tail(&ail_updates, &flushed_inodes);
- }
+ if (need_ail)
+ xfs_iflush_ail_updates(bp->b_mount->m_ail, &tmp);
- xfs_iflush_finish(bp, &flushed_inodes);
- if (!list_empty(&flushed_inodes))
- list_splice_tail(&flushed_inodes, &bp->b_li_list);
+ xfs_iflush_finish(bp, &tmp);
+ if (!list_empty(&tmp))
+ list_splice_tail(&tmp, &bp->b_li_list);
}
/*
next prev parent reply other threads:[~2020-06-11 14:07 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 7:45 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-04 7:45 ` [PATCH 01/30] xfs: Don't allow logging of XFS_ISTALE inodes Dave Chinner
2020-06-04 7:45 ` [PATCH 02/30] xfs: remove logged flag from inode log item Dave Chinner
2020-06-04 7:45 ` [PATCH 03/30] xfs: add an inode item lock Dave Chinner
2020-06-09 13:13 ` Brian Foster
2020-06-04 7:45 ` [PATCH 04/30] xfs: mark inode buffers in cache Dave Chinner
2020-06-04 14:04 ` Brian Foster
2020-06-04 7:45 ` [PATCH 05/30] xfs: mark dquot " Dave Chinner
2020-06-04 7:45 ` [PATCH 06/30] xfs: mark log recovery buffers for completion Dave Chinner
2020-06-04 7:45 ` [PATCH 07/30] xfs: call xfs_buf_iodone directly Dave Chinner
2020-06-04 7:45 ` [PATCH 08/30] xfs: clean up whacky buffer log item list reinit Dave Chinner
2020-06-04 7:45 ` [PATCH 09/30] xfs: make inode IO completion buffer centric Dave Chinner
2020-06-04 7:45 ` [PATCH 10/30] xfs: use direct calls for dquot IO completion Dave Chinner
2020-06-04 7:45 ` [PATCH 11/30] xfs: clean up the buffer iodone callback functions Dave Chinner
2020-06-04 7:45 ` [PATCH 12/30] xfs: get rid of log item callbacks Dave Chinner
2020-06-04 7:45 ` [PATCH 13/30] xfs: handle buffer log item IO errors directly Dave Chinner
2020-06-04 14:05 ` Brian Foster
2020-06-05 0:59 ` Dave Chinner
2020-06-05 1:32 ` [PATCH 13/30 V2] " Dave Chinner
2020-06-05 16:24 ` Brian Foster
2020-06-04 7:45 ` [PATCH 14/30] xfs: unwind log item error flagging Dave Chinner
2020-06-04 7:45 ` [PATCH 15/30] xfs: move xfs_clear_li_failed out of xfs_ail_delete_one() Dave Chinner
2020-06-04 7:45 ` [PATCH 16/30] xfs: pin inode backing buffer to the inode log item Dave Chinner
2020-06-04 14:05 ` Brian Foster
2020-06-04 7:45 ` [PATCH 17/30] xfs: make inode reclaim almost non-blocking Dave Chinner
2020-06-04 18:06 ` Brian Foster
2020-06-04 7:45 ` [PATCH 18/30] xfs: remove IO submission from xfs_reclaim_inode() Dave Chinner
2020-06-04 18:08 ` Brian Foster
2020-06-04 22:53 ` Dave Chinner
2020-06-05 16:25 ` Brian Foster
2020-06-04 7:45 ` [PATCH 19/30] xfs: allow multiple reclaimers per AG Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-05 21:07 ` Dave Chinner
2020-06-08 16:44 ` Brian Foster
2020-06-04 7:45 ` [PATCH 20/30] xfs: don't block inode reclaim on the ILOCK Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:45 ` [PATCH 21/30] xfs: remove SYNC_TRYLOCK from inode reclaim Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:45 ` [PATCH 22/30] xfs: remove SYNC_WAIT from xfs_reclaim_inodes() Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-05 21:09 ` Dave Chinner
2020-06-04 7:45 ` [PATCH 23/30] xfs: clean up inode reclaim comments Dave Chinner
2020-06-05 16:26 ` Brian Foster
2020-06-04 7:46 ` [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster Dave Chinner
2020-06-05 18:27 ` Brian Foster
2020-06-05 21:32 ` Dave Chinner
2020-06-08 16:44 ` Brian Foster
2020-06-04 7:46 ` [PATCH 25/30] xfs: attach inodes to the cluster buffer when dirtied Dave Chinner
2020-06-08 16:45 ` Brian Foster
2020-06-08 21:05 ` Dave Chinner
2020-06-04 7:46 ` [PATCH 26/30] xfs: xfs_iflush() is no longer necessary Dave Chinner
2020-06-08 16:45 ` Brian Foster
2020-06-08 21:37 ` Dave Chinner
2020-06-08 22:26 ` [PATCH 26/30 V2] " Dave Chinner
2020-06-09 13:11 ` Brian Foster
2020-06-04 7:46 ` [PATCH 27/30] xfs: rename xfs_iflush_int() Dave Chinner
2020-06-08 17:37 ` Brian Foster
2020-06-04 7:46 ` [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-06-09 13:11 ` Brian Foster
2020-06-09 22:01 ` Dave Chinner
2020-06-10 13:06 ` Brian Foster
2020-06-10 23:40 ` Dave Chinner
2020-06-11 13:56 ` Brian Foster
2020-06-15 1:01 ` Dave Chinner
2020-06-15 14:21 ` Brian Foster
2020-06-16 14:41 ` Brian Foster
2020-06-11 1:56 ` [PATCH 28/30 V2] " Dave Chinner
2020-06-04 7:46 ` [PATCH 29/30] xfs: factor xfs_iflush_done Dave Chinner
2020-06-09 13:12 ` Brian Foster
2020-06-09 22:14 ` Dave Chinner
2020-06-10 13:08 ` Brian Foster
2020-06-11 0:16 ` Dave Chinner
2020-06-11 14:07 ` Brian Foster [this message]
2020-06-15 1:49 ` Dave Chinner
2020-06-15 5:20 ` Amir Goldstein
2020-06-15 14:31 ` Brian Foster
2020-06-11 1:58 ` [PATCH 29/30 V2] " Dave Chinner
2020-06-04 7:46 ` [PATCH 30/30] xfs: remove xfs_inobp_check() Dave Chinner
2020-06-09 13:12 ` Brian Foster
-- strict thread matches above, loose matches on Subject: below --
2020-06-22 8:15 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-22 8:16 ` [PATCH 29/30] xfs: factor xfs_iflush_done Dave Chinner
2020-06-01 21:42 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-01 21:42 ` [PATCH 29/30] xfs: factor xfs_iflush_done Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200611140709.GB56572@bfoster \
--to=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).