linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration
Date: Mon, 15 Jun 2020 10:21:55 -0400	[thread overview]
Message-ID: <20200615142155.GA12452@bfoster> (raw)
In-Reply-To: <20200615010152.GT2040@dread.disaster.area>

On Mon, Jun 15, 2020 at 11:01:52AM +1000, Dave Chinner wrote:
> On Thu, Jun 11, 2020 at 09:56:18AM -0400, Brian Foster wrote:
> > On Thu, Jun 11, 2020 at 09:40:08AM +1000, Dave Chinner wrote:
> > > On Wed, Jun 10, 2020 at 09:06:28AM -0400, Brian Foster wrote:
> > > > On Wed, Jun 10, 2020 at 08:01:39AM +1000, Dave Chinner wrote:
> > > > > On Tue, Jun 09, 2020 at 09:11:55AM -0400, Brian Foster wrote:
> > > > > > On Thu, Jun 04, 2020 at 05:46:04PM +1000, Dave Chinner wrote:
> > > > > > > -		 * check is not sufficient.
> > > > > > > +		 * If we are shut down, unpin and abort the inode now as there
> > > > > > > +		 * is no point in flushing it to the buffer just to get an IO
> > > > > > > +		 * completion to abort the buffer and remove it from the AIL.
> > > > > > >  		 */
> > > > > > > -		if (!cip->i_ino) {
> > > > > > > -			xfs_ifunlock(cip);
> > > > > > > -			xfs_iunlock(cip, XFS_ILOCK_SHARED);
> > > > > > > +		if (XFS_FORCED_SHUTDOWN(mp)) {
> > > > > > > +			xfs_iunpin_wait(ip);
> > > > > > 
> > > > > > Note that we have an unlocked check above that skips pinned inodes.
> > > > > 
> > > > > Right, but we could be racing with a transaction commit that pinned
> > > > > the inode and a shutdown. As the comment says: it's a quick and
> > > > > dirty check to avoid trying to get locks when we know that it is
> > > > > unlikely we can flush the inode without blocking. We still have to
> > > > > recheck that state once we have the ILOCK....
> > > > > 
> > > > 
> > > > Right, but that means we can just as easily skip the shutdown processing
> > > > (which waits for unpin) if a particular inode is pinned. So which is
> > > > supposed to happen in the shutdown case?
> > > >
> > > > ISTM that either could happen. As a result this kind of looks like
> > > > random logic to me.
> > > 
> > > Yes, shutdown is racy, so it could be either. However, I'm not
> > > changing the shutdown logic or handling here. If the shutdown race
> > > could happen before this patchset (and it can), it can still happen
> > > after the patchset, and this patchset does not change the way we
> > > handle the shutdown race at all.
> > > 
> > > IOWs, while this shutdown logic may appear "random", that's not a
> > > result of this patchset - it a result of design decisions made in
> > > the last major shutdown rework/cleanup that required checks to be
> > > added to places that could hang waiting for an event that would
> > > never occur because shutdown state prevented it from occurring.
> > 
> > It's not so much the shutdown check that I find random as much as how it
> > intends to handle pinned inodes.
> 
> I can change that, but that's the shutdown was handled by similar
> code in the past, so it seemed appropriate here because this code
> was hanging on shutdowns during development.
> 

Ok.

> > > There's already enough complexity in this patchset that adding
> > > shutdown logic changes is just too much to ask for.  If we want to
> > > change how various shutdown logics work, lets do it as a separate
> > > set of changes so all the subtle bugs that result from the changes
> > > bisect to the isolated shutdown logic changes...
> > > 
> > 
> > The fact that shutdown is racy is just background context. My point is
> > that this patch appears to introduce special shutdown handling for a
> > condition where it 1.) didn't previously exist and 2.) doesn't appear to
> > be necessary.
> 
> The random shutdown failures I kept seeing in shutdown tests says
> otherwise.
> 

Then can we please focus on the issue? My initial feedback was around
the pin state logic and that made me question whether the broader
shutdown logic was spurious. The response to that was that there's too
much complexity in this series to change shutdown logic, etc., and that
suggested I should justify the feedback with this being shutdown code
and all. Now we're debating about whether I want to make some kind of
architectural shutdown rule (I don't, and the reasoning for my question
doesn't invalidate the question itself), while I still have no idea what
issue this code actually fixes.

Can you elaborate on the problem, please? I realize that specifics may
be lost, but what in general about the failure suggested placing this
shutdown check in the cluster flush path? Does it concern you at all
that since shutdown is racy as such, this is possibly suppressing a new
(or old) bug and making it harder to reproduce and diagnose? It's also
quite possible that this was a latent issue exposed by removing the
buffer mapping from this path. In that case it probably does make more
sense to keep the check in this patch, but if that's the case I also
might be willing to spend some time digging into it to try and address
it sooner rather than have it fall to the wayside if you'd be willing to
provide some useful information...

Brian

> > The current push/flush code only incorporates a shutdown check
> > indirectly via mapping the buffer, which simulates an I/O failure and
> > causes us to abort the flush (and shutdown if the I/O failure happened
> > for some other reason). If the shutdown happened sometime after we
> > acquired the buffer, then there's no real impact on this code path. We
> > flush the inode(s) and return success. The shutdown will be handled
> > appropriately when xfsaild attempts to submit the buffer.
> 
> The long-standing rule of thumb is "shutdown == abort in-progress IO
> immediately", which is what I followed here when it became apparent
> there was some kind of subtle shutdown issue occurring. That made
> the shutdown problem go away.
> 
> It may be that changes I've been making to other parts of this
> writeback code make the shutdown check here unnecessary. My testing
> said otherwise, but maybe that's all been cleared up. Hence if the
> shutdown check is truly unnecessary, let's clean it up in a future
> patchset where that assertion can be bisected down cleanly.  I
> needed this to get fstests to pass, and for this patchset which is
> entirely unrelated to shutdown architecture, that's all the
> justification that should be necessary.
> 
> > The new code no longer maps the buffer because that is done much
> > earlier, but for some reason incorporates a new check to abort the flush
> > if the fs is already shutdown. The problem I have with this is that
> > these checks tend to be brittle, untested and a maintenance burden.
> 
> Shutdown has *always* been brittle. You've seen it go from being
> completely fucking awful to actually being somewhat reliable because
> your experience with XFS matches to roughly when we first added
> substantial shutdown validation to fstests.  We had a huge mountain
> of technical debt around shutdown, but the importance of addressing
> it has historically been pretty low because -shutdown is extremely
> rare- in production systems.
> 
> > As
> > such, I don't think we should ever add new shutdown checks for cases
> > that aren't required for functional correctness.
> 
> I think the burden of proof is the wrong way around for the current
> shutdown architecture. If you want to make a rule like this, you
> need to fix define how the shutdown architecture is going to allow
> this sort of rule to be applied without placing unreasonable demands
> on the patch author.
> 
> > That way we hopefully
> > move to a state where we have the minimum number of shutdown checks with
> > broadest coverage to ensure everything unwinds correctly, but don't have
> > to constantly battle with insufficiently tested logic in obscure
> > contexts that silently break as surrounding code changes over time and
> > leads to random fstests hangs and shutdown logic cleanouts every couple
> > of years.
> 
> Yes, I think this is an admirable goal, but I think you've got how
> to get there completely backward.  First work out the architecture
> and logic that allows us to remove/avoid "randomly" placed checks,
> then work towards cleaning up the code. We don't get there by saying
> "no new checks!" and then ignoring the underlying problems those
> checks are trying to avoid.
> 
> If you can come up with a shutdown mechanism that:
> 
> 	a) prevents all IO the moment a shutdown is triggered, and
> 	b) avoids shutdown detection ordering hangs between
> 	   different objects and subsystems,
> 
> then you have grounds for saying "nobody should need to add new
> shutdown checks". But right now, that's not even close to being the
> reality. There needs to be lots more cleanup and rework to the
> shutdown code to be done before we get anywhere near this...
> 
> > So my question for any newly added shutdown check is: what problem does
> > this check solve?
> 
> Repeated fstests hangs on various shutdown tests while developing
> this code.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


  reply	other threads:[~2020-06-15 14:22 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-04  7:45 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-04  7:45 ` [PATCH 01/30] xfs: Don't allow logging of XFS_ISTALE inodes Dave Chinner
2020-06-04  7:45 ` [PATCH 02/30] xfs: remove logged flag from inode log item Dave Chinner
2020-06-04  7:45 ` [PATCH 03/30] xfs: add an inode item lock Dave Chinner
2020-06-09 13:13   ` Brian Foster
2020-06-04  7:45 ` [PATCH 04/30] xfs: mark inode buffers in cache Dave Chinner
2020-06-04 14:04   ` Brian Foster
2020-06-04  7:45 ` [PATCH 05/30] xfs: mark dquot " Dave Chinner
2020-06-04  7:45 ` [PATCH 06/30] xfs: mark log recovery buffers for completion Dave Chinner
2020-06-04  7:45 ` [PATCH 07/30] xfs: call xfs_buf_iodone directly Dave Chinner
2020-06-04  7:45 ` [PATCH 08/30] xfs: clean up whacky buffer log item list reinit Dave Chinner
2020-06-04  7:45 ` [PATCH 09/30] xfs: make inode IO completion buffer centric Dave Chinner
2020-06-04  7:45 ` [PATCH 10/30] xfs: use direct calls for dquot IO completion Dave Chinner
2020-06-04  7:45 ` [PATCH 11/30] xfs: clean up the buffer iodone callback functions Dave Chinner
2020-06-04  7:45 ` [PATCH 12/30] xfs: get rid of log item callbacks Dave Chinner
2020-06-04  7:45 ` [PATCH 13/30] xfs: handle buffer log item IO errors directly Dave Chinner
2020-06-04 14:05   ` Brian Foster
2020-06-05  0:59     ` Dave Chinner
2020-06-05  1:32   ` [PATCH 13/30 V2] " Dave Chinner
2020-06-05 16:24     ` Brian Foster
2020-06-04  7:45 ` [PATCH 14/30] xfs: unwind log item error flagging Dave Chinner
2020-06-04  7:45 ` [PATCH 15/30] xfs: move xfs_clear_li_failed out of xfs_ail_delete_one() Dave Chinner
2020-06-04  7:45 ` [PATCH 16/30] xfs: pin inode backing buffer to the inode log item Dave Chinner
2020-06-04 14:05   ` Brian Foster
2020-06-04  7:45 ` [PATCH 17/30] xfs: make inode reclaim almost non-blocking Dave Chinner
2020-06-04 18:06   ` Brian Foster
2020-06-04  7:45 ` [PATCH 18/30] xfs: remove IO submission from xfs_reclaim_inode() Dave Chinner
2020-06-04 18:08   ` Brian Foster
2020-06-04 22:53     ` Dave Chinner
2020-06-05 16:25       ` Brian Foster
2020-06-04  7:45 ` [PATCH 19/30] xfs: allow multiple reclaimers per AG Dave Chinner
2020-06-05 16:26   ` Brian Foster
2020-06-05 21:07     ` Dave Chinner
2020-06-08 16:44       ` Brian Foster
2020-06-04  7:45 ` [PATCH 20/30] xfs: don't block inode reclaim on the ILOCK Dave Chinner
2020-06-05 16:26   ` Brian Foster
2020-06-04  7:45 ` [PATCH 21/30] xfs: remove SYNC_TRYLOCK from inode reclaim Dave Chinner
2020-06-05 16:26   ` Brian Foster
2020-06-04  7:45 ` [PATCH 22/30] xfs: remove SYNC_WAIT from xfs_reclaim_inodes() Dave Chinner
2020-06-05 16:26   ` Brian Foster
2020-06-05 21:09     ` Dave Chinner
2020-06-04  7:45 ` [PATCH 23/30] xfs: clean up inode reclaim comments Dave Chinner
2020-06-05 16:26   ` Brian Foster
2020-06-04  7:46 ` [PATCH 24/30] xfs: rework stale inodes in xfs_ifree_cluster Dave Chinner
2020-06-05 18:27   ` Brian Foster
2020-06-05 21:32     ` Dave Chinner
2020-06-08 16:44       ` Brian Foster
2020-06-04  7:46 ` [PATCH 25/30] xfs: attach inodes to the cluster buffer when dirtied Dave Chinner
2020-06-08 16:45   ` Brian Foster
2020-06-08 21:05     ` Dave Chinner
2020-06-04  7:46 ` [PATCH 26/30] xfs: xfs_iflush() is no longer necessary Dave Chinner
2020-06-08 16:45   ` Brian Foster
2020-06-08 21:37     ` Dave Chinner
2020-06-08 22:26   ` [PATCH 26/30 V2] " Dave Chinner
2020-06-09 13:11     ` Brian Foster
2020-06-04  7:46 ` [PATCH 27/30] xfs: rename xfs_iflush_int() Dave Chinner
2020-06-08 17:37   ` Brian Foster
2020-06-04  7:46 ` [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-06-09 13:11   ` Brian Foster
2020-06-09 22:01     ` Dave Chinner
2020-06-10 13:06       ` Brian Foster
2020-06-10 23:40         ` Dave Chinner
2020-06-11 13:56           ` Brian Foster
2020-06-15  1:01             ` Dave Chinner
2020-06-15 14:21               ` Brian Foster [this message]
2020-06-16 14:41                 ` Brian Foster
2020-06-11  1:56   ` [PATCH 28/30 V2] " Dave Chinner
2020-06-04  7:46 ` [PATCH 29/30] xfs: factor xfs_iflush_done Dave Chinner
2020-06-09 13:12   ` Brian Foster
2020-06-09 22:14     ` Dave Chinner
2020-06-10 13:08       ` Brian Foster
2020-06-11  0:16         ` Dave Chinner
2020-06-11 14:07           ` Brian Foster
2020-06-15  1:49             ` Dave Chinner
2020-06-15  5:20               ` Amir Goldstein
2020-06-15 14:31               ` Brian Foster
2020-06-11  1:58   ` [PATCH 29/30 V2] " Dave Chinner
2020-06-04  7:46 ` [PATCH 30/30] xfs: remove xfs_inobp_check() Dave Chinner
2020-06-09 13:12   ` Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2020-06-22  8:15 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-22  8:16 ` [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-06-01 21:42 [PATCH 00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous Dave Chinner
2020-06-01 21:42 ` [PATCH 28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration Dave Chinner
2020-06-02 23:23   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200615142155.GA12452@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).