public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: willy@infradead.org
Subject: xfs: log recovery hang fixes
Date: Mon,  7 Mar 2022 16:32:49 +1100	[thread overview]
Message-ID: <20220307053252.2534616-1-david@fromorbit.com> (raw)

Hi folks,

Willy reported generic/530 had started hanging on his test machines
and I've tried to reproduce the problem he reported. While I haven't
reproduced the exact hang he's been having, I've found a couple of
others while running g/530 in a tight loop on a couple of test
machines.

The first two patches are a result of a hang documented in patch 1.
The change to run the log worker earlier is defensive, but serves to
break generic log space deadlocks during intent and unlinked inode
recovery as it does at normal runtime. This doesn't fix the problem,
just adds a layer of protection that means stuff that gets stuck on
pinned buffers, push hangs, etc only stays hung up for 30s at most.

The second patch fixes the hang that results from delwri buffer
pushing racing with modifications that pin the buffer (i.e.
transaction commit) and then require access to it again soon after.
The buffer is locked by delwri submission that is waiting for it to
be unpinned, but the processes that might be able to trigger an
unpin are blocked waiting for the buffer lock itself. This happens
during log recovery when processing unlinked inodes that hit the
same inode cluster buffer.

The third patch is for log recovery hangs I've been seeing that
occur after unlinked inode recovery has completed and the AIL is
being pushed out. The trigger may be unique to the highly modified
kernel I was running (and mitigated to a 30s delay to log recovery
completion in g/530 by the first patch in the series), but I have
occasionally seen period hangs in xfs_ail_push_all_sync() in the
past where the AIL has not been fully emptied but it is sleeping
without making progress. Hence I think the problem is a real one,
just I don't have a way of reproducing it reliably an unmodified
kernel.

Willy, can you see if these patches fix the problem you are seeing?
If not, I still think they stand alone as necessary fixes, but I'll
have to keep digging to find out why you are seeing hangs in g/530.

Cheers,

Dave.


             reply	other threads:[~2022-03-07  5:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07  5:32 Dave Chinner [this message]
2022-03-07  5:32 ` [PATCH 1/3] xfs: log worker needs to start before intent/unlink recovery Dave Chinner
2022-03-07  5:32 ` [PATCH 2/3] xfs: check buffer pin state after locking in delwri_submit Dave Chinner
2022-03-07  5:32 ` [PATCH 3/3] xfs: xfs_ail_push_all_sync() stalls when racing with updates Dave Chinner
2022-03-07 17:43 ` xfs: log recovery hang fixes Matthew Wilcox
2022-03-07 21:18   ` Dave Chinner
2022-03-07 23:18 ` [PATCH 4/3] xfs: async CIL flushes need pending pushes to be made stable Dave Chinner
2022-03-08  6:12   ` [PATCH 4/3 v2] " Dave Chinner
2022-03-08 13:52     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220307053252.2534616-1-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox