From: Dave Chinner <david@fromorbit.com>
To: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Cc: Christoph Hellwig <hch@infradead.org>,
"xfs-masters@oss.sgi.com" <xfs-masters@oss.sgi.com>,
"xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
Date: Thu, 22 Sep 2011 10:53:12 +1000 [thread overview]
Message-ID: <20110922005312.GT15688@dastard> (raw)
In-Reply-To: <20110921114237.GP15688@dastard>
On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
> On Wed, Sep 21, 2011 at 09:40:03AM +0200, Stefan Priebe - Profihost AG wrote:
> > Am 21.09.2011 04:11, schrieb Dave Chinner:
> > >Also, what phase do you see it hanging in? the random stat phase is
> > >terribly slow on spinning disks, so if I can avoid that it woul dbe
> > >nice....
> > Creating or deleting files. never in the stat phase.
>
> Ok, I got a hang in the random delete phase. Not sure what is wrong
> yet, but inode reclaim is trying to reclaim inodes but failing, and
> the AIL is trying to push items but failing. Hence the tail of the
> log is not being moved forward and new transactions are being
> blocked until log space bcomes available.
>
> The AIl is particularly interesting. the number of pushes being
> executed is precisely 50/s, and precisely 5000 items/s are being
> scanned. All those items are pinned, so the "stuck" processing is
> what is triggering this pattern.
>
> Thing is, all the items are aparently pinned - I see that stat
> incrementing at 5,000/s. It's here:
>
> case XFS_ITEM_PINNED:
> XFS_STATS_INC(xs_push_ail_pinned);
> stuck++;
> flush_log = 1;
> break;
>
> so we should have the flush_log variable set. However, this code:
>
> if (flush_log) {
> /*
> * If something we need to push out was pinned, then
> * push out the log so it will become unpinned and
> * move forward in the AIL.
> */
> XFS_STATS_INC(xs_push_ail_flush);
> xfs_log_force(mp, 0);
> }
>
> never seems to execute. I don't see the xs_push_ail_flush stat
> increase, nor the log force counter increase, either. Hence the
> pinned items are not getting unpinned, and progress is not being
> made. Background inode reclaim is not making progress, either,
> because it skips pinned inodes.
>
> The AIL code is clearly cycling - the push counter is increasing,
> and the run numbers match the stuck code precisely (aborts at 100
> stuck items a cycle). The question is now why isn't the log force
> being triggered.
>
> Given this, just triggering a log force is shoul dget everything
> moving again. Running "echo 2 > /proc/sys/vm/drop_caches" gets inode
> reclaim running in sync mode, which causes pinned inodes to trigger
> a log force. And once I've done this, everything starts running
> again.
>
> So, the log force not triggering in the AIL code looks to be the
> problem. That, I simply cannot explain right now - it makes no sense
> but that is what all the stats and trace events point to. I need to
> do more investigation.
Ok, it makes sense now. The kernel I was running (from before I went
on holidays) had this patch in it:
http://oss.sgi.com/archives/xfs/2011-08/msg00472.html
I found this out by disassembling the kernel code. That code has a
bug it in when the stuck case is hit - it fails to issue the log
force in that case, and that's why I've been seeing this kernel get
stuck. False alarm - will now try to reproduce without any dev
patches in the kernel.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-09-22 0:53 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe
2011-09-12 15:21 ` Christoph Hellwig
2011-09-12 16:46 ` Stefan Priebe
2011-09-12 20:05 ` Christoph Hellwig
2011-09-13 6:04 ` Stefan Priebe - Profihost AG
2011-09-13 19:31 ` Stefan Priebe - Profihost AG
2011-09-13 20:50 ` Christoph Hellwig
2011-09-13 21:52 ` [xfs-masters] " Alex Elder
2011-09-13 21:58 ` Alex Elder
2011-09-13 22:26 ` Christoph Hellwig
2011-09-14 7:26 ` Stefan Priebe - Profihost AG
2011-09-14 7:48 ` Stefan Priebe - Profihost AG
2011-09-14 8:49 ` Stefan Priebe - Profihost AG
2011-09-14 14:30 ` Christoph Hellwig
2011-09-14 14:30 ` Christoph Hellwig
2011-09-14 16:06 ` Stefan Priebe - Profihost AG
2011-09-18 9:14 ` Stefan Priebe - Profihost AG
2011-09-18 20:04 ` Christoph Hellwig
2011-09-19 10:54 ` Stefan Priebe - Profihost AG
2011-09-18 23:02 ` Dave Chinner
2011-09-20 0:47 ` Stefan Priebe
2011-09-20 1:01 ` Stefan Priebe
2011-09-20 10:09 ` Stefan Priebe - Profihost AG
2011-09-20 16:02 ` Christoph Hellwig
2011-09-20 17:23 ` Stefan Priebe - Profihost AG
2011-09-20 17:24 ` Christoph Hellwig
2011-09-20 17:35 ` Stefan Priebe - Profihost AG
2011-09-20 22:30 ` Christoph Hellwig
2011-09-21 2:11 ` [xfs-masters] " Dave Chinner
2011-09-21 7:40 ` Stefan Priebe - Profihost AG
2011-09-21 11:42 ` Dave Chinner
2011-09-21 11:55 ` Stefan Priebe - Profihost AG
2011-09-21 12:26 ` Christoph Hellwig
2011-09-21 13:42 ` Stefan Priebe
2011-09-21 16:48 ` Stefan Priebe - Profihost AG
2011-09-21 17:26 ` Stefan Priebe - Profihost AG
2011-09-21 19:01 ` Stefan Priebe - Profihost AG
2011-09-21 23:07 ` Dave Chinner
2011-09-22 14:14 ` Christoph Hellwig
2011-09-22 21:49 ` Dave Chinner
2011-09-22 22:01 ` Christoph Hellwig
2011-09-23 5:28 ` Stefan Priebe - Profihost AG
2011-09-22 0:53 ` Dave Chinner [this message]
2011-09-22 5:27 ` Stefan Priebe - Profihost AG
2011-09-22 7:52 ` Stefan Priebe - Profihost AG
2011-09-21 7:36 ` Stefan Priebe - Profihost AG
2011-09-21 11:39 ` Christoph Hellwig
2011-09-21 13:39 ` Stefan Priebe
2011-09-21 14:17 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110922005312.GT15688@dastard \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=s.priebe@profihost.ag \
--cc=xfs-masters@oss.sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.