public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Markus Trippelsdorf <markus@trippelsdorf.de>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-kernel@vger.kernel.org, Simon Kirby <sim@hostway.ca>,
	xfs@oss.sgi.com
Subject: Re: XFS read hangs in 3.1-rc10
Date: Fri, 21 Oct 2011 22:29:11 +0200	[thread overview]
Message-ID: <20111021202911.GA1633@x4.trippels.de> (raw)
In-Reply-To: <20111021132240.GA24136@infradead.org>

On 2011.10.21 at 09:22 -0400, Christoph Hellwig wrote:
> On Thu, Oct 20, 2011 at 03:42:14PM -0700, Simon Kirby wrote:
> > 
> > [<ffffffff8126f205>] xfs_reclaim_inode+0x85/0x2b0
> > [<ffffffff8126f5b0>] xfs_reclaim_inodes_ag+0x180/0x2f0
> > [<ffffffff8126f74e>] xfs_reclaim_inodes_nr+0x2e/0x40
> > [<ffffffff8126ccf0>] xfs_fs_free_cached_objects+0x10/0x20
> > [<ffffffff81119a70>] prune_super+0x110/0x1b0
> > [<ffffffff810e4fa5>] shrink_slab+0x1e5/0x2a0
> > [<ffffffff810e5821>] kswapd+0x7c1/0xba0
> > [<ffffffff8107ada6>] kthread+0x96/0xb0
> > [<ffffffff816c0474>] kernel_thread_helper+0x4/0x10
> > [<ffffffffffffffff>] 0xffffffffffffffff
> 
> We're stuck in synchronous inode reclaim.
> 
> > All of the other processes that get stuck have this stack:
> > 
> > [<ffffffff81080587>] down+0x47/0x50
> > [<ffffffff8125e816>] xfs_buf_lock+0x66/0xd0
> > [<ffffffff812603ad>] _xfs_buf_find+0x16d/0x270
> > [<ffffffff81260517>] xfs_buf_get+0x67/0x1a0
> > [<ffffffff8126067a>] xfs_buf_read+0x2a/0x120
> > [<ffffffff812b876f>] xfs_trans_read_buf+0x28f/0x3f0
> > [<ffffffff8129e161>] xfs_read_agi+0x71/0x100
> 
> They are waiting for the AGI buffer to become unlocked.  The only reason
> it is held locked for longer time is when it is under I/O.
> 
> > 
> > By the way, xfs_reclaim_inode+0x85 (133) disassembles as:
> 
> > 
> > ...So the next function is wait_for_completion(), which is marked
> > __sched and thus doesn't show up in the trace.
> 
> So we're waiting for the inode to be flushed, aka I/O again.
> 
> What is interesting here is that we're always blocking on the AGI
> buffer - which is used during unlinks of inodes, and thus gets hit
> fairly heavily for a workload that does a lot of unlinks.
> 
> Given that you are doing a lot of unlinks I wonder if it is related
> to the recent ail pushing issues in that area.  While your symptoms
> looks completely different we could be blocking on the flush completion
> for an inode that gets stuck in the AIL.
> 
> Can you run with latest 3.0-stable plus the patches at:

Please not that he saw this in 3.1-rc10, too. And this version already
contains the fixes:
 % git describe --contains 0030807c66f0582
v3.1-rc10~5^2

I just saw similar symptoms while running a weekly rsync backup job. The
machine was stuck for a few seconds several times during that time (no
response to mouse or keyboard input). It always recovered by itself
after a short while.

This is an example output of latencytop during the rsync run:

Cause                                                Maximum     Percentage
[xfs_reclaim_inodes_ag]                           7847.0 msec         16.6 %
Fork() system call                                7777.1 msec         25.5 %
Creating block layer request                      5352.5 msec         18.1 %
[xfs_buf_iowait]                                  2000.4 msec         14.7 %
[down]                                            959.7 msec          2.0 %
Page fault                                        637.3 msec          1.4 %
[xfs_iunpin_wait]                                 557.4 msec          0.7 %
Unlinking file                                     66.8 msec          0.1 %

I also took some "perf timechart" recordings. If there is interest I
could post the svg images somewhere.

-- 
Markus

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      parent reply	other threads:[~2011-10-21 20:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-20 22:42 XFS read hangs in 3.1-rc10 Simon Kirby
2011-10-21 13:22 ` Christoph Hellwig
2011-10-21 20:28   ` Simon Kirby
2011-10-24  8:22     ` Christoph Hellwig
2011-10-25 20:07       ` Simon Kirby
2011-10-26 11:25         ` Christoph Hellwig
2011-11-04 21:03           ` Christoph Hellwig
2011-11-16 19:56             ` Simon Kirby
2011-11-20 15:32               ` Christoph Hellwig
2011-11-28 19:05                 ` Simon Kirby
2011-11-28 19:55                   ` Christoph Hellwig
2011-11-28 21:01                     ` Ben Myers
2011-10-21 20:29   ` Markus Trippelsdorf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111021202911.GA1633@x4.trippels.de \
    --to=markus@trippelsdorf.de \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sim@hostway.ca \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox