Re: Still seeing hangs in xlog_grant_log_space

From: Ben Myers <bpm@sgi.com>
To: Juerg Haefliger <juergh@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: Still seeing hangs in xlog_grant_log_space
Date: Mon, 21 May 2012 12:11:37 -0500	[thread overview]
Message-ID: <20120521171136.GR16099@sgi.com> (raw)
In-Reply-To: <CADLDEKssiOCVRknW3hYtxDxYHSyGr6qfepfai+UymsD6zMGopw@mail.gmail.com>

Hey Juerg,

On Sat, May 19, 2012 at 09:28:55AM +0200, Juerg Haefliger wrote:
> > On Wed, May 09, 2012 at 09:54:08AM +0200, Juerg Haefliger wrote:
> >> > On Sat, May 05, 2012 at 09:44:35AM +0200, Juerg Haefliger wrote:
> >> >> Did anybody have a chance to look at the data?
> >> >
> >> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/979498
> >> >
> >> > Here you indicate that you have created a reproducer.  Can you post it to the list?
> >>
> >> Canonical attached them to the bug report that they filed yesterday:
> >> http://oss.sgi.com/bugzilla/show_bug.cgi?id=922
> >
> > I'm interested in understanding to what extent the hang you see in production
> > on 2.6.38 is similar to the hang of the reproducer.  Mark is seeing a situation
> > where there is nothing on the AIL and is clogged up in the CIL, others are
> > seeing items on the AIL that don't seem to be making progress.  Could you
> > provide a dump or traces from a hang on a filesystem with a normal sized log?
> > Can the reproducer hit the hang eventually without resorting to the tiny log?
> 
> I'm not certain that the reproducer hang is identical to the
> production hang. One difference that I've noticed is that a reproducer
> hang can be cleared with an emergency sync while a production hang
> can't. I'm working on trying to get a trace from a production machine.

Hit this on a filesystem with a regular sized log over the weekend.  If you see
this again in production could you gather up task states?

echo t > /proc/sysrq-trigger

Mark and I have been looking at the dump.  There are few interesting items to point out.

1) xfs_sync_worker is blocked trying to get log reservation:

PID: 25374  TASK: ffff88013481c6c0  CPU: 3   COMMAND: "kworker/3:83"
 #0 [ffff88013481fb50] __schedule at ffffffff813aacac
 #1 [ffff88013481fc98] schedule at ffffffff813ab0c4
 #2 [ffff88013481fca8] xlog_grant_head_wait at ffffffffa0347b78 [xfs]
 #3 [ffff88013481fcf8] xlog_grant_head_check at ffffffffa03483e6 [xfs]
 #4 [ffff88013481fd38] xfs_log_reserve at ffffffffa034852c [xfs]
 #5 [ffff88013481fd88] xfs_trans_reserve at ffffffffa0344e64 [xfs]
 #6 [ffff88013481fdd8] xfs_fs_log_dummy at ffffffffa02ec138 [xfs]
 #7 [ffff88013481fdf8] xfs_sync_worker at ffffffffa02f7be4 [xfs]
 #8 [ffff88013481fe18] process_one_work at ffffffff8104c53b
 #9 [ffff88013481fe68] worker_thread at ffffffff8104f0e3
#10 [ffff88013481fee8] kthread at ffffffff8105395e
#11 [ffff88013481ff48] kernel_thread_helper at ffffffff813b3ae4

This means that it is not in a position to push the AIL.  It is clear that the
AIL has plenty of entries which can be pushed.

crash> xfs_ail 0xffff88022112b7c0,
struct xfs_ail {
...
  xa_ail = {
    next = 0xffff880144d1c318,
    prev = 0xffff880170a02078
  },
  xa_target = 0x1f00003063,

Here's the first item on the AIL:

ffff880144d1c318
struct xfs_log_item_t {
  li_ail = {
    next = 0xffff880196ea0858,
    prev = 0xffff88022112b7d0
  },
  li_lsn = 0x1f00001c63,		<--- less than xa_target
  li_desc = 0x0,
  li_mountp = 0xffff88016adee000,
  li_ailp = 0xffff88022112b7c0,
  li_type = 0x123b,
  li_flags = 0x1,
  li_bio_list = 0xffff88016afa5cb8,
  li_cb = 0xffffffffa034de00 <xfs_istale_done>,
  li_ops = 0xffffffffa035f620,
  li_cil = {
    next = 0xffff880144d1c368,
    prev = 0xffff880144d1c368
  },
  li_lv = 0x0,
  li_seq = 0x3b
}

So if xfs_sync_worker were not blocked on log reservation it would push these
items.

2) The CIL is waiting around too:

crash> xfs_cil_ctx 0xffff880144d1a9c0,
struct xfs_cil_ctx {
...
  space_used = 0x135f68, 

struct log {
...
  l_logsize = 0xa00000,

A00000/8
140000						<--- XLOG_CIL_SPACE_LIMIT

140000 - 135F68
A098

Looks like xlog_cil_push_background will not push the CIL while space used is
less than XLOG_CIL_SPACE_LIMIT, so that's not going anywhere either.

3) It may be unrelated to this bug, but we do have a race in the log
reservation code that hasn't been resolved... between when log_space_left
samples the grant heads and when the space is actually granted a bit later.
Maybe we can grant more space than intended.

If you can provide output of 'echo t > /proc/sysrq-trigger' it may be enough
information to determine if you're seeing the same problem we hit on Saturday.

Thanks,

Ben & Mark

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs