All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Kris Kersey <kkersey@steelbox.com>
Cc: xfs@oss.sgi.com, Bill Vaughan <billv@steelbox.com>
Subject: Re: pdflush hang on xlog_grant_log_space()
Date: Sat, 8 Mar 2008 09:35:10 +1100	[thread overview]
Message-ID: <20080307223510.GM155407@sgi.com> (raw)
In-Reply-To: <47D062AF.80501@steelbox.com>

On Thu, Mar 06, 2008 at 04:31:27PM -0500, Kris Kersey wrote:
> Hello,
> 
> I'm working on a NAS product and we're currently having lock-ups that
> seem to be hanging in XFS code.  We're running a NAS that has 1024 NFSD
> threads accessing three RAID mounts.  All three mounts are running XFS
> file systems.  Lately we've had random lockups on these boxes and I am
> now running a kernel with KDB built-in.
> 
> The lock-up takes the form of all NFSD threads in D state with one out
> of three pdflush threads in D state.  The assumption can be made that
> all NFSD threads are waiting on the one pdflush thread to complete.  So
> two times now when an NAS has gotten in this state I have accessed KDB
> and ran a stack trace on the pdflush thread.  Both times the thread was
> stuck on xlog_grant_log_space+0xdb.

Try bumping XFS_TRANS_PUSH_AIL_RESTARTS to a much larger number and
seeing if the problem goes away....

Alternatively, that restart hack is backed by a "watchdog" timeout
in 2.6.25-rc1, so if that is the cause of the problem perhaps the
latest -rcX kernel will prevent the hang?

BTW, you can get all the traces of D state threads through the sysrq
interface, so you don't need to drop into kdb to get this.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  parent reply	other threads:[~2008-03-07 22:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 21:31 pdflush hang on xlog_grant_log_space() Kris Kersey
2008-03-07  1:20 ` Mark Goodwin
2008-03-07 22:35 ` David Chinner [this message]
2008-03-10 11:48   ` Kris Kersey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080307223510.GM155407@sgi.com \
    --to=dgc@sgi.com \
    --cc=billv@steelbox.com \
    --cc=kkersey@steelbox.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.