linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roger Heflin <rogerheflin@gmail.com>
To: Faidon Liambotis <paravoid@debian.org>
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>,
	557262@bugs.debian.org, Dave Chinner <david@fromorbit.com>,
	submit@bugs.debian.org, linux-kernel@vger.kernel.org,
	xfs@oss.sgi.com, linux-raid@vger.kernel.org,
	asterisk-users@lists.digium.com, Alan Piszcz <ap@solarrain.com>
Subject: Re: Bug#557262: 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after	24-48 hours (sysrq-t+w available) - root cause found = asterisk
Date: Sat, 21 Nov 2009 08:29:18 -0600	[thread overview]
Message-ID: <4B07F93E.10502@gmail.com> (raw)
In-Reply-To: <4B0729D8.3000105@debian.org>

Faidon Liambotis wrote:
> Justin Piszcz wrote:
>  > Found root cause-- root cause is asterisk PBX software.  I use an
> SPA3102.
>> When someone called me, they accidentally dropped the connection, I called
>> them back in a short period.  It is during this time (and the last time)
>> this happened that the box froze under multiple(!) kernels, always when
>> someone was calling.
> <snip>
>> I don't know what asterisk is doing but top did run before the crash
>> and asterisk was using 100% CPU and as I noted before all other processes
>> were in D-state.
>>
>> When this bug occurs, it freezes I/O to all devices and the only way to
>> recover
>> is to reboot the system.
> That's obviously *not* the root cause.
> 
> It's not normal for an application that isn't even privileged to hang
> all I/O and, subsequently everything on a system.
> 
> This is almost probably a kernel issue and asterisk just does something
> that triggers this bug.
> 
> Regards,
> Faidon


I had an application in 2.6.5 (SLES9)...that would hang XFS.

The underlying application was multi-threaded and both threads were 
doing full disks syncs every so often, and sometimes when doing the 
full disk sync the XFS subsystem would deadlock, it appeared to me tha 
one sync had a lock and was waiting for another, and the other process 
had the second lock and was waiting for the first...   We were able to 
disable the full disk sync from the application and the deadlock went 
away.   All non-xfs filesytems still worked and could still be 
accessed.    I did report the bug with some traces but I don't believe 
anyone ever determined where the underlying issues was.



  parent reply	other threads:[~2009-11-21 14:29 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-17 22:34 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) Justin Piszcz
2009-10-18 20:17 ` Justin Piszcz
2009-10-19  3:04   ` Dave Chinner
2009-10-19 10:18     ` Justin Piszcz
2009-10-20  0:33       ` Dave Chinner
2009-10-20  8:33         ` Justin Piszcz
2009-10-21 10:19           ` Justin Piszcz
2009-10-21 14:17             ` mdadm --detail showing annoying device Stephane Bunel
2009-10-21 21:46               ` Neil Brown
2009-10-22 11:22                 ` Stephane Bunel
2009-10-29  3:44                   ` Neil Brown
2009-11-03  9:37                     ` Stephane Bunel
2009-11-03 10:09                       ` Beolach
2009-11-03 12:16                         ` Stephane Bunel
2009-10-22 11:29                 ` Mario 'BitKoenig' Holbe
2009-10-22 14:17                   ` Stephane Bunel
2009-10-22 16:00                     ` Stephane Bunel
2009-10-22 22:49             ` 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) Justin Piszcz
2009-10-22 23:00               ` Dave Chinner
2009-10-26 11:24               ` Justin Piszcz
2009-11-02 21:46                 ` Justin Piszcz
2009-11-20 20:39             ` 2.6.31+2.6.31.4: XFS - All I/O locks up to D-state after 24-48 hours (sysrq-t+w available) - root cause found = asterisk Justin Piszcz
2009-11-20 23:44               ` Bug#557262: " Faidon Liambotis
2009-11-20 23:51                 ` Justin Piszcz
2009-11-21 14:29                 ` Roger Heflin [this message]
2009-11-24 13:08 ` Which kernel options should be enabled to find the root cause of this bug? Justin Piszcz
2009-11-24 15:14   ` Eric Sandeen
2009-11-24 16:20     ` Justin Piszcz
2009-11-24 16:23       ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B07F93E.10502@gmail.com \
    --to=rogerheflin@gmail.com \
    --cc=557262@bugs.debian.org \
    --cc=ap@solarrain.com \
    --cc=asterisk-users@lists.digium.com \
    --cc=david@fromorbit.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=paravoid@debian.org \
    --cc=submit@bugs.debian.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).