From: Dave Chinner <david@fromorbit.com>
To: Lachlan McIlroy <lachlan@sgi.com>
Cc: Christoph Hellwig <hch@lst.de>, Mark Goodwin <markgw@sgi.com>,
xfs@oss.sgi.com
Subject: Re: fw: [PATCH] fix instant oops with tracing enabled
Date: Wed, 15 Oct 2008 11:54:41 +1100 [thread overview]
Message-ID: <20081015005441.GR10716@disturbed> (raw)
In-Reply-To: <48F546ED.6050702@sgi.com>
On Wed, Oct 15, 2008 at 11:27:09AM +1000, Lachlan McIlroy wrote:
> Christoph Hellwig wrote:
>> On Tue, Oct 14, 2008 at 10:40:15AM +1000, Mark Goodwin wrote:
>>> Lachlan also saw some regressions after merging these patchsets :
>>> . replace the mount inode list with radix tree traversals
>>> . clean up sync code
>>
>> What exactly? I saw some softlookup in 042, but when applying Dave's
>> xfs_sync_inodeS_ag fix (or the hal of it applying without the del inodes
>> tracking in the radix tree) it goes away.
>
> I saw this panic but I don't think it's related to the above patches:
>
> [252921.307588] BUG: unable to handle kernel <3>BUG: scheduling while atomic: dd/16976/0xf101da90
Isn't there another line with this ouutput that looks like:
atomic = 1 in_interrupt = 0
To indicate the "atomic" reason?
> [252921.307908] Modules linked in:
> [252921.307911] Pid: 16976, comm: dd Not tainted 2.6.27-rc8 #183
> [252921.307913] [252921.307913] Call Trace:
[ snip exceedingly deep stack that'll blow a 4k ia32 stack
completely ]
In summary, the stack is:
write
balance_dirty_pages
xfs_iomap_write_allocate
<enter memory reclaim>
try_to_free_pages
xfs_iomap_write_allocate
_xfs_trans_commit
xlog_write
xlog_state_get_iclog_space
<sleep>
The question is what is the reason for running in atomic mode?
The only place I can see a sleep happening in this function is
the call to sv_wait(), which means the atomic state must have come
from higher up.... Seems very strange.
> I saw sync get stuck in an infinite loop running test 042 - maybe the same
> problem you saw.
Yes, that's the lockup that the later patch I posted fixes.
> I saw the panic in _xfs_itrace_exit() which has now been fixed.
>
> And I also saw this assertion:
>
> <4>[34770.626472] Assertion failed: (index >= 0) && (index < ktp->kt_nentries), file: fs/xfs/support/ktrace.c, line: 173
> <0>[34770.626511] ------------[ cut here ]------------
> <2>[34770.627419] kernel BUG at fs/xfs/support/debug.c:81!
I can't see how that is related to the changes - it's a trace
buffer index overrun. That kind of implies that the ktrace_t
has been corrupted. Memory corruption of some kind?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-10-15 0:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-13 22:39 fw: [PATCH] fix instant oops with tracing enabled Dave Chinner
2008-10-14 0:40 ` Mark Goodwin
2008-10-14 2:04 ` Dave Chinner
2008-10-14 13:11 ` Christoph Hellwig
2008-10-15 1:27 ` Lachlan McIlroy
2008-10-15 0:54 ` Dave Chinner [this message]
2008-10-15 2:28 ` Lachlan McIlroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081015005441.GR10716@disturbed \
--to=david@fromorbit.com \
--cc=hch@lst.de \
--cc=lachlan@sgi.com \
--cc=markgw@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox