From: Oleg Drokin <green@linuxhacker.ru>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com>,
linux-kernel@vger.kernel.org, akpm@osdl.org
Subject: Re: [2.4] NMI WD detected lockup during page alloc
Date: Tue, 6 Apr 2004 10:02:45 +0300 [thread overview]
Message-ID: <20040406070245.GB1819@linuxhacker.ru> (raw)
In-Reply-To: <20040405221255.GM2234@dualathlon.random>
Hello!
On Tue, Apr 06, 2004 at 12:12:55AM +0200, Andrea Arcangeli wrote:
> > In addition to what I have compiled in:
> > # lsmod
> > Module Size Used by Not tainted
> > ppp_deflate 4568 1 (autoclean)
> you may want to disable compression, this sounds like mm corruption and
> compression isn't trivial to handle in kernel skbs (though I doubt this
> is the problem but it's easy to disable).
Ok.
> > ipt_state 1016 4 (autoclean)
> the hang while unloading this module may also be a sign of a bug in the
> module so it would be nice if you could reproduce also w/o the above
> ips_state.
Unfortunatelly this is not as easy to do, though I believe there is just some
sort or race on unload that is not being hit until module is unloaded and
therefore it is completely not related.
> If this still doesn't help then you can try to go UP again, SMP is
> harder at stressing the memory bus and see if it stabilizes. Other thing
> you can do is to remove half of the ram and see if it stabilizes to try
> to identify buggy ram slots.
There I have ECC RAM, passed 14 days of memtest (yes, I know memtest uses
only 1 CPU), so I do not think I have memory problems, though this is not
absolute guarantee against that of course.
Also running in UP mode for weeks is not all that funny and still proves nothing
as I do not have clear way to reproduce it in certain time.
> Overall it's unlikely the oops is useful unfortunately since that piece
> of the kernel is the most stressed ever, and it just signals random mm
> corruption. I assume this is the first time you've got the nmi watchdog
> oops, if you could get it again it would be more interesting, I'd expect
> next time you would get it in another place.
Well, I had a hang before this oops and that was main reason I enabled NMI
watchdog. At that first hang nothing get to serial console so I guessed
it was similar spinlock deadlock.
We'll see what I get when another NMI watchdog thing occurs. I run
with spinlock debug this time, so hopefully if spinlock is really just
corrupted, its magic would be corrupted as well and I get clear warning about
that.
Thank you.
Bye,
Oleg
prev parent reply other threads:[~2004-04-06 7:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-04-04 12:17 [2.4] NMI WD detected lockup during page alloc Oleg Drokin
2004-04-05 20:43 ` Marcelo Tosatti
2004-04-05 21:27 ` Oleg Drokin
2004-04-05 22:12 ` Andrea Arcangeli
2004-04-06 7:02 ` Oleg Drokin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040406070245.GB1819@linuxhacker.ru \
--to=green@linuxhacker.ru \
--cc=akpm@osdl.org \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo.tosatti@cyclades.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.