public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Ondrej Zary <linux@rainbow-software.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	kaber@trash.net
Subject: Re: Oops after 30 days of uptime
Date: Sun, 10 Sep 2006 15:16:16 +0200	[thread overview]
Message-ID: <20060910131616.GA574@1wt.eu> (raw)
In-Reply-To: <200609101243.25772.linux@rainbow-software.org>

On Sun, Sep 10, 2006 at 12:43:25PM +0200, Ondrej Zary wrote:
> On Sunday 10 September 2006 10:26, Willy Tarreau wrote:
> > Hi Ondrej,
> >
> > OK, I've analysed your oops with your kernel. My conclusions are that you
> > have a hardware problem (most probably the CPU), because you've hit an
> > impossible case :
> >
> > ip_nat_cheat_check() pushed the size of the data (8) on the stack, followed
> > by the pointer to the data, then called csum_partial() :
> >
> > c01e657f:       6a 08                   push   $0x8
> > c01e6581:       52                      push   %edx
> > c01e6582:       e8 a5 85 00 00          call   c01eeb2c <csum_partial>
> >
> > In csum_partial(), ECX is filled with the size (8) and ESI with the data
> > pointer (0xc0227ce8) :
> >
> > c01eeb32:       8b 4c 24 10             mov    0x10(%esp),%ecx
> > c01eeb36:       8b 74 24 0c             mov    0xc(%esp),%esi
> >
> > Then, the size is divided by 32 to count how many 32 bytes blocks can be
> > read at a time. If the size is lower than 32, the code branches to a
> > special location which reads 1 word at a time :
> >
> > c01eeb78:       89 ca                   mov    %ecx,%edx
> > c01eeb7a:       c1 e9 05                shr    $0x5,%ecx
> > c01eeb7d:       74 32                   je     c01eebb1 <csum_partial+0x85>
> >
> > Your oops comes from a few instructions below. The branch has not been
> > taken while it should have because (8 >> 5) == 0. You can also see from EDX
> > in the oops that it really was 0x8 when copied from ECX. The rest is pretty
> > obvious. The data are read 32 bytes at a time after ESI, and ECX is
> > decreased by 1 every 32 bytes. When ESI+0x18 reaches an unmapped area
> > (0xc2000000), you get the oops, and ECX = 0xfff113e8 as in your oops.
> >
> > Given that the failing instruction is the most common conditionnal jump, it
> > is very fortunate that your system can work 30 days before crashing. I
> > think that your CPU might be running too hot and might get wrong results
> > during branch prediction. It's also possible that you have a poor power
> > supply. However, I'm pretty sure that this is not a RAM problem.
> 
> Thank you very much for the analysis. Good that it's not a kernel bug.
> The CPU is 33MHz UMC GreenCPU which does not run hot even without a heatsink. 
> It's powered directly from 5V so it might be the power supply.

CPUs from this generation did not eat much power. I would find it strange that
a glitch in the PSU causes trouble. Maybe you have dead capacitors on the
motherboard close to the CPU (they would look bumped on the top).

Regards,
Willy


      reply	other threads:[~2006-09-10 13:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-01 16:52 Oops after 30 days of uptime Ondrej Zary
2006-09-01 17:00 ` Patrick McHardy
2006-09-01 18:00   ` Ondrej Zary
2006-09-03 20:03 ` Ondrej Zary
2006-09-09  5:10   ` Willy Tarreau
2006-09-09  5:20 ` Willy Tarreau
2006-09-09 10:15   ` Ondrej Zary
2006-09-09 10:19     ` Willy Tarreau
2006-09-09 10:43       ` Ondrej Zary
2006-09-09 11:38         ` Ondrej Zary
2006-09-10  8:26           ` Willy Tarreau
2006-09-10 10:43             ` Ondrej Zary
2006-09-10 13:16               ` Willy Tarreau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060910131616.GA574@1wt.eu \
    --to=w@1wt.eu \
    --cc=kaber@trash.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rainbow-software.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox