All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jason T. Collins" <jcollins@valinux.com>
To: Corin Hartland-Swann <cdhs@commerce.uk.net>,
	linux-kernel@vger.kernel.org
Subject: Re: Memory Problems - CTCS/memtst
Date: Thu, 02 Aug 2001 11:10:14 -0700	[thread overview]
Message-ID: <3B699786.C3178BE1@valinux.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0108021659300.23264-100000@willow.commerce.uk.net>

Corin Hartland-Swann wrote:
> 
> Alan,
> 
> On Thu, 2 Aug 2001, Alan Cox wrote:
> > > The BIOS has an ECC logging feature, and if I understand it correctly,
> > > then there /cannot/ have been any main memory errors or they would have
> > > shown up in the logs. At least not any single or double-bit errors (ECC
> > > corrects single-bit and detects double-bit, doesn't it?)

Remember, the memory itself is only one area where there might be problems. 
There are other memory related areas including the following that are not
covered by ECC memory:

North bridge (memory controller)
L1/L2/L3 cache levels (some processors have ECC checking in the cache)
Register corruption

In addition, the transfers between the CPU and memory could be corrupted in
transit before the ECC checksum is calculated (I've actually seen this happen
on a poorly designed motherboard).  In other words, there are a lot of things
that could be wrong, see the FAQ in CTCS for more of my ramblings on the
subject.

One way to tell whether or not your memory is the problem is by examining your
files/coredumps for corruption.  If entire page-sized chunks have been
substituted with chunks from other files, pages in RAM, etc, you're likely
dealing with a kernel MM bug rather than memory corruption.  (I suppose an MMU
bug is possible too.. sigh...)  A few bits swapped here and there points to
hardware/faulty memory.  That's one reason why my memory checker displays that
nice context information, so those sorts of determinations can be made.

> I've just tried test 2 on another machine (with good memory) and it looks
> like it's a bug in memtst rather than the detection of an error.

This doesn't surprise me too much, the software is pretty new.  The fact that
the expected and resulting memory contents in the log is the same would seem to
confirm that, plus the fact that the 'error' happened on the first byte in the
test array and other strange things.  :)  A quick check confirms it breaks for
me too, so I'll find this bug and whack it in a new release.  Expect something
this weekend.

-- 
Jason T. Collins  "Noone has lived to see even three of my techniques.  It
Software Engineer  is almost sunset.  How many will you see before you die?"
VA Linux Systems   'Twilight' Suzuka, "Creeping Evil"

  reply	other threads:[~2001-08-02 18:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-08-02 13:43 Memory Problems - CTCS/memtst Corin Hartland-Swann
2001-08-02 14:53 ` Alan Cox
2001-08-02 16:09   ` Corin Hartland-Swann
2001-08-02 18:10     ` Jason T. Collins [this message]
2001-08-09 14:45   ` Corin Hartland-Swann
2001-08-09 15:10     ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3B699786.C3178BE1@valinux.com \
    --to=jcollins@valinux.com \
    --cc=cdhs@commerce.uk.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.