All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Brian Gordon <legerde@gmail.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Aerospace and linux
Date: Thu, 10 Jun 2010 20:23:15 +0200	[thread overview]
Message-ID: <877hm64ui4.fsf@basil.nowhere.org> (raw)
In-Reply-To: <AANLkTimrZvKyh6zYnUg2xzDVBVr5NxqtJhxZ42AVhkVI@mail.gmail.com> (Brian Gordon's message of "Thu\, 10 Jun 2010 11\:29\:46 -0600")

Brian Gordon <legerde@gmail.com> writes:
>     I work in the aerospace industry and one of the considerations
> that occurs in aerospace is a phenomenon called Single Event Upsets
> (SEU).   I'm not an expert on the physics behind this phenomenon, but
> the end result is that bits in RAM change state due to high energy
> particles passing through the device.   This phenomenon happens more
> often at higher altitudes (aircraft) and is a very serious
> consideration for space vehicles.

It's also a serious consideration for standard servers.

>     When these SEU can be detected some action may be taken to improve
> the behaviour of the system  (log a fault and reset in order to
> refresh things from scratch?).   So the first question becomes how to
> detect an SEU.   Flash is considered somewhat safer than RAM.   When
> executables run in linux, do the .text and .ro sections get copied
> into RAM?  If so, can a background task monitor the RAM copy of .text
> and .ro for corruption?   

On server class systems with ECC memory hardware does that.

The hardware stores the RAM contents using an error correcting
code that can normally correct one bit errors and detect multi-bit
errors.

There are various more or less sophisticated variations of 
this around, from simple ECC, over chipkill to handle DIMMs failing, 
upto various variants of full memory mirroring.

>   Thank you to anyone for any pointers on where I can look to learn
> more about detecting SEU in linux.

Normally server class hardware handles this and the kernel then reports
memory errors (e.g. through mcelog or through EDAC)

Hardware also stops the system before it would consume corrupted
data.

Newer Linux also has special code that allows to recover
from this in some circumstances or use predictive failure analysis
with page offlining to prevent future problems. This requires
suitable hardware support.

Lower end systems which are optimized for cost generally ignore the
problem though and any flipped bit in memory will result 
in a crash (if you're lucky) or silent data corruption (if you're unlucky)

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

  reply	other threads:[~2010-06-10 18:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-10 17:29 Aerospace and linux Brian Gordon
2010-06-10 18:23 ` Andi Kleen [this message]
2010-06-10 18:38   ` Brian Gordon
2010-06-10 18:46     ` Chris Friesen
2010-06-10 19:14       ` Brian Gordon
2010-06-10 18:48     ` Andi Kleen
2010-06-13  8:51     ` Borislav Petkov
2010-06-10 18:27 ` Chris Friesen
2010-06-10 18:42   ` Brian Gordon
2010-06-10 19:23     ` Massimiliano Galanti
2010-06-10 19:37       ` Brian Gordon
2010-06-10 19:42         ` Brian Gordon
2010-06-10 19:52           ` Massimiliano Galanti
2010-06-10 20:12             ` Brian Gordon
2010-06-10 19:59       ` Massimiliano Galanti
2010-06-11 14:37   ` Henrique de Moraes Holschuh
  -- strict thread matches above, loose matches on Subject: below --
2010-06-13 15:26 Denys Fedorysychenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877hm64ui4.fsf@basil.nowhere.org \
    --to=andi@firstfloor.org \
    --cc=legerde@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.