All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russ Anderson <rja@sgi.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>,
	Prarit Bhargava <prarit@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dzickus@redhat.com" <dzickus@redhat.com>,
	"mstowe@redhat.com" <mstowe@redhat.com>,
	"dnelson@redhat.com" <dnelson@redhat.com>,
	rja@americas.sgi.com
Subject: Re: [PATCH]: mce: don't print "human readable" message for corrected errors
Date: Tue, 12 Apr 2011 21:24:03 -0500	[thread overview]
Message-ID: <20110413022402.GA31652@sgi.com> (raw)
In-Reply-To: <987664A83D2D224EAE907B061CE93D5301A9629BD5@orsmsx505.amr.corp.intel.com>

On Tue, Apr 12, 2011 at 01:02:21PM -0700, Luck, Tony wrote:
> > Why not? This way you turn reporting of _ALL_ correctable MCEs
> > completely off and some users would actually like to run them through
> > mcelog on Intel.
> 
> pr_emerg() is rather overkill for a corrected error - on large systems
> corrected errors are going to be a routine occurrence (my personal estimation
> is "one soft error per gigabyte per month" ... which is pretty much the
> same as "one per terabyte per hour" for the people with the really cool
> toys.

Good point.

> We are also setting TAINT_MACHINE_CHECK for corrected errors - perhaps
> this made sense when systems were small and machine checks were rare and
> scary.  But I think we need to start working with the reality that
> corrected errors are normal events.

I agree.  Corrected errors - by definition - have hardware corrected data.
There is no corruption so there is no reason for kernel taint.  It would
be like setting taint when one hard drive of a RAID file system goes bad.

It's worth noting that linux does not set taint when it recovers from 
_uncorrected_ memory errors on IA64 (by killing the application
that consumed the bad data and discarding the bad page).  Modern hardware
has enough error detection/correction code to avoid undetected data 
corruption from memory errors.


-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

      parent reply	other threads:[~2011-04-13  2:24 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-12 17:44 [PATCH]: mce: don't print "human readable" message for corrected errors Prarit Bhargava
2011-04-12 18:58 ` Borislav Petkov
2011-04-12 19:22   ` Prarit Bhargava
2011-04-12 19:57     ` Borislav Petkov
2011-04-12 20:02   ` Luck, Tony
2011-04-12 20:15     ` Prarit Bhargava
2011-04-12 20:28       ` Borislav Petkov
2011-04-13  3:00         ` Russ Anderson
2011-04-13  7:14           ` Borislav Petkov
2011-04-13 13:24             ` Borislav Petkov
2011-04-13 13:36               ` [PATCH 1/3] x86, MCE: Do not taint when correctable errors Borislav Petkov
2011-04-13 13:36               ` [PATCH 2/3] x86, MCE: Drop default decoding notifier Borislav Petkov
2011-04-13 14:01                 ` Prarit Bhargava
2011-04-13 14:18                   ` Borislav Petkov
2011-04-13 14:22                     ` Prarit Bhargava
2011-04-13 14:26                       ` Borislav Petkov
2011-04-13 14:32                         ` Prarit Bhargava
2011-04-13 14:39                           ` Borislav Petkov
2011-04-13 14:45                             ` Prarit Bhargava
2011-04-13 14:36                         ` [PATCH -v2] " Borislav Petkov
2011-04-13 17:01                           ` Prarit Bhargava
2011-04-13 17:13                             ` Luck, Tony
2011-04-13 17:17                               ` Prarit Bhargava
2011-04-13 17:14                             ` Prarit Bhargava
2011-04-13 17:37                               ` Borislav Petkov
2011-04-14 14:59                                 ` Prarit Bhargava
2011-04-14 15:00                                 ` [PATCH -v3] x86, MCE: Drop the " Borislav Petkov
2011-04-14 15:04                                   ` Prarit Bhargava
2011-04-14 15:16                                     ` Borislav Petkov
2011-04-14 15:23                                       ` Prarit Bhargava
2011-04-14 15:44                                         ` Borislav Petkov
2011-04-14 15:49                                           ` Prarit Bhargava
2011-04-14 19:02                                             ` Borislav Petkov
2011-04-14 19:04                                               ` Prarit Bhargava
2011-04-14 15:33                                       ` Russ Anderson
2011-04-14 15:49                                         ` Borislav Petkov
2011-04-13 13:36               ` [PATCH 3/3] EDAC, MCE, AMD: Register with MCE core Borislav Petkov
2011-04-13  2:24     ` Russ Anderson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110413022402.GA31652@sgi.com \
    --to=rja@sgi.com \
    --cc=bp@alien8.de \
    --cc=dnelson@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mstowe@redhat.com \
    --cc=prarit@redhat.com \
    --cc=rja@americas.sgi.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.