public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH -tip 1/3] x86, mce: Add mce_threshold option for intel cmci
Date: Fri, 27 Mar 2009 18:44:12 +0900	[thread overview]
Message-ID: <49CC9FEC.6090300@jp.fujitsu.com> (raw)
In-Reply-To: <49CB4677.9010403@linux.intel.com>

Andi Kleen wrote:
> Hidetoshi Seto wrote:
>> This patch adds a kernel parameter "mce_threshold=n" to enable us
>> to change the default threshold for CMCI(Corrected Machine Check
>> Interrupt) that recent Intel processor supports.
> 
> I intentionally didn't implement this because it seemed not needed.

I know your intention since you have mentioned it at description of
previous patch that implements CMCI support.

> Any threshold in the actual error reporting should be implemented
> in the user space processing backend, but not in the CPU, because
> they typically need to be more fine grained than just per bank,
> and the CPU cannot do that.

I believe that one of reasons why there is thresholding in CPU is
because it can be help for user space.  Not all backend in the user
space requires such fine graining.  More coarse grain also should be
supported.
i.e. It would be useful if the backend accounts 5 errors as 1 grain.

> The only potential reason for implementing this threshold at the
> CPU level is if someone is concerned about CPU consumption during error storms.
> But then the threshold should be dynamically adjusted based on the
> current rate, otherwise it doesn't help.

So sysfs is required for such usage, right?
I already have an another patch to have sysfs interface.
I'll post it next time if it helps.

> But I didn't do this so far because I didn't want to overengineer
> and in general if you have a error storm you're likely soon dead
> anyways.

Always it is said that corrected errors (and CE storm) will be soon
lead an uncorrected error.  But AFAIK there is no statistics about
that the "soon" is how much long.

Assume that if a component starts to assert CEs, you'll not stop
system but just schedule next maintenance by the weekend, by the
end of the month or so.  Nothing wrong with that.
I suppose we can have something to support the few days until the
maintenance.

> Also even if this was implemented a boot option would seem
> like the wrong interface compared to sysfs.

CMCI is enabled before sysfs creation, isn't it?
If someone like to disable CMCI at all, it seems sysfs is not enough.

> Can you please describe your rationale for this more clearly?

At first I've been asked about the default threshold of CMCI, and
noticed there is no way to know the default value, some kind of
"factory default."  So my concern is the "1", default value of current
implementation, is really appropriate value or not.

I told it to querier and had some responses that:
1) It is heard that already there are some customer complaining about
  error reporting for "every" CE.  So thresholding is nice solution
  for such cases.  Is it adjustable?
2) Usually reporting corrected error never have high priority so even
  it is too higher than reference high threshold would be preferred
  than low one.
3) The reference value might varies in every bank.  So it would be best
  if we can have per-bank adjusters, but it will be simple and still
  acceptable if we only have a global adjuster for all banks because of
  logic in 2).

And additionally that:
4) It is also heard that some have no interest in correctable errors
  at all!  In such case, kernel message "Machine check events logged"
  for CE (it is leveled KERN_INFO and already rate-limited) can be a
  "noise" in syslog.  Can we disable CE related stuff at all?
5) Our BIOS provides good log enough to identify faulty component,
  so OS log is rarely used in maintenance phase.  Comparing these log
  will be cause of confusion, in case if they use different threshold
  and if one reports error while another does not.  It depends on
  the platform which log is better, but I suppose disabling OS feature
  might be a good option for platforms where BIOS wins.
6) In past, EDAC driver troubled us by conflicting with BIOS since it
  clears error information in memory controller.  It would not happen
  in recent platforms that have processors integrated memory controller.
  However it would be a nice workaround to have switch to disable error
  monitoring by OS in advance, just in case there are something nasty
  conflict in BIOS or hardware.  Update or quirk for such issue will
  take time and rarely be in time.

So in summary, the conclusion is that it is better to have a threshold
adjuster as an option (at least global one) and also add some switch to
disable CE features just in case of troubles.


Thanks,
H.Seto


  reply	other threads:[~2009-03-27  9:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-26  8:39 [PATCH -tip 1/3] x86, mce: Add mce_threshold option for intel cmci Hidetoshi Seto
2009-03-26  9:10 ` Andi Kleen
2009-03-27  9:44   ` Hidetoshi Seto [this message]
2009-03-27 10:31     ` Andi Kleen
2009-03-30  9:06       ` Hidetoshi Seto
2009-03-30 10:05         ` Andi Kleen
2009-03-31  7:22           ` Hidetoshi Seto
2009-03-31  8:15             ` Andi Kleen
2009-03-28 12:00     ` Ingo Molnar
2009-03-28 12:08 ` Ingo Molnar
2009-03-30  9:42   ` Andi Kleen
2009-03-31  2:45     ` Hidetoshi Seto
2009-03-31  8:08       ` Andi Kleen
2009-03-31  2:45   ` Hidetoshi Seto
2009-04-01 15:07     ` Ingo Molnar
2009-04-02  4:43       ` Hidetoshi Seto
2009-04-02  4:54         ` [PATCH -tip 1/3] x86, mce: Revert "add mce_threshold option for intel cmci" Hidetoshi Seto
2009-04-02  4:55         ` [PATCH -tip 2/3] x86, mce: Revert "add mce=nopoll option to disable timer polling" Hidetoshi Seto
2009-04-02  4:58         ` [PATCH -tip 3/3] x86, mce: Add new option mce=no_cmci and mce=ignore_ce Hidetoshi Seto
2009-03-28 21:28 ` [tip:x86/mce2] x86, mce: Add mce_threshold option for intel cmci Hidetoshi Seto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49CC9FEC.6090300@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=ak@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox