All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kyle Meyer <kyle.meyer@hpe.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "bp@alien8.de" <bp@alien8.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] RAS/CEC: Should cec_notifier() set MCE_HANDLED_CEC after a soft-offline?
Date: Tue, 1 Oct 2024 13:45:27 -0500	[thread overview]
Message-ID: <ZvxDRzW6_5dn2_X6@hpe.com> (raw)
In-Reply-To: <SJ1PR11MB60833BFA53B5E617C526828EFC772@SJ1PR11MB6083.namprd11.prod.outlook.com>

On Tue, Oct 01, 2024 at 06:24:19PM +0000, Luck, Tony wrote:
> > I noticed CEC should indicate whether it took action to log or handle an error
> > by setting MCE_HANDLED_CEC (commit 1de08dc) and that EDAC and dev-mcelog should
> > skip errors that have been processed by CEC (commit 23ba710).
> >
> > cec_notifier() does not set MCE_HANDLED_CEC when the offlining threshold
> > is reached in cec_add_elem() because the return code is not zero. Is that
> > intentional?
> 
> Kyle,
> 
> It seems a bit murky. You are right that cec_add_elem() appears to expect three
> different actions from its caller based on the return value being <0, 0, >0. But
> cec_notifier() only has two actions (0 and !0).
> 
> But I think this may be OK. The main purpose of CEC is to avoid over-reacting
> to simple corrected memory errors. Many (most?) are due to particle bit flips and
> no action is needed. So setting MCE_HANDLED_CEC for the case where CEC
> counted the error, but took no action feels like the right thing to do.
> 
> Conversely, if action was taken (because this was an error that repeated
> enough to hit the threshold) the we do want mcelog/EDAC to give additional
> reporting.

That makes sense. Thank you for the explanation.

Thanks,
Kyle Meyer

      reply	other threads:[~2024-10-01 18:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-01 18:02 [RFC] RAS/CEC: Should cec_notifier() set MCE_HANDLED_CEC after a soft-offline? Kyle Meyer
2024-10-01 18:24 ` Luck, Tony
2024-10-01 18:45   ` Kyle Meyer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvxDRzW6_5dn2_X6@hpe.com \
    --to=kyle.meyer@hpe.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.