From: Borislav Petkov <bp@alien8.de>
To: "“William Roche" <william.roche@oracle.com>
Cc: linux-kernel@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
linux-edac@vger.kernel.org
Subject: Re: [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering
Date: Fri, 26 Mar 2021 20:02:42 +0100 [thread overview]
Message-ID: <20210326190242.GI25229@zn.tnic> (raw)
In-Reply-To: <1616783429-6793-1-git-send-email-william.roche@oracle.com>
On Fri, Mar 26, 2021 at 02:30:29PM -0400, “William Roche wrote:
> From: William Roche <william.roche@oracle.com>
>
> The Corrected Error events collected by the cec_add_elem() have to be
> consistently filtered out.
> We fix the case where the value of find_elem() to find the slot of a pfn
> was mistakenly used as the return value of the function.
> Now the MCE notifiers chain relying on MCE_HANDLED_CEC would only report
> filtered corrected errors that reached the action threshold.
>
> Signed-off-by: William Roche <william.roche@oracle.com>
> ---
>
> Notes:
> Some machines are reporting Corrected Errors events without any
> information about a PFN Soft-offlining or Invalid pfn (report given by
> the EDAC module or the mcelog daemon).
>
> A research showed that it reflected the first occurrence of a CE error
> on the system which should have been filtered by the RAS_CEC component.
> We could also notice that if 2 PFNs are impacted by CE errors, the PFN
> on the non-zero slot gets its CE errors reported every-time instead of
> being filtered out.
>
> This problem has appeared with the introduction of commit
> de0e0624d86ff9fc512dedb297f8978698abf21a where the filtering logic has
> been modified.
>
> Could you please review this small suggested fix ?
>
> Thanks in advance for any feedback you could have.
> William.
>
> drivers/ras/cec.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
AFAIU, I think you want something like the below untested hunk:
You set it to 0 when it cannot find an element and that "ret = 1" we can
remove because callers don't care about the offlining threshold - the
only caller that looks at its retval wants to know whether it added the
VA successfully to note that it handled the error.
Makes sense?
---
diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
index ddecf25b5dd4..a29994d726d8 100644
--- a/drivers/ras/cec.c
+++ b/drivers/ras/cec.c
@@ -341,6 +341,8 @@ static int cec_add_elem(u64 pfn)
ca->array[to] = pfn << PAGE_SHIFT;
ca->n++;
+
+ ret = 0;
}
/* Add/refresh element generation and increment count */
@@ -363,12 +365,6 @@ static int cec_add_elem(u64 pfn)
del_elem(ca, to);
- /*
- * Return a >0 value to callers, to denote that we've reached
- * the offlining threshold.
- */
- ret = 1;
-
goto unlock;
}
---
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
next prev parent reply other threads:[~2021-03-26 19:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-26 18:30 [PATCH v1] RAS/CEC: Memory Corrected Errors consistent event filtering “William Roche
2021-03-26 19:02 ` Borislav Petkov [this message]
2021-03-26 22:24 ` William Roche
2021-03-26 22:43 ` Borislav Petkov
2021-03-29 9:44 ` William Roche
2021-04-01 16:12 ` Borislav Petkov
2021-04-02 16:00 ` William Roche
2021-04-02 17:07 ` Borislav Petkov
2021-04-06 15:28 ` [PATCH v2] " “William Roche
2021-04-07 9:57 ` [tip: x86/urgent] RAS/CEC: Correct ce_add_elem()'s returned values tip-bot2 for William Roche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210326190242.GI25229@zn.tnic \
--to=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=william.roche@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.