* [PATCH] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records
@ 2016-03-25 18:35 Tony Luck
2016-03-31 8:46 ` Borislav Petkov
0 siblings, 1 reply; 4+ messages in thread
From: Tony Luck @ 2016-03-25 18:35 UTC (permalink / raw)
To: Borislav Petkov; +Cc: linux-kernel, Borislav Petkov, Ashok Raj
Couple of issues here:
1) MCE_LOG_LEN is only 32 - so we may have more pending records than will
fit in the buffer on high core count cpus
2) During a panic we may have a lot of duplicate records because multiple
logical cpus may have seen and logged the same error because some
banks are shared.
Switch to using the genpool to look for the pending records. Squeeze
out duplicated records.
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
New solution for an old problem. See
https://lkml.kernel.org/r/1443073720-3940-1-git-send-email-ashok.raj@intel.com
for the previous take where Ashok said "bump MCE_LOG_LEN from 32 to 128" and
Boris replied "switch to the gen_pool".
I think this clears the way to pull all of the /dev/mcelog driver
code out of mce.c and make it just another driver that registers
a notifier on x86_mce_decoder_chain.
arch/x86/kernel/cpu/mcheck/mce-genpool.c | 42 +++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/mcheck/mce-internal.h | 10 ++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 21 +++++++---------
3 files changed, 61 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
index 0a850100c594..2dc276260fea 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
@@ -26,6 +26,48 @@ static struct gen_pool *mce_evt_pool;
static LLIST_HEAD(mce_event_llist);
static char gen_pool_buf[MCE_POOLSZ];
+static bool duplicate_mce_record(struct mce_evt_llist *t, struct mce_evt_llist *l)
+{
+ struct mce_evt_llist *node;
+ struct mce *m1, *m2;
+
+ m1 = &t->mce;
+
+ llist_for_each_entry(node, &l->llnode, llnode) {
+ m2 = &node->mce;
+
+ if (dup_mce_record(m1, m2))
+ return true;
+ }
+ return false;
+}
+
+/*
+ * The system has paniced - we'd like to peruse the list of mce records
+ * that have been queued, but not seen by anyone yet. The list is in
+ * reverse time order, so we need to reverse it. While doing that we can
+ * also drop duplicate records (these were logged because some banks are
+ * shared between cores or by all threads on a socket).
+ */
+struct llist_node *mce_gen_pool_panic(void)
+{
+ struct llist_node *head;
+ LLIST_HEAD(new_head);
+ struct mce_evt_llist *node, *t;
+
+ head = llist_del_all(&mce_event_llist);
+ if (!head)
+ return NULL;
+
+ /* squeeze out duplicates while reversing order */
+ llist_for_each_entry_safe(node, t, head, llnode) {
+ if (!duplicate_mce_record(node, t))
+ llist_add(&node->llnode, &new_head);
+ }
+
+ return new_head.first;
+}
+
void mce_gen_pool_process(void)
{
struct llist_node *head;
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 547720efd923..492e38e4d02e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -35,6 +35,7 @@ void mce_gen_pool_process(void);
bool mce_gen_pool_empty(void);
int mce_gen_pool_add(struct mce *mce);
int mce_gen_pool_init(void);
+struct llist_node *mce_gen_pool_panic(void);
extern int (*mce_severity)(struct mce *a, int tolerant, char **msg, bool is_excp);
struct dentry *mce_get_debugfs_dir(void);
@@ -81,3 +82,12 @@ static inline int apei_clear_mce(u64 record_id)
#endif
void mce_inject_log(struct mce *m);
+
+/*
+ * We consider records to be duplicate if bank+status+addr+misc all match
+ */
+static inline bool dup_mce_record(struct mce *m1, struct mce *m2)
+{
+ return m1->bank == m2->bank && m1->status == m2->status &&
+ m1->addr == m2->addr && m1->misc == m2->misc;
+}
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a006f4cd792b..e23f8b96cb10 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -161,7 +161,6 @@ void mce_log(struct mce *mce)
if (!mce_gen_pool_add(mce))
irq_work_queue(&mce_irq_work);
- mce->finished = 0;
wmb();
for (;;) {
entry = mce_log_get_idx_check(mcelog.next);
@@ -194,7 +193,6 @@ void mce_log(struct mce *mce)
mcelog.entry[entry].finished = 1;
wmb();
- mce->finished = 1;
set_bit(0, &mce_need_notify);
}
@@ -290,7 +288,9 @@ static void wait_for_panic(void)
static void mce_panic(const char *msg, struct mce *final, char *exp)
{
- int i, apei_err = 0;
+ int apei_err = 0;
+ struct llist_node *pending;
+ struct mce_evt_llist *l;
if (!fake_panic) {
/*
@@ -307,11 +307,10 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
if (atomic_inc_return(&mce_fake_panicked) > 1)
return;
}
+ pending = mce_gen_pool_panic();
/* First print corrected ones that are still unlogged */
- for (i = 0; i < MCE_LOG_LEN; i++) {
- struct mce *m = &mcelog.entry[i];
- if (!(m->status & MCI_STATUS_VAL))
- continue;
+ llist_for_each_entry(l, pending, llnode) {
+ struct mce *m = &l->mce;
if (!(m->status & MCI_STATUS_UC)) {
print_mce(m);
if (!apei_err)
@@ -319,13 +318,11 @@ static void mce_panic(const char *msg, struct mce *final, char *exp)
}
}
/* Now print uncorrected but with the final one last */
- for (i = 0; i < MCE_LOG_LEN; i++) {
- struct mce *m = &mcelog.entry[i];
- if (!(m->status & MCI_STATUS_VAL))
- continue;
+ llist_for_each_entry(l, pending, llnode) {
+ struct mce *m = &l->mce;
if (!(m->status & MCI_STATUS_UC))
continue;
- if (!final || memcmp(m, final, sizeof(struct mce))) {
+ if (!final || !dup_mce_record(m, final)) {
print_mce(m);
if (!apei_err)
apei_err = apei_write_mce(m);
--
2.5.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records
2016-03-25 18:35 [PATCH] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records Tony Luck
@ 2016-03-31 8:46 ` Borislav Petkov
[not found] ` <b5198c79fc527cc63d858087bd20fd81b1c4011a.1460046517.git.tony.luck@intel.com>
0 siblings, 1 reply; 4+ messages in thread
From: Borislav Petkov @ 2016-03-31 8:46 UTC (permalink / raw)
To: Tony Luck; +Cc: linux-kernel, Ashok Raj
On Fri, Mar 25, 2016 at 11:35:17AM -0700, Tony Luck wrote:
> Couple of issues here:
> 1) MCE_LOG_LEN is only 32 - so we may have more pending records than will
> fit in the buffer on high core count cpus
> 2) During a panic we may have a lot of duplicate records because multiple
> logical cpus may have seen and logged the same error because some
> banks are shared.
>
> Switch to using the genpool to look for the pending records. Squeeze
> out duplicated records.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>
> New solution for an old problem. See
> https://lkml.kernel.org/r/1443073720-3940-1-git-send-email-ashok.raj@intel.com
> for the previous take where Ashok said "bump MCE_LOG_LEN from 32 to 128" and
> Boris replied "switch to the gen_pool".
>
> I think this clears the way to pull all of the /dev/mcelog driver
> code out of mce.c and make it just another driver that registers
> a notifier on x86_mce_decoder_chain.
Good, finally!
Once we have all the code around RAS daemon and things, we could finally
mark it deprecated and make it a default off config option for distros
which still need mcelog.
> arch/x86/kernel/cpu/mcheck/mce-genpool.c | 42 +++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/mcheck/mce-internal.h | 10 ++++++++
> arch/x86/kernel/cpu/mcheck/mce.c | 21 +++++++---------
> 3 files changed, 61 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> index 0a850100c594..2dc276260fea 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> @@ -26,6 +26,48 @@ static struct gen_pool *mce_evt_pool;
> static LLIST_HEAD(mce_event_llist);
> static char gen_pool_buf[MCE_POOLSZ];
>
> +static bool duplicate_mce_record(struct mce_evt_llist *t, struct mce_evt_llist *l)
Hmm, having the verb at the beginning of the name makes me think this
function duplicates MCE records :)
Maybe
is_mce_record_duplicate()
or
mce_record_seen()
...
Bah, all ideas I have right now are meh.
> +{
> + struct mce_evt_llist *node;
> + struct mce *m1, *m2;
> +
> + m1 = &t->mce;
> +
> + llist_for_each_entry(node, &l->llnode, llnode) {
> + m2 = &node->mce;
> +
> + if (dup_mce_record(m1, m2))
> + return true;
> + }
> + return false;
> +}
> +
> +/*
> + * The system has paniced - we'd like to peruse the list of mce records
panicked
> + * that have been queued, but not seen by anyone yet. The list is in
> + * reverse time order, so we need to reverse it. While doing that we can
> + * also drop duplicate records (these were logged because some banks are
> + * shared between cores or by all threads on a socket).
> + */
> +struct llist_node *mce_gen_pool_panic(void)
I'd call it mce_gen_pool_prepare_records() or so.
> +{
> + struct llist_node *head;
> + LLIST_HEAD(new_head);
> + struct mce_evt_llist *node, *t;
> +
> + head = llist_del_all(&mce_event_llist);
> + if (!head)
> + return NULL;
> +
> + /* squeeze out duplicates while reversing order */
> + llist_for_each_entry_safe(node, t, head, llnode) {
> + if (!duplicate_mce_record(node, t))
> + llist_add(&node->llnode, &new_head);
> + }
> +
> + return new_head.first;
> +}
> +
> void mce_gen_pool_process(void)
> {
> struct llist_node *head;
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
> index 547720efd923..492e38e4d02e 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
> +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
> @@ -35,6 +35,7 @@ void mce_gen_pool_process(void);
> bool mce_gen_pool_empty(void);
> int mce_gen_pool_add(struct mce *mce);
> int mce_gen_pool_init(void);
> +struct llist_node *mce_gen_pool_panic(void);
>
> extern int (*mce_severity)(struct mce *a, int tolerant, char **msg, bool is_excp);
> struct dentry *mce_get_debugfs_dir(void);
> @@ -81,3 +82,12 @@ static inline int apei_clear_mce(u64 record_id)
> #endif
>
> void mce_inject_log(struct mce *m);
> +
> +/*
> + * We consider records to be duplicate if bank+status+addr+misc all match
> + */
> +static inline bool dup_mce_record(struct mce *m1, struct mce *m2)
Since this is a "comparison" of sorts, I'd prefer something like:
mce_records_equal(m1, m2);
or even better:
mce_cmp(m1, m2);
:-)
> +{
> + return m1->bank == m2->bank && m1->status == m2->status &&
> + m1->addr == m2->addr && m1->misc == m2->misc;
Should be more readable like this:
return m1->bank == m2->bank &&
m1->status == m2->status &&
m1->addr == m2->addr &&
m1->misc == m2->misc;
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCHv2] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records
[not found] ` <b5198c79fc527cc63d858087bd20fd81b1c4011a.1460046517.git.tony.luck@intel.com>
@ 2016-04-07 17:03 ` Borislav Petkov
[not found] ` <0a9f1fc6b1626ad2b9cb6f1a7c621249b39d1a63.1460058327.git.tony.luck@intel.com>
0 siblings, 1 reply; 4+ messages in thread
From: Borislav Petkov @ 2016-04-07 17:03 UTC (permalink / raw)
To: Tony Luck; +Cc: linux-kernel, Ashok Raj
On Thu, Apr 07, 2016 at 09:34:06AM -0700, Tony Luck wrote:
> Couple of issues here:
> 1) MCE_LOG_LEN is only 32 - so we may have more pending records than will
> fit in the buffer on high core count cpus
> 2) During a panic we may have a lot of duplicate records because multiple
> logical cpus may have seen and logged the same error because some
> banks are shared.
>
> Switch to using the genpool to look for the pending records. Squeeze
> out duplicated records.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> v2: Better names and code layout (Boris)
> Revised commments on mce record comparisons (Ashok)
>
> arch/x86/kernel/cpu/mcheck/mce-genpool.c | 46 +++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 ++++++++++
> arch/x86/kernel/cpu/mcheck/mce.c | 21 ++++++--------
> 3 files changed, 70 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> index 0a850100c594..c43050b91d6d 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
> @@ -26,6 +26,52 @@ static struct gen_pool *mce_evt_pool;
> static LLIST_HEAD(mce_event_llist);
> static char gen_pool_buf[MCE_POOLSZ];
>
> +/*
> + * Compare the record "t" with each of the records on list "l" to see if
> + * a functionally equivalent one is present in the list.
functionally?
> + */
> +static bool is_duplicate_mce_record(struct mce_evt_llist *t, struct mce_evt_llist *l)
> +{
> + struct mce_evt_llist *node;
> + struct mce *m1, *m2;
> +
> + m1 = &t->mce;
> +
> + llist_for_each_entry(node, &l->llnode, llnode) {
> + m2 = &node->mce;
> +
> + if (mce_cmp(m1, m2))
Sorry for nitpicking but isn't it usually the case that a
_cmp()-something function should return 0 when both things are equal?
I.e., you have:
if (!strcmp(s1, s2))
...
I think if we do it this way here too, it'll be very natural. mce_cmp()
would then have to do:
return !(m1->bank == m2->bank &&
m1->status == m2->status &&
m1->addr == m2->addr &&
m1->misc == m2->misc);
simply.
Hmmm?
Rest looks ok.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCHv3] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records
[not found] ` <0a9f1fc6b1626ad2b9cb6f1a7c621249b39d1a63.1460058327.git.tony.luck@intel.com>
@ 2016-04-26 18:11 ` Borislav Petkov
0 siblings, 0 replies; 4+ messages in thread
From: Borislav Petkov @ 2016-04-26 18:11 UTC (permalink / raw)
To: Tony Luck; +Cc: linux-kernel, Ashok Raj
On Fri, Apr 08, 2016 at 10:00:50AM -0700, Tony Luck wrote:
> Couple of issues here:
> 1) MCE_LOG_LEN is only 32 - so we may have more pending records than will
> fit in the buffer on high core count cpus
> 2) During a panic we may have a lot of duplicate records because multiple
> logical cpus may have seen and logged the same error because some
> banks are shared.
>
> Switch to using the genpool to look for the pending records. Squeeze
> out duplicated records.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> v3: Make mce_cmp() behave like other *cmp() functions: return 0 for equality (Boris)
>
> v2: Better names and code layout (Boris)
> Revised commments on mce record comparisons (Ashok)
>
> arch/x86/kernel/cpu/mcheck/mce-genpool.c | 46 +++++++++++++++++++++++++++++++
> arch/x86/kernel/cpu/mcheck/mce-internal.h | 15 ++++++++++
> arch/x86/kernel/cpu/mcheck/mce.c | 21 ++++++--------
> 3 files changed, 70 insertions(+), 12 deletions(-)
Applied, thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-04-26 18:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-25 18:35 [PATCH] x86/mce: Look in genpool instead of mcelog.entry[] for pending error records Tony Luck
2016-03-31 8:46 ` Borislav Petkov
[not found] ` <b5198c79fc527cc63d858087bd20fd81b1c4011a.1460046517.git.tony.luck@intel.com>
2016-04-07 17:03 ` [PATCHv2] " Borislav Petkov
[not found] ` <0a9f1fc6b1626ad2b9cb6f1a7c621249b39d1a63.1460058327.git.tony.luck@intel.com>
2016-04-26 18:11 ` [PATCHv3] " Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox