From: Borislav Petkov <bp@alien8.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Raj, Ashok" <ashok.raj@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Subject: Re: [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts
Date: Thu, 24 Sep 2015 21:22:24 +0200 [thread overview]
Message-ID: <20150924192224.GL3774@pd.tnic> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F32B014D3@ORSMSX114.amr.corp.intel.com>
On Thu, Sep 24, 2015 at 07:00:46PM +0000, Luck, Tony wrote:
> > If we get new ones logged in the meantime and userspace hasn't managed
> > to consume and delete the present ones yet, we overwrite the oldest ones
> > and set MCE_OVERFLOW like mce_log does now for mcelog. And that's no
> > difference in functionality than what we have now.
>
> Ummmm. No.
>
> for (;;) {
>
> /*
> * When the buffer fills up discard new entries.
> * Assume that the earlier errors are the more
> * interesting ones:
> */
> if (entry >= MCE_LOG_LEN) {
> set_bit(MCE_OVERFLOW,
> (unsigned long *)&mcelog.flags);
> return;
> }
Ah, we return. But we shouldn't return - we should overwrite. I believe
we've talked about the policy of overwriting old errors with new ones.
TBH, I don't think there's a 100%-correct policy to act according to
when our error logging buffers are full:
- we can overwrite old errors with new but then this way we might lose
the one important error record with which it all started.
- if we don't overwrite, we might fill up with "unimportant" correctable
error records and miss other, more important ones which happen now
- ...
We could try to implement some cheap heuristics which decide what and
when to overwrite but I'm sceptical it'll be always correct...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
next prev parent reply other threads:[~2015-09-24 19:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-24 5:48 [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts Ashok Raj
2015-09-24 5:48 ` [Patch V1 2/3] x86, mce: Refactor parts of mce_log() to reuse when logging from offline CPUs Ashok Raj
2015-09-24 5:48 ` [Patch V1 3/3] x86, mce: Account for offline CPUs during MCE rendezvous Ashok Raj
2015-09-24 15:47 ` [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts Borislav Petkov
2015-09-24 18:44 ` Luck, Tony
2015-09-24 18:52 ` Borislav Petkov
2015-09-24 19:00 ` Luck, Tony
2015-09-24 19:22 ` Borislav Petkov [this message]
2015-09-24 20:22 ` Raj, Ashok
2015-09-24 21:07 ` Borislav Petkov
2015-09-24 21:25 ` Raj, Ashok
2015-09-25 8:29 ` Borislav Petkov
2015-09-25 16:29 ` Raj, Ashok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150924192224.GL3774@pd.tnic \
--to=bp@alien8.de \
--cc=ashok.raj@intel.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.