From: "Raj, Ashok" <ashok.raj@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "Luck, Tony" <tony.luck@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
Ashok Raj <ashok.raj@intel.com>
Subject: Re: [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts
Date: Thu, 24 Sep 2015 13:22:12 -0700 [thread overview]
Message-ID: <20150924202212.GA13075@linux.intel.com> (raw)
In-Reply-To: <20150924192224.GL3774@pd.tnic>
Hi Boris
On Thu, Sep 24, 2015 at 09:22:24PM +0200, Borislav Petkov wrote:
>
> Ah, we return. But we shouldn't return - we should overwrite. I believe
> we've talked about the policy of overwriting old errors with new ones.
>
Another reason i had a separate buffer in my earlier patch was to avoid
calling rcu() functions from the offline CPU. I had an offline discussion
with Paul McKenney he said don't do that...
mce_gen_pool_add()->gen_pool_alloc() which calls rcu_read_lock() and such.
So it didn't seem approprite.
Also the function doesn't seem safe to be called in NMI context. Although
MCE is different, for all intentional purposes we should treat both as same
priority. The old style log is simple and tested in those cases.
I like everything you say below... something we could do as our next phase
of improving logging and might need more careful work to build it right.
just like how MC banks have overwrite rules, we can possibly do something
like that if the buffer fills up.
> TBH, I don't think there's a 100%-correct policy to act according to
> when our error logging buffers are full:
>
> - we can overwrite old errors with new but then this way we might lose
> the one important error record with which it all started.
>
> - if we don't overwrite, we might fill up with "unimportant" correctable
> error records and miss other, more important ones which happen now
>
> - ...
>
> We could try to implement some cheap heuristics which decide what and
> when to overwrite but I'm sceptical it'll be always correct...
>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
next prev parent reply other threads:[~2015-09-24 20:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-24 5:48 [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts Ashok Raj
2015-09-24 5:48 ` [Patch V1 2/3] x86, mce: Refactor parts of mce_log() to reuse when logging from offline CPUs Ashok Raj
2015-09-24 5:48 ` [Patch V1 3/3] x86, mce: Account for offline CPUs during MCE rendezvous Ashok Raj
2015-09-24 15:47 ` [Patch V1 1/3] x86, mce: MCE log size not enough for high core parts Borislav Petkov
2015-09-24 18:44 ` Luck, Tony
2015-09-24 18:52 ` Borislav Petkov
2015-09-24 19:00 ` Luck, Tony
2015-09-24 19:22 ` Borislav Petkov
2015-09-24 20:22 ` Raj, Ashok [this message]
2015-09-24 21:07 ` Borislav Petkov
2015-09-24 21:25 ` Raj, Ashok
2015-09-25 8:29 ` Borislav Petkov
2015-09-25 16:29 ` Raj, Ashok
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150924202212.GA13075@linux.intel.com \
--to=ashok.raj@intel.com \
--cc=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox