All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@amd64.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Mauro Carvalho Chehab <mchehab@redhat.com>,
	Ingo Molnar <mingo@elte.hu>,
	edac-devel <linux-edac@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: RAS trace event proto
Date: Wed, 22 Feb 2012 16:59:48 +0100	[thread overview]
Message-ID: <20120222155948.GF26845@aftab> (raw)
In-Reply-To: <20120222104324.GA26845@aftab>

On Wed, Feb 22, 2012 at 11:43:24AM +0100, Borislav Petkov wrote:
> This will keep the bloat level to a minimum, keep the TPs apart and
> hopefully make all of us happy :).

Btw, here's how the rough MCE TP trace_mce_record() looks like:

       mcegen.py-2715  [001] .N..  1049.818840: mce_record: [Hardware Error]: CPU:0     MC4_STATUS[Over|UE|-|PCC|AddrV|UECC]: 0xf604a00006080a41
[Hardware Error]:       MC4_ADDR: 0xbabedeaddeadbeef
[Hardware Error]: Northbridge Error (node 0): DRAM ECC error detected
(CPU: 0, MCGc/s: 0/0, MC4: f604a00006080a41, ADDR/MISC: babedeaddeadbeef/dead57ac1ba0babe, RIP: 00:<0000000000000000>, TSC: 0, PROCESSOR: 0:0, TIME: 0, SOCKET: 0, APIC: 0)

Basically, the userspace daemon will consume the error string (after
it's been massaged into looking prettier and smaller :-)) (1st arg)
and dump it to some logs, and use some of the MCE fields to do error
collection and thresholding/ratelimiting/whatever.

While at it, I'm also looking very critically at the fields SOCKET,
APIC, TSC (we have walltime) for I'd like to drop them. Also, MC4 should
be MC4_STATUS btw.

To be continued...

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

  parent reply	other threads:[~2012-02-22 16:00 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-20 14:59 RAS trace event proto Borislav Petkov
2012-02-21  1:14 ` Steven Rostedt
2012-02-21 10:15   ` Borislav Petkov
2012-02-21 12:24 ` Mauro Carvalho Chehab
2012-02-21 14:12   ` Borislav Petkov
2012-02-21 14:48     ` Steven Rostedt
2012-02-21 14:59       ` Borislav Petkov
2012-02-21 16:18         ` Mauro Carvalho Chehab
2012-02-22  0:58         ` Luck, Tony
2012-02-22 10:43           ` Borislav Petkov
2012-02-22 12:02             ` Mauro Carvalho Chehab
2012-02-22 12:25               ` Borislav Petkov
2012-02-22 13:32                 ` Mauro Carvalho Chehab
2012-02-22 14:05                   ` Borislav Petkov
2012-02-22 14:25                     ` Mauro Carvalho Chehab
2012-02-22 14:26                   ` Steven Rostedt
2012-02-22 15:59             ` Borislav Petkov [this message]
2012-02-27 15:54               ` Borislav Petkov
2012-02-21 17:28     ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120222155948.GF26845@aftab \
    --to=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.