Re: [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Borislav Petkov <bp@amd64.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>,
	Mauro Carvalho Chehab <mchehab@redhat.com>,
	Ingo Molnar <mingo@elte.hu>,
	EDAC devel <linux-edac@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer
Date: Tue, 27 Mar 2012 21:11:12 +0200	[thread overview]
Message-ID: <20120327191112.GA11587@aftab> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F15B72395@ORSMSX104.amr.corp.intel.com>

On Tue, Mar 27, 2012 at 06:35:37PM +0000, Luck, Tony wrote:
> > In any case, if during the safe period of time we haven't received
> > confirmation from userspace that the item has been consumed, we switch
> > irreversibly back to the kernel log buffer and reissue the decoded info
> > through printk.
> 
> I'm not sure I like irreversible things.
> 
> Here's the life cycle:
> 
> 1) System boots ... we have a window during this time where there is
>    no daemon (or any user space at all).
> 
> 2) Daemon gets started from /etc/init.d or systemd script
> 
> 3) (optional) New version of daemon installed in update (old daemon is terminated, new one starts).
> 
> 4) System is shutdown - all daemons terminated
> 
> 5) System actually halts.
> 
> 
> So we clearly have some gaps where there isn't a daemon.  Most of them should
> be pretty short ... but I worry about the gap from #1 to #2 - which can be pretty
> long if we need to fsck some disks (or we on some crazy big system that takes
> many minutes just to find and spin-up all the disks).

Well, currenty we queue MCEs for later consumption before the decoder
chains have been registered etc: 0937195715713. We probably could delay
the draining of the buffer until we have userspace and daemon running.

Problems with this is that buffer size is limited: 32 struct mce's and
it can overflow pretty fast on a b0rked system which spews a lot of MCEs
during boot.

We probably could provide for enlarging that when needed as a Kconfig or
a boot option using early memblock allocations or whatever...

Then, after maybe a configurable period of uptime (it should be chosen
to be safe for most systems out there and the others could configure in
a higher timeout if they need to) we start spewing out decoded MCEs into
dmesg unless a daemon has drained the buffers before that.

Or something to that effect...

Concerning the irreversibility, we could probably teach the code to stop
printk'ing MCEs if the daemon has been restarted in the meantime...

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

next prev parent reply	other threads:[~2012-03-27 19:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-06 13:31 [RFC -v3 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-06 13:31 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-03-06 13:31 ` [PATCH 2/3] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-03-06 13:31 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
2012-03-06 15:42   ` Mauro Carvalho Chehab
2012-03-12 16:18     ` Luck, Tony
2012-03-12 16:26       ` Borislav Petkov
2012-03-12 16:59         ` Luck, Tony
2012-03-12 18:03           ` Borislav Petkov
2012-03-27 17:06             ` Borislav Petkov
2012-03-27 18:35               ` Luck, Tony
2012-03-27 19:11                 ` Borislav Petkov [this message]
  -- strict thread matches above, loose matches on Subject: below --
2012-02-28 16:11 [RFC PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-02-28 16:11 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120327191112.GA11587@aftab \
    --to=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).