From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Jan Beulich" <jbeulich@novell.com>
Subject: Re: [PATCH] 3/3: MCA/MCE correctable error
	handling
Date: Wed, 22 Aug 2007 11:09:41 +0100
Message-ID: <46CC2785.76E4.0078.0@novell.com>
References: <200708211531.44997.Christoph.Egger@amd.com>
	<46CB28CE.76E4.0078.0@novell.com>
	<200708221100.34795.Christoph.Egger@amd.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <200708221100.34795.Christoph.Egger@amd.com>
Content-Disposition: inline
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Christoph Egger <Christoph.Egger@amd.com>, xen-devel@lists.xensource.com
Cc: Gavin.Maltby@sun.com, Keir Fraser <keir@xensource.com>
List-Id: xen-devel@lists.xenproject.org

>>> "Christoph Egger" <Christoph.Egger@amd.com> 22.08.07 11:00 >>>
>On Tuesday 21 August 2007 18:02:54 Jan Beulich wrote:
>> >+		if (mc_global->mc_flags & MC_FLAG_UNCORRECTABLE)
>> >+			printk(KERN_EMERG);
>> >+		else
>> >+			printk(KERN_INFO);
>>
>> KERN_INFO seems gross understatement here - generally, correctable MCs =
are
>> considered indicators that within not too distant future uncorrectable =
MCs
>> might result, so this generally is a call for action (and hence =
shouldn't
>> be hidden with default log level settings).
>
>Well, that is what the "old" code did. It used KERN_EMERG for fatal =
errors
>and KERN_INFO in the polling service routine. What do you want me to =
suggest?

This should be at least KERN_WARNING, probably even KERN_ERR (note
though that KERN_ERR and KERN_EMERG both resolve to XENLOG_ERR).

>> Also, I'm not sure adjusting the polling frequency makes much sense - =
30s
>> seems an awful lot of time to me.
>
>It's not clear to me what you are trying to tell me. Please explain/elabor=
ate.

What I'm trying to say is that I'd think this should be polled at a much =
higher
frequency (I'd suggest 1Hz), without adjustments. Typically, a healthy =
system
will not encounter problems soon after boot, but after running for perhaps =
a
very long time (and a system in bad condition is likely to encounter =
problems
right away, so wouldn't be affected by changing the polling rate). Thus, =
in the
general case, you'd have a comparably long latency, during which some kind
of (automated) action could already be taken to preserve data consistency.

Jan