* machine check report on HVM startup
@ 2008-08-13 11:48 Christoph Egger
2008-08-13 12:27 ` Keir Fraser
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Egger @ 2008-08-13 11:48 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
Hi,
When I launch memtest as HVM guest, then Xen sends tons of VIRQ_MCA events
to the Dom0, although there occured NO correctable machine check errors.
When the Dom0 tries to fetch the error telemetry, then the
BUG_ON(mc_data.fetch_idx > mc_data.error_idx); in x86_mcinfo_getfetchptr()
in xen/arch/x86/cpu/mcheck/mce.c is hit. (x86_mcinfo_getfetchptr() only works
if actually real error occured which is not the case.)
This looks to me, there's a non-public event channel using the same number
as VIRQ_MCA which fires when launching memtest as HVM guest.
Christoph
--
AMD Saxony, Dresden, Germany
Operating System Research Center
Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift):
Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär:
AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
Geschäftsführer der AMD Saxony LLC:
Dr. Hans-R. Deppe, Thomas McCoy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: machine check report on HVM startup
2008-08-13 11:48 machine check report on HVM startup Christoph Egger
@ 2008-08-13 12:27 ` Keir Fraser
2008-08-13 12:36 ` Keir Fraser
0 siblings, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2008-08-13 12:27 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel
On 13/8/08 12:48, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> When I launch memtest as HVM guest, then Xen sends tons of VIRQ_MCA events
> to the Dom0, although there occured NO correctable machine check errors.
> When the Dom0 tries to fetch the error telemetry, then the
>
> BUG_ON(mc_data.fetch_idx > mc_data.error_idx); in x86_mcinfo_getfetchptr()
> in xen/arch/x86/cpu/mcheck/mce.c is hit. (x86_mcinfo_getfetchptr() only works
> if actually real error occured which is not the case.)
Perhaps you should be more wary of hypercall inputs? Failing the hypercall,
perhaps with a warning printk, would be better than BUG_ON() I think.
> This looks to me, there's a non-public event channel using the same number
> as VIRQ_MCA which fires when launching memtest as HVM guest.
I don't think this is the case. Sounds easy to repro this issue though. I'll
give it a go.
-- Keir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 12:27 ` Keir Fraser
@ 2008-08-13 12:36 ` Keir Fraser
2008-08-13 12:40 ` Christoph Egger
0 siblings, 1 reply; 8+ messages in thread
From: Keir Fraser @ 2008-08-13 12:36 UTC (permalink / raw)
To: Christoph Egger; +Cc: xen-devel
On 13/8/08 13:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>> This looks to me, there's a non-public event channel using the same number
>> as VIRQ_MCA which fires when launching memtest as HVM guest.
>
> I don't think this is the case. Sounds easy to repro this issue though. I'll
> give it a go.
I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine.
-- Keir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 12:36 ` Keir Fraser
@ 2008-08-13 12:40 ` Christoph Egger
2008-08-13 12:48 ` Keir Fraser
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Egger @ 2008-08-13 12:40 UTC (permalink / raw)
To: xen-devel; +Cc: Keir Fraser
On Wednesday 13 August 2008 14:36:21 Keir Fraser wrote:
> On 13/8/08 13:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> >> This looks to me, there's a non-public event channel using the same
> >> number as VIRQ_MCA which fires when launching memtest as HVM guest.
> >
> > I don't think this is the case. Sounds easy to repro this issue though.
> > I'll give it a go.
>
> I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine.
Does your Dom0 kernel registrate the machine check event handler ?
If not, then it things go fine. If yes, then you should see the flood of
VIRQ_MCA events in the Dom0.
Christoph
--
AMD Saxony, Dresden, Germany
Operating System Research Center
Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift):
Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär:
AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
Geschäftsführer der AMD Saxony LLC:
Dr. Hans-R. Deppe, Thomas McCoy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 12:40 ` Christoph Egger
@ 2008-08-13 12:48 ` Keir Fraser
2008-08-13 13:17 ` Christoph Egger
2008-08-13 13:17 ` Keir Fraser
0 siblings, 2 replies; 8+ messages in thread
From: Keir Fraser @ 2008-08-13 12:48 UTC (permalink / raw)
To: Christoph Egger, xen-devel
On 13/8/08 13:40, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
>>>> This looks to me, there's a non-public event channel using the same
>>>> number as VIRQ_MCA which fires when launching memtest as HVM guest.
>>>
>>> I don't think this is the case. Sounds easy to repro this issue though.
>>> I'll give it a go.
>>
>> I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine.
>
> Does your Dom0 kernel registrate the machine check event handler ?
> If not, then it things go fine. If yes, then you should see the flood of
> VIRQ_MCA events in the Dom0.
How do I make it do that?
-- Keir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 12:48 ` Keir Fraser
@ 2008-08-13 13:17 ` Christoph Egger
2008-08-13 13:22 ` Keir Fraser
2008-08-13 13:17 ` Keir Fraser
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Egger @ 2008-08-13 13:17 UTC (permalink / raw)
To: xen-devel; +Cc: Keir Fraser
[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]
On Wednesday 13 August 2008 14:48:04 Keir Fraser wrote:
> On 13/8/08 13:40, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> >>>> This looks to me, there's a non-public event channel using the same
> >>>> number as VIRQ_MCA which fires when launching memtest as HVM guest.
> >>>
> >>> I don't think this is the case. Sounds easy to repro this issue though.
> >>> I'll give it a go.
> >>
> >> I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just
> >> fine.
> >
> > Does your Dom0 kernel registrate the machine check event handler ?
> > If not, then it things go fine. If yes, then you should see the flood of
> > VIRQ_MCA events in the Dom0.
>
> How do I make it do that?
Assuming you use Linux as Dom0, apply the attached patch to your local tree.
With it, you should see a flood of "xen_mca: HW reported correctable error(s)"
Dom0 kernel messages.
Note, the patch is not intended to go upstream. There will be something better
in the future.
Christoph
--
AMD Saxony, Dresden, Germany
Operating System Research Center
Legal Information:
AMD Saxony Limited Liability Company & Co. KG
Sitz (Geschäftsanschrift):
Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland
Registergericht Dresden: HRA 4896
vertretungsberechtigter Komplementär:
AMD Saxony LLC (Sitz Wilmington, Delaware, USA)
Geschäftsführer der AMD Saxony LLC:
Dr. Hans-R. Deppe, Thomas McCoy
[-- Attachment #2: linux_xenmca.diff --]
[-- Type: text/x-diff, Size: 1127 bytes --]
diff -r c110692c140f arch/i386/kernel/cpu/mcheck/non-fatal.c
--- a/arch/i386/kernel/cpu/mcheck/non-fatal.c Wed Aug 13 10:00:09 2008 +0100
+++ b/arch/i386/kernel/cpu/mcheck/non-fatal.c Wed Aug 13 15:10:47 2008 +0200
@@ -60,9 +60,31 @@ static void mce_work_fn(void *data)
schedule_delayed_work(&mce_work, MCE_RATE);
}
+/* Privileged receive callback and transmit kicker. */
+static irqreturn_t xenmca_event(int irq, void *dev_id,
+ struct pt_regs *regs)
+{
+ printk("xen_mca: HW reported correctable error(s)\n");
+
+ return IRQ_HANDLED;
+}
+
+static int mca_event_irq;
+
static int __init init_nonfatal_mce_checker(void)
{
struct cpuinfo_x86 *c = &boot_cpu_data;
+
+ if (is_initial_xendomain()) {
+ mca_event_irq = bind_virq_to_irqhandler(
+ VIRQ_MCA,
+ 0,
+ xenmca_event,
+ 0,
+ "mca0",
+ NULL);
+ BUG_ON(mca_event_irq < 0);
+ }
/* Check for MCE support */
if (!cpu_has(c, X86_FEATURE_MCE))
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 12:48 ` Keir Fraser
2008-08-13 13:17 ` Christoph Egger
@ 2008-08-13 13:17 ` Keir Fraser
1 sibling, 0 replies; 8+ messages in thread
From: Keir Fraser @ 2008-08-13 13:17 UTC (permalink / raw)
To: Christoph Egger, xen-devel
On 13/8/08 13:48, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>> Does your Dom0 kernel registrate the machine check event handler ?
>> If not, then it things go fine. If yes, then you should see the flood of
>> VIRQ_MCA events in the Dom0.
>
> How do I make it do that?
I modified the netback VIRQ_DEBUG handler to register on VIRQ_MCA instead. I
didn't get any output from it when running a memtest HVM guest.
-- Keir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup
2008-08-13 13:17 ` Christoph Egger
@ 2008-08-13 13:22 ` Keir Fraser
0 siblings, 0 replies; 8+ messages in thread
From: Keir Fraser @ 2008-08-13 13:22 UTC (permalink / raw)
To: Christoph Egger, xen-devel
On 13/8/08 14:17, "Christoph Egger" <Christoph.Egger@amd.com> wrote:
> Assuming you use Linux as Dom0, apply the attached patch to your local tree.
> With it, you should see a flood of "xen_mca: HW reported correctable error(s)"
> Dom0 kernel messages.
>
> Note, the patch is not intended to go upstream. There will be something better
> in the future.
The patch won't do much since CONFIG_X86_MCE depends on !XEN.
Anyhow, I tried registering some other handler as VIRQ_MCA and it never
fired for me.
-- Keir
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-08-13 13:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-13 11:48 machine check report on HVM startup Christoph Egger
2008-08-13 12:27 ` Keir Fraser
2008-08-13 12:36 ` Keir Fraser
2008-08-13 12:40 ` Christoph Egger
2008-08-13 12:48 ` Keir Fraser
2008-08-13 13:17 ` Christoph Egger
2008-08-13 13:22 ` Keir Fraser
2008-08-13 13:17 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.