* machine check report on HVM startup @ 2008-08-13 11:48 Christoph Egger 2008-08-13 12:27 ` Keir Fraser 0 siblings, 1 reply; 8+ messages in thread From: Christoph Egger @ 2008-08-13 11:48 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel Hi, When I launch memtest as HVM guest, then Xen sends tons of VIRQ_MCA events to the Dom0, although there occured NO correctable machine check errors. When the Dom0 tries to fetch the error telemetry, then the BUG_ON(mc_data.fetch_idx > mc_data.error_idx); in x86_mcinfo_getfetchptr() in xen/arch/x86/cpu/mcheck/mce.c is hit. (x86_mcinfo_getfetchptr() only works if actually real error occured which is not the case.) This looks to me, there's a non-public event channel using the same number as VIRQ_MCA which fires when launching memtest as HVM guest. Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: machine check report on HVM startup 2008-08-13 11:48 machine check report on HVM startup Christoph Egger @ 2008-08-13 12:27 ` Keir Fraser 2008-08-13 12:36 ` Keir Fraser 0 siblings, 1 reply; 8+ messages in thread From: Keir Fraser @ 2008-08-13 12:27 UTC (permalink / raw) To: Christoph Egger; +Cc: xen-devel On 13/8/08 12:48, "Christoph Egger" <Christoph.Egger@amd.com> wrote: > When I launch memtest as HVM guest, then Xen sends tons of VIRQ_MCA events > to the Dom0, although there occured NO correctable machine check errors. > When the Dom0 tries to fetch the error telemetry, then the > > BUG_ON(mc_data.fetch_idx > mc_data.error_idx); in x86_mcinfo_getfetchptr() > in xen/arch/x86/cpu/mcheck/mce.c is hit. (x86_mcinfo_getfetchptr() only works > if actually real error occured which is not the case.) Perhaps you should be more wary of hypercall inputs? Failing the hypercall, perhaps with a warning printk, would be better than BUG_ON() I think. > This looks to me, there's a non-public event channel using the same number > as VIRQ_MCA which fires when launching memtest as HVM guest. I don't think this is the case. Sounds easy to repro this issue though. I'll give it a go. -- Keir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 12:27 ` Keir Fraser @ 2008-08-13 12:36 ` Keir Fraser 2008-08-13 12:40 ` Christoph Egger 0 siblings, 1 reply; 8+ messages in thread From: Keir Fraser @ 2008-08-13 12:36 UTC (permalink / raw) To: Christoph Egger; +Cc: xen-devel On 13/8/08 13:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: >> This looks to me, there's a non-public event channel using the same number >> as VIRQ_MCA which fires when launching memtest as HVM guest. > > I don't think this is the case. Sounds easy to repro this issue though. I'll > give it a go. I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine. -- Keir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 12:36 ` Keir Fraser @ 2008-08-13 12:40 ` Christoph Egger 2008-08-13 12:48 ` Keir Fraser 0 siblings, 1 reply; 8+ messages in thread From: Christoph Egger @ 2008-08-13 12:40 UTC (permalink / raw) To: xen-devel; +Cc: Keir Fraser On Wednesday 13 August 2008 14:36:21 Keir Fraser wrote: > On 13/8/08 13:27, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >> This looks to me, there's a non-public event channel using the same > >> number as VIRQ_MCA which fires when launching memtest as HVM guest. > > > > I don't think this is the case. Sounds easy to repro this issue though. > > I'll give it a go. > > I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine. Does your Dom0 kernel registrate the machine check event handler ? If not, then it things go fine. If yes, then you should see the flood of VIRQ_MCA events in the Dom0. Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 12:40 ` Christoph Egger @ 2008-08-13 12:48 ` Keir Fraser 2008-08-13 13:17 ` Christoph Egger 2008-08-13 13:17 ` Keir Fraser 0 siblings, 2 replies; 8+ messages in thread From: Keir Fraser @ 2008-08-13 12:48 UTC (permalink / raw) To: Christoph Egger, xen-devel On 13/8/08 13:40, "Christoph Egger" <Christoph.Egger@amd.com> wrote: >>>> This looks to me, there's a non-public event channel using the same >>>> number as VIRQ_MCA which fires when launching memtest as HVM guest. >>> >>> I don't think this is the case. Sounds easy to repro this issue though. >>> I'll give it a go. >> >> I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just fine. > > Does your Dom0 kernel registrate the machine check event handler ? > If not, then it things go fine. If yes, then you should see the flood of > VIRQ_MCA events in the Dom0. How do I make it do that? -- Keir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 12:48 ` Keir Fraser @ 2008-08-13 13:17 ` Christoph Egger 2008-08-13 13:22 ` Keir Fraser 2008-08-13 13:17 ` Keir Fraser 1 sibling, 1 reply; 8+ messages in thread From: Christoph Egger @ 2008-08-13 13:17 UTC (permalink / raw) To: xen-devel; +Cc: Keir Fraser [-- Attachment #1: Type: text/plain, Size: 1426 bytes --] On Wednesday 13 August 2008 14:48:04 Keir Fraser wrote: > On 13/8/08 13:40, "Christoph Egger" <Christoph.Egger@amd.com> wrote: > >>>> This looks to me, there's a non-public event channel using the same > >>>> number as VIRQ_MCA which fires when launching memtest as HVM guest. > >>> > >>> I don't think this is the case. Sounds easy to repro this issue though. > >>> I'll give it a go. > >> > >> I can boot a memtest-3.4 ISO in an HVM guest on PAE hypervisor just > >> fine. > > > > Does your Dom0 kernel registrate the machine check event handler ? > > If not, then it things go fine. If yes, then you should see the flood of > > VIRQ_MCA events in the Dom0. > > How do I make it do that? Assuming you use Linux as Dom0, apply the attached patch to your local tree. With it, you should see a flood of "xen_mca: HW reported correctable error(s)" Dom0 kernel messages. Note, the patch is not intended to go upstream. There will be something better in the future. Christoph -- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy [-- Attachment #2: linux_xenmca.diff --] [-- Type: text/x-diff, Size: 1127 bytes --] diff -r c110692c140f arch/i386/kernel/cpu/mcheck/non-fatal.c --- a/arch/i386/kernel/cpu/mcheck/non-fatal.c Wed Aug 13 10:00:09 2008 +0100 +++ b/arch/i386/kernel/cpu/mcheck/non-fatal.c Wed Aug 13 15:10:47 2008 +0200 @@ -60,9 +60,31 @@ static void mce_work_fn(void *data) schedule_delayed_work(&mce_work, MCE_RATE); } +/* Privileged receive callback and transmit kicker. */ +static irqreturn_t xenmca_event(int irq, void *dev_id, + struct pt_regs *regs) +{ + printk("xen_mca: HW reported correctable error(s)\n"); + + return IRQ_HANDLED; +} + +static int mca_event_irq; + static int __init init_nonfatal_mce_checker(void) { struct cpuinfo_x86 *c = &boot_cpu_data; + + if (is_initial_xendomain()) { + mca_event_irq = bind_virq_to_irqhandler( + VIRQ_MCA, + 0, + xenmca_event, + 0, + "mca0", + NULL); + BUG_ON(mca_event_irq < 0); + } /* Check for MCE support */ if (!cpu_has(c, X86_FEATURE_MCE)) [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 13:17 ` Christoph Egger @ 2008-08-13 13:22 ` Keir Fraser 0 siblings, 0 replies; 8+ messages in thread From: Keir Fraser @ 2008-08-13 13:22 UTC (permalink / raw) To: Christoph Egger, xen-devel On 13/8/08 14:17, "Christoph Egger" <Christoph.Egger@amd.com> wrote: > Assuming you use Linux as Dom0, apply the attached patch to your local tree. > With it, you should see a flood of "xen_mca: HW reported correctable error(s)" > Dom0 kernel messages. > > Note, the patch is not intended to go upstream. There will be something better > in the future. The patch won't do much since CONFIG_X86_MCE depends on !XEN. Anyhow, I tried registering some other handler as VIRQ_MCA and it never fired for me. -- Keir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: machine check report on HVM startup 2008-08-13 12:48 ` Keir Fraser 2008-08-13 13:17 ` Christoph Egger @ 2008-08-13 13:17 ` Keir Fraser 1 sibling, 0 replies; 8+ messages in thread From: Keir Fraser @ 2008-08-13 13:17 UTC (permalink / raw) To: Christoph Egger, xen-devel On 13/8/08 13:48, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: >> Does your Dom0 kernel registrate the machine check event handler ? >> If not, then it things go fine. If yes, then you should see the flood of >> VIRQ_MCA events in the Dom0. > > How do I make it do that? I modified the netback VIRQ_DEBUG handler to register on VIRQ_MCA instead. I didn't get any output from it when running a memtest HVM guest. -- Keir ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-08-13 13:22 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-13 11:48 machine check report on HVM startup Christoph Egger 2008-08-13 12:27 ` Keir Fraser 2008-08-13 12:36 ` Keir Fraser 2008-08-13 12:40 ` Christoph Egger 2008-08-13 12:48 ` Keir Fraser 2008-08-13 13:17 ` Christoph Egger 2008-08-13 13:22 ` Keir Fraser 2008-08-13 13:17 ` Keir Fraser
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.