xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* ACPI suspend/resume not failing with (dom0) kernel panic
@ 2016-12-23 17:16 Dario Faggioli
  2016-12-23 17:34 ` Boris Ostrovsky
  2016-12-27 17:12 ` Jan Beulich
  0 siblings, 2 replies; 4+ messages in thread
From: Dario Faggioli @ 2016-12-23 17:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Boris Ostrovsky


[-- Attachment #1.1: Type: text/plain, Size: 2435 bytes --]

Hey,

I was trying ACPI suspend/resume for testing some patches to Xen, on a
box on which I'm 100% sure I've seen it working a few time back.

Right now, suspending seems ok, but upon resuming, this is what I see
(and after that, everything is just locked):

[  132.790494] smpboot: CPU 15 is now offline
[  132.797383] ACPI: Low-level resume complete
[  132.801635] PM: Restoring platform NVS memory
[  142.805036] Kernel panic - not syncing: DMAR hardware is malfunctioning
[  142.805036] 
[  142.813109] CPU: 0 PID: 1386 Comm: pm-suspend Not tainted 4.8.0-2-amd64 #1 Debian 4.8.11-1
[  142.821345] Hardware name: Dell Inc. Precision WorkStation T5500  /0CRH6C, BIOS A09 04/20/2011
[  142.829928]  0000000000000086 00000000ca1fa4d3 ffffffff8cb269f5 ffff89b99d076200
[  142.837340]  ffff89b99b763d48 ffffffff8c97a6b2 0000000000000008 ffff89b99b763d58
[  142.844751]  ffff89b99b763cf0 00000000ca1fa4d3 0000000000000046 0000000000000002
[  142.852159] Call Trace:
[  142.854602]  [<ffffffff8cb269f5>] ? dump_stack+0x5c/0x77
[  142.859980]  [<ffffffff8c97a6b2>] ? panic+0xe4/0x226
[  142.865004]  [<ffffffff8cc4b7c8>] ? dmar_disable_qi+0x108/0x110
[  142.870978]  [<ffffffff8cc4bb1e>] ? dmar_reenable_qi+0x1e/0x30
[  142.876863]  [<ffffffff8cc5429f>] ? reenable_irq_remapping+0x2f/0x110
[  142.883358]  [<ffffffff8c850ccd>] ? lapic_resume+0x1ed/0x290
[  142.889073]  [<ffffffff8cc5fd14>] ? syscore_resume+0x44/0x180
[  142.894873]  [<ffffffff8c8c86f4>] ? suspend_devices_and_enter+0x654/0x6f0
[  142.901711]  [<ffffffff8c8c8ab1>] ? pm_suspend+0x321/0x3a0
[  142.907253]  [<ffffffff8c8c731f>] ? state_store+0x6f/0xd0
[  142.912707]  [<ffffffff8ca7fa98>] ? kernfs_fop_write+0x118/0x1a0
[  142.918766]  [<ffffffff8ca02303>] ? vfs_write+0xb3/0x1a0
[  142.924131]  [<ffffffff8ca036e2>] ? SyS_write+0x52/0xc0
[  142.929413]  [<ffffffff8cdefa76>] ? system_call_fast_compare_end+0xc/0x96
[  142.936264] Kernel Offset: 0xb800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  142.947007] ---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning

Does this ring any bell?

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ACPI suspend/resume not failing with (dom0) kernel panic
  2016-12-23 17:16 ACPI suspend/resume not failing with (dom0) kernel panic Dario Faggioli
@ 2016-12-23 17:34 ` Boris Ostrovsky
  2016-12-23 17:44   ` Dario Faggioli
  2016-12-27 17:12 ` Jan Beulich
  1 sibling, 1 reply; 4+ messages in thread
From: Boris Ostrovsky @ 2016-12-23 17:34 UTC (permalink / raw)
  To: Dario Faggioli, xen-devel; +Cc: Juergen Gross


[-- Attachment #1.1.1: Type: text/plain, Size: 2474 bytes --]

On 12/23/2016 12:16 PM, Dario Faggioli wrote:
> Hey,
>
> I was trying ACPI suspend/resume for testing some patches to Xen, on a
> box on which I'm 100% sure I've seen it working a few time back.
>
> Right now, suspending seems ok, but upon resuming, this is what I see
> (and after that, everything is just locked):
>
> [  132.790494] smpboot: CPU 15 is now offline
> [  132.797383] ACPI: Low-level resume complete
> [  132.801635] PM: Restoring platform NVS memory
> [  142.805036] Kernel panic - not syncing: DMAR hardware is malfunctioning
> [  142.805036] 
> [  142.813109] CPU: 0 PID: 1386 Comm: pm-suspend Not tainted 4.8.0-2-amd64 #1 Debian 4.8.11-1
> [  142.821345] Hardware name: Dell Inc. Precision WorkStation T5500  /0CRH6C, BIOS A09 04/20/2011
> [  142.829928]  0000000000000086 00000000ca1fa4d3 ffffffff8cb269f5 ffff89b99d076200
> [  142.837340]  ffff89b99b763d48 ffffffff8c97a6b2 0000000000000008 ffff89b99b763d58
> [  142.844751]  ffff89b99b763cf0 00000000ca1fa4d3 0000000000000046 0000000000000002
> [  142.852159] Call Trace:
> [  142.854602]  [<ffffffff8cb269f5>] ? dump_stack+0x5c/0x77
> [  142.859980]  [<ffffffff8c97a6b2>] ? panic+0xe4/0x226
> [  142.865004]  [<ffffffff8cc4b7c8>] ? dmar_disable_qi+0x108/0x110
> [  142.870978]  [<ffffffff8cc4bb1e>] ? dmar_reenable_qi+0x1e/0x30
> [  142.876863]  [<ffffffff8cc5429f>] ? reenable_irq_remapping+0x2f/0x110
> [  142.883358]  [<ffffffff8c850ccd>] ? lapic_resume+0x1ed/0x290
> [  142.889073]  [<ffffffff8cc5fd14>] ? syscore_resume+0x44/0x180
> [  142.894873]  [<ffffffff8c8c86f4>] ? suspend_devices_and_enter+0x654/0x6f0
> [  142.901711]  [<ffffffff8c8c8ab1>] ? pm_suspend+0x321/0x3a0
> [  142.907253]  [<ffffffff8c8c731f>] ? state_store+0x6f/0xd0
> [  142.912707]  [<ffffffff8ca7fa98>] ? kernfs_fop_write+0x118/0x1a0
> [  142.918766]  [<ffffffff8ca02303>] ? vfs_write+0xb3/0x1a0
> [  142.924131]  [<ffffffff8ca036e2>] ? SyS_write+0x52/0xc0
> [  142.929413]  [<ffffffff8cdefa76>] ? system_call_fast_compare_end+0xc/0x96
> [  142.936264] Kernel Offset: 0xb800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [  142.947007] ---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning
>
> Does this ring any bell?

Not really.

Does this happen without your patches too? What about other Xen versions
and/or baremetal?

Try also booting Xen with "iommu=debug" and see if something shows up in
the log.

-boris



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ACPI suspend/resume not failing with (dom0) kernel panic
  2016-12-23 17:34 ` Boris Ostrovsky
@ 2016-12-23 17:44   ` Dario Faggioli
  0 siblings, 0 replies; 4+ messages in thread
From: Dario Faggioli @ 2016-12-23 17:44 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel; +Cc: Juergen Gross


[-- Attachment #1.1: Type: text/plain, Size: 2947 bytes --]

On Fri, 2016-12-23 at 12:34 -0500, Boris Ostrovsky wrote:
> On 12/23/2016 12:16 PM, Dario Faggioli wrote:
> > [  132.790494] smpboot: CPU 15 is now offline
> > [  132.797383] ACPI: Low-level resume complete
> > [  132.801635] PM: Restoring platform NVS memory
> > [  142.805036] Kernel panic - not syncing: DMAR hardware is
> > malfunctioning
> > [  142.805036] 
> > [  142.813109] CPU: 0 PID: 1386 Comm: pm-suspend Not tainted 4.8.0-
> > 2-amd64 #1 Debian 4.8.11-1
> > [  142.821345] Hardware name: Dell Inc. Precision WorkStation
> > T5500  /0CRH6C, BIOS A09 04/20/2011
> > [  142.829928]  0000000000000086 00000000ca1fa4d3 ffffffff8cb269f5
> > ffff89b99d076200
> > [  142.837340]  ffff89b99b763d48 ffffffff8c97a6b2 0000000000000008
> > ffff89b99b763d58
> > [  142.844751]  ffff89b99b763cf0 00000000ca1fa4d3 0000000000000046
> > 0000000000000002
> > [  142.852159] Call Trace:
> > [  142.854602]  [<ffffffff8cb269f5>] ? dump_stack+0x5c/0x77
> > [  142.859980]  [<ffffffff8c97a6b2>] ? panic+0xe4/0x226
> > [  142.865004]  [<ffffffff8cc4b7c8>] ? dmar_disable_qi+0x108/0x110
> > [  142.870978]  [<ffffffff8cc4bb1e>] ? dmar_reenable_qi+0x1e/0x30
> > [  142.876863]  [<ffffffff8cc5429f>] ?
> > reenable_irq_remapping+0x2f/0x110
> > [  142.883358]  [<ffffffff8c850ccd>] ? lapic_resume+0x1ed/0x290
> > [  142.889073]  [<ffffffff8cc5fd14>] ? syscore_resume+0x44/0x180
> > [  142.894873]  [<ffffffff8c8c86f4>] ?
> > suspend_devices_and_enter+0x654/0x6f0
> > [  142.901711]  [<ffffffff8c8c8ab1>] ? pm_suspend+0x321/0x3a0
> > [  142.907253]  [<ffffffff8c8c731f>] ? state_store+0x6f/0xd0
> > [  142.912707]  [<ffffffff8ca7fa98>] ? kernfs_fop_write+0x118/0x1a0
> > [  142.918766]  [<ffffffff8ca02303>] ? vfs_write+0xb3/0x1a0
> > [  142.924131]  [<ffffffff8ca036e2>] ? SyS_write+0x52/0xc0
> > [  142.929413]  [<ffffffff8cdefa76>] ?
> > system_call_fast_compare_end+0xc/0x96
> > [  142.936264] Kernel Offset: 0xb800000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [  142.947007] ---[ end Kernel panic - not syncing: DMAR hardware
> > is malfunctioning
> > 
> > Does this ring any bell?
> 
> Not really.
> 
> Does this happen without your patches too? What about other Xen
> versions
> and/or baremetal?
> 
Yes, it happens without my patches. Xen version is current staging.

I didn't think about testing baremetal. I've just done it, and yes, it
_crashes_ in the same way.

I'd say this is not a Xen issue then, and I should report to proper
Linux people (after having tried 4.9, probably).

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ACPI suspend/resume not failing with (dom0) kernel panic
  2016-12-23 17:16 ACPI suspend/resume not failing with (dom0) kernel panic Dario Faggioli
  2016-12-23 17:34 ` Boris Ostrovsky
@ 2016-12-27 17:12 ` Jan Beulich
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2016-12-27 17:12 UTC (permalink / raw)
  To: dario.faggioli, boris.ostrovsky; +Cc: Juergen Gross, xen-devel

>>> Dario Faggioli <dario.faggioli@citrix.com> 12/23/16 6:18 PM >>>
>[  142.852159] Call Trace:
>[  142.854602]  [<ffffffff8cb269f5>] ? dump_stack+0x5c/0x77
>[  142.859980]  [<ffffffff8c97a6b2>] ? panic+0xe4/0x226
>[  142.865004]  [<ffffffff8cc4b7c8>] ? dmar_disable_qi+0x108/0x110
>[  142.870978]  [<ffffffff8cc4bb1e>] ? dmar_reenable_qi+0x1e/0x30
>[  142.876863]  [<ffffffff8cc5429f>] ? reenable_irq_remapping+0x2f/0x110

All the three above should be unreachable when running under Xen, no matter
whether the same issue appears on bare metal.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-27 17:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-23 17:16 ACPI suspend/resume not failing with (dom0) kernel panic Dario Faggioli
2016-12-23 17:34 ` Boris Ostrovsky
2016-12-23 17:44   ` Dario Faggioli
2016-12-27 17:12 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).