* d0v0 Unhandled general protection fault with 4.9.x on brand new hardware
@ 2017-11-02 7:56 Francesco De Francesco
2017-11-02 9:31 ` Juergen Gross
0 siblings, 1 reply; 3+ messages in thread
From: Francesco De Francesco @ 2017-11-02 7:56 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 5044 bytes --]
Hi everyone,
I need to address an issue that prevents me from running Xen on new
hardware. It's an HPE DL360 Gen10 with double Xeon Silver 4108 CPU and
256GB ECC DDR4 RDIMM. It happens with both CentOS6 and CentOS7.
When I try to boot with Xen kernel 4.9.58-29.el6, I get the following error
at boot time (I can read it from the serial console):
(XEN) Brought up 32 CPUs
(XEN) ACPI sleep modes: S3
(XEN) VPMU: disabled
(XEN) mcheck_poll: Machine check polling timer started.
(XEN) Dom0 has maximum 1240 PIRQs
(XEN) NX (Execute Disable) protection active
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Xen kernel: 64-bit, lsb, compat32
(XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x26a0000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN) Dom0 alloc.: 0000003fdc000000->0000003fe0000000 (1012299 pages to
be allocated)
(XEN) Init. ramdisk: 000000403b04b000->000000403fdff200
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN) Loaded kernel: ffffffff81000000->ffffffff826a0000
(XEN) Init. ramdisk: 0000000000000000->0000000000000000
(XEN) Phys-Mach map: 0000008000000000->0000008000800000
(XEN) Start info: ffffffff826a0000->ffffffff826a04b4
(XEN) Page tables: ffffffff826a1000->ffffffff826b8000
(XEN) Boot stack: ffffffff826b8000->ffffffff826b9000
(XEN) TOTAL: ffffffff80000000->ffffffff82800000
(XEN) ENTRY ADDRESS: ffffffff821a9180
(XEN) Dom0 has maximum 32 VCPUs
(XEN) Scrubbing Free RAM on 2 nodes using 16 CPUs
(XEN)
.................................................................................................................................done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input
to Xen)
(XEN) Freed 300kB init memory.
mapping kernel into physical memory
about to get started...
(XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
(XEN) domain_crash_sync called from entry.S: fault at ffff82d08022f983
create_bounce_frame+0x12b/0x13a
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.6.6-3.el6 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff8103cf38>]
(XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
(XEN) rax: 00000000000002ff rbx: ffffffff8217a1a0 rcx: 0000000000000000
(XEN) rdx: 0000000000000000 rsi: 00000000000002ff rdi: 0000000000042660
(XEN) rbp: ffffffff82003dc8 rsp: ffffffff82003d80 r8: ffffffff82003e0c
(XEN) r9: ffffffff82003e08 r10: 00000000ffffffff r11: 00000000ffffffff
(XEN) r12: ffffffff82003e04 r13: ffffffff82003e00 r14: ffffffff82003dfc
(XEN) r15: ffffffff82003df8 cr0: 0000000080050033 cr4: 00000000003526e0
(XEN) cr3: 0000003fde007000 cr2: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=ffffffff82003d80:
(XEN) 0000000000000000 00000000ffffffff 0000000000000000 ffffffff8103cf38
(XEN) 000000010000e030 0000000000010046 ffffffff82003dc8 000000000000e02b
(XEN) ffffffff8103cf28 ffffffff82003e48 ffffffff821bbd3e ffffffff82199890
(XEN) ffffffff8219a090 ffffffff82199890 ffffffff8219a090 ffffffff82003e38
(XEN) 0000000000100800 00000a8800000000 000002ff00000240 ffffffff8217a1a0
(XEN) ffffffff8217a1a0 ffffffff82673000 ffffffff82003f20 0000000000000000
(XEN) 0000000000000000 ffffffff82003e78 ffffffff821bb684 0000000001000000
(XEN) 0000037f82673000 ffffffff82003f20 0000000001000000 ffffffff82003e88
(XEN) ffffffff821bc371 ffffffff82003e98 ffffffff821bc3a7 ffffffff82003ef8
(XEN) ffffffff821b73e7 ffffffff00000010 ffffffff82003f08 ffffffff82003ec8
(XEN) ffffffff82003e88 ffffffff8114b100 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff82003f28
(XEN) ffffffff821aa0c6 0000000000000000 0000000000000000 b013b3f5b0133a3e
(XEN) ffffffff821a97ac ffffffff82003f38 ffffffff821a9386 ffffffff82003ff8
(XEN) ffffffff821b0dc6 0300000100000032 0000000000000005 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffd83a831fc9cbf5
(XEN) 0005065400100800 0000000000000001 0000000000000000 0000000000000000
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
So far I tried (to no avail):
- installing older Xen kernel version (4.9.39-29.el6 and 4.9.25-27.el6)
- installing older Xen version (4.6.3-15.el6) instead of the current one
(4.6.6-3.el6)
- disabling Hypertreading
- disabling NUMA
- changing several dom0_mem=,max: configurations (from 512M up to to 20G)
- changing several BIOS options including power profiles etc.
I read that someone else with a similar problem solved by setting some
parameters in the kernel command line, but I have no clue.
Can you help me?
Thanks in advance!
-- Francesco
[-- Attachment #1.2: Type: text/html, Size: 11763 bytes --]
[-- Attachment #2: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: d0v0 Unhandled general protection fault with 4.9.x on brand new hardware
2017-11-02 7:56 d0v0 Unhandled general protection fault with 4.9.x on brand new hardware Francesco De Francesco
@ 2017-11-02 9:31 ` Juergen Gross
2017-11-02 9:49 ` Francesco De Francesco
0 siblings, 1 reply; 3+ messages in thread
From: Juergen Gross @ 2017-11-02 9:31 UTC (permalink / raw)
To: Francesco De Francesco, xen-devel
On 02/11/17 08:56, Francesco De Francesco wrote:
> Hi everyone,
>
> I need to address an issue that prevents me from running Xen on new
> hardware. It's an HPE DL360 Gen10 with double Xeon Silver 4108 CPU and
> 256GB ECC DDR4 RDIMM. It happens with both CentOS6 and CentOS7.
>
> When I try to boot with Xen kernel 4.9.58-29.el6, I get the following
> error at boot time (I can read it from the serial console):
>
> (XEN) Brought up 32 CPUs
> (XEN) ACPI sleep modes: S3
> (XEN) VPMU: disabled
> (XEN) mcheck_poll: Machine check polling timer started.
> (XEN) Dom0 has maximum 1240 PIRQs
> (XEN) NX (Execute Disable) protection active
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN) Xen kernel: 64-bit, lsb, compat32
> (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x26a0000
> (XEN) PHYSICAL MEMORY ARRANGEMENT:
> (XEN) Dom0 alloc.: 0000003fdc000000->0000003fe0000000 (1012299 pages
> to be allocated)
> (XEN) Init. ramdisk: 000000403b04b000->000000403fdff200
> (XEN) VIRTUAL MEMORY ARRANGEMENT:
> (XEN) Loaded kernel: ffffffff81000000->ffffffff826a0000
> (XEN) Init. ramdisk: 0000000000000000->0000000000000000
> (XEN) Phys-Mach map: 0000008000000000->0000008000800000
> (XEN) Start info: ffffffff826a0000->ffffffff826a04b4
> (XEN) Page tables: ffffffff826a1000->ffffffff826b8000
> (XEN) Boot stack: ffffffff826b8000->ffffffff826b9000
> (XEN) TOTAL: ffffffff80000000->ffffffff82800000
> (XEN) ENTRY ADDRESS: ffffffff821a9180
> (XEN) Dom0 has maximum 32 VCPUs
> (XEN) Scrubbing Free RAM on 2 nodes using 16 CPUs
> (XEN)
> .................................................................................................................................done.
> (XEN) Initial low memory virq threshold set at 0x4000 pages.
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
> input to Xen)
> (XEN) Freed 300kB init memory.
> mapping kernel into physical memory
> about to get started...
> (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08022f983
> create_bounce_frame+0x12b/0x13a
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.6.6-3.el6 x86_64 debug=n Not tainted ]----
> (XEN) CPU: 0
> (XEN) RIP: e033:[<ffffffff8103cf38>]
> (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
> (XEN) rax: 00000000000002ff rbx: ffffffff8217a1a0 rcx: 0000000000000000
> (XEN) rdx: 0000000000000000 rsi: 00000000000002ff rdi: 0000000000042660
> (XEN) rbp: ffffffff82003dc8 rsp: ffffffff82003d80 r8: ffffffff82003e0c
> (XEN) r9: ffffffff82003e08 r10: 00000000ffffffff r11: 00000000ffffffff
> (XEN) r12: ffffffff82003e04 r13: ffffffff82003e00 r14: ffffffff82003dfc
> (XEN) r15: ffffffff82003df8 cr0: 0000000080050033 cr4: 00000000003526e0
> (XEN) cr3: 0000003fde007000 cr2: 0000000000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=ffffffff82003d80:
> (XEN) 0000000000000000 00000000ffffffff 0000000000000000 ffffffff8103cf38
> (XEN) 000000010000e030 0000000000010046 ffffffff82003dc8 000000000000e02b
> (XEN) ffffffff8103cf28 ffffffff82003e48 ffffffff821bbd3e ffffffff82199890
> (XEN) ffffffff8219a090 ffffffff82199890 ffffffff8219a090 ffffffff82003e38
> (XEN) 0000000000100800 00000a8800000000 000002ff00000240 ffffffff8217a1a0
> (XEN) ffffffff8217a1a0 ffffffff82673000 ffffffff82003f20 0000000000000000
> (XEN) 0000000000000000 ffffffff82003e78 ffffffff821bb684 0000000001000000
> (XEN) 0000037f82673000 ffffffff82003f20 0000000001000000 ffffffff82003e88
> (XEN) ffffffff821bc371 ffffffff82003e98 ffffffff821bc3a7 ffffffff82003ef8
> (XEN) ffffffff821b73e7 ffffffff00000010 ffffffff82003f08 ffffffff82003ec8
> (XEN) ffffffff82003e88 ffffffff8114b100 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff82003f28
> (XEN) ffffffff821aa0c6 0000000000000000 0000000000000000 b013b3f5b0133a3e
> (XEN) ffffffff821a97ac ffffffff82003f38 ffffffff821a9386 ffffffff82003ff8
> (XEN) ffffffff821b0dc6 0300000100000032 0000000000000005 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffd83a831fc9cbf5
> (XEN) 0005065400100800 0000000000000001 0000000000000000 0000000000000000
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
So your dom0 crashed very early before trap handlers have been set up.
Can you please try to find matching source lines for suspected kernel
addresses in above stack trace and for the guest's RIP: for all
addresses starting with "ffffffff81" you should try the addr2line tool
to obtain that information. This will help to have an idea what
happened.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: d0v0 Unhandled general protection fault with 4.9.x on brand new hardware
2017-11-02 9:31 ` Juergen Gross
@ 2017-11-02 9:49 ` Francesco De Francesco
0 siblings, 0 replies; 3+ messages in thread
From: Francesco De Francesco @ 2017-11-02 9:49 UTC (permalink / raw)
Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 5411 bytes --]
do I have to run addr2line on my vmlinux image?
2017-11-02 10:31 GMT+01:00 Juergen Gross <jgross@suse.com>:
> On 02/11/17 08:56, Francesco De Francesco wrote:
> > Hi everyone,
> >
> > I need to address an issue that prevents me from running Xen on new
> > hardware. It's an HPE DL360 Gen10 with double Xeon Silver 4108 CPU and
> > 256GB ECC DDR4 RDIMM. It happens with both CentOS6 and CentOS7.
> >
> > When I try to boot with Xen kernel 4.9.58-29.el6, I get the following
> > error at boot time (I can read it from the serial console):
> >
> > (XEN) Brought up 32 CPUs
> > (XEN) ACPI sleep modes: S3
> > (XEN) VPMU: disabled
> > (XEN) mcheck_poll: Machine check polling timer started.
> > (XEN) Dom0 has maximum 1240 PIRQs
> > (XEN) NX (Execute Disable) protection active
> > (XEN) *** LOADING DOMAIN 0 ***
> > (XEN) Xen kernel: 64-bit, lsb, compat32
> > (XEN) Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x26a0000
> > (XEN) PHYSICAL MEMORY ARRANGEMENT:
> > (XEN) Dom0 alloc.: 0000003fdc000000->0000003fe0000000 (1012299 pages
> > to be allocated)
> > (XEN) Init. ramdisk: 000000403b04b000->000000403fdff200
> > (XEN) VIRTUAL MEMORY ARRANGEMENT:
> > (XEN) Loaded kernel: ffffffff81000000->ffffffff826a0000
> > (XEN) Init. ramdisk: 0000000000000000->0000000000000000
> > (XEN) Phys-Mach map: 0000008000000000->0000008000800000
> > (XEN) Start info: ffffffff826a0000->ffffffff826a04b4
> > (XEN) Page tables: ffffffff826a1000->ffffffff826b8000
> > (XEN) Boot stack: ffffffff826b8000->ffffffff826b9000
> > (XEN) TOTAL: ffffffff80000000->ffffffff82800000
> > (XEN) ENTRY ADDRESS: ffffffff821a9180
> > (XEN) Dom0 has maximum 32 VCPUs
> > (XEN) Scrubbing Free RAM on 2 nodes using 16 CPUs
> > (XEN)
> > ............................................................
> .....................................................................done.
> > (XEN) Initial low memory virq threshold set at 0x4000 pages.
> > (XEN) Std. Loglevel: All
> > (XEN) Guest Loglevel: All
> > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
> > input to Xen)
> > (XEN) Freed 300kB init memory.
> > mapping kernel into physical memory
> > about to get started...
> > (XEN) d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
> > (XEN) domain_crash_sync called from entry.S: fault at ffff82d08022f983
> > create_bounce_frame+0x12b/0x13a
> > (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> > (XEN) ----[ Xen-4.6.6-3.el6 x86_64 debug=n Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e033:[<ffffffff8103cf38>]
> > (XEN) RFLAGS: 0000000000000246 EM: 1 CONTEXT: pv guest (d0v0)
> > (XEN) rax: 00000000000002ff rbx: ffffffff8217a1a0 rcx:
> 0000000000000000
> > (XEN) rdx: 0000000000000000 rsi: 00000000000002ff rdi:
> 0000000000042660
> > (XEN) rbp: ffffffff82003dc8 rsp: ffffffff82003d80 r8:
> ffffffff82003e0c
> > (XEN) r9: ffffffff82003e08 r10: 00000000ffffffff r11:
> 00000000ffffffff
> > (XEN) r12: ffffffff82003e04 r13: ffffffff82003e00 r14:
> ffffffff82003dfc
> > (XEN) r15: ffffffff82003df8 cr0: 0000000080050033 cr4:
> 00000000003526e0
> > (XEN) cr3: 0000003fde007000 cr2: 0000000000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> > (XEN) Guest stack trace from rsp=ffffffff82003d80:
> > (XEN) 0000000000000000 00000000ffffffff 0000000000000000
> ffffffff8103cf38
> > (XEN) 000000010000e030 0000000000010046 ffffffff82003dc8
> 000000000000e02b
> > (XEN) ffffffff8103cf28 ffffffff82003e48 ffffffff821bbd3e
> ffffffff82199890
> > (XEN) ffffffff8219a090 ffffffff82199890 ffffffff8219a090
> ffffffff82003e38
> > (XEN) 0000000000100800 00000a8800000000 000002ff00000240
> ffffffff8217a1a0
> > (XEN) ffffffff8217a1a0 ffffffff82673000 ffffffff82003f20
> 0000000000000000
> > (XEN) 0000000000000000 ffffffff82003e78 ffffffff821bb684
> 0000000001000000
> > (XEN) 0000037f82673000 ffffffff82003f20 0000000001000000
> ffffffff82003e88
> > (XEN) ffffffff821bc371 ffffffff82003e98 ffffffff821bc3a7
> ffffffff82003ef8
> > (XEN) ffffffff821b73e7 ffffffff00000010 ffffffff82003f08
> ffffffff82003ec8
> > (XEN) ffffffff82003e88 ffffffff8114b100 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffffffff82003f28
> > (XEN) ffffffff821aa0c6 0000000000000000 0000000000000000
> b013b3f5b0133a3e
> > (XEN) ffffffff821a97ac ffffffff82003f38 ffffffff821a9386
> ffffffff82003ff8
> > (XEN) ffffffff821b0dc6 0300000100000032 0000000000000005
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffd83a831fc9cbf5
> > (XEN) 0005065400100800 0000000000000001 0000000000000000
> 0000000000000000
> > (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
>
> So your dom0 crashed very early before trap handlers have been set up.
>
> Can you please try to find matching source lines for suspected kernel
> addresses in above stack trace and for the guest's RIP: for all
> addresses starting with "ffffffff81" you should try the addr2line tool
> to obtain that information. This will help to have an idea what
> happened.
>
>
> Juergen
>
>
[-- Attachment #1.2: Type: text/html, Size: 6601 bytes --]
[-- Attachment #2: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-11-02 9:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-02 7:56 d0v0 Unhandled general protection fault with 4.9.x on brand new hardware Francesco De Francesco
2017-11-02 9:31 ` Juergen Gross
2017-11-02 9:49 ` Francesco De Francesco
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).