From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Li, Liang Z" <liang.z.li@intel.com>,
David Vrabel <david.vrabel@citrix.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>, Tim Deegan <tim@xen.org>,
Jan Beulich <JBeulich@suse.com>
Subject: Re: dom0 show call trace and failed to boot on HSW-EX platform
Date: Tue, 2 Feb 2016 10:11:19 +0000 [thread overview]
Message-ID: <56B080C7.9070704@citrix.com> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0373E0AC@SHSMSX101.ccr.corp.intel.com>
[-- Attachment #1.1: Type: text/plain, Size: 8339 bytes --]
On 02/02/16 07:40, Li, Liang Z wrote:
> Hi David,
>
> We found dom0 will crash when booing on HSW-EX server, the dom0 kernel version is v4.4. By debugging I found the your patch
> ' x86/xen: discard RAM regions above the maximum reservation' , which the commit ID is : f5775e0b6116b7e2425ccf535243b21
> caused the regression. The debug message is listed below:
> ===============================================================
> (XEN) mm.c:884:d0v14 pg_owner 0 l1e_owner 0, but real_pg_owner -1
> (XEN) mm.c:955:d0v14 Error getting mfn 1080000 (pfn ffffffffffffffff) from L1
> (XEN) mm.c:1269:d0v14 Failure in alloc_l1_table: entry 0
> (XEN) mm.c:2175:d0v14 Error while validating mfn 188d903 (pfn 17a7cc) for type
> (XEN) mm.c:3101:d0v14 Error -16 while pinning mfn 188d903
> [ 33.768792] ------------[ cut here ]------------
> WARNING: CPU: 14 PID: 1 at arch/x86/xen/multicalls.c:129 xen_mc_
> [ 33.783809] Modules linked in:
> [ 33.787304] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #1
> [ 33.793991] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS
> [ 33.805624] 0000000000000081 ffff88017d2537c8 ffffffff812ff954 000000000000^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> [ 33.813961] 0000000000000000 0000000000000081 0000000000000000 ffff88017d25^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> [ 33.822300] ffffffff810ca120 ffffffff81cb7f00 ffff8801879ca280 000000000000^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> [ 33.830639] Call Trace:
> [ 33.833457] [<ffffffff812ff954>] dump_stack+0x48/0x64
> [ 33.839277] [<ffffffff810ca120>] warn_slowpath_common+0x90/0xd0
> [ 33.846058] [<ffffffff810ca175>] warn_slowpath_null+0x15/0x20
> [ 33.852659] [<ffffffff81060133>] xen_mc_flush+0x1c3/0x1d0
> [ 33.858858] [<ffffffff8106449f>] xen_alloc_pte+0x20f/0x300
> [ 33.865158] [<ffffffff810beef5>] ? update_page_count+0x45/0x60
> [ 33.871855] [<ffffffff817a1194>] ? phys_pte_init+0x170/0x183
> [ 33.878345] [<ffffffff817a148d>] phys_pmd_init+0x2e6/0x389
> [ 33.884649] [<ffffffff817a17dd>] phys_pud_init+0x2ad/0x3dc
> [ 33.890954] [<ffffffff817a290d>] kernel_physical_mapping_init+0xec/0x211
> [ 33.898613] [<ffffffff8179df8d>] init_memory_mapping+0x17d/0x2f0
> [ 33.905496] [<ffffffff81104f11>] ? __raw_callee_save___pv_queued_spin_unloc^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> [ 33.914516] [<ffffffff813643f7>] ? acpi_os_signal_semaphore+0x2e/0x32
> [ 33.921889] [<ffffffff810ba7b8>] arch_add_memory+0x48/0xf0
> [ 33.928186] [<ffffffff8179eb80>] add_memory_resource+0x80/0x110
> [ 33.934967] [<ffffffff8179ec8d>] add_memory+0x7d/0xc0
> [ 33.940787] [<ffffffff81399538>] acpi_memory_device_add+0x14f/0x237
> [ 33.947963] [<ffffffff81369a6d>] acpi_bus_attach+0xcb/0x166
> [ 33.954359] [<ffffffff81369acd>] acpi_bus_attach+0x12b/0x166
> [ 33.960854] [<ffffffff81369acd>] acpi_bus_attach+0x12b/0x166
> [ 33.967350] [<ffffffff81369acd>] acpi_bus_attach+0x12b/0x166
> [ 33.973848] [<ffffffff8136aff1>] acpi_bus_scan+0x5b/0x66
> [ 33.979962] [<ffffffff81d31e04>] ? acpi_early_init+0xeb/0xeb
> [ 33.986450] [<ffffffff81d32187>] acpi_scan_init+0x7d/0x1c4
> [ 33.992755] [<ffffffff81d31e04>] ? acpi_early_init+0xeb/0xeb
> [ 33.999248] [<ffffffff81d31e04>] ? acpi_early_init+0xeb/0xeb
> [ 34.005747] [<ffffffff81d3204a>] acpi_init+0x246/0x282
> [ 34.011659] [<ffffffff81d31e04>] ? acpi_early_init+0xeb/0xeb
> [ 34.018156] [<ffffffff810020b1>] do_one_initcall+0x81/0x1e0
> [ 34.024557] [<ffffffff81cf5c06>] kernel_init_freeable+0x19d/0x238
> [ 34.031542] [<ffffffff81cf5ca1>] ? kernel_init_freeable+0x238/0x238
> [ 34.038711] [<ffffffff8179d490>] ? rest_init+0x80/0x80
> [ 34.044626] [<ffffffff8179d499>] kernel_init+0x9/0xe0
> [ 34.050450] [<ffffffff817aa3cf>] ret_from_fork+0x3f/0x70
> [ 34.056552] [<ffffffff8179d490>] ? rest_init+0x80/0x80
> [ 34.062475] ---[ end trace 854dae1bef359299 ]---
> ============================================================================================
>
> You can get more information in 'error_log.txt'.
>
> Any idea?
> I don't know your original intention of this patch, so just send a revert patch to fix the issue is not a good choice,
> May be you have better solution.
>
> Liang
>
>
> error_log.txt
>
>
> (XEN) Bad console= option '8n1'
8n1 should be part of com1= or com2=, rather than console=
> Xen 4.7-unstable
> (XEN) Xen version 4.7-unstable (build@) (gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7^[[16;80H-^[[16;80H1^[[16;80H6^[[16;80H)^[[16;80H)^[[16;80H ^[[16;80Hd^[[16;80He^[[16;80Hb^[[16;80Hu^[[16;80Hg^[[16;80H=^[[16;80Hy^[[16;80H ^[[16;80HT^[[16;80Hh^[[16;80Hu^[[16;80H ^[[16;80HJ^[[16;80Ha^[[16;80Hn^[[16;80H ^[[16;80H2^[[16;80H1^[[16;80H ^[[16;80H2^[[16;80H3^[[16;80H:^[[16;80H2^[[16;80H1^[[16;80H:^[[16;80H3^[[16;80H2^[[16;80H ^[[16;80HE^[[16;80HS^[[16;80HT^[[16;80H ^[[16;80H2^[[16;80H0^[[16;80H1^[[16;80H6^[[16;80H
> (XEN) Latest ChangeSet: Tue Jan 19 17:47:19 2016 +0000 git:1949868-dirty
> (XEN) Console output is synchronous.
> (XEN) Bootloader: GNU GRUB 0.97
> (XEN) Command line: dom0_mem=4096M loglvl=all guest_loglvl=all unrestricted_gues^[[20;80Ht^[[20;80H=^[[20;80H1^[[20;80H ^[[20;80Hm^[[20;80Hs^[[20;80Hi^[[20;80H=^[[20;80H1^[[20;80H ^[[20;80Hc^[[20;80Ho^[[20;80Hn^[[20;80Hs^[[20;80Ho^[[20;80Hl^[[20;80He^[[20;80H=^[[20;80Hc^[[20;80Ho^[[20;80Hm^[[20;80H1^[[20;80H,^[[20;80H1^[[20;80H1^[[20;80H5^[[20;80H2^[[20;80H0^[[20;80H0^[[20;80H,^[[20;80H8^[[20;80Hn^[[20;80H1^[[20;80H ^[[20;80Hs^[[20;80Hy^[[20;80Hn^[[20;80Hc^[[20;80H_^[[20;80Hc^[[20;80Ho^[[20;80Hn^[[20;80Hs^[[20;80Ho^[[20;80Hl^[[20;80He^[[20;80H ^[[20;80Hh^[[20;80Ha^[[20;80Hp^[[20;80H_^[[20;80H1^[[20;80Hg^[[20;80Hb^[[20;80H=^[[20;80H1^[[20;80H ^[[20;80Hc^[[20;80Ho^[[20;80Hn^[[20;80Hr^[[20;80Hi^[[20;80Hn^[[20;80Hg^[[20;80H_^[[20;80Hs^[[20;80Hi^[[20;80Hz^[[20;80He^[[20;80H=^[[20;80H1^[[20;80H2^[[20;80H8^[[20;80HM^[[20;80H ^[[20;80Hi^[[20;80Ho^[[20;80Hm^[[20;80Hm^[[20;80Hu^[[20;80H=^[[20;80Ho^[[20;80Hn^[[20;80H,^[[20;80Hi^[[20;80Hn^[[20;80Ht^[[20;80Hp^[[20;80Ho^[[20;80Hs^[[20;80Ht^[[20;80H ^[[20;80Hp^[[20;80Hs^[[20;80Hr^[[20;80H=^[[20;80Hc^[[20;80Hm^[[20;80Ht^[[20;80H ^[[20;80Hp^[[20;80Hs^[[20;80Hr^[[20;80H=^[[20;80Hc^[[20;80Ha^[[20;80Ht^[[20;80H ^[[20;80Hp^[[20;80Hs^[[20;80Hr^[[20;80H=^[[20;80Hc^[[20;80Hd^[[20;80Hp^[[20;80H
This is very hard to read with the VT escape characters still present.
However, you probably meant dom0_mem=4096M:max=4096M, or dom0 gets all
the remaining RAM.
Having said that, giving dom0 all the RAM should work, and...
> ^[[23;80H ^[[24;1H[ 33.656695] ACPI: NR_CPUS/possible_cpus limit of 64 reached. Processor 99/0^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> ^[[23;80H.^[[24;1H[ 33.665648] ACPI: Unable to map lapic to logical cpu number
> ^[[23;80H ^[[24;1H(XEN) mm.c:884:d0v14 pg_owner 0 l1e_owner 0, but real_pg_owner -1
> ^[[23;80H ^[[24;1H(XEN) mm.c:955:d0v14 Error getting mfn 1080000 (pfn ffffffffffffffff) from L1 e^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> ^[[23;80H0^[[24;1H(XEN) mm.c:1269:d0v14 Failure in alloc_l1_table: entry 0
> ^[[23;80H ^[[24;1H(XEN) mm.c:2175:d0v14 Error while validating mfn 188d903 (pfn 17a7cc) for type ^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H^[[24;80H
> ^[[23;80H1^[[24;1H(XEN) mm.c:3101:d0v14 Error -16 while pinning mfn 188d903
This is a -EBUSY. Is there anything magic about mfn 188d903? It just
looks like plain RAM in the E820 table.
Have you got dom0 configured to use linear p2m mode? Without it, dom0
can only have a maximum of 512GB of RAM.
~Andrew
[-- Attachment #1.2: Type: text/html, Size: 10090 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-02-02 10:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-02 7:40 dom0 show call trace and failed to boot on HSW-EX platform Li, Liang Z
2016-02-02 10:11 ` Andrew Cooper [this message]
2016-02-02 10:23 ` David Vrabel
2016-02-02 10:23 ` David Vrabel
2016-02-02 13:15 ` Li, Liang Z
2016-02-02 13:34 ` David Vrabel
2016-02-02 13:34 ` [Xen-devel] " David Vrabel
2016-02-02 16:49 ` Konrad Rzeszutek Wilk
2016-02-02 19:56 ` Daniel Kiper
2016-02-02 19:56 ` Daniel Kiper
2016-02-02 13:15 ` Li, Liang Z
2016-02-02 12:56 ` Li, Liang Z
2016-02-02 12:56 ` Li, Liang Z
-- strict thread matches above, loose matches on Subject: below --
2016-02-02 7:40 Li, Liang Z
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B080C7.9070704@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=daniel.kiper@oracle.com \
--cc=david.vrabel@citrix.com \
--cc=liang.z.li@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tim@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.