From: <bercarug@amazon.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
David Woodhouse <dwmw2@infradead.org>,
Jan Beulich <JBeulich@suse.com>,
abelgun@amazon.com
Subject: Re: PVH dom0 creation fails - the system freezes
Date: Wed, 25 Jul 2018 16:57:23 +0300 [thread overview]
Message-ID: <40a982ee-06c4-e45a-006e-f75df79eb14b@amazon.com> (raw)
In-Reply-To: <20180725133530.235csakkjrz6y5yr@mac.bytemobile.com>
On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com wrote:
>> On 07/24/2018 12:54 PM, Jan Beulich wrote:
>>>>>> On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
>>>> For the last few days, I have been trying to get a PVH dom0 running,
>>>> however I encountered the following problem: the system seems to
>>>> freeze after the hypervisor boots, the screen goes black. I have tried to
>>>> debug it via a serial console (using Minicom) and managed to get some
>>>> more Xen output, after the screen turns black.
>>>>
>>>> I mention that I have tried to boot the PVH dom0 using different kernel
>>>> images (from 4.9.0 to 4.18-rc3), different Xen versions (4.10, 4.11, 4.12).
>>>>
>>>> Below I attached my system / hypervisor configuration, as well as the
>>>> output captured through the serial console, corresponding to the latest
>>>> versions for Xen and the Linux Kernel (Xen staging and Kernel from the
>>>> xen/tip tree).
>>>> [...]
>>>> (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
>>>> (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending Fault
>>>> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault addr 8deb3000, iommu reg = ffff82c00021b000
> Can you figure out which PCI device is 00:14.0?
This is the output of lspci -vvv for device 00:14.0:
00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
Controller (rev 31) (prog-if 30 [XHCI])
Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI
Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 178
Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [70] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
Address: 00000000fee0e000 Data: 4021
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
>>>> (XEN) [VT-D]DMAR: reason 05 - PTE Write access is not set
>>>> (XEN) print_vtd_entries: iommu #0 dev 0000:00:14.0 gmfn 8deb3
>>>> (XEN) root_entry[00] = 1021c60001
>>>> (XEN) context[a0] = 2_1021d6d001
>>>> (XEN) l4[000] = 9c00001021d6c107
>>>> (XEN) l3[002] = 9c00001021d3e107
>>>> (XEN) l2[06f] = 9c000010218c0107
>>>> (XEN) l1[0b3] = 8000000000000000
>>>> (XEN) l1[0b3] not present
>>>> (XEN) Dom0 callback via changed to Direct Vector 0xf3
>>> This might be a hint at a missing RMRR entry in the ACPI tables, as
>>> we've seen to be the case for a number of systems (I dare to guess
>>> that 0000:00:14.0 is a USB controller, perhaps one with a keyboard
>>> and/or mouse connected). You may want to play with the respective
>>> command line option ("rmrr="). Note that "iommu_inclusive_mapping"
>>> as you're using it does not have any meaning for PVH (see
>>> intel_iommu_hwdom_init()).
>>>
>>> Jan
>>>
>>>
>>>
>> Hello,
>>
>> Following Roger's advice, I rebuilt Xen (4.12) using the staging branch and
>> I managed to get a PVH dom0 starting. However, some other problems appeared:
>>
>> 1) The USB devices are not usable anymore (keyboard and mouse), so the
>> system is only accessible through the serial port.
> Can you boot with iommu=debug and see if you get any extra IOMMU
> information on the serial console?
The debug flag was already set, so the log I attached on the first
message already contains the IOMMU info.
In Xen's command line I used iommu=debug,verbose,workaround_bios_bug.
>
>> 2) I can run any usual command in dom0, but the ones involving xl (except
>> for xl info) will make the system run out of memory very fast. Eventually,
>> when there is no more free memory available, the OOM killer begins removing
>> processes until the system auto reboots.
>>
>> I attached a file containing the output of a lsusb, as well as the output of
>> xl info and xl list -l.
>> After xl list -l, the “free -m” commands show the available memory
>> decreasing.
>> Each command has a timestamp appended, so it can be seen how fast the
>> available memory decreases.
>>
>> I removed much of the process killing logs and kept the last one, since they
>> were following the same pattern.
>>
>> Dom0 still appears to be of type PV (output of xl list -l), however during
>> boot, the following messages were displayed: “Building a PVH Dom0” and
>> “Booting paravirtualized kernel on Xen PVH”.
>>
>> I mention that I had to add “workaround_bios_bug” in GRUB_CMDLINE_XEN for
>> iommu to get dom0 running.
> It seems to me like your ACPI DMAR table contains errors, and I
> wouldn't be surprised if those also cause the USB devices to
> malfunction.
>
>> What could be causing the available memory loss problem?
> That seems to be Linux aggressively ballooning out memory, you go from
> 7129M total memory to 246M. Are you creating a lot of domains?
>
> Roger.
>
I did not create any guest before issuing "xl list -l". However, creating
a PVH domU will work - "xl create <cfg_file>" does not produce this
behavior.
Gabriel
Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2018-07-25 13:57 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-23 11:50 PVH dom0 creation fails - the system freezes bercarug
2018-07-24 9:54 ` Jan Beulich
2018-07-25 10:06 ` bercarug
2018-07-25 10:22 ` Wei Liu
2018-07-25 10:43 ` Juergen Gross
2018-07-25 13:35 ` Roger Pau Monné
2018-07-25 13:41 ` Juergen Gross
2018-07-25 14:02 ` Wei Liu
2018-07-25 14:05 ` bercarug
2018-07-25 14:10 ` Wei Liu
2018-07-25 16:12 ` Roger Pau Monné
2018-07-25 16:29 ` Juergen Gross
2018-07-25 18:56 ` [Memory Accounting] was: " Andrew Cooper
2018-07-25 23:07 ` Boris Ostrovsky
2018-07-26 9:41 ` Juergen Gross
2018-07-26 9:45 ` George Dunlap
2018-07-26 11:11 ` Roger Pau Monné
2018-07-26 11:22 ` Juergen Gross
2018-07-26 11:27 ` George Dunlap
2018-07-26 12:19 ` Juergen Gross
2018-07-26 14:44 ` George Dunlap
2018-07-26 13:50 ` Roger Pau Monné
2018-07-26 13:58 ` Juergen Gross
2018-07-26 14:35 ` Roger Pau Monné
2018-07-26 11:23 ` George Dunlap
2018-07-26 11:08 ` Roger Pau Monné
2018-07-26 8:15 ` bercarug
2018-07-26 8:31 ` Juergen Gross
2018-07-26 11:05 ` Roger Pau Monné
2018-07-25 13:57 ` bercarug [this message]
2018-07-25 14:12 ` Roger Pau Monné
2018-07-25 16:19 ` Paul Durrant
2018-07-26 16:46 ` Roger Pau Monné
2018-07-27 8:48 ` Bercaru, Gabriel
2018-07-27 9:11 ` Roger Pau Monné
2018-08-02 11:36 ` Bercaru, Gabriel
2018-08-02 13:55 ` Roger Pau Monné
2018-08-08 7:46 ` bercarug
2018-08-08 8:08 ` Roger Pau Monné
2018-08-08 8:39 ` bercarug
2018-08-08 8:43 ` Paul Durrant
2018-08-08 8:51 ` Roger Pau Monné
2018-08-08 8:54 ` bercarug
2018-08-08 9:44 ` Roger Pau Monné
2018-08-08 10:11 ` Roger Pau Monné
2018-08-08 10:13 ` bercarug
[not found] ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
[not found] ` <5B6AAF130200003B04D2E796@prv1-mh.provo.novell.com>
2018-08-08 10:00 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40a982ee-06c4-e45a-006e-f75df79eb14b@amazon.com \
--to=bercarug@amazon.com \
--cc=JBeulich@suse.com \
--cc=abelgun@amazon.com \
--cc=dwmw2@infradead.org \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).