From: "Bercaru, Gabriel" <bercarug@amazon.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>,
"Paul Durrant" <Paul.Durrant@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
David Woodhouse <dwmw2@infradead.org>,
Jan Beulich <JBeulich@suse.com>,
"Belgun, Adrian" <abelgun@amazon.com>
Subject: Re: PVH dom0 creation fails - the system freezes
Date: Fri, 27 Jul 2018 08:48:32 +0000 [thread overview]
Message-ID: <1532681311565.24220@amazon.com> (raw)
In-Reply-To: <20180726164611.gneruvoi5vmvbkd5@mac.bytemobile.com>
[-- Attachment #1: Type: text/plain, Size: 14539 bytes --]
I tried the patch and it fixes the unusable USB devices problem.
However, I captured the boot messages and the "IOMMU mapping failed" printk
seems to have been executed on each iteration of the loop.
I attached a small section of the boot log. As I said, the warning log was displayed
many more times, I removed a lot of them to keep the attached file short.
Gabriel
________________________________________
From: Roger Pau Monné <roger.pau@citrix.com>
Sent: Thursday, July 26, 2018 7:46 PM
To: Paul Durrant
Cc: Bercaru, Gabriel; xen-devel; David Woodhouse; Jan Beulich; Belgun, Adrian
Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
On Wed, Jul 25, 2018 at 05:19:03PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> > Of Roger Pau Monné
> > Sent: 25 July 2018 15:12
> > To: bercarug@amazon.com
> > Cc: xen-devel <xen-devel@lists.xenproject.org>; David Woodhouse
> > <dwmw2@infradead.org>; Jan Beulich <JBeulich@suse.com>;
> > abelgun@amazon.com
> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> >
> > On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> > > On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com
> > wrote:
> > > > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > > > however I encountered the following problem: the system seems
> > to
> > > > > > > freeze after the hypervisor boots, the screen goes black. I have
> > tried to
> > > > > > > debug it via a serial console (using Minicom) and managed to get
> > some
> > > > > > > more Xen output, after the screen turns black.
> > > > > > >
> > > > > > > I mention that I have tried to boot the PVH dom0 using different
> > kernel
> > > > > > > images (from 4.9.0 to 4.18-rc3), different Xen versions (4.10, 4.11,
> > 4.12).
> > > > > > >
> > > > > > > Below I attached my system / hypervisor configuration, as well as
> > the
> > > > > > > output captured through the serial console, corresponding to the
> > latest
> > > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from
> > the
> > > > > > > xen/tip tree).
> > > > > > > [...]
> > > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending
> > Fault
> > > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault
> > addr 8deb3000, iommu reg = ffff82c00021b000
> > > > Can you figure out which PCI device is 00:14.0?
> > > This is the output of lspci -vvv for device 00:14.0:
> > >
> > > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> > > Controller (rev 31) (prog-if 30 [XHCI])
> > > Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
> > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+
> > > Stepping- SERR+ FastB2B- DisINTx+
> > > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > > <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > > Latency: 0
> > > Interrupt: pin A routed to IRQ 178
> > > Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
> > > Capabilities: [70] Power Management version 2
> > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> > > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > > Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > > Address: 00000000fee0e000 Data: 4021
> > > Kernel driver in use: xhci_hcd
> > > Kernel modules: xhci_pci
> >
> > I'm afraid your USB controller is missing RMRR entries in the DMAR
> > ACPI tables, thus causing the IOMMU faults and not working properly.
> >
> > You could try to manually add some extra rmrr regions by appending:
> >
> > rmrr=0x8deb3=0:0:14.0
> >
> > To the Xen command line, and keep adding any address that pops up in
> > the iommu faults. This is of course quite cumbersome, but there's no
> > way to get the required memory addresses if the data in RMRR is
> > wrong/incomplete.
> >
>
> You could just add all E820 reserved regions in there. That will almost certainly cover it.
I have a prototype patch for this that attempts to identity map all
reserved regions below 4GB to the p2m. It's still a WIP, but if you
could give it a try that would help me figure out whether this fixes
your issues and is indeed something that would be good to have.
I don't really like the patch as-is because it doesn't check whether
the reserved regions added to the p2m overlap with the LAPIC page or
the PCIe MCFG regions for example, I will continue to work on a safer
version.
If you can give this a shot, please remove any rmrr options from the
command line and use iommu=debug in order to catch any issues.
Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2c44fabf99..76a1fd6681 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -21,6 +21,8 @@
#include <xen/keyhandler.h>
#include <xsm/xsm.h>
+#include <asm/setup.h>
+
static int parse_iommu_param(const char *s);
static void iommu_dump_p2m_table(unsigned char key);
@@ -47,6 +49,8 @@ integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
* no-igfx Disable VT-d for IGD devices (insecure)
* no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping
* tables (insecure)
+ * inclusive Include any memory ranges below 4GB not used
+ * by Xen or unusable to the iommu page tables.
*/
custom_param("iommu", parse_iommu_param);
bool_t __initdata iommu_enable = 1;
@@ -60,6 +64,7 @@ bool_t __read_mostly iommu_passthrough;
bool_t __read_mostly iommu_snoop = 1;
bool_t __read_mostly iommu_qinval = 1;
bool_t __read_mostly iommu_intremap = 1;
+bool __read_mostly iommu_inclusive = true;
/*
* In the current implementation of VT-d posted interrupts, in some extreme
@@ -126,6 +131,8 @@ static int __init parse_iommu_param(const char *s)
iommu_dom0_strict = val;
else if ( !strncmp(s, "sharept", ss - s) )
iommu_hap_pt_share = val;
+ else if ( !strncmp(s, "inclusive", ss - s) )
+ iommu_inclusive = val;
else
rc = -EINVAL;
@@ -165,6 +172,85 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
iommu_dom0_strict = 1;
}
+static void __hwdom_init setup_inclusive_mappings(struct domain *d)
+{
+ unsigned long i, j, tmp, top, max_pfn;
+
+ BUG_ON(!is_hardware_domain(d));
+
+ max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
+ top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
+
+ for ( i = 0; i < top; i++ )
+ {
+ unsigned long pfn = pdx_to_pfn(i);
+ bool map;
+ int rc = 0;
+
+ /*
+ * Set up 1:1 mapping for dom0. Default to include only
+ * conventional RAM areas and let RMRRs include needed reserved
+ * regions. When set, the inclusive mapping additionally maps in
+ * every pfn up to 4GB except those that fall in unusable ranges.
+ */
+ if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
+ continue;
+
+ if ( is_pv_domain(d) && iommu_inclusive && pfn <= max_pfn )
+ map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
+ else if ( is_hvm_domain(d) && iommu_inclusive )
+ map = page_is_ram_type(pfn, RAM_TYPE_RESERVED);
+ else
+ map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
+
+ if ( !map )
+ continue;
+
+ /* Exclude Xen bits */
+ if ( xen_in_range(pfn) )
+ continue;
+
+ /*
+ * If dom0-strict mode is enabled or guest type is HVM/PVH then exclude
+ * conventional RAM and let the common code map dom0's pages.
+ */
+ if ( (iommu_dom0_strict || is_hvm_domain(d)) &&
+ page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
+ continue;
+
+ /* For HVM avoid memory below 1MB because that's already mapped. */
+ if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) )
+ continue;
+
+ tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+ for ( j = 0; j < tmp; j++ )
+ {
+ int ret;
+
+ if ( iommu_use_hap_pt(d) )
+ {
+ ASSERT(is_hvm_domain(d));
+ ret = set_identity_p2m_entry(d, pfn * tmp + j, p2m_access_rw,
+ 0);
+ }
+ else
+ ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+ IOMMUF_readable|IOMMUF_writable);
+
+ if ( !rc )
+ rc = ret;
+ }
+
+ if ( rc )
+ printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
+ d->domain_id, rc);
+
+ if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
+ process_pending_softirqs();
+ }
+
+}
+
void __hwdom_init iommu_hwdom_init(struct domain *d)
{
const struct domain_iommu *hd = dom_iommu(d);
@@ -207,7 +293,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
d->domain_id, rc);
}
- return hd->platform_ops->hwdom_init(d);
+ hd->platform_ops->hwdom_init(d);
+
+ if ( !iommu_passthrough )
+ setup_inclusive_mappings(d);
}
void iommu_teardown(struct domain *d)
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index fb7edfaef9..91cadc602e 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -99,6 +99,4 @@ void pci_vtd_quirk(const struct pci_dev *);
bool_t platform_supports_intremap(void);
bool_t platform_supports_x2apic(void);
-void vtd_set_hwdom_mapping(struct domain *d);
-
#endif // _VTD_EXTERN_H_
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..569ec4aec2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1304,12 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
{
struct acpi_drhd_unit *drhd;
- if ( !iommu_passthrough && is_pv_domain(d) )
- {
- /* Set up 1:1 page table for hardware domain. */
- vtd_set_hwdom_mapping(d);
- }
-
setup_hwdom_pci_devices(d, setup_hwdom_device);
setup_hwdom_rmrr(d);
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..9971915349 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -32,11 +32,9 @@
#include "../extern.h"
/*
- * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
- * 1:1 iommu mappings except xen and unusable regions.
+ * iommu_inclusive_mapping: superseded by iommu=inclusive.
*/
-static bool_t __hwdom_initdata iommu_inclusive_mapping = 1;
-boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping);
+boolean_param("iommu_inclusive_mapping", iommu_inclusive);
void *map_vtd_domain_page(u64 maddr)
{
@@ -107,67 +105,3 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
}
spin_unlock(&d->event_lock);
}
-
-void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
-{
- unsigned long i, j, tmp, top, max_pfn;
-
- BUG_ON(!is_hardware_domain(d));
-
- max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
- top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
-
- for ( i = 0; i < top; i++ )
- {
- unsigned long pfn = pdx_to_pfn(i);
- bool map;
- int rc = 0;
-
- /*
- * Set up 1:1 mapping for dom0. Default to include only
- * conventional RAM areas and let RMRRs include needed reserved
- * regions. When set, the inclusive mapping additionally maps in
- * every pfn up to 4GB except those that fall in unusable ranges.
- */
- if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
- continue;
-
- if ( iommu_inclusive_mapping && pfn <= max_pfn )
- map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
- else
- map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
-
- if ( !map )
- continue;
-
- /* Exclude Xen bits */
- if ( xen_in_range(pfn) )
- continue;
-
- /*
- * If dom0-strict mode is enabled then exclude conventional RAM
- * and let the common code map dom0's pages.
- */
- if ( iommu_dom0_strict &&
- page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
- continue;
-
- tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
- for ( j = 0; j < tmp; j++ )
- {
- int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
- IOMMUF_readable|IOMMUF_writable);
-
- if ( !rc )
- rc = ret;
- }
-
- if ( rc )
- printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
- d->domain_id, rc);
-
- if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
- process_pending_softirqs();
- }
-}
-
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..15d6584837 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -35,6 +35,7 @@ extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost;
extern bool_t iommu_hap_pt_share;
extern bool_t iommu_debug;
extern bool_t amd_iommu_perdev_intremap;
+extern bool iommu_inclusive;
extern unsigned int iommu_dev_iotlb_timeout;
Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
[-- Attachment #2: iommu.txt --]
[-- Type: text/plain, Size: 1732 bytes --]
(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:03:00.0
(XEN) [VT-D]d0:PCIe: map 0000:04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
(XEN) hap.c:289: d0 failed to allocate from HAP pool
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) Cannot setup identity map d0:fee00, gfn already mapped to 1021c63.
(XEN) d0: IOMMU mapping failed: -16
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) d0: IOMMU mapping failed: -2
(XEN) WARNING: PVH is an experimental mode with limited functionality
[-- Attachment #3: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2018-07-27 8:50 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-23 11:50 PVH dom0 creation fails - the system freezes bercarug
2018-07-24 9:54 ` Jan Beulich
2018-07-25 10:06 ` bercarug
2018-07-25 10:22 ` Wei Liu
2018-07-25 10:43 ` Juergen Gross
2018-07-25 13:35 ` Roger Pau Monné
2018-07-25 13:41 ` Juergen Gross
2018-07-25 14:02 ` Wei Liu
2018-07-25 14:05 ` bercarug
2018-07-25 14:10 ` Wei Liu
2018-07-25 16:12 ` Roger Pau Monné
2018-07-25 16:29 ` Juergen Gross
2018-07-25 18:56 ` [Memory Accounting] was: " Andrew Cooper
2018-07-25 23:07 ` Boris Ostrovsky
2018-07-26 9:41 ` Juergen Gross
2018-07-26 9:45 ` George Dunlap
2018-07-26 11:11 ` Roger Pau Monné
2018-07-26 11:22 ` Juergen Gross
2018-07-26 11:27 ` George Dunlap
2018-07-26 12:19 ` Juergen Gross
2018-07-26 14:44 ` George Dunlap
2018-07-26 13:50 ` Roger Pau Monné
2018-07-26 13:58 ` Juergen Gross
2018-07-26 14:35 ` Roger Pau Monné
2018-07-26 11:23 ` George Dunlap
2018-07-26 11:08 ` Roger Pau Monné
2018-07-26 8:15 ` bercarug
2018-07-26 8:31 ` Juergen Gross
2018-07-26 11:05 ` Roger Pau Monné
2018-07-25 13:57 ` bercarug
2018-07-25 14:12 ` Roger Pau Monné
2018-07-25 16:19 ` Paul Durrant
2018-07-26 16:46 ` Roger Pau Monné
2018-07-27 8:48 ` Bercaru, Gabriel [this message]
2018-07-27 9:11 ` Roger Pau Monné
2018-08-02 11:36 ` Bercaru, Gabriel
2018-08-02 13:55 ` Roger Pau Monné
2018-08-08 7:46 ` bercarug
2018-08-08 8:08 ` Roger Pau Monné
2018-08-08 8:39 ` bercarug
2018-08-08 8:43 ` Paul Durrant
2018-08-08 8:51 ` Roger Pau Monné
2018-08-08 8:54 ` bercarug
2018-08-08 9:44 ` Roger Pau Monné
2018-08-08 10:11 ` Roger Pau Monné
2018-08-08 10:13 ` bercarug
[not found] ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
[not found] ` <5B6AAF130200003B04D2E796@prv1-mh.provo.novell.com>
2018-08-08 10:00 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1532681311565.24220@amazon.com \
--to=bercarug@amazon.com \
--cc=JBeulich@suse.com \
--cc=Paul.Durrant@citrix.com \
--cc=abelgun@amazon.com \
--cc=dwmw2@infradead.org \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).