All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bercaru, Gabriel" <bercarug@amazon.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>,
	"Paul Durrant" <Paul.Durrant@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Jan Beulich <JBeulich@suse.com>,
	"Belgun, Adrian" <abelgun@amazon.com>
Subject: Re: PVH dom0 creation fails - the system freezes
Date: Fri, 27 Jul 2018 08:48:32 +0000	[thread overview]
Message-ID: <1532681311565.24220@amazon.com> (raw)
In-Reply-To: <20180726164611.gneruvoi5vmvbkd5@mac.bytemobile.com>

[-- Attachment #1: Type: text/plain, Size: 14539 bytes --]

I tried the patch and it fixes the unusable USB devices problem.
However, I captured the boot messages and the "IOMMU mapping failed" printk
seems to have been executed on each iteration of the loop.

I attached a small section of the boot log. As I said, the warning log was displayed
many more times, I removed a lot of them to keep the attached file short.

Gabriel
________________________________________
From: Roger Pau Monné <roger.pau@citrix.com>
Sent: Thursday, July 26, 2018 7:46 PM
To: Paul Durrant
Cc: Bercaru, Gabriel; xen-devel; David Woodhouse; Jan Beulich; Belgun, Adrian
Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes

On Wed, Jul 25, 2018 at 05:19:03PM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> > Of Roger Pau Monné
> > Sent: 25 July 2018 15:12
> > To: bercarug@amazon.com
> > Cc: xen-devel <xen-devel@lists.xenproject.org>; David Woodhouse
> > <dwmw2@infradead.org>; Jan Beulich <JBeulich@suse.com>;
> > abelgun@amazon.com
> > Subject: Re: [Xen-devel] PVH dom0 creation fails - the system freezes
> >
> > On Wed, Jul 25, 2018 at 04:57:23PM +0300, bercarug@amazon.com wrote:
> > > On 07/25/2018 04:35 PM, Roger Pau Monné wrote:
> > > > On Wed, Jul 25, 2018 at 01:06:43PM +0300, bercarug@amazon.com
> > wrote:
> > > > > On 07/24/2018 12:54 PM, Jan Beulich wrote:
> > > > > > > > > On 23.07.18 at 13:50, <bercarug@amazon.com> wrote:
> > > > > > > For the last few days, I have been trying to get a PVH dom0 running,
> > > > > > > however I encountered the following problem: the system seems
> > to
> > > > > > > freeze after the hypervisor boots, the screen goes black. I have
> > tried to
> > > > > > > debug it via a serial console (using Minicom) and managed to get
> > some
> > > > > > > more Xen output, after the screen turns black.
> > > > > > >
> > > > > > > I mention that I have tried to boot the PVH dom0 using different
> > kernel
> > > > > > > images (from 4.9.0 to 4.18-rc3), different Xen  versions (4.10, 4.11,
> > 4.12).
> > > > > > >
> > > > > > > Below I attached my system / hypervisor configuration, as well as
> > the
> > > > > > > output captured through the serial console, corresponding to the
> > latest
> > > > > > > versions for Xen and the Linux Kernel (Xen staging and Kernel from
> > the
> > > > > > > xen/tip tree).
> > > > > > > [...]
> > > > > > > (XEN) [VT-D]iommu.c:919: iommu_fault_status: Fault Overflow
> > > > > > > (XEN) [VT-D]iommu.c:921: iommu_fault_status: Primary Pending
> > Fault
> > > > > > > (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:00:14.0] fault
> > addr 8deb3000, iommu reg = ffff82c00021b000
> > > > Can you figure out which PCI device is 00:14.0?
> > > This is the output of lspci -vvv for device 00:14.0:
> > >
> > > 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI
> > > Controller (rev 31) (prog-if 30 [XHCI])
> > >         Subsystem: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller
> > >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+
> > > Stepping- SERR+ FastB2B- DisINTx+
> > >         Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > > <TAbort- <MAbort+ >SERR- <PERR- INTx-
> > >         Latency: 0
> > >         Interrupt: pin A routed to IRQ 178
> > >         Region 0: Memory at a2e00000 (64-bit, non-prefetchable) [size=64K]
> > >         Capabilities: [70] Power Management version 2
> > >                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
> > > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> > >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > >         Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > >                 Address: 00000000fee0e000  Data: 4021
> > >         Kernel driver in use: xhci_hcd
> > >         Kernel modules: xhci_pci
> >
> > I'm afraid your USB controller is missing RMRR entries in the DMAR
> > ACPI tables, thus causing the IOMMU faults and not working properly.
> >
> > You could try to manually add some extra rmrr regions by appending:
> >
> > rmrr=0x8deb3=0:0:14.0
> >
> > To the Xen command line, and keep adding any address that pops up in
> > the iommu faults. This is of course quite cumbersome, but there's no
> > way to get the required memory addresses if the data in RMRR is
> > wrong/incomplete.
> >
>
> You could just add all E820 reserved regions in there. That will almost certainly cover it.

I have a prototype patch for this that attempts to identity map all
reserved regions below 4GB to the p2m. It's still a WIP, but if you
could give it a try that would help me figure out whether this fixes
your issues and is indeed something that would be good to have.

I don't really like the patch as-is because it doesn't check whether
the reserved regions added to the p2m overlap with the LAPIC page or
the PCIe MCFG regions for example, I will continue to work on a safer
version.

If you can give this a shot, please remove any rmrr options from the
command line and use iommu=debug in order to catch any issues.

Thanks, Roger.
---8<---
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2c44fabf99..76a1fd6681 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -21,6 +21,8 @@
 #include <xen/keyhandler.h>
 #include <xsm/xsm.h>

+#include <asm/setup.h>
+
 static int parse_iommu_param(const char *s);
 static void iommu_dump_p2m_table(unsigned char key);

@@ -47,6 +49,8 @@ integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
  *   no-igfx                    Disable VT-d for IGD devices (insecure)
  *   no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping
  *                              tables (insecure)
+ *   inclusive                  Include any memory ranges below 4GB not used
+ *                              by Xen or unusable to the iommu page tables.
  */
 custom_param("iommu", parse_iommu_param);
 bool_t __initdata iommu_enable = 1;
@@ -60,6 +64,7 @@ bool_t __read_mostly iommu_passthrough;
 bool_t __read_mostly iommu_snoop = 1;
 bool_t __read_mostly iommu_qinval = 1;
 bool_t __read_mostly iommu_intremap = 1;
+bool __read_mostly iommu_inclusive = true;

 /*
  * In the current implementation of VT-d posted interrupts, in some extreme
@@ -126,6 +131,8 @@ static int __init parse_iommu_param(const char *s)
             iommu_dom0_strict = val;
         else if ( !strncmp(s, "sharept", ss - s) )
             iommu_hap_pt_share = val;
+        else if ( !strncmp(s, "inclusive", ss - s) )
+            iommu_inclusive = val;
         else
             rc = -EINVAL;

@@ -165,6 +172,85 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
     iommu_dom0_strict = 1;
 }

+static void __hwdom_init setup_inclusive_mappings(struct domain *d)
+{
+    unsigned long i, j, tmp, top, max_pfn;
+
+    BUG_ON(!is_hardware_domain(d));
+
+    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
+    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
+
+    for ( i = 0; i < top; i++ )
+    {
+        unsigned long pfn = pdx_to_pfn(i);
+        bool map;
+        int rc = 0;
+
+        /*
+         * Set up 1:1 mapping for dom0. Default to include only
+         * conventional RAM areas and let RMRRs include needed reserved
+         * regions. When set, the inclusive mapping additionally maps in
+         * every pfn up to 4GB except those that fall in unusable ranges.
+         */
+        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
+            continue;
+
+        if ( is_pv_domain(d) && iommu_inclusive && pfn <= max_pfn )
+            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
+        else if ( is_hvm_domain(d) && iommu_inclusive )
+            map = page_is_ram_type(pfn, RAM_TYPE_RESERVED);
+        else
+            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
+
+        if ( !map )
+            continue;
+
+        /* Exclude Xen bits */
+        if ( xen_in_range(pfn) )
+            continue;
+
+        /*
+         * If dom0-strict mode is enabled or guest type is HVM/PVH then exclude
+         * conventional RAM and let the common code map dom0's pages.
+         */
+        if ( (iommu_dom0_strict || is_hvm_domain(d)) &&
+             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
+            continue;
+
+        /* For HVM avoid memory below 1MB because that's already mapped. */
+        if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) )
+            continue;
+
+        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+        for ( j = 0; j < tmp; j++ )
+        {
+            int ret;
+
+            if ( iommu_use_hap_pt(d) )
+            {
+                ASSERT(is_hvm_domain(d));
+                ret = set_identity_p2m_entry(d, pfn * tmp + j, p2m_access_rw,
+                                             0);
+            }
+            else
+                ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+                                     IOMMUF_readable|IOMMUF_writable);
+
+            if ( !rc )
+               rc = ret;
+        }
+
+        if ( rc )
+            printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n",
+                   d->domain_id, rc);
+
+        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
+            process_pending_softirqs();
+    }
+
+}
+
 void __hwdom_init iommu_hwdom_init(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -207,7 +293,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                    d->domain_id, rc);
     }

-    return hd->platform_ops->hwdom_init(d);
+    hd->platform_ops->hwdom_init(d);
+
+    if ( !iommu_passthrough )
+        setup_inclusive_mappings(d);
 }

 void iommu_teardown(struct domain *d)
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index fb7edfaef9..91cadc602e 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -99,6 +99,4 @@ void pci_vtd_quirk(const struct pci_dev *);
 bool_t platform_supports_intremap(void);
 bool_t platform_supports_x2apic(void);

-void vtd_set_hwdom_mapping(struct domain *d);
-
 #endif // _VTD_EXTERN_H_
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..569ec4aec2 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1304,12 +1304,6 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
 {
     struct acpi_drhd_unit *drhd;

-    if ( !iommu_passthrough && is_pv_domain(d) )
-    {
-        /* Set up 1:1 page table for hardware domain. */
-        vtd_set_hwdom_mapping(d);
-    }
-
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);

diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..9971915349 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -32,11 +32,9 @@
 #include "../extern.h"

 /*
- * iommu_inclusive_mapping: when set, all memory below 4GB is included in dom0
- * 1:1 iommu mappings except xen and unusable regions.
+ * iommu_inclusive_mapping: superseded by iommu=inclusive.
  */
-static bool_t __hwdom_initdata iommu_inclusive_mapping = 1;
-boolean_param("iommu_inclusive_mapping", iommu_inclusive_mapping);
+boolean_param("iommu_inclusive_mapping", iommu_inclusive);

 void *map_vtd_domain_page(u64 maddr)
 {
@@ -107,67 +105,3 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
     }
     spin_unlock(&d->event_lock);
 }
-
-void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
-{
-    unsigned long i, j, tmp, top, max_pfn;
-
-    BUG_ON(!is_hardware_domain(d));
-
-    max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
-    top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
-
-    for ( i = 0; i < top; i++ )
-    {
-        unsigned long pfn = pdx_to_pfn(i);
-        bool map;
-        int rc = 0;
-
-        /*
-         * Set up 1:1 mapping for dom0. Default to include only
-         * conventional RAM areas and let RMRRs include needed reserved
-         * regions. When set, the inclusive mapping additionally maps in
-         * every pfn up to 4GB except those that fall in unusable ranges.
-         */
-        if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) )
-            continue;
-
-        if ( iommu_inclusive_mapping && pfn <= max_pfn )
-            map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE);
-        else
-            map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL);
-
-        if ( !map )
-            continue;
-
-        /* Exclude Xen bits */
-        if ( xen_in_range(pfn) )
-            continue;
-
-        /*
-         * If dom0-strict mode is enabled then exclude conventional RAM
-         * and let the common code map dom0's pages.
-         */
-        if ( iommu_dom0_strict &&
-             page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
-            continue;
-
-        tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
-        for ( j = 0; j < tmp; j++ )
-        {
-            int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
-                                     IOMMUF_readable|IOMMUF_writable);
-
-            if ( !rc )
-               rc = ret;
-        }
-
-        if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
-                   d->domain_id, rc);
-
-        if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
-            process_pending_softirqs();
-    }
-}
-
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..15d6584837 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -35,6 +35,7 @@ extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost;
 extern bool_t iommu_hap_pt_share;
 extern bool_t iommu_debug;
 extern bool_t amd_iommu_perdev_intremap;
+extern bool iommu_inclusive;

 extern unsigned int iommu_dev_iotlb_timeout;






Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.

[-- Attachment #2: iommu.txt --]
[-- Type: text/plain, Size: 1732 bytes --]

(XEN) *** Building a PVH Dom0 ***
(XEN) [VT-D]d0:Hostbridge: skip 0000:00:00.0 map
(XEN) [VT-D]d0:PCI: map 0000:00:14.0
(XEN) [VT-D]d0:PCI: map 0000:00:14.2
(XEN) [VT-D]d0:PCI: map 0000:00:16.0
(XEN) [VT-D]d0:PCI: map 0000:00:16.1
(XEN) [VT-D]d0:PCI: map 0000:00:17.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.0
(XEN) [VT-D]d0:PCI: map 0000:00:1f.2
(XEN) [VT-D]d0:PCI: map 0000:00:1f.4
(XEN) [VT-D]d0:PCIe: map 0000:01:00.0
(XEN) [VT-D]d0:PCIe: map 0000:02:00.0
(XEN) [VT-D]d0:PCIe: map 0000:03:00.0
(XEN) [VT-D]d0:PCIe: map 0000:04:00.0
(XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
(XEN) hap.c:289: d0 failed to allocate from HAP pool
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN) Cannot setup identity map d0:fee00, gfn already mapped to 1021c63.
(XEN)  d0: IOMMU mapping failed: -16
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN)  d0: IOMMU mapping failed: -2
(XEN) WARNING: PVH is an experimental mode with limited functionality

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-07-27  8:50 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-23 11:50 PVH dom0 creation fails - the system freezes bercarug
2018-07-24  9:54 ` Jan Beulich
2018-07-25 10:06   ` bercarug
2018-07-25 10:22     ` Wei Liu
2018-07-25 10:43     ` Juergen Gross
2018-07-25 13:35     ` Roger Pau Monné
2018-07-25 13:41       ` Juergen Gross
2018-07-25 14:02         ` Wei Liu
2018-07-25 14:05           ` bercarug
2018-07-25 14:10             ` Wei Liu
2018-07-25 16:12             ` Roger Pau Monné
2018-07-25 16:29               ` Juergen Gross
2018-07-25 18:56                 ` [Memory Accounting] was: " Andrew Cooper
2018-07-25 23:07                   ` Boris Ostrovsky
2018-07-26  9:41                     ` Juergen Gross
2018-07-26  9:45                     ` George Dunlap
2018-07-26 11:11                       ` Roger Pau Monné
2018-07-26 11:22                         ` Juergen Gross
2018-07-26 11:27                           ` George Dunlap
2018-07-26 12:19                             ` Juergen Gross
2018-07-26 14:44                               ` George Dunlap
2018-07-26 13:50                           ` Roger Pau Monné
2018-07-26 13:58                             ` Juergen Gross
2018-07-26 14:35                               ` Roger Pau Monné
2018-07-26 11:23                         ` George Dunlap
2018-07-26 11:08                 ` Roger Pau Monné
2018-07-26  8:15               ` bercarug
2018-07-26  8:31                 ` Juergen Gross
2018-07-26 11:05                   ` Roger Pau Monné
2018-07-25 13:57       ` bercarug
2018-07-25 14:12         ` Roger Pau Monné
2018-07-25 16:19           ` Paul Durrant
2018-07-26 16:46             ` Roger Pau Monné
2018-07-27  8:48               ` Bercaru, Gabriel [this message]
2018-07-27  9:11                 ` Roger Pau Monné
2018-08-02 11:36                   ` Bercaru, Gabriel
2018-08-02 13:55                     ` Roger Pau Monné
2018-08-08  7:46                       ` bercarug
2018-08-08  8:08                         ` Roger Pau Monné
2018-08-08  8:39                           ` bercarug
2018-08-08  8:43                           ` Paul Durrant
2018-08-08  8:51                             ` Roger Pau Monné
2018-08-08  8:54                               ` bercarug
2018-08-08  9:44                                 ` Roger Pau Monné
2018-08-08 10:11                                   ` Roger Pau Monné
2018-08-08 10:13                                     ` bercarug
     [not found]                               ` <5B6AAD430200009A03E1638C@prv1-mh.provo.novell.com>
     [not found]                                 ` <5B6AAF130200003B04D2E796@prv1-mh.provo.novell.com>
2018-08-08 10:00                                   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1532681311565.24220@amazon.com \
    --to=bercarug@amazon.com \
    --cc=JBeulich@suse.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=abelgun@amazon.com \
    --cc=dwmw2@infradead.org \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.