From: Chao Gao <chao.gao@intel.com>
To: Roger Pau Monne <roger.pau@citrix.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v2 1/4] x86/dom0: prevent access to MMCFG areas for PVH Dom0
Date: Mon, 4 Sep 2017 14:25:10 +0800 [thread overview]
Message-ID: <20170904062507.GA5394@op-computing> (raw)
In-Reply-To: <20170831100948.bbeck4c5kbkryuw7@MacBook-Pro-de-Roger.local>
On Thu, Aug 31, 2017 at 11:09:48AM +0100, Roger Pau Monne wrote:
>On Thu, Aug 31, 2017 at 04:45:23PM +0800, Chao Gao wrote:
>> On Thu, Aug 31, 2017 at 10:03:19AM +0100, Roger Pau Monne wrote:
>> >On Thu, Aug 31, 2017 at 03:32:42PM +0800, Chao Gao wrote:
>> >> On Tue, Aug 29, 2017 at 08:33:25AM +0100, Roger Pau Monne wrote:
>> >> >On Mon, Aug 28, 2017 at 06:18:13AM +0000, Tian, Kevin wrote:
>> >> >> > From: Roger Pau Monne [mailto:roger.pau@citrix.com]
>> >> >> > Sent: Friday, August 25, 2017 9:59 PM
>> >> >> >
>> >> >> > On Fri, Aug 25, 2017 at 06:25:36AM -0600, Jan Beulich wrote:
>> >> >> > > >>> On 25.08.17 at 14:15, <roger.pau@citrix.com> wrote:
>> >> >> > > > On Wed, Aug 23, 2017 at 02:16:38AM -0600, Jan Beulich wrote:
>> >> >> > > >> >>> On 22.08.17 at 15:54, <roger.pau@citrix.com> wrote:
>> >> >> > > >> > On Tue, Aug 22, 2017 at 06:26:23AM -0600, Jan Beulich wrote:
>> >> >> > > >> >> >>> On 11.08.17 at 18:43, <roger.pau@citrix.com> wrote:
>> >> >> > > >> >> > --- a/xen/arch/x86/dom0_build.c
>> >> >> > > >> >> > +++ b/xen/arch/x86/dom0_build.c
>> >> >> > > >> >> > @@ -440,6 +440,10 @@ int __init
>> >> >> > dom0_setup_permissions(struct domain *d)
>> >> >> > > >> >> > rc |= rangeset_add_singleton(mmio_ro_ranges, mfn);
>> >> >> > > >> >> > }
>> >> >> > > >> >> >
>> >> >> > > >> >> > + /* For PVH prevent access to the MMCFG areas. */
>> >> >> > > >> >> > + if ( dom0_pvh )
>> >> >> > > >> >> > + rc |= pci_mmcfg_set_domain_permissions(d);
>> >> >> > > >> >>
>> >> >> > > >> >> What about ones reported by Dom0 later on? Which then raises the
>> >> >> > > >> >> question whether ...
>> >> >> > > >> >
>> >> >> > > >> > This should be dealt with in the PHYSDEVOP_pci_mmcfg_reserved
>> >> >> > handler.
>> >> >> > > >> > But since you propose to do white listing, I guess it doesn't matter
>> >> >> > > >> > that much anymore.
>> >> >> > > >>
>> >> >> > > >> Well, a fundamental question is whether white listing would work in
>> >> >> > > >> the first place. I could see room for severe problems e.g. with ACPI
>> >> >> > > >> methods wanting to access MMIO that's not described by any PCI
>> >> >> > > >> devices' BARs. Typically that would be regions in the chipset which
>> >> >> > > >> firmware is responsible for configuring/managing, the addresses of
>> >> >> > > >> which can be found/set in custom config space registers.
>> >> >> > > >
>> >> >> > > > The question would also be what would Xen allow in such white-listing.
>> >> >> > > > Obviously you can get to map the same using both white-list and
>> >> >> > > > black-listing (see below).
>> >> >> > >
>> >> >> > > Not really - what you've said there regarding MMCFG regions is
>> >> >> > > a clear indication that we should _not_ map reserved regions, i.e.
>> >> >> > > it would need to be full white listing with perhaps just the PCI
>> >> >> > > device BARs being handled automatically.
>> >> >> >
>> >> >> > I've tried just mapping the BARs and that sadly doesn't work, the box
>> >> >> > hangs after the IOMMU is enabled:
>> >> >> >
>> >> >> > [...]
>> >> >> > (XEN) [VT-D]d0:PCI: map 0000:3f:13.5
>> >> >> > (XEN) [VT-D]d0:PCI: map 0000:3f:13.6
>> >> >> > (XEN) [VT-D]iommu_enable_translation: iommu->reg = ffff82c00021b000
>> >> >> >
>> >> >> > I will park this ATM and leave it for the Intel guys to diagnose.
>> >> >> >
>> >> >> > For the reference, the specific box I'm testing ATM has a Xeon(R) CPU
>> >> >> > E5-1607 0 @ 3.00GHz and a C600/X79 chipset.
>> >> >> >
>> >> >>
>> >> >> +Chao who can help check whether we have such a box at hand.
>> >> >>
>> >> >> btw please also give your BIOS version.
>> >> >
>> >> >It's a Precision T3600 BIOS A14.
>> >>
>> >> Hi, Roger.
>> >>
>> >> I found a Ivy bridge box with E5-2697 v2 and tested with "dom0=pvh", and
>> >
>> >The ones I've seen issues with are Sandy Bridge or Nehalem, can you
>> >find some of this hardware?
>>
>> As I expected, I was removed from recipents :(, which made me
>> hard to notice your replies in time.
>
>Sorry, I have no idea why my MUA does that, it seems to be able to
>deal fine with other recipients.
>
>> Yes. I will. But may take some time (for even Ivy Bridge is rare).
>>
>> >
>> >I haven't tested Ivy Bridge, but all Haswell boxes I've tested seem to
>> >work just fine.
>>
>> The reason why I chose Ivy Bridge partly is you said you found this bug on
>> almost pre-haswell box.
>
>I tested Nehalem, Sandy Bridge and Haswell, but sadly not Ivy Bridge
>(in fact I didn't even know about Ivy Bridge, that's why I said all
>pre-Haswell).
>
>In fact I'm now trying with a Nehalem processor that seem to work, so
>whatever this issue is it certainly doesn't affect all models or
>chipsets.
Hi, Roger.
Last week, I borrowed a Sandy Bridge with Intel(R) Xeon(R) E5-2690
2.7GHz and tested with 'dom0=pvh'. But I didn't see the machine hang.
I also tested on Haswell and found RMRRs in dmar are incorrect on my
haswell. The e820 on that machine is:
(XEN) [ 0.000000] Xen-e820 RAM map:
(XEN) [ 0.000000] 0000000000000000 - 000000000009a400 (usable)
(XEN) [ 0.000000] 000000000009a400 - 00000000000a0000 (reserved)
(XEN) [ 0.000000] 00000000000e0000 - 0000000000100000 (reserved)
(XEN) [ 0.000000] 0000000000100000 - 000000006ff84000 (usable)
(XEN) [ 0.000000] 000000006ff84000 - 000000007ac51000 (reserved)
(XEN) [ 0.000000] 000000007ac51000 - 000000007b681000 (ACPI NVS)
(XEN) [ 0.000000] 000000007b681000 - 000000007b7cf000 (ACPI data)
(XEN) [ 0.000000] 000000007b7cf000 - 000000007b800000 (usable)
(XEN) [ 0.000000] 000000007b800000 - 0000000090000000 (reserved)
(XEN) [ 0.000000] 00000000fed1c000 - 00000000fed20000 (reserved)
(XEN) [ 0.000000] 00000000ff400000 - 0000000100000000 (reserved)
(XEN) [ 0.000000] 0000000100000000 - 0000002080000000 (usable)
And the RMRRs in DMAR are:
(XEN) [ 0.000000] [VT-D]found ACPI_DMAR_RMRR:
(XEN) [ 0.000000] [VT-D] endpoint: 0000:05:00.0
(XEN) [ 0.000000] [VT-D]dmar.c:638: RMRR region: base_addr 723b4000
end_addr 7a3f3fff
(XEN) [ 0.000000] [VT-D]found ACPI_DMAR_RMRR:
(XEN) [ 0.000000] [VT-D] endpoint: 0000:00:1d.0
(XEN) [ 0.000000] [VT-D] endpoint: 0000:00:1a.0
(XEN) [ 0.000000] [VT-D]dmar.c:638: RMRR region: base_addr 723ac000
end_addr 723aefff
(Endpoint 05:00.0 is a RAID bus controller. Endpoints 00.1d.0 and 00.1a.0
are USB controllers.)
After DMA remapping is enabled, two DMA translation faults are reported
by VT-d:
(XEN) [ 9.547924] [VT-D]iommu_enable_translation: iommu->reg =
ffff82c00021b000
(XEN) [ 9.550620] [VT-D]iommu_enable_translation: iommu->reg =
ffff82c00021d000
(XEN) [ 9.553327] [VT-D]iommu.c:921: iommu_fault_status: Primary
Pending Fault
(XEN) [ 9.555906] [VT-D]DMAR:[DMA Read] Request device [0000:00:1a.0]
fault addr 7a3f5000, iommu reg = ffff82c00021d000
(XEN) [ 9.558537] [VT-D]DMAR: reason 06 - PTE Read access is not set
(XEN) [ 9.559860] print_vtd_entries: iommu #1 dev 0000:00:1a.0 gmfn
7a3f5
(XEN) [ 9.561179] root_entry[00] = 107277c001
(XEN) [ 9.562447] context[d0] = 2_1072c06001
(XEN) [ 9.563776] l4[000] = 9c0000202f171107
(XEN) [ 9.565125] l3[001] = 9c0000202f152107
(XEN) [ 9.566483] l2[1d1] = 9c000010727ce107
(XEN) [ 9.567821] l1[1f5] = 8000000000000000
(XEN) [ 9.569168] l1[1f5] not present
(XEN) [ 9.570502] [VT-D]DMAR:[DMA Read] Request device [0000:00:1d.0]
fault addr 7a3f4000, iommu reg = ffff82c00021d000
(XEN) [ 9.573147] [VT-D]DMAR: reason 06 - PTE Read access is not set
(XEN) [ 9.574488] print_vtd_entries: iommu #1 dev 0000:00:1d.0 gmfn
7a3f4
(XEN) [ 9.575819] root_entry[00] = 107277c001
(XEN) [ 9.577129] context[e8] = 2_1072c06001
(XEN) [ 9.578439] l4[000] = 9c0000202f171107
(XEN) [ 9.579778] l3[001] = 9c0000202f152107
(XEN) [ 9.581111] l2[1d1] = 9c000010727ce107
(XEN) [ 9.582482] l1[1f4] = 8000000000000000
(XEN) [ 9.583812] l1[1f4] not present
(XEN) [ 10.520172] Unable to find XEN_ELFNOTE_PHYS32_ENTRY address
(XEN) [ 10.521499] Failed to load Dom0 kernel
(XEN) [ 10.532171]
(XEN) [ 10.535464] ****************************************
(XEN) [ 10.542636] Panic on CPU 0:
(XEN) [ 10.547394] Could not set up DOM0 guest OS
(XEN) [ 10.553605] ****************************************
The fault address the devices failed to access is marked as reserved in
e820 and isn't reserved for the devices according to the RMRRs in DMAR.
So I think we can draw a conclusion that some existing BIOSs don't
expose correct RMRR to OS by DMAR. And we need a workaround such as
iommu_inclusive_mapping to deal with such kind of BIOS for both pv dom0
and pvh dom0.
As to the machine hang Roger observed, I have no idea on the cause. Roger,
have you ever seen the VT-d on that machine reporting a DMA
translation fault? If not, can you create one fault in native? I think
this can tell us whether the hardware's fault report function works well
or there are some bugs in Xen code. What is your opinion on this trial?
Thanks
chao
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-09-04 7:26 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-11 16:43 [PATCH v2 0/4] x86/pvh: implement iommu_inclusive_mapping for PVH Dom0 Roger Pau Monne
2017-08-11 16:43 ` [PATCH v2 1/4] x86/dom0: prevent access to MMCFG areas " Roger Pau Monne
2017-08-17 3:12 ` Tian, Kevin
2017-08-17 9:32 ` Roger Pau Monne
2017-08-28 6:04 ` Tian, Kevin
2017-08-22 12:26 ` Jan Beulich
2017-08-22 13:54 ` Roger Pau Monne
2017-08-23 8:16 ` Jan Beulich
2017-08-25 12:15 ` Roger Pau Monne
2017-08-25 12:25 ` Jan Beulich
2017-08-25 13:58 ` Roger Pau Monne
2017-08-28 6:18 ` Tian, Kevin
2017-08-29 7:33 ` Roger Pau Monne
2017-08-31 7:32 ` Chao Gao
2017-08-31 8:53 ` Roger Pau Monne
2017-08-31 9:03 ` Roger Pau Monne
2017-08-31 8:45 ` Chao Gao
2017-08-31 10:09 ` Roger Pau Monne
2017-09-04 6:25 ` Chao Gao [this message]
2017-09-04 9:00 ` Roger Pau Monné
2017-09-04 9:26 ` Roger Pau Monné
2017-09-04 8:52 ` Chao Gao
2017-09-04 15:06 ` Roger Pau Monné
2017-09-04 15:19 ` Roger Pau Monné
2017-09-04 15:39 ` Jan Beulich
2017-08-11 16:43 ` [PATCH v2 2/4] x86/dom0: prevent PVH Dom0 from mapping read-only the IO APIC area Roger Pau Monne
2017-08-17 3:12 ` Tian, Kevin
2017-08-17 9:35 ` Roger Pau Monne
2017-08-28 6:07 ` Tian, Kevin
2017-08-22 12:28 ` Jan Beulich
2017-08-11 16:43 ` [PATCH v2 3/4] x86/vtd: introduce a PVH implementation of iommu_inclusive_mapping Roger Pau Monne
2017-08-17 3:28 ` Tian, Kevin
2017-08-17 9:39 ` Roger Pau Monne
2017-08-28 6:13 ` Tian, Kevin
2017-08-22 12:31 ` Jan Beulich
2017-08-22 14:01 ` Roger Pau Monne
2017-08-23 8:18 ` Jan Beulich
2017-08-28 6:14 ` Tian, Kevin
2017-08-29 7:39 ` Roger Pau Monne
2017-08-11 16:43 ` [PATCH v2 4/4] x86/dom0: re-order DMA remapping enabling for PVH Dom0 Roger Pau Monne
2017-08-22 12:37 ` Jan Beulich
2017-08-22 14:05 ` Roger Pau Monne
2017-08-23 8:21 ` Jan Beulich
2017-08-17 3:10 ` [PATCH v2 0/4] x86/pvh: implement iommu_inclusive_mapping " Tian, Kevin
2017-08-17 9:28 ` Roger Pau Monne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170904062507.GA5394@op-computing \
--to=chao.gao@intel.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=kevin.tian@intel.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.