All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gordan Bobic <gordan@bobich.net>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: Multi-bridged PCIe devices (Was: Re: iommuu/vt-d issues with LSI MegaSAS (PERC5i))
Date: Wed, 11 Dec 2013 21:15:17 +0000	[thread overview]
Message-ID: <52A8D5E5.2030902@bobich.net> (raw)
In-Reply-To: <20131211183233.GA2760@phenom.dumpdata.com>

On 12/11/2013 06:32 PM, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 12, 2013 at 06:20:18AM +0000, Zhang, Yang Z wrote:
>> Jan Beulich wrote on 2013-09-11:
>>>>>> On 11.09.13 at 15:26, Gordan Bobic <gordan@bobich.net> wrote:
>>>> On Wed, 11 Sep 2013 14:22:51 +0100, "Jan Beulich"
>>>> <JBeulich@suse.com>
>>>>   wrote:
>>>>>>>> On 11.09.13 at 15:10, Gordan Bobic <gordan@bobich.net> wrote:
>>>>>> On Wed, 11 Sep 2013 14:03:14 +0100, "Jan Beulich"
>>>>>> <JBeulich@suse.com>
>>>>>>   wrote:
>>>>>>>>>> On 11.09.13 at 14:45, Gordan Bobic <gordan@bobich.net> wrote:
>>>>>>>>   dmesg, xl dmesg, lspci -vvvnn and lspci -tvnn output is attached.
>>>>>>>>
>>>>>>>>   I'll try adding one of my LSI cards and see the comparative
>>>>>>>> behaviour. Right now I don't even know if the phantom device  is
>>>>>>>> on the SAS card or the motherboard.
>>>>>>>
>>>>>>> The Adaptec card being the only thing on bus 0f makes it pretty
>>>>>>> likely that this other device also is on that card.
>>>>>>>
>>>>>>> I guess the issue is mainly because the device itself is a PCI
>>>>>>> one, while the immediately upstream bridge (where I mean only the
>>>>>>> visible one) is PCIe. There _must_ be a PCIe-PCI bridge between
>>>>>>> them. And as long as firmware doesn't know about that bridge and
>>>>>>> the bridge doesn't properly handle config space accesses to it,
>>>>>>> such a device just can't be used with an IOMMU (without some yet
>>>>>>> to be invented workaround).
>>>>>>>
>>>>>>   I'm actually thinking about Konrad's proposed hack in that
>>>>>> thread from 3 years ago. If the device IDs are parameterized  out
>>>>>> rather than hard-coded, then this could work in nearly the  same
>>>>>> was as xen-pciback in terms of usage. Pass the phantom  device IDs
>>>>>> as parameters to the module. Done that way it  might even be
>>>>>> considered clean enough to be fit for public  consumption.
>>>>>
>>>>> Except that, short of being able to determine it via config space
>>>>> reads, we also need the resulting command line option to tell us
>>>>> that what kind of device that is.
>>>>>
>>>>   Not sure I follow. Why do we need to know the device type?
>>>
>>> Just look at set_msi_source_id() as well as
>>> domain_context_{mapping,unmap}() (just the most prominent
>>> examples): Behavior here heavily depends on the type of the device
>>> itself _and_ that of the upstream bridge(s).
>> Looks like there are many devices are failed to work. I wonder whether the PCI/PCIe specification tells how to detect the hidden device behind those devices (Like detection of phantom device). If not, I think those devices are buggy. Or we can say those devices are not really PCI/PCIe compatible. Since VT-d only covers the PCI/PCIe device, it's reasonable that non-PCI/PCIe device failed to work under VT-d.
>>
>> As Jan's suggestion, we need the user to tell us whether there is a hidden device or BDF behind anther device that the OS is unaware. We need to pass that info to Xen before pass-thought the device.
>>
>
> Interestingly enough I just hit this with my brand-new Haswell CPU and
> new motherboard when passing in a capture card. It shows:
>
>      +-1c.5-[07-09]----00.0-[08-09]--+-01.0-[09]--+-08.0  Brooktree Corporation Bt878 Video Capture
>             |                               |            +-08.1  Brooktree Corporation Bt878 Audio Capture
>             |                               |            +-09.0  Brooktree Corporation Bt878 Video Capture
>             |                               |            +-09.1  Brooktree Corporation Bt878 Audio Capture
>             |                               |            +-0a.0  Brooktree Corporation Bt878 Video Capture
>             |                               |            +-0a.1  Brooktree Corporation Bt878 Audio Capture
>             |                               |            +-0b.0  Brooktree Corporation Bt878 Video Capture
>             |                               |            \-0b.1  Brooktree Corporation Bt878 Audio Capture
>             |                               \-03.0  Texas Instruments TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx]
>
> And Xen says:
> (XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow
> (XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault
> (XEN) [VT-D]iommu.c:865: DMAR:[DMA Read] Request device [0000:08:00.0] fault addr 36aa3000, iommu reg = ffff82c3ffd53000
> (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
> (XEN) print_vtd_entries: iommu ffff83083d4939b0 dev 0000:08:00.0 gmfn 36aa3
> (XEN)     root_entry = ffff83083d47e000
> (XEN)     root_entry[8] = 72569a001
> (XEN)     context = ffff83072569a000
> (XEN)     context[0] = 0_0
> (XEN)     ctxt_entry[0] not present
> (XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow
> (XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault
> (XEN) [VT-D]iommu.c:865: DMAR:[DMA Read] Request device [0000:08:00.0] fault addr 36aa3000, iommu reg = ffff82c3ffd53000
>
>
> Oddly enough it was working fine in a box with an AMD IOMMU. But
> to be fair - that machine was running with Xen 4.1.
>
> The hack I developed: http://lists.xen.org/archives/html/xen-devel/2010-06/msg00093.html
> ends up with this:
>
> (XEN) alloc_pdev: unknown type: 0000:08:00.0
> (XEN) [VT-D]iommu.c:1484: d0:unknown(0): 0000:08:00.0
> (XEN) [VT-D]iommu.c:1888: d0: context mapping failed
>
> (FYI, this Xen 4.3.1)
>
> Let me retry on the AMD box with the same version of Xen.

I may be wrong, but this doesn't look like the same problem (phantom PCI 
device on the bus). Or am I missing something?

As far as I can tell, the original problem was arising on cards that are 
PCIe, but based on a PCIX chipset, i.e. with a PCIe-PCIX bridge. Xen 
wasn't the only thing affected in my case - bare metal Linux kernel was 
also having problems with intel-iommu=1 in the kernel boot parameters. 
If might be worth trying that with your card to see what happens. If 
bare metal Linux with intel-iommu=1 works for your card, it's probably 
not the same problem (of course it could be similar/related).

Out of interest, I noticed recently there is a xen parameter 
"pci-phantom", but I haven't been able to find documentation for it. Can 
you point me in the right direction? Does it, perchance, allow 
specifying the PCI slot ID of a phantom device so that IOMMU doesn't 
freak out when a seemingly non-existant device starts trying to do DMA?

Gordan

  reply	other threads:[~2013-12-11 21:15 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-11 11:05 Multi-bridged PCIe devices (Was: Re: iommuu/vt-d issues with LSI MegaSAS (PERC5i)) Gordan Bobic
2013-09-11 11:25 ` Gordan Bobic
2013-09-11 11:44   ` Gordan Bobic
2013-09-11 11:57     ` Jan Beulich
2013-09-11 12:19       ` Gordan Bobic
2013-09-11 12:56         ` Pasi Kärkkäinen
2013-09-11 11:53 ` Jan Beulich
2013-09-11 12:14   ` Gordan Bobic
2013-09-11 12:31     ` Jan Beulich
2013-09-11 12:45       ` Gordan Bobic
2013-09-11 13:03         ` Jan Beulich
2013-09-11 13:10           ` Gordan Bobic
2013-09-11 13:22             ` Jan Beulich
2013-09-11 13:26               ` Gordan Bobic
2013-09-11 13:36                 ` Jan Beulich
2013-09-12  6:20                   ` Zhang, Yang Z
2013-12-11 18:32                     ` Konrad Rzeszutek Wilk
2013-12-11 21:15                       ` Gordan Bobic [this message]
2013-12-11 21:30                         ` Konrad Rzeszutek Wilk
2013-12-13 11:13                           ` Jan Beulich
2013-12-13 14:43                             ` Konrad Rzeszutek Wilk
2013-12-13 14:56                               ` Jan Beulich
2013-12-13 15:27                                 ` Gordan Bobic
2014-01-06 20:26                                   ` Konrad Rzeszutek Wilk
2014-01-06 21:45                                     ` Konrad Rzeszutek Wilk
2014-01-07  3:17                                       ` Zhang, Yang Z
2014-01-07 10:35                                         ` Gordan Bobic
2014-01-07 10:38                                           ` Andrew Cooper
2014-01-07 10:44                                             ` Gordan Bobic
2014-02-21 19:08                                               ` Konrad Rzeszutek Wilk
2014-02-24 10:14                                                 ` Jan Beulich
2013-09-11 13:23           ` Gordan Bobic
2013-09-11 13:34             ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2014-01-07 11:26 Wu, Feng
2014-01-07 11:35 ` Gordan Bobic
2014-01-07 12:15   ` Jan Beulich
2014-01-07 12:42     ` Gordan Bobic
2014-01-07 14:38       ` Konrad Rzeszutek Wilk
2014-01-07 14:47         ` Jan Beulich
2014-01-07 15:40           ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A8D5E5.2030902@bobich.net \
    --to=gordan@bobich.net \
    --cc=JBeulich@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.