* The same IOMMU group for igb and its igbvf siblings
@ 2016-07-09 19:16 Sebastian Andrzej Siewior
2016-07-09 19:44 ` Alex Williamson
0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-07-09 19:16 UTC (permalink / raw)
To: linux-pci, vfio-users; +Cc: Alex Williamson, tglx
Hi,
I am trying to use SR-IOV on a IGB card with PCI ID 8086:1521. After
| echo 7 > /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/sriov_numvfs
I have them all on one iommu group:
|# find /sys/kernel/iommu_groups/ -type l|grep /1/
|/sys/kernel/iommu_groups/1/devices/0000:00:01.0
|/sys/kernel/iommu_groups/1/devices/0000:00:01.1
|/sys/kernel/iommu_groups/1/devices/0000:02:00.0
|/sys/kernel/iommu_groups/1/devices/0000:02:00.1
|/sys/kernel/iommu_groups/1/devices/0000:03:10.0
|/sys/kernel/iommu_groups/1/devices/0000:03:10.4
|/sys/kernel/iommu_groups/1/devices/0000:03:11.0
|/sys/kernel/iommu_groups/1/devices/0000:03:11.4
|/sys/kernel/iommu_groups/1/devices/0000:03:12.0
|/sys/kernel/iommu_groups/1/devices/0000:03:12.4
|/sys/kernel/iommu_groups/1/devices/0000:03:13.0
lspci -t
|-[0000:00]-+-00.0
| +-01.0-[01]--
| +-01.1-[02-03]--+-[0000:03]-+-10.0
| | | +-10.4
| | | +-11.0
| | | +-11.4
| | | +-12.0
| | | +-12.4
| | | \-13.0
| | \-[0000:02]-+-00.0
| | \-00.1
lspci for those devices:
|00:00.0 Host bridge: Intel Corporation Device 1918 (rev 07)
|00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07)
|00:01.1 PCI bridge: Intel Corporation Device 1905 (rev 07)
|02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
|02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
|03:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:10.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:11.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:11.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:12.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:12.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
|03:13.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
and qemu won't pass the virtual-function NICs to a guest. Shouldn't each
VF device be in its own IOMMU group?
>From the ACS capabilities I see:
|00:00.0 Host bridge: Intel Corporation Device 1918 (rev 07)
| Subsystem: Super Micro Computer Inc Device 0909
| Flags: bus master, fast devsel, latency 0
| Capabilities: [e0] Vendor Specific Information: Len=10 <?>
|
|00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07) (prog-if 00 [Normal decode])
| Flags: bus master, fast devsel, latency 0
| Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
| Capabilities: [88] Subsystem: Super Micro Computer Inc Device 0909
| Capabilities: [80] Power Management version 3
| Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
| Capabilities: [a0] Express Root Port (Slot+), MSI 00
| Capabilities: [100] Virtual Channel
| Capabilities: [140] Root Complex Link
| Kernel driver in use: pcieport
|
|00:01.1 PCI bridge: Intel Corporation Device 1905 (rev 07) (prog-if 00 [Normal decode])
| Flags: bus master, fast devsel, latency 0
| Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
| I/O behind bridge: 0000e000-0000efff
| Memory behind bridge: df100000-df3fffff
| Capabilities: [88] Subsystem: Super Micro Computer Inc Device 0909
| Capabilities: [80] Power Management version 3
| Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
| Capabilities: [a0] Express Root Port (Slot+), MSI 00
| Capabilities: [100] Virtual Channel
| Capabilities: [140] Root Complex Link
| Capabilities: [d94] #19
| Kernel driver in use: pcieport
|02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
| Subsystem: Super Micro Computer Inc Device 0652
|…
| Capabilities: [1d0 v1] Access Control Services
| ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
| ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
| Kernel driver in use: igb
|
|03:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
| Subsystem: Super Micro Computer Inc Device 0652
| Flags: fast devsel
|…
| Capabilities: [1d0 v1] Access Control Services
| ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
| ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
I *think* the problem is that the root port lacks ACS caps. Could this
be the poblem? If so do I need to wait for a BIOS update or is there an
other option?
I tried v4.7-rc6. I noticed that the IGB device is part of the quirk
table in pci_dev_acs_enabled but somehow it is not used.
Any suggestions?
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: The same IOMMU group for igb and its igbvf siblings
2016-07-09 19:16 The same IOMMU group for igb and its igbvf siblings Sebastian Andrzej Siewior
@ 2016-07-09 19:44 ` Alex Williamson
2016-07-09 20:01 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 3+ messages in thread
From: Alex Williamson @ 2016-07-09 19:44 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: linux-pci, vfio-users, tglx
On Sat, 9 Jul 2016 21:16:00 +0200
Sebastian Andrzej Siewior <sebastian@breakpoint.cc> wrote:
> Hi,
>
> I am trying to use SR-IOV on a IGB card with PCI ID 8086:1521. After
> | echo 7 > /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/sriov_numvfs
> I have them all on one iommu group:
>
> |# find /sys/kernel/iommu_groups/ -type l|grep /1/
> |/sys/kernel/iommu_groups/1/devices/0000:00:01.0
> |/sys/kernel/iommu_groups/1/devices/0000:00:01.1
> |/sys/kernel/iommu_groups/1/devices/0000:02:00.0
> |/sys/kernel/iommu_groups/1/devices/0000:02:00.1
> |/sys/kernel/iommu_groups/1/devices/0000:03:10.0
> |/sys/kernel/iommu_groups/1/devices/0000:03:10.4
> |/sys/kernel/iommu_groups/1/devices/0000:03:11.0
> |/sys/kernel/iommu_groups/1/devices/0000:03:11.4
> |/sys/kernel/iommu_groups/1/devices/0000:03:12.0
> |/sys/kernel/iommu_groups/1/devices/0000:03:12.4
> |/sys/kernel/iommu_groups/1/devices/0000:03:13.0
>
> lspci -t
> |-[0000:00]-+-00.0
> | +-01.0-[01]--
> | +-01.1-[02-03]--+-[0000:03]-+-10.0
> | | | +-10.4
> | | | +-11.0
> | | | +-11.4
> | | | +-12.0
> | | | +-12.4
> | | | \-13.0
> | | \-[0000:02]-+-00.0
> | | \-00.1
>
> lspci for those devices:
> |00:00.0 Host bridge: Intel Corporation Device 1918 (rev 07)
> |00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07)
> |00:01.1 PCI bridge: Intel Corporation Device 1905 (rev 07)
> |02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
> |02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
> |03:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:10.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:11.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:11.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:12.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:12.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> |03:13.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
>
> and qemu won't pass the virtual-function NICs to a guest. Shouldn't each
> VF device be in its own IOMMU group?
>
> From the ACS capabilities I see:
> |00:00.0 Host bridge: Intel Corporation Device 1918 (rev 07)
> | Subsystem: Super Micro Computer Inc Device 0909
> | Flags: bus master, fast devsel, latency 0
> | Capabilities: [e0] Vendor Specific Information: Len=10 <?>
> |
> |00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07) (prog-if 00 [Normal decode])
> | Flags: bus master, fast devsel, latency 0
> | Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> | Capabilities: [88] Subsystem: Super Micro Computer Inc Device 0909
> | Capabilities: [80] Power Management version 3
> | Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
> | Capabilities: [a0] Express Root Port (Slot+), MSI 00
> | Capabilities: [100] Virtual Channel
> | Capabilities: [140] Root Complex Link
> | Kernel driver in use: pcieport
> |
> |00:01.1 PCI bridge: Intel Corporation Device 1905 (rev 07) (prog-if 00 [Normal decode])
> | Flags: bus master, fast devsel, latency 0
> | Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
> | I/O behind bridge: 0000e000-0000efff
> | Memory behind bridge: df100000-df3fffff
> | Capabilities: [88] Subsystem: Super Micro Computer Inc Device 0909
> | Capabilities: [80] Power Management version 3
> | Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
> | Capabilities: [a0] Express Root Port (Slot+), MSI 00
> | Capabilities: [100] Virtual Channel
> | Capabilities: [140] Root Complex Link
> | Capabilities: [d94] #19
> | Kernel driver in use: pcieport
> |02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
> | Subsystem: Super Micro Computer Inc Device 0652
> |…
> | Capabilities: [1d0 v1] Access Control Services
> | ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
> | ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
> | Kernel driver in use: igb
> |
> |03:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
> | Subsystem: Super Micro Computer Inc Device 0652
> | Flags: fast devsel
> |…
> | Capabilities: [1d0 v1] Access Control Services
> | ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
> | ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
>
> I *think* the problem is that the root port lacks ACS caps. Could this
> be the poblem? If so do I need to wait for a BIOS update or is there an
> other option?
> I tried v4.7-rc6. I noticed that the IGB device is part of the quirk
> table in pci_dev_acs_enabled but somehow it is not used.
> Any suggestions?
The root port device IDs translate to a Skylake platform, which is a
"client" processor. Core-i3/5/7 and even Xeon E3 fit into this
category and they do not support ACS on the processor root ports. This
groups everything downstream of those root ports together and even
binds together separate sub-hierarchies when the root ports are joined
in a multifunction slot. Without ACS we cannot guarantee that
peer-to-peer DMA does not occur through redirection prior to IOMMU
translation.
The easiest solution is to move the card to one of the PCH sourced root
ports (ie. downstream of root ports at 00:1c.*). As of kernel v4.7-rc1
we have quirks for the Sunrise Point PCH to work around the botched
implementation of ACS found in this chipset. Pretty much all Intel
client processors have the same story, no ACS in the processor root
ports, quirks to enable ACS in the PCH root ports. Xeon E5 and higher
as well as "High End Desktop Processors" (based on E5) support ACS
correctly (though the PCH root ports need and already have quirks for
ACS).
There exists a non-upstream patch to override ACS, which does nothing
to solve the isolation problem, it just allows you to gamble with data
integrity, which is why it really has no place upstream. The IGB
devices you note in pci_dev_acs_enabled are quirks for the IGB PFs.
Intel has confirmed that there is isolation between the PFs, so when
installed into topology that does have ACS support, this allows the PFs
to be put into separate groups. Since the point at which your system
lacks isolation is upstream of the PFs, this doesn't help you. Thanks,
Alex
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: The same IOMMU group for igb and its igbvf siblings
2016-07-09 19:44 ` Alex Williamson
@ 2016-07-09 20:01 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2016-07-09 20:01 UTC (permalink / raw)
To: Alex Williamson; +Cc: linux-pci, vfio-users, tglx
On 2016-07-09 13:44:03 [-0600], Alex Williamson wrote:
> The root port device IDs translate to a Skylake platform, which is a
> "client" processor. Core-i3/5/7 and even Xeon E3 fit into this
It is an E3-1230 v5
> category and they do not support ACS on the processor root ports. This
> groups everything downstream of those root ports together and even
> binds together separate sub-hierarchies when the root ports are joined
> in a multifunction slot. Without ACS we cannot guarantee that
> peer-to-peer DMA does not occur through redirection prior to IOMMU
> translation.
So it is not a missing BIOS knob but a missing CPU feature.
> The easiest solution is to move the card to one of the PCH sourced root
> ports (ie. downstream of root ports at 00:1c.*). As of kernel v4.7-rc1
> we have quirks for the Sunrise Point PCH to work around the botched
> implementation of ACS found in this chipset. Pretty much all Intel
> client processors have the same story, no ACS in the processor root
> ports, quirks to enable ACS in the PCH root ports. Xeon E5 and higher
> as well as "High End Desktop Processors" (based on E5) support ACS
> correctly (though the PCH root ports need and already have quirks for
> ACS).
bah. Not sure if another slot is possible / available but thanks for the
hint.
> There exists a non-upstream patch to override ACS, which does nothing
> to solve the isolation problem, it just allows you to gamble with data
> integrity, which is why it really has no place upstream. The IGB
> devices you note in pci_dev_acs_enabled are quirks for the IGB PFs.
> Intel has confirmed that there is isolation between the PFs, so when
> installed into topology that does have ACS support, this allows the PFs
> to be put into separate groups. Since the point at which your system
> lacks isolation is upstream of the PFs, this doesn't help you. Thanks,
Thank you for the explanation.
> Alex
Sebastian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-09 20:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-09 19:16 The same IOMMU group for igb and its igbvf siblings Sebastian Andrzej Siewior
2016-07-09 19:44 ` Alex Williamson
2016-07-09 20:01 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).