From: "Zhang Haoyu" <zhanghy@sangfor.com>
To: "Zhang Haoyu" <zhanghy@sangfor.com>,
"Alex Williamson" <alex.williamson@redhat.com>
Cc: bhelgaas <bhelgaas@google.com>,
"xudong.hao" <xudong.hao@intel.com>,
"donald.d.dugger" <donald.d.dugger@intel.com>,
qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>
Subject: Re: [questions] about using vfio to assign sr-iov vf to vm
Date: Mon, 18 Aug 2014 17:49:01 +0800 [thread overview]
Message-ID: <201408181748585961259@sangfor.com> (raw)
In-Reply-To: 201408181646457989269@sangfor.com
>>> >> >> Hi, all
>>> >> >> I'm using VFIO to assign intel 82599 VF to VM, now I encounter a problem,
>>> >> >> 82599 PF and its VFs belong to the same iommu_group, but I only want to assign some VFs to one VM, and some other VFs to another VM,
>>> >> >> so how to only unbind (part of) the VFs but PF?
>>> >> >> I read the kernel doc vfio.txt, I'm not sure should I unbind all of the devices which belong to one iommu_group?
>>> >> >> If so, because PF and its VFs belong to the same iommu_group, if I unbind the PF, its VFs also diappeared.
>>> >> >> I think I misunderstand someting,
>>> >> >> any advises?
>>> >> >
>>> >> >This occurs when the PF is installed behind components in the system
>>> >> >that do not support PCIe Access Control Services (ACS). The IOMMU group
>>> >> >contains both the PF and the VF because upstream transactions can be
>>> >> >re-routed downstream by these non-ACS components before being translated
>>> >> >by the IOMMU. Please provide 'sudo lspci -vvv', 'lspci -n', and kernel
>>> >> >version and we might be able to give you some advise on how to work
>>> >> >around the problem. Thanks,
>>> >> >
>>> >> The intel 82599(02:00.0 or 02:00.1) is behind the pci bridge (00:01.1),
>>> >> does 00:01.1 PCI bridge support ACS ?
>>> >
>>> >It does not and that's exactly the problem. We must assume that the
>>> >root port can redirect a transaction from a subordinate device back to
>>> >another subordinate device without IOMMU translation when ACS support is
>>> >not present. If you had a device plugged in below 00:01.0, we'd also
>>> >need to assume that non-IOMMU translated peer-to-peer between devices
>>> >behind either function, 00:01.0 or 00:01.1, is possible.
>>> >
>>> >Intel has indicated that processor root ports for all Xeon class
>>> >processors should support ACS and have verified isolation for PCH based
>>> >root ports allowing us to support quirks in place of ACS support. I'm
>>> >not aware of any efforts at Intel to verify isolation capabilities of
>>> >root ports on client processors. They are however aware that lack of
>>> >ACS is a limiting factor for usability of VT-d, and I hope that we'll
>>> >see future products with ACS support.
>>> >
>>> >Chances are good that the PCH root port at 00:1c.0 is supported by an
>>> >ACS quirk, but it seems that your system has a PCIe switch below the
>>> >root port. If the PCIe switch downstream ports support ACS, then you
>>> >may be able to move the 82599 to the empty slot at bus 07 to separate
>>> >the VFs into different IOMMU groups. Thanks,
>>> >
>>> Thanks, Alex,
>>> how to tell whether a PCI bridge/deivce support ACS capability?
>>>
>>> I perform "lspci -vvv -s | grep -i ACS", nothing matched.
>>> # lspci -vvv -s 00:1c.0
>>> 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
>>
>>
>>Ideally there would be capabilities for it, something like:
>>
>>Capabilities [xxx] Access Control Services...
>>
>>But, Intel failed to provide this, so we enable "effective" ACS
>>capabilities via a quirk:
>>
>>drivers/pci/quirks.c:
>>/*
>> * Many Intel PCH root ports do provide ACS-like features to disable peer
>> * transactions and validate bus numbers in requests, but do not provide an
>> * actual PCIe ACS capability. This is the list of device IDs known to fall
>> * into that category as provided by Intel in Red Hat bugzilla 1037684.
>> */
>>static const u16 pci_quirk_intel_pch_acs_ids[] = {
>> /* Ibexpeak PCH */
>> 0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49,
>> 0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51,
>> /* Cougarpoint PCH */
>> 0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17,
>> 0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f,
>> /* Pantherpoint PCH */
>> 0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17,
>> 0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f,
>> /* Lynxpoint-H PCH */
>> 0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17,
>> 0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f,
>> /* Lynxpoint-LP PCH */
>> 0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17,
>> 0x9c18, 0x9c19, 0x9c1a, 0x9c1b,
>> /* Wildcat PCH */
>> 0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97,
>> 0x9c98, 0x9c99, 0x9c9a, 0x9c9b,
>> /* Patsburg (X79) PCH */
>> 0x1d10, 0x1d12, 0x1d14, 0x1d16, 0x1d18, 0x1d1a, 0x1d1c, 0x1d1e,
>>};
>>
>>Hopefully if you run 'lspci -n', you'll see your device ID listed among
>>these. We don't currently have any quirks for PCIe switches, so if your
>>IOMMU group is still bigger than it should be, that may be the reason.
>>Thanks,
>>
>Using device specific mechanisms to enable and verify ACS-like capability is okay,
>but with regard to those devices which completely don't support ACS-like capabilities,
>what shall we do, how about applying the [PATCH] pci: Enable overrides for missing ACS capabilities,
>and how to reduce the risk of data corruption and info leakage between VMs?
>
Any update compared with http://thread.gmane.org/gmane.comp.emulators.kvm.devel/110726/focus=111515 ?
>Thanks,
>Zhang Haoyu
>>Alex
next prev parent reply other threads:[~2014-08-18 9:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-14 8:22 [questions] about using vfio to assign sr-iov vf to vm Zhang Haoyu
2014-08-14 12:44 ` Alex Williamson
2014-08-16 6:48 ` Zhang Haoyu
2014-08-16 13:29 ` Alex Williamson
2014-08-18 1:00 ` Zhang Haoyu
2014-08-18 1:14 ` Alex Williamson
2014-08-18 8:46 ` Zhang Haoyu
2014-08-18 9:49 ` Zhang Haoyu [this message]
2014-08-18 12:53 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201408181748585961259@sangfor.com \
--to=zhanghy@sangfor.com \
--cc=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=donald.d.dugger@intel.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=xudong.hao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox