From: Don Dutile <ddutile@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Joerg Roedel <joro@8bytes.org>,
"open list:INTEL IOMMU (VT-d)" <iommu@lists.linux-foundation.org>,
Andrew Cooks <acooks@gmail.com>,
David Woodhouse <dwmw2@infradead.org>
Subject: Re: [RFC PATCH v2 1/2] pci: Create PCIe requester ID interface
Date: Mon, 29 Jul 2013 17:03:55 -0400 [thread overview]
Message-ID: <51F6D8BB.4040901@redhat.com> (raw)
In-Reply-To: <1375113975.2642.208.camel@ul30vt.home>
On 07/29/2013 12:06 PM, Alex Williamson wrote:
> On Fri, 2013-07-26 at 15:54 -0600, Bjorn Helgaas wrote:
>> On Thu, Jul 25, 2013 at 11:56:56AM -0600, Alex Williamson wrote:
>>> On Wed, 2013-07-24 at 17:24 -0600, Bjorn Helgaas wrote:
>>>> On Wed, Jul 24, 2013 at 12:12:28PM -0600, Alex Williamson wrote:
>>>>> On Wed, 2013-07-24 at 10:47 -0600, Bjorn Helgaas wrote:
>>>>>> On Tue, Jul 23, 2013 at 5:21 PM, Alex Williamson
>>>>>> <alex.williamson@redhat.com> wrote:
>>>>>>> On Tue, 2013-07-23 at 16:35 -0600, Bjorn Helgaas wrote:
>>>>>>>> On Thu, Jul 11, 2013 at 03:03:27PM -0600, Alex Williamson wrote:
>>>>>> As the DMA
>>>>>> transaction propagates through the fabric, it may be tagged by bridges
>>>>>> with different requester IDs.
>>>>>>
>>>>>> The requester IDs are needed outside PCI (by IOMMU drivers), but I'm
>>>>>> not sure the intermediate pci_devs are.
>>>>>
>>>>> A u16 requester ID doesn't mean much on it's own though, it's not
>>>>> necessarily even unique. A requester ID associated with the context of
>>>>> a pci_dev is unique and gives us a reference point if we need to perform
>>>>> another operation on that requester ID.
>>>>
>>>> A u16 requester ID better mean something to an IOMMU -- it's all the
>>>> IOMMU can use to look up the correct mapping. That's why we have to
>>>> give the iterator something to define the scope to iterate over. The
>>>> same requester ID could mean something totally unrelated in a
>>>> different scope, e.g., below a different IOMMU.
>>>
>>> The point I'm trying to make is that a requester ID depends on it's
>>> context (minimally, the PCI segment). The caller can assume the context
>>> based on the calling parameters or we can provide context in the form of
>>> an associated pci_dev. I chose the latter path because I prefer
>>> explicit interfaces and it has some usefulness in the intel-iommu
>>> implementation.
>>>
>>> For instance, get_domain_for_dev() first looks to see if a pci_dev
>>> already has a domain. If it doesn't, we want to look to see if there's
>>> an upstream device that would use the same requester ID that already has
>>> a domain. If our get-requester-ID-for-device function returns only the
>>> requester ID, we don't know if that requester ID is the device we're
>>> searching from or some upstream device. Therefore we potentially do an
>>> unnecessary search for the domain.
>>>
>>> The other user is intel_iommu_add_device() where we're trying to build
>>> IOMMU groups. Visibility is the first requirement of an IOMMU group.
>>> If the IOMMU cannot distinguish between devices, they must be part of
>>> the same IOMMU group. Here we want to find the pci_dev that hosts the
>>> requester ID. I don't even know how we'd implement this with a function
>>> that only returned the requester ID. Perhaps we'd have to walk upstream
>>> from the device calling the get-requester-ID-for-device function at each
>>> step and noticing when it changed. That moves significant logic back
>>> into the caller code.
>>> ...
>>
>>>> I don't see the point of passing a "device closest to the requester
>>>> ID." What would the IOMMU do with that? As far as the IOMMU is
>>>> concerned, the requester ID could be an arbitrary number completely
>>>> unrelated to a pci_dev.
>>>
>>> Do you have an example of a case where a requester ID doesn't have some
>>> association to a pci_dev?
>>
>> I think our confusion here is the same as what Don& I have been
>> hashing out -- I'm saying a requester ID fabricated by a bridge
>> need not correspond to a specific pci_dev, and you probably mean
>> that every requester ID is by definition the result of *some* PCI
>> device making a DMA request.
>
> Yes
>
>>> ...
>>> Furthermore, if we have:
>>>
>>> -- A
>>> /
>>> X--Y
>>> \
>>> -- B
>>> ...
>>
>>> Let me go back to the X-Y-A|B example above to see if I can explain why
>>> pcie_for_each_requester_id() doesn't make sense to me. Generally a
>>> for_each_foo function should iterate across all things with the same
>>> foo. So a for_each_requester_id should iterate across all things with
>>> the same requester ID.
>>
>> Hm, that's not the way I think of for_each_FOO() interfaces. I
>> think of it as "execute the body (or callback) for every possible
>> FOO", where FOO is different each time. for_each_pci_dev(),
>> pci_bus_for_each_resource(), for_each_zone(), for_each_cpu(), etc.,
>> work like that.
>
> Most of these aren't relevant because they iterate over everything. I
> think they only show that if we had a pci_for_each_requester_id() that
> it should iterate over every possible PCI requester ID in the system and
> the same could be said for pci_for_each_requester().
>
Lost me here.... I thought the functions took in a pdev, so it'll only
iterate on the segment that pdev is associated with.
> pci_bus_for_each_resource() is the only one we can build from; for all
> resources on a bus. We want all requester IDs for a pci_dev. Does that
> perhaps mean it should be called pci_dev_for_each_requester_id()? I'd
> be ok with that name, but I think it even more implies that a pci_dev is
> associated with a requester ID.
>
and the fcn call name changes have equally lost me here.
AIUI, IOMMUs want to call a PCI core function with a pdev, asking for
the req-id's that the pdev may generate to the IOMMU when performing DMA.
that doesn't strike me as a for-each-requester-id() but a 'get-requester-id()'
operation.
>> But the more important question is what arguments we give to the
>> callback. My proposal was to map
>>
>> {pci_dev -> {requester-ID-A, requester-ID-B, ...}}
>>
>> Yours is to map
>>
>> {pci_dev -> {{pci_dev-A, requester-ID-A}, {pci_dev-B, requester-ID-B}, ...}}
>>
>> i.e., your callback gets both a pci_dev and a requester-ID. I'm
>> trying to figure out how you handle cases where requester-id-A
>> doesn't have a corresponding pci_dev-A. I guess you pass the device
>> most closely associated with requester-id-A
>
> Yes
>
>> The "most closely associated device" idea seems confusing to
>> describe and think about. I think you want to use it as part of
>> grouping devices into domains. But I think we should attack that
>> problem separately. For grouping or isolation questions, we have
>> to pay attention to things like ACS, which are not relevant for
>> mapping.
>
> We are only touch isolation insofar as providing an interface for a
> driver to determine the point in the PCI topology where the requester ID
> is rooted. Yes, grouping can make use of that, but I object to the idea
> that we're approaching some slippery slope of rolling multiple concepts
> into this patch. What I'm proposing here is strictly a requester ID
> interface. I believe that providing a way for a caller to determine
> that two devices have a requester ID rooted at the same point in the PCI
> topology is directly relevant to that interface.
>
I strongly agree here. providing the pdev's associated with each req-id rtn'd
cannot hurt, and in fact, I believe it is an improvement, avoiding a potential
set of more interfaces that may be needed to do (segment, req-id)->pdev mapping/matching/get-ing.
> pci_find_upstream_pcie_bridge() attempts to do this same thing.
> Unfortunately, the interface is convoluted, it doesn't handle quirks,
> and it requires a great deal of work by the caller to then walk the
> topology and create requester IDs at each step. This also indicates
> that at each step, the requester ID is associated with some pci_dev.
> Providing a pci_dev is simply showing our work and providing context to
> the requester ID (ie. here's the requester ID and the step along the
> path from which it was derived. Here's your top level requester ID and
> the point in the topology where it's based).
>
IMO, getting rid of pci_find_upstream_pcie_bridge() gets non-PCI code
out of the biz of knowing too much about PCI topology and the idiosyncracies
around req-id's, and the proposed interface cleans up the IOMMU (PCI) support.
Having the IOMMU api get the pdev rtn'd with a req-id, and then passing it
back to the core (for release/free) seems like the proper get/put logic
btwn the PCI core and another kernel subsystem.
>>> If we look at an endpoint device like A, only A
>>> has A's requester ID. Therefore, why would for_each_requester_id(A)
>>> traverse up to Y?
>>
>> Even if A is a PCIe device, we have to traverse upwards to find any
>> bridges that might drop A's requester ID or take ownership, e.g., if
>> we have this:
>>
>> 00:1c.0 PCIe-to-PCI bridge to [bus 01-02]
>> 01:00.0 PCI-to-PCIe bridge to [bus 02]
>> 02:00.0 PCIe endpoint A
>>
>> the IOMMU has to look for requester-ID 0100.
>
> And I believe this patch handles this case correctly; I mentioned this
> exact example in the v2 RFC cover letter. This is another example where
> pci_find_upstream_pcie_bridge() will currently fail.
>
+1
>>> Y can take ownership and become the requester for A,
>>> but if we were to call for_each_requester_id(Y), wouldn't you expect the
>>> callback to happen on {Y, A, B} since all of those can use that
>>> requester ID?
>>
>> No. If I call for_each_requester_id(Y), I'm expecting the callback
>> to happen for each requester ID that could be used for transactions
>> originated by *Y*. I'm trying to make an IOMMU mapping for use by
>> Y, so any devices downstream of Y, e.g., A and B, are irrelevant.
>
> Ok, you think of for_each_requester_id() the same as I think of
> for_each_requester(). Can we split the difference and call it
> pci_dev_for_each_requester_id()?
>
I can only ditto my sentiment above.
Can I toss in 'pci_get_requester_id()'?!? ... ;-)
>> I think a lot of the confusion here is because we're trying to solve
>> both two questions at once: (1) what requester-IDs need to be mapped
>> to handle DMA from a device, and (2) what devices can't be isolated
>> from each other and must be in the same domain. I think things will
>> be simpler if we can deal with those separately. Even if it ends up
>> being more code, at least each piece will be easier to understand by
>> itself.
>
> Don't we already have this split in the code?
>
> (1) pcie_for_each_requester
> (or pci_dev_for_each_requester_id)
>
> (2) pci_get_visible_pcie_requester
> (or pci_get_visible_pcie_requester_id)
>
again, unless you understand all of the mapped/coerced/modified request-id
of PCIe, PCIe-to-PCI(x) bridges, and the potential (RC, other) quirks,
these interface names are confusing. ....
... that could be me, and I need to go back to look at patches and
the description of the functions, to see if they help to clarify their uses.
> Note however that (2) does not impose anything about domains or
> isolation, it is strictly based on PCI topology. It's left to the
> caller to determine how that translates to IOMMU domains, but the
> typical case is trivial. Thanks,
>
+1, again.
> Alex
>
next prev parent reply other threads:[~2013-07-29 21:04 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-11 21:03 [RFC PATCH v2 0/2] pci/iommu: PCIe requester ID interface Alex Williamson
2013-07-11 21:03 ` [RFC PATCH v2 1/2] pci: Create " Alex Williamson
2013-07-23 22:35 ` Bjorn Helgaas
2013-07-23 23:21 ` Alex Williamson
2013-07-24 15:03 ` Andrew Cooks
2013-07-24 15:50 ` Alex Williamson
2013-07-24 16:47 ` Bjorn Helgaas
2013-07-24 18:12 ` Alex Williamson
2013-07-24 23:24 ` Bjorn Helgaas
2013-07-25 17:56 ` Alex Williamson
2013-07-26 21:54 ` Bjorn Helgaas
2013-07-29 16:06 ` Alex Williamson
2013-07-29 21:02 ` Bjorn Helgaas
2013-07-29 22:32 ` Alex Williamson
2013-08-01 22:08 ` Bjorn Helgaas
2013-08-02 16:59 ` Alex Williamson
2014-04-03 21:48 ` Bjorn Helgaas
2014-04-04 2:51 ` Alex Williamson
2014-04-04 15:00 ` Bjorn Helgaas
2013-07-29 21:03 ` Don Dutile [this message]
2013-07-29 22:55 ` Alex Williamson
2013-07-24 20:42 ` Don Dutile
2013-07-24 21:22 ` Alex Williamson
2013-07-25 18:38 ` Don Dutile
2013-07-25 17:19 ` Bjorn Helgaas
2013-07-25 18:25 ` Don Dutile
2013-07-26 19:48 ` Bjorn Helgaas
2013-07-26 20:04 ` Don Dutile
2013-07-11 21:03 ` [RFC PATCH v2 2/2] iommu/intel: Make use of " Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51F6D8BB.4040901@redhat.com \
--to=ddutile@redhat.com \
--cc=acooks@gmail.com \
--cc=alex.williamson@redhat.com \
--cc=bhelgaas@google.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).