linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Lawrynowicz, Jacek" <jacek.lawrynowicz@intel.com>,
	Joerg Roedel <jroedel@suse.de>,
	linux-pci@vger.kernel.org, iommu@lists.linux-foundation.org
Subject: Re: [PATCH v4 3/6] PCI: Add support for multiple DMA aliases
Date: Fri, 8 Apr 2016 11:31:49 -0600	[thread overview]
Message-ID: <20160408113149.1386ec8c@t450s.home> (raw)
In-Reply-To: <20160408160632.GF8780@localhost>

On Fri, 8 Apr 2016 11:06:32 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Tue, Mar 15, 2016 at 07:48:17PM -0500, Bjorn Helgaas wrote:
> > On Mon, Mar 14, 2016 at 10:43:40PM +0000, David Woodhouse wrote:  
> > > On Thu, 2016-02-25 at 08:38 -0600, Bjorn Helgaas wrote:  
> > > >   
> > > > >  /*
> > > > > - * Look for aliases to or from the given device for exisiting groups.  The
> > > > > - * dma_alias_devfn only supports aliases on the same bus, therefore the search
> > > > > + * Look for aliases to or from the given device for existing groups. DMA
> > > > > + * aliases are only supported on the same bus, therefore the search  
> > > > 
> > > > I'm trying to reconcile this statement that "DMA aliases are only
> > > > supported on the same bus" (which was there even before this patch)
> > > > with the fact that pci_for_each_dma_alias() does not have that
> > > > limitation.  
> > > 
> > > Doesn't it? You can still only set a DMA alias on the same bus with
> > > pci_add_dma_alias(), can't you?  
> > 
> > I guess it's true that PCI_DEV_FLAGS_DMA_ALIAS_DEVFN and the proposed
> > pci_add_dma_alias() only add aliases on the same bus.  I was thinking
> > about a scenario like this:
> > 
> >   00:00.0 PCIe-to-PCI bridge to [bus 01]
> >   01:01.0 conventional PCI device
> > 
> > where I think 01:00.0 is a DMA alias for 01:01.0 because the bridge
> > takes ownership of DMA transactions from 01:01.0 and assigns a
> > Requester ID of 01:00.0 (secondary bus number, device 0, function 0).

This is true, but this is a topology alias, not a quirk.
pci_for_each_dma_alias() already handles this case.  It would trigger
the callback function for any direct alias of the conventional device
as well as the bridge itself.  Obviously we don't end up with any
quirks for conventional devices because the aliases are masked behind
the bridge anyway.
 
> > > > >   * space is quite small (especially since we're really only looking at pcie
> > > > >   * device, and therefore only expect multiple slots on the root complex or
> > > > >   * downstream switch ports).  It's conceivable though that a pair of
> > > > > @@ -686,11 +692,8 @@ static struct iommu_group *get_pci_alias_group(struct pci_dev *pdev,
> > > > >                       continue;
> > > > >  
> > > > >               /* We alias them or they alias us */
> > > > > -             if (((pdev->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) &&
> > > > > -                  pdev->dma_alias_devfn == tmp->devfn) ||
> > > > > -                 ((tmp->dev_flags & PCI_DEV_FLAGS_DMA_ALIAS_DEVFN) &&
> > > > > -                  tmp->dma_alias_devfn == pdev->devfn)) {
> > > > > -
> > > > > +             if (dma_alias_is_enabled(pdev, tmp->devfn) ||
> > > > > +                 dma_alias_is_enabled(tmp, pdev->devfn)) {
> > > > >                       group = get_pci_alias_group(tmp, devfns);  
> > > > 
> > > > We basically have this:
> > > > 
> > > >   for_each_pci_dev(tmp) {
> > > >     if ()
> > > >       group = get_pci_alias_group();
> > > >       ...
> > > >   }  
> > > 
> > > Strictly, that's:
> > > 
> > >  for_each_pci_dev(tmp) {
> > >    if (pdev is an alias of tmp || tmp is an alias of pdev)
> > >      group = get_pci_alias_group();
> > >    ...
> > >  }  
> > 
> > OK.  
> >   
> > > > I'm trying to figure out why we don't do something like the following
> > > > instead:
> > > > 
> > > >   callback(struct pci_dev *pdev, u16 alias, void *opaque)
> > > >   {
> > > >     struct iommu_group *group;
> > > > 
> > > >     group = get_pci_alias_group();
> > > >     if (group)
> > > >       return group;
> > > > 
> > > >     return 0;
> > > >   }
> > > > 
> > > >   pci_for_each_dma_alias(pdev, callback, ...);  
> > > 
> > > And this would be equivalent to
> > > 
> > >  for_each_pci_dev(tmp) {
> > >    if (tmp is an alias of pdev)
> > >      group = get_pci_alias_group();
> > >    ...
> > >  }
> > > 
> > > The "is an alias of" property is not commutative. Perhaps it should be.
> > > But that's hard because in some cases the alias doesn't even *exist* as
> > > a real PCI device. It's just that you appear to get DMA transactions
> > > from a given source-id.  
> > 
> > Right.  In my example above, 01:00.0 is not a PCI device; it's only a
> > Requester ID that is fabricated by the bridge when it forwards DMA
> > transactions upstream.
> > 
> > I think I'm confused because I don't really understand IOMMU groups.
> > 
> > Let me explain what I think they are and you can correct me when I go
> > wrong.  The iommu_group_alloc() comment says "The IOMMU group
> > represents the minimum granularity of the IOMMU."  So I suppose the
> > IOMMU cannot distinguish between devices in a group.  All the devices
> > in the group use the same set of DMA mappings.  Granting device A DMA
> > access to a buffer grants the same access to all other members of A's
> > IOMMU group.
> > 
> > That would mean my question was fundamentally backwards.  In
> > get_pci_alias_group(A), we're not trying to figure out what all the
> > aliases of A are, which is what pci_for_each_dma_alias() does.
> > 
> > Instead, we're trying to figure out which IOMMU group A belongs to.
> > But I still don't quite understand how aliases fit into this.  Let's
> > go back to my example and assume we've already put 00:00.0 and 01:01.0
> > in IOMMU groups:
> > 
> >   00:00.0 PCIe-to-PCI bridge to [bus 01]     # in IOMMU group G0
> >   01:01.0 conventional PCI device            # in IOMMU group G1
> > 
> > I assume these devices are in different IOMMU groups because if the
> > bridge generated its own DMA, it would use Requester ID 00:00.0, which
> > is distinct from the 01:00.0 it would use when forwarding DMAs from
> > its secondary side.

The actual requester ID of the bridge depends on quirks as well, see
PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS, so it might either be the bridge itself
or the subordinate bus address.  However, trace how these groups would
be created, the bridge device is discovered first, so we call
pci_device_group() for it.  since it's on the root bus and it's the 0th
function, there will be no existing groups for it to alias, so it gets
a new group.  Then we discover 01:01.0, pci_device_group() calls
pci_for_each_dma_alias(), which hits the bridge, regardless of which
alias is used.  So 01:01.0 ends up in the same iommu group as the
bridge.

>From the purist standpoint of iommu groups, your expectations are
probably correct, but we also include within IOMMU groups DMA isolation
between devices.  So if two devices can DMA between each other without
it necessarily passing through the iommu, the devices are grouped
together.  The more important cases of this are ACS isolation on PCI-e
devices, but it still sort of applies to this conventional PCI example
here.
 
> > What happens when we add 01:02.0?  I think 01:01.0 and 01:02.0
> > should both end up in IOMMU group G1 because the IOMMU will see only
> > Requester ID 01:00.0, so it can't distinguish them.
> > 
> > When we add 01:02.0, the ops->add_device() ... ops->device_group()
> > path calls pci_device_group(01:02.0):
> > 
> >   pci_device_group(01:02.0)
> >     pci_for_each_dma_alias(01:02.0, get_pci_alias_or_group)
> >       get_pci_alias_or_group(01:02.0, 01:02.0)   # callback
> >         return 0           # 01:02.0 group not set yet
> >       get_pci_alias_or_group(00:00.0, 01:00.0)   # callback
> >         return 1           # 00:00.0 is in G0
> > 
> > It seems like we'll assign 01:02.0 to group G0, when I think it
> > should be in G1.  Where did I go wrong?  

In fact they're all part of G0, so you went wrong assuming the initial
grouping rather than applying the same rules and tracing how 01:01.0 is
added.  It's not an ideal representation, it tends to be more
conservative than necessary in some cases, but keeping the group
attached to devices has advantages in being able to search and find
points like this where we don't necessarily have access to the group
behind the bridge without tying it to the bridge itself.  Thanks,

Alex

  parent reply	other threads:[~2016-04-08 17:31 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 19:43 [PATCH v4 0/6] PCI: Support multiple DMA aliases Bjorn Helgaas
2016-02-24 19:43 ` [PATCH v4 1/6] PCI: Add pci_add_dma_alias() to abstract implementation Bjorn Helgaas
2016-04-08 20:18   ` Alex Williamson
2016-02-24 19:43 ` [PATCH v4 2/6] PCI: Move informational printk to pci_add_dma_alias() Bjorn Helgaas
2016-04-08 20:19   ` Alex Williamson
2016-02-24 19:44 ` [PATCH v4 3/6] PCI: Add support for multiple DMA aliases Bjorn Helgaas
2016-02-25 14:38   ` Bjorn Helgaas
2016-02-25 15:41     ` Lawrynowicz, Jacek
2016-02-29 22:44       ` Bjorn Helgaas
2016-03-01 16:57         ` Jacek Lawrynowicz
2016-03-03 14:22         ` [PATCH] " Jacek Lawrynowicz
2016-03-03 14:38         ` [PATCH v5 3/6] " Jacek Lawrynowicz
2016-04-08 20:19           ` Alex Williamson
2016-03-14 22:43     ` [PATCH v4 " David Woodhouse
2016-03-16  0:48       ` Bjorn Helgaas
2016-04-08 16:06         ` Bjorn Helgaas
2016-04-08 16:09           ` David Woodhouse
2016-04-08 17:31           ` Alex Williamson [this message]
2016-02-24 19:44 ` [PATCH v4 4/6] PCI: Rename dma_alias_is_enabled() to pci_devs_are_dma_aliases() Bjorn Helgaas
2016-04-08 20:19   ` Alex Williamson
2016-02-24 19:44 ` [PATCH v4 5/6] pci: Add DMA alias quirk for mic_x200_dma Bjorn Helgaas
2016-03-03 14:53   ` [PATCH v5 5/6] PCI: " Jacek Lawrynowicz
2016-04-08 20:19     ` Alex Williamson
2016-02-24 19:44 ` [PATCH v4 6/6] PCI: Squash pci_dev_flags to remove holes Bjorn Helgaas
2016-04-08 20:19   ` Alex Williamson
2016-04-12  4:38 ` [PATCH v4 0/6] PCI: Support multiple DMA aliases Bjorn Helgaas
2016-04-12 16:20   ` Lawrynowicz, Jacek
2016-04-12 18:10   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160408113149.1386ec8c@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=dwmw2@infradead.org \
    --cc=helgaas@kernel.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacek.lawrynowicz@intel.com \
    --cc=jroedel@suse.de \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).