public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	linux-pci@vger.kernel.org, Robin Murphy <robin.murphy@arm.com>,
	Will Deacon <will@kernel.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Donald Dutile <ddutile@redhat.com>,
	galshalom@nvidia.com, Joerg Roedel <jroedel@suse.de>,
	Kevin Tian <kevin.tian@intel.com>,
	kvm@vger.kernel.org, maorg@nvidia.com, patches@lists.linux.dev,
	tdave@nvidia.com, Tony Zhu <tony.zhu@intel.com>
Subject: Re: [PATCH v3 02/11] PCI: Add pci_bus_isolated()
Date: Tue, 9 Sep 2025 18:21:19 -0300	[thread overview]
Message-ID: <20250909212119.GA895786@nvidia.com> (raw)
In-Reply-To: <20250909195409.GA1494873@bhelgaas>

On Tue, Sep 09, 2025 at 02:54:09PM -0500, Bjorn Helgaas wrote:
> On Fri, Sep 05, 2025 at 03:06:17PM -0300, Jason Gunthorpe wrote:
> > Prepare to move the calculation of the bus P2P isolation out of the iommu
> > code and into the PCI core. This allows using the faster list iteration
> > under the pci_bus_sem, and the code has a kinship with the logic in
> > pci_for_each_dma_alias().
> > 
> > Bus isolation is the concept that drives the iommu_groups for the purposes
> > of VFIO. Stated simply, if device A can send traffic to device B then they
> > must be in the same group.
> > 
> > Only PCIe provides isolation. The multi-drop electrical topology in
> > classical PCI allows any bus member to claim the transaction.
> > 
> > In PCIe isolation comes out of ACS. If a PCIe Switch and Root Complex has
> > ACS flags that prevent peer to peer traffic and funnel all operations to
> > the IOMMU then devices can be isolated.
> 
> I guess a device being isolated means that peer-to-peer Requests from
> a different bus can't reach it?

peer-to-peer requests from a different *device*

> Did you mean "Root Port" instead of "Root Complex"?  Or are you
> assuming an ACS Capability in an RCRB?  (I don't think Linux supports
> RCRBs, except maybe for CXL)

Can't really say, the interaction of ACS within the Root Port, Root
Complex and so on is not really fully specified. Something within the
Root Complex routes to the TA/IOMMU.

Linux has, and continues with this series, to assume that the Root
Port is routing to the TA because we don't accumulate other Root
Complex devices into shared groups.

> > Multi-function devices also have an isolation concern with self loopback
> > between the functions, though pci_bus_isolated() does not deal with
> > devices.
> 
> It looks like multi-function devices *can* implement ACS and can
> isolate functions from each other (PCIe r7.0, sec 6.12.1.2).  But it
> sounds like we're ignoring peer-to-peer on the same bus for now and
> assuming devices on the same bus can't be isolated from each other?

Yes, MFD has ACS, but no it is not ignored for now. The iommu grouping
code has two parts, one for busses/switches and another for MFDs.

This couple of patches switches the busses/switches to use the new
mechanism and leaves MFD alone. Later patches correct the issues in
MFD as well. The above comment is trying to explain this split in the
patch series.

> If we ignore ACS on non-bridge multi-function devices, I think the
> only way to isolate things is bridge ACS that controls forwarding
> between buses.  

Yes

> If everything on the bus must be in the same group, it
> makes sense for pci_bus_isolated() to take a pci_bus pointer and not
> deal with individual devices.

> Below, it seems like sometimes we refer to *buses* being isolated and
> other times *devices* (Root Port, Switch Port, Switch, etc), so I'm a
> little confused.

They are different things, and have different treatment. I've tried to
keep them separated by having code that works on busses and different
code that works on devices. 

A bus is isolating if upstream travelling transactions reaching the
bus only go upstream.

A device is isolated if its bus and all upstream busses are isolating,
and the device itself has no internal loopback.

In this code it sometimes talks about the device in terms of a
bridge, USP, DSP, or Root Port. All of these are a bit special because
a upstream travelling transaction is permitted to internal loopback to
the bridge device without touching a bus.

So while the bridge device may be on an isolating bus, the bridge
device itself is not isolated from its downstream bus.

> > As a property of a bus, there are several positive cases:
> > 
> >  - The point to point "bus" on a physical PCIe link is isolated if the
> >    bridge/root device has something preventing self-access to its own
> >    MMIO.
> >
> >  - A Root Port is usually isolated
> >
> >  - A PCIe switch can be isolated if all it's Down Stream Ports have good
> >    ACS flags
> 
> I guess this is saying that a switch's internal bus is isolated if all
> the DSPs have the ACS flags we need?

Yes

> > pci_bus_isolated() implements these rules and returns an enum indicating
> > the level of isolation the bus has, with five possibilies:
> > 
> >  PCIE_ISOLATED: Traffic on this PCIE bus can not do any P2P.
> 
> Is this saying that peer-to-peer Requests can't reach devices on this
> bus?  Or Requests *from* this bus can only go to the IOMMU?

Transactions mastered on this bus travelling in the upstream direction
are only received by the upstream bridge and are never received by any
other device on the bus.

Or Transactions reaching this bus travelling in the upstream direction
continue upstream and never go back downstream.

I would not use the words 'from' or 'to' when talking about busses,
they don't originate or terminate transactions they are just
transports.

> > + * pci_bus_isolated() does not consider loopback internal to devices, like
> > + * multi-function devices performing a self-loopback. The caller must check
> > + * this separately. It does not considering alasing within the bus.
> 
> s/alasing/aliasing/ (I guess this refers to the
> PCI_DEV_FLAG_PCIE_BRIDGE_ALIAS thing where a bridge takes ownership?)

Yes

> > +	/*
> > +	 * bus->self is only NULL for SRIOV VFs, it represents a "virtual" bus
> > +	 * within Linux to hold any bus numbers consumed by VF RIDs. Caller must
> > +	 * use pci_physfn() to get the bus for calling this function.
> 
> s/VF RIDs/VFs/  I think?  I think we allocate these virtual bus
> numbers when enabling the VFs.

Maybe BDF instead of RID

> > +	/*
> > +	 * bus is the interior bus of a PCI-E switch where ACS rules apply.
> 
> s/interior/internal/ to match use above
> s/PCI-E/PCIe/
> 
> I'm not sure what this is saying.  A USP can't have an ACS Capability
> unless it's part of a multi-function device.

So it can have ACS :)

Not sure what is unclear?

I will fix up all the notes

Thanks,
Jason

  reply	other threads:[~2025-09-09 21:21 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 18:06 [PATCH v3 00/11] Fix incorrect iommu_groups with PCIe ACS Jason Gunthorpe
2025-09-05 18:06 ` [PATCH v3 01/11] PCI: Move REQ_ACS_FLAGS into pci_regs.h as PCI_ACS_ISOLATED Jason Gunthorpe
2025-09-09  4:08   ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 02/11] PCI: Add pci_bus_isolated() Jason Gunthorpe
2025-09-09  4:09   ` Donald Dutile
2025-09-09 19:54   ` Bjorn Helgaas
2025-09-09 21:21     ` Jason Gunthorpe [this message]
2025-09-05 18:06 ` [PATCH v3 03/11] iommu: Compute iommu_groups properly for PCIe switches Jason Gunthorpe
2025-09-09  4:14   ` Donald Dutile
2025-09-09 12:18     ` Jason Gunthorpe
2025-09-09 19:33       ` Donald Dutile
2025-09-09 20:27   ` Bjorn Helgaas
2025-09-09 21:21     ` Jason Gunthorpe
2025-09-05 18:06 ` [PATCH v3 04/11] iommu: Organize iommu_group by member size Jason Gunthorpe
2025-09-09  4:16   ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 05/11] PCI: Add pci_reachable_set() Jason Gunthorpe
2025-09-09 21:03   ` Bjorn Helgaas
2025-09-10 16:13     ` Jason Gunthorpe
2025-09-11 19:56     ` Donald Dutile
2025-09-15 13:38       ` Jason Gunthorpe
2025-09-15 14:32         ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 06/11] iommu: Compute iommu_groups properly for PCIe MFDs Jason Gunthorpe
2025-09-09  4:57   ` Donald Dutile
2025-09-09 13:31     ` Jason Gunthorpe
2025-09-09 19:55       ` Donald Dutile
2025-09-09 21:24   ` Bjorn Helgaas
2025-09-09 23:20     ` Jason Gunthorpe
2025-09-10  1:59     ` Donald Dutile
2025-09-10 17:43       ` Jason Gunthorpe
2025-09-05 18:06 ` [PATCH v3 07/11] iommu: Validate that pci_for_each_dma_alias() matches the groups Jason Gunthorpe
2025-09-09  5:00   ` Donald Dutile
2025-09-09 15:35     ` Jason Gunthorpe
2025-09-09 19:58       ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 08/11] PCI: Add the ACS Enhanced Capability definitions Jason Gunthorpe
2025-09-09  5:01   ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 09/11] PCI: Enable ACS Enhanced bits for enable_acs and config_acs Jason Gunthorpe
2025-09-09  5:01   ` Donald Dutile
2025-09-05 18:06 ` [PATCH v3 10/11] PCI: Check ACS DSP/USP redirect bits in pci_enable_pasid() Jason Gunthorpe
2025-09-09  5:02   ` Donald Dutile
2025-09-09 21:43   ` Bjorn Helgaas
2025-09-10 17:34     ` Jason Gunthorpe
2025-09-11 19:50       ` Donald Dutile
2026-01-20 18:08   ` Keith Busch
2025-09-05 18:06 ` [PATCH v3 11/11] PCI: Check ACS Extended flags for pci_bus_isolated() Jason Gunthorpe
2025-09-09  5:04   ` Donald Dutile
2025-09-15  9:41 ` [PATCH v3 00/11] Fix incorrect iommu_groups with PCIe ACS Cédric Le Goater
2025-09-22 22:39 ` Alex Williamson
2025-09-23  1:44   ` Donald Dutile
2025-09-23  2:06     ` Alex Williamson
2025-09-23  2:42       ` Donald Dutile
2025-09-23 22:23         ` Alex Williamson
2025-09-30 15:23           ` Donald Dutile
2025-09-30 16:21             ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250909212119.GA895786@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=galshalom@nvidia.com \
    --cc=helgaas@kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=robin.murphy@arm.com \
    --cc=tdave@nvidia.com \
    --cc=tony.zhu@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox