From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
Will Deacon <will@kernel.org>
Subject: Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group()
Date: Tue, 16 May 2023 19:24:20 +0100 [thread overview]
Message-ID: <e8157615-abd3-9072-6245-50add0eeb284@arm.com> (raw)
In-Reply-To: <ZGOqnWBspvDynyJ4@nvidia.com>
On 2023-05-16 17:09, Jason Gunthorpe wrote:
> On Tue, May 16, 2023 at 04:00:47PM +0100, Robin Murphy wrote:
>> On 2023-05-16 01:27, Jason Gunthorpe wrote:
>>> This API is expected to be used only by POWER and VFIO no-iommu that
>>> manually manage the group lifecycle. It should not be called under
>>> ops->device_group().
>>>
>>> This is already buggy as is since the core code does not expect a probed
>>> driver to loose it's iommu_group without also releasing the device.
>>>
>>> FSL seems to be trying to block the platform_device that represents the
>>> pci_controller, eg the thing passed to fsl_add_bridge(), from having an
>>> iommu_group.
>>>
>>> Instead of creating an iommu_group that we don't want and then later
>>> removing it, just don't create it at all in the first place.
>>>
>>> For the 'pci_endpt_partitioning' case every PCI device already gets its
>>> own iommu_group through the standard code, so it is unclear why having a
>>> dedicated group for the controller could be problematic.
>>>
>>> For the other case, the controller group was being used to bizarrely
>>> de-duplicate the group in it's hose. Instead just directly create a group
>>> for the hose the first time we encounter it. The code already searches the
>>> entire hose to find any iommu_group. Again, it is unclear why having the
>>> pci_controller inside the same iommu_group as the PCI devices would be
>>> harmful.
>>
>> It's harmful in that case because it prevents VFIO from working at all -
>> even now that VFIO no longer rejects cross-bus groups on principle, the
>> fsl-pci driver being bound to the platform device would then deny VFIO from
>> taking ownership.
>
> I think that was true before..
>
> Today we block out VFIO based on iommu_device_use_default_domain()
> which is called prior to probe. That function is a NOP if the iommu
> group hasn't already been setup.
>
> Thus the platform_device probed by fsl_pci_probe() does not block VFIO
> any more, even if it is in the same group.
My reading of the current code is that it must absolutely be relying on
fsl_pamu_init() having run before fsl_pci_init() (they're both at
arch_initcall level, implying an icky fragile link-order dependency),
such that the platform device group is already assigned by the time the
PCI host driver binds and starts adding PCI devices, so for *their*
iommu_probe, get_pci_device_group() can then hoick that group over into
PCI-land to start propagating it around.
Thus if all we did was remove the dodgy iommu_group_remove_device()
calls, I believe use_default_domain *would* start getting in the way...
> But, more broadly, we exclude things like PCI bridges from the VFIO
> check by using driver_managed_dma:
>
> index b7232c46b24481..6daf620b63a4d5 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -1353,6 +1353,7 @@ static struct platform_driver fsl_pci_driver = {
> .of_match_table = pci_ids,
> },
> .probe = fsl_pci_probe,
> + .driver_managed_dma = true,
> };
>
> static int __init fsl_pci_init(void)
>
> Even though it is a NOP in this case because of ordering, it still
> would be good for documentation purposes.
...however I think you're on to something here - if with this we can
safely guarantee to negate any ownership problem as well, then flipping
it all around to *always* rely on the platform device's group might be
neatest of all. It's redundant in the pci_endpt_partitioning case, but I
don't see how that would really hurt other than slightly changing group
numbering to the annoyance of users with dodgy hard-coded scripts who
mistakenly assumed group IDs were stable. Crucially, all of the other
crap can go away and leave us with simply:
if (pci_endpt_partitioning)
return pci_device_group(&pdev->dev);
else
return iommu_group_get(pci_ctl->parent);
Thanks,
Robin.
next prev parent reply other threads:[~2023-05-16 18:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-16 0:27 [PATCH 0/2] Remove iommu_group_remove_device() from fsl Jason Gunthorpe
2023-05-16 0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe
2023-05-16 15:00 ` Robin Murphy
2023-05-16 16:09 ` Jason Gunthorpe
2023-05-16 18:24 ` Robin Murphy [this message]
2023-05-16 19:52 ` Jason Gunthorpe
2023-05-16 0:27 ` [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8157615-abd3-9072-6245-50add0eeb284@arm.com \
--to=robin.murphy@arm.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox