Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group()

Linux IOMMU Development
 help / color / mirror / Atom feed

From: Robin Murphy <robin.murphy@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group()
Date: Tue, 16 May 2023 19:24:20 +0100	[thread overview]
Message-ID: <e8157615-abd3-9072-6245-50add0eeb284@arm.com> (raw)
In-Reply-To: <ZGOqnWBspvDynyJ4@nvidia.com>

On 2023-05-16 17:09, Jason Gunthorpe wrote:
> On Tue, May 16, 2023 at 04:00:47PM +0100, Robin Murphy wrote:
>> On 2023-05-16 01:27, Jason Gunthorpe wrote:
>>> This API is expected to be used only by POWER and VFIO no-iommu that
>>> manually manage the group lifecycle. It should not be called under
>>> ops->device_group().
>>>
>>> This is already buggy as is since the core code does not expect a probed
>>> driver to loose it's iommu_group without also releasing the device.
>>>
>>> FSL seems to be trying to block the platform_device that represents the
>>> pci_controller, eg the thing passed to fsl_add_bridge(), from having an
>>> iommu_group.
>>>
>>> Instead of creating an iommu_group that we don't want and then later
>>> removing it, just don't create it at all in the first place.
>>>
>>> For the 'pci_endpt_partitioning' case every PCI device already gets its
>>> own iommu_group through the standard code, so it is unclear why having a
>>> dedicated group for the controller could be problematic.
>>>
>>> For the other case, the controller group was being used to bizarrely
>>> de-duplicate the group in it's hose. Instead just directly create a group
>>> for the hose the first time we encounter it. The code already searches the
>>> entire hose to find any iommu_group. Again, it is unclear why having the
>>> pci_controller inside the same iommu_group as the PCI devices would be
>>> harmful.
>>
>> It's harmful in that case because it prevents VFIO from working at all -
>> even now that VFIO no longer rejects cross-bus groups on principle, the
>> fsl-pci driver being bound to the platform device would then deny VFIO from
>> taking ownership.
> 
> I think that was true before..
> 
> Today we block out VFIO based on iommu_device_use_default_domain()
> which is called prior to probe. That function is a NOP if the iommu
> group hasn't already been setup.
> 
> Thus the platform_device probed by fsl_pci_probe() does not block VFIO
> any more, even if it is in the same group.

My reading of the current code is that it must absolutely be relying on 
fsl_pamu_init() having run before fsl_pci_init() (they're both at 
arch_initcall level, implying an icky fragile link-order dependency), 
such that the platform device group is already assigned by the time the 
PCI host driver binds and starts adding PCI devices, so for *their* 
iommu_probe, get_pci_device_group() can then hoick that group over into 
PCI-land to start propagating it around.

Thus if all we did was remove the dodgy iommu_group_remove_device() 
calls, I believe use_default_domain *would* start getting in the way...

> But, more broadly, we exclude things like PCI bridges from the VFIO
> check by using driver_managed_dma:
> 
> index b7232c46b24481..6daf620b63a4d5 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -1353,6 +1353,7 @@ static struct platform_driver fsl_pci_driver = {
>                  .of_match_table = pci_ids,
>          },
>          .probe = fsl_pci_probe,
> +       .driver_managed_dma = true,
>   };
>   
>   static int __init fsl_pci_init(void)
> 
> Even though it is a NOP in this case because of ordering, it still
> would be good for documentation purposes.

...however I think you're on to something here - if with this we can 
safely guarantee to negate any ownership problem as well, then flipping 
it all around to *always* rely on the platform device's group might be 
neatest of all. It's redundant in the pci_endpt_partitioning case, but I 
don't see how that would really hurt other than slightly changing group 
numbering to the annoyance of users with dodgy hard-coded scripts who 
mistakenly assumed group IDs were stable. Crucially, all of the other 
crap can go away and leave us with simply:

	if (pci_endpt_partitioning)
		return pci_device_group(&pdev->dev);
	else
		return iommu_group_get(pci_ctl->parent);

Thanks,
Robin.

next prev parent reply	other threads:[~2023-05-16 18:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-16  0:27 [PATCH 0/2] Remove iommu_group_remove_device() from fsl Jason Gunthorpe
2023-05-16  0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe
2023-05-16 15:00   ` Robin Murphy
2023-05-16 16:09     ` Jason Gunthorpe
2023-05-16 18:24       ` Robin Murphy [this message]
2023-05-16 19:52         ` Jason Gunthorpe
2023-05-16  0:27 ` [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8157615-abd3-9072-6245-50add0eeb284@arm.com \
    --to=robin.murphy@arm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox