* [PATCH 0/2] Remove iommu_group_remove_device() from fsl
@ 2023-05-16 0:27 Jason Gunthorpe
2023-05-16 0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe
2023-05-16 0:27 ` [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices Jason Gunthorpe
0 siblings, 2 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2023-05-16 0:27 UTC (permalink / raw)
To: iommu, Joerg Roedel, Robin Murphy, Will Deacon
With POWER SPAPR now having a real iommu driver and using the normal group
lifecycle stuff fixing FSL will leave only VFIO's no-iommu support as a
user for the iommu_group_add/remove_device() calls. This will help
simplify the understanding of what the core code should be doing for these
functions.
Fix FSL to not need to call iommu_group_remove_device() at all.
Jason Gunthorpe (2):
iommu/fsl: Do not use iommu_group_remove_device() under
ops->device_group()
iommu/fsl: Always allocate a group for non-pci devices
drivers/iommu/fsl_pamu_domain.c | 55 +++++++++++++--------------------
1 file changed, 21 insertions(+), 34 deletions(-)
base-commit: 1421774b874bfd5fd1b2b05b59b67c0c5e0d513e
--
2.40.1
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() 2023-05-16 0:27 [PATCH 0/2] Remove iommu_group_remove_device() from fsl Jason Gunthorpe @ 2023-05-16 0:27 ` Jason Gunthorpe 2023-05-16 15:00 ` Robin Murphy 2023-05-16 0:27 ` [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices Jason Gunthorpe 1 sibling, 1 reply; 7+ messages in thread From: Jason Gunthorpe @ 2023-05-16 0:27 UTC (permalink / raw) To: iommu, Joerg Roedel, Robin Murphy, Will Deacon This API is expected to be used only by POWER and VFIO no-iommu that manually manage the group lifecycle. It should not be called under ops->device_group(). This is already buggy as is since the core code does not expect a probed driver to loose it's iommu_group without also releasing the device. FSL seems to be trying to block the platform_device that represents the pci_controller, eg the thing passed to fsl_add_bridge(), from having an iommu_group. Instead of creating an iommu_group that we don't want and then later removing it, just don't create it at all in the first place. For the 'pci_endpt_partitioning' case every PCI device already gets its own iommu_group through the standard code, so it is unclear why having a dedicated group for the controller could be problematic. For the other case, the controller group was being used to bizarrely de-duplicate the group in it's hose. Instead just directly create a group for the hose the first time we encounter it. The code already searches the entire hose to find any iommu_group. Again, it is unclear why having the pci_controller inside the same iommu_group as the PCI devices would be harmful. In any case, this is a cleaner way to not enable iommu support for the platform_device. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/iommu/fsl_pamu_domain.c | 42 ++++++++++++++++----------------- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c index bce37229709965..bf045f58cd50ae 100644 --- a/drivers/iommu/fsl_pamu_domain.c +++ b/drivers/iommu/fsl_pamu_domain.c @@ -11,6 +11,7 @@ #include <linux/platform_device.h> #include <sysdev/fsl_pci.h> +#include <linux/pci.h> /* * Global spinlock that needs to be held while @@ -379,7 +380,21 @@ static struct iommu_group *get_shared_pci_device_group(struct pci_dev *pdev) bus = bus->parent; } - return NULL; + return iommu_group_alloc(); +} + +static int __is_pci_controller_parent(struct device *dev, void *data) +{ + struct pci_dev *pdev = to_pci_dev(dev); + struct pci_controller *pci_ctl = pci_bus_to_host(pdev->bus); + + return dev == pci_ctl->parent; +} + +static bool is_pci_controller_parent(struct device *dev) +{ + return bus_for_each_dev(&pci_bus_type, NULL, NULL, + __is_pci_controller_parent); } static struct iommu_group *get_pci_device_group(struct pci_dev *pdev) @@ -393,30 +408,12 @@ static struct iommu_group *get_pci_device_group(struct pci_dev *pdev) /* We can partition PCIe devices so assign device group to the device */ if (pci_endpt_partitioning) { group = pci_device_group(&pdev->dev); - - /* - * PCIe controller is not a paritionable entity - * free the controller device iommu_group. - */ - if (pci_ctl->parent->iommu_group) - iommu_group_remove_device(pci_ctl->parent); } else { /* * All devices connected to the controller will share the - * PCI controllers device group. If this is the first - * device to be probed for the pci controller, copy the - * device group information from the PCI controller device - * node and remove the PCI controller iommu group. - * For subsequent devices, the iommu group information can - * be obtained from sibling devices (i.e. from the bus_devices - * link list). + * same device group. */ - if (pci_ctl->parent->iommu_group) { - group = get_device_iommu_group(pci_ctl->parent); - iommu_group_remove_device(pci_ctl->parent); - } else { - group = get_shared_pci_device_group(pdev); - } + group = get_shared_pci_device_group(pdev); } if (!group) @@ -436,7 +433,8 @@ static struct iommu_group *fsl_pamu_device_group(struct device *dev) */ if (dev_is_pci(dev)) group = get_pci_device_group(to_pci_dev(dev)); - else if (of_get_property(dev->of_node, "fsl,liodn", &len)) + else if (of_get_property(dev->of_node, "fsl,liodn", &len) && + !is_pci_controller_parent(dev)) group = get_device_iommu_group(dev); return group; -- 2.40.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() 2023-05-16 0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe @ 2023-05-16 15:00 ` Robin Murphy 2023-05-16 16:09 ` Jason Gunthorpe 0 siblings, 1 reply; 7+ messages in thread From: Robin Murphy @ 2023-05-16 15:00 UTC (permalink / raw) To: Jason Gunthorpe, iommu, Joerg Roedel, Will Deacon On 2023-05-16 01:27, Jason Gunthorpe wrote: > This API is expected to be used only by POWER and VFIO no-iommu that > manually manage the group lifecycle. It should not be called under > ops->device_group(). > > This is already buggy as is since the core code does not expect a probed > driver to loose it's iommu_group without also releasing the device. > > FSL seems to be trying to block the platform_device that represents the > pci_controller, eg the thing passed to fsl_add_bridge(), from having an > iommu_group. > > Instead of creating an iommu_group that we don't want and then later > removing it, just don't create it at all in the first place. > > For the 'pci_endpt_partitioning' case every PCI device already gets its > own iommu_group through the standard code, so it is unclear why having a > dedicated group for the controller could be problematic. > > For the other case, the controller group was being used to bizarrely > de-duplicate the group in it's hose. Instead just directly create a group > for the hose the first time we encounter it. The code already searches the > entire hose to find any iommu_group. Again, it is unclear why having the > pci_controller inside the same iommu_group as the PCI devices would be > harmful. It's harmful in that case because it prevents VFIO from working at all - even now that VFIO no longer rejects cross-bus groups on principle, the fsl-pci driver being bound to the platform device would then deny VFIO from taking ownership. > In any case, this is a cleaner way to not enable iommu support for the > platform_device. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/iommu/fsl_pamu_domain.c | 42 ++++++++++++++++----------------- > 1 file changed, 20 insertions(+), 22 deletions(-) > > diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c > index bce37229709965..bf045f58cd50ae 100644 > --- a/drivers/iommu/fsl_pamu_domain.c > +++ b/drivers/iommu/fsl_pamu_domain.c > @@ -11,6 +11,7 @@ > > #include <linux/platform_device.h> > #include <sysdev/fsl_pci.h> > +#include <linux/pci.h> > > /* > * Global spinlock that needs to be held while > @@ -379,7 +380,21 @@ static struct iommu_group *get_shared_pci_device_group(struct pci_dev *pdev) > bus = bus->parent; > } > > - return NULL; > + return iommu_group_alloc(); > +} > + > +static int __is_pci_controller_parent(struct device *dev, void *data) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + struct pci_controller *pci_ctl = pci_bus_to_host(pdev->bus); > + > + return dev == pci_ctl->parent; If I'm not mistaken, this is testing every *PCI* device to see any of them are their own host bridge's platform parent... > +} > + > +static bool is_pci_controller_parent(struct device *dev) > +{ > + return bus_for_each_dev(&pci_bus_type, NULL, NULL, > + __is_pci_controller_parent); for_each_pci_dev() might be easier to handle, however for what I think this is ultimately trying to do, device_find_child() would seem even better. > } > > static struct iommu_group *get_pci_device_group(struct pci_dev *pdev) > @@ -393,30 +408,12 @@ static struct iommu_group *get_pci_device_group(struct pci_dev *pdev) > /* We can partition PCIe devices so assign device group to the device */ > if (pci_endpt_partitioning) { > group = pci_device_group(&pdev->dev); > - > - /* > - * PCIe controller is not a paritionable entity > - * free the controller device iommu_group. > - */ > - if (pci_ctl->parent->iommu_group) > - iommu_group_remove_device(pci_ctl->parent); > } else { > /* > * All devices connected to the controller will share the > - * PCI controllers device group. If this is the first > - * device to be probed for the pci controller, copy the > - * device group information from the PCI controller device > - * node and remove the PCI controller iommu group. > - * For subsequent devices, the iommu group information can > - * be obtained from sibling devices (i.e. from the bus_devices > - * link list). > + * same device group. > */ > - if (pci_ctl->parent->iommu_group) { > - group = get_device_iommu_group(pci_ctl->parent); > - iommu_group_remove_device(pci_ctl->parent); > - } else { > - group = get_shared_pci_device_group(pdev); > - } > + group = get_shared_pci_device_group(pdev); It would be nice if we could still associate the "PCI group" directly with the pci_controller some other way, and avoid all the slightly-confusing bus walks altogether, but I don't see a sufficiently clean way to achieve that :( Thanks, Robin. > } > > if (!group) > @@ -436,7 +433,8 @@ static struct iommu_group *fsl_pamu_device_group(struct device *dev) > */ > if (dev_is_pci(dev)) > group = get_pci_device_group(to_pci_dev(dev)); > - else if (of_get_property(dev->of_node, "fsl,liodn", &len)) > + else if (of_get_property(dev->of_node, "fsl,liodn", &len) && > + !is_pci_controller_parent(dev)) > group = get_device_iommu_group(dev); > > return group; ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() 2023-05-16 15:00 ` Robin Murphy @ 2023-05-16 16:09 ` Jason Gunthorpe 2023-05-16 18:24 ` Robin Murphy 0 siblings, 1 reply; 7+ messages in thread From: Jason Gunthorpe @ 2023-05-16 16:09 UTC (permalink / raw) To: Robin Murphy; +Cc: iommu, Joerg Roedel, Will Deacon On Tue, May 16, 2023 at 04:00:47PM +0100, Robin Murphy wrote: > On 2023-05-16 01:27, Jason Gunthorpe wrote: > > This API is expected to be used only by POWER and VFIO no-iommu that > > manually manage the group lifecycle. It should not be called under > > ops->device_group(). > > > > This is already buggy as is since the core code does not expect a probed > > driver to loose it's iommu_group without also releasing the device. > > > > FSL seems to be trying to block the platform_device that represents the > > pci_controller, eg the thing passed to fsl_add_bridge(), from having an > > iommu_group. > > > > Instead of creating an iommu_group that we don't want and then later > > removing it, just don't create it at all in the first place. > > > > For the 'pci_endpt_partitioning' case every PCI device already gets its > > own iommu_group through the standard code, so it is unclear why having a > > dedicated group for the controller could be problematic. > > > > For the other case, the controller group was being used to bizarrely > > de-duplicate the group in it's hose. Instead just directly create a group > > for the hose the first time we encounter it. The code already searches the > > entire hose to find any iommu_group. Again, it is unclear why having the > > pci_controller inside the same iommu_group as the PCI devices would be > > harmful. > > It's harmful in that case because it prevents VFIO from working at all - > even now that VFIO no longer rejects cross-bus groups on principle, the > fsl-pci driver being bound to the platform device would then deny VFIO from > taking ownership. I think that was true before.. Today we block out VFIO based on iommu_device_use_default_domain() which is called prior to probe. That function is a NOP if the iommu group hasn't already been setup. Thus the platform_device probed by fsl_pci_probe() does not block VFIO any more, even if it is in the same group. But, more broadly, we exclude things like PCI bridges from the VFIO check by using driver_managed_dma: index b7232c46b24481..6daf620b63a4d5 100644 --- a/arch/powerpc/sysdev/fsl_pci.c +++ b/arch/powerpc/sysdev/fsl_pci.c @@ -1353,6 +1353,7 @@ static struct platform_driver fsl_pci_driver = { .of_match_table = pci_ids, }, .probe = fsl_pci_probe, + .driver_managed_dma = true, }; static int __init fsl_pci_init(void) Even though it is a NOP in this case because of ordering, it still would be good for documentation purposes. > > +static int __is_pci_controller_parent(struct device *dev, void *data) > > +{ > > + struct pci_dev *pdev = to_pci_dev(dev); > > + struct pci_controller *pci_ctl = pci_bus_to_host(pdev->bus); > > + > > + return dev == pci_ctl->parent; > > If I'm not mistaken, this is testing every *PCI* device to see any of them > are their own host bridge's platform parent... er.. Hum, yes... How did I miss that :\ > It would be nice if we could still associate the "PCI group" directly with > the pci_controller some other way, and avoid all the slightly-confusing bus > walks altogether, but I don't see a sufficiently clean way to achieve that > :( Are you Ok with the above? Jason ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() 2023-05-16 16:09 ` Jason Gunthorpe @ 2023-05-16 18:24 ` Robin Murphy 2023-05-16 19:52 ` Jason Gunthorpe 0 siblings, 1 reply; 7+ messages in thread From: Robin Murphy @ 2023-05-16 18:24 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: iommu, Joerg Roedel, Will Deacon On 2023-05-16 17:09, Jason Gunthorpe wrote: > On Tue, May 16, 2023 at 04:00:47PM +0100, Robin Murphy wrote: >> On 2023-05-16 01:27, Jason Gunthorpe wrote: >>> This API is expected to be used only by POWER and VFIO no-iommu that >>> manually manage the group lifecycle. It should not be called under >>> ops->device_group(). >>> >>> This is already buggy as is since the core code does not expect a probed >>> driver to loose it's iommu_group without also releasing the device. >>> >>> FSL seems to be trying to block the platform_device that represents the >>> pci_controller, eg the thing passed to fsl_add_bridge(), from having an >>> iommu_group. >>> >>> Instead of creating an iommu_group that we don't want and then later >>> removing it, just don't create it at all in the first place. >>> >>> For the 'pci_endpt_partitioning' case every PCI device already gets its >>> own iommu_group through the standard code, so it is unclear why having a >>> dedicated group for the controller could be problematic. >>> >>> For the other case, the controller group was being used to bizarrely >>> de-duplicate the group in it's hose. Instead just directly create a group >>> for the hose the first time we encounter it. The code already searches the >>> entire hose to find any iommu_group. Again, it is unclear why having the >>> pci_controller inside the same iommu_group as the PCI devices would be >>> harmful. >> >> It's harmful in that case because it prevents VFIO from working at all - >> even now that VFIO no longer rejects cross-bus groups on principle, the >> fsl-pci driver being bound to the platform device would then deny VFIO from >> taking ownership. > > I think that was true before.. > > Today we block out VFIO based on iommu_device_use_default_domain() > which is called prior to probe. That function is a NOP if the iommu > group hasn't already been setup. > > Thus the platform_device probed by fsl_pci_probe() does not block VFIO > any more, even if it is in the same group. My reading of the current code is that it must absolutely be relying on fsl_pamu_init() having run before fsl_pci_init() (they're both at arch_initcall level, implying an icky fragile link-order dependency), such that the platform device group is already assigned by the time the PCI host driver binds and starts adding PCI devices, so for *their* iommu_probe, get_pci_device_group() can then hoick that group over into PCI-land to start propagating it around. Thus if all we did was remove the dodgy iommu_group_remove_device() calls, I believe use_default_domain *would* start getting in the way... > But, more broadly, we exclude things like PCI bridges from the VFIO > check by using driver_managed_dma: > > index b7232c46b24481..6daf620b63a4d5 100644 > --- a/arch/powerpc/sysdev/fsl_pci.c > +++ b/arch/powerpc/sysdev/fsl_pci.c > @@ -1353,6 +1353,7 @@ static struct platform_driver fsl_pci_driver = { > .of_match_table = pci_ids, > }, > .probe = fsl_pci_probe, > + .driver_managed_dma = true, > }; > > static int __init fsl_pci_init(void) > > Even though it is a NOP in this case because of ordering, it still > would be good for documentation purposes. ...however I think you're on to something here - if with this we can safely guarantee to negate any ownership problem as well, then flipping it all around to *always* rely on the platform device's group might be neatest of all. It's redundant in the pci_endpt_partitioning case, but I don't see how that would really hurt other than slightly changing group numbering to the annoyance of users with dodgy hard-coded scripts who mistakenly assumed group IDs were stable. Crucially, all of the other crap can go away and leave us with simply: if (pci_endpt_partitioning) return pci_device_group(&pdev->dev); else return iommu_group_get(pci_ctl->parent); Thanks, Robin. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() 2023-05-16 18:24 ` Robin Murphy @ 2023-05-16 19:52 ` Jason Gunthorpe 0 siblings, 0 replies; 7+ messages in thread From: Jason Gunthorpe @ 2023-05-16 19:52 UTC (permalink / raw) To: Robin Murphy; +Cc: iommu, Joerg Roedel, Will Deacon On Tue, May 16, 2023 at 07:24:20PM +0100, Robin Murphy wrote: > My reading of the current code is that it must absolutely be relying on > fsl_pamu_init() having run before fsl_pci_init() (they're both at > arch_initcall level, implying an icky fragile link-order > dependency), Oh, I see, so this platform device would have a group assigned. > > index b7232c46b24481..6daf620b63a4d5 100644 > > --- a/arch/powerpc/sysdev/fsl_pci.c > > +++ b/arch/powerpc/sysdev/fsl_pci.c > > @@ -1353,6 +1353,7 @@ static struct platform_driver fsl_pci_driver = { > > .of_match_table = pci_ids, > > }, > > .probe = fsl_pci_probe, > > + .driver_managed_dma = true, > > }; > > static int __init fsl_pci_init(void) > > > > Even though it is a NOP in this case because of ordering, it still > > would be good for documentation purposes. > > ...however I think you're on to something here - if with this we can safely > guarantee to negate any ownership problem as well, then flipping it all > around to *always* rely on the platform device's group might be neatest of > all. Okay, it makes sense. I don't see anything wrong with this, the controller has to have a liodn and thus a group created for PCI to work. Though looking at this driver I don't really understand how it works.. The email discussion from 2013 for merging the driver made it clear that .. We can isolate the DMA access to the host based on the to the pci bus,device,function number. But I can't see where in this driver the RID gets relayed into any kind of HW programming?? When fsl_pamu_attach_device() sees a PCI device it jumps to the controller and discards the pci device entirely for the functions it calls? It seems to do the same thing regardless of what PCI device is attaching? Huh? Thanks, Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices 2023-05-16 0:27 [PATCH 0/2] Remove iommu_group_remove_device() from fsl Jason Gunthorpe 2023-05-16 0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe @ 2023-05-16 0:27 ` Jason Gunthorpe 1 sibling, 0 replies; 7+ messages in thread From: Jason Gunthorpe @ 2023-05-16 0:27 UTC (permalink / raw) To: iommu, Joerg Roedel, Robin Murphy, Will Deacon fsl_pamu_device_group() is only called if dev->iommu_group is NULL, so iommu_group_get() always returns NULL. Remove this test and just allocate a group. Call generic_device_group() for this like the other drivers. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/iommu/fsl_pamu_domain.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c index bf045f58cd50ae..f93306848a497f 100644 --- a/drivers/iommu/fsl_pamu_domain.c +++ b/drivers/iommu/fsl_pamu_domain.c @@ -335,17 +335,6 @@ int fsl_pamu_configure_l1_stash(struct iommu_domain *domain, u32 cpu) return ret; } -static struct iommu_group *get_device_iommu_group(struct device *dev) -{ - struct iommu_group *group; - - group = iommu_group_get(dev); - if (!group) - group = iommu_group_alloc(); - - return group; -} - static bool check_pci_ctl_endpt_part(struct pci_controller *pci_ctl) { u32 version; @@ -435,7 +424,7 @@ static struct iommu_group *fsl_pamu_device_group(struct device *dev) group = get_pci_device_group(to_pci_dev(dev)); else if (of_get_property(dev->of_node, "fsl,liodn", &len) && !is_pci_controller_parent(dev)) - group = get_device_iommu_group(dev); + group = generic_device_group(dev); return group; } -- 2.40.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-05-16 19:52 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-16 0:27 [PATCH 0/2] Remove iommu_group_remove_device() from fsl Jason Gunthorpe 2023-05-16 0:27 ` [PATCH 1/2] iommu/fsl: Do not use iommu_group_remove_device() under ops->device_group() Jason Gunthorpe 2023-05-16 15:00 ` Robin Murphy 2023-05-16 16:09 ` Jason Gunthorpe 2023-05-16 18:24 ` Robin Murphy 2023-05-16 19:52 ` Jason Gunthorpe 2023-05-16 0:27 ` [PATCH 2/2] iommu/fsl: Always allocate a group for non-pci devices Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox