* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 12:05 [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup Sachin Parekh
@ 2024-08-29 13:52 ` Greg KH
2024-09-02 6:44 ` Sachin Parekh
2024-08-29 13:58 ` Robin Murphy
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2024-08-29 13:52 UTC (permalink / raw)
To: Sachin Parekh
Cc: rafael, will, robin.murphy, lokeshvutla, linux-kernel, iommu
On Thu, Aug 29, 2024 at 12:05:04PM +0000, Sachin Parekh wrote:
> Installing a kernel module that has an iommu node in its
> device tree increments corresponding iommu kernel module
> reference count during driver_register.
> Removing the same kernel module doesn't decrement the
> iommu reference count resulting in error while
> removing the iommu kernel module
Please wrap kernel changelog text at 72 columns like your editor asked
you to :)
> $ modprobe arm-smmu-v3
> $ modprobe test_module
> $ modprobe -r test_module
> $ modprobe -r arm-smmu-v3
> modprobe: can't unload module arm_smmu_v3: Resource temporarily unavailable
Does this happen for any in-kernel driver?
Why has this not been noticed before?
>
> Cause:
> platform_driver_register
> ...
> -> platform_dma_configure
> ...
> -> iommu_probe_device (Increments reference count)
>
> platform_driver_unregister
> ...
> -> platform_dma_cleanup
> ...
> -> No corresponding iommu_release_device call
>
> Fix:
> Call iommu_release_device in platform_dma_cleanup to remove the
> iommu from the corresponding kernel module
>
> Signed-off-by: Sachin Parekh <sachinparekh@google.com>
What commit id does this fix?
> ---
> drivers/base/platform.c | 3 +++
> drivers/iommu/iommu.c | 3 +--
> include/linux/iommu.h | 1 +
> 3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 4c3ee6521ba5..c8125325a5e9 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1467,6 +1467,9 @@ static void platform_dma_cleanup(struct device *dev)
>
> if (!drv->driver_managed_dma)
> iommu_device_unuse_default_domain(dev);
> +
> + if (dev_of_node(dev))
> + iommu_release_device(dev);
Are you sure that all devices that pass this should have this call made?
How well was this tested on different systems?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 13:52 ` Greg KH
@ 2024-09-02 6:44 ` Sachin Parekh
0 siblings, 0 replies; 7+ messages in thread
From: Sachin Parekh @ 2024-09-02 6:44 UTC (permalink / raw)
To: Greg KH; +Cc: rafael, will, robin.murphy, lokeshvutla, linux-kernel, iommu
On Thu, Aug 29, 2024 at 7:22 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Aug 29, 2024 at 12:05:04PM +0000, Sachin Parekh wrote:
> > Installing a kernel module that has an iommu node in its
> > device tree increments corresponding iommu kernel module
> > reference count during driver_register.
> > Removing the same kernel module doesn't decrement the
> > iommu reference count resulting in error while
> > removing the iommu kernel module
>
> Please wrap kernel changelog text at 72 columns like your editor asked
> you to :)
I had wrapped while creating the patch but I suppose it got messed up
somewhere. Sorry about that, I will fix it next time.
> > $ modprobe arm-smmu-v3
> > $ modprobe test_module
> > $ modprobe -r test_module
> > $ modprobe -r arm-smmu-v3
> > modprobe: can't unload module arm_smmu_v3: Resource temporarily unavailable
>
> Does this happen for any in-kernel driver?
>
> Why has this not been noticed before?
>
As per Robin's explanation, IOMMU drivers are not intended to be removed so
that could be the reason why this hasn't been noticed.
> >
> > Cause:
> > platform_driver_register
> > ...
> > -> platform_dma_configure
> > ...
> > -> iommu_probe_device (Increments reference count)
> >
> > platform_driver_unregister
> > ...
> > -> platform_dma_cleanup
> > ...
> > -> No corresponding iommu_release_device call
> >
> > Fix:
> > Call iommu_release_device in platform_dma_cleanup to remove the
> > iommu from the corresponding kernel module
> >
> > Signed-off-by: Sachin Parekh <sachinparekh@google.com>
>
> What commit id does this fix?
>
I think these two commit IDs are relevant in this context:
512881eacfa72c2136b27b9934b7b27504a9efc2
07397df29e57cde5799af16e8f148ae10ed75285
> > ---
> > drivers/base/platform.c | 3 +++
> > drivers/iommu/iommu.c | 3 +--
> > include/linux/iommu.h | 1 +
> > 3 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> > index 4c3ee6521ba5..c8125325a5e9 100644
> > --- a/drivers/base/platform.c
> > +++ b/drivers/base/platform.c
> > @@ -1467,6 +1467,9 @@ static void platform_dma_cleanup(struct device *dev)
> >
> > if (!drv->driver_managed_dma)
> > iommu_device_unuse_default_domain(dev);
> > +
> > + if (dev_of_node(dev))
> > + iommu_release_device(dev);
>
> Are you sure that all devices that pass this should have this call made?
> How well was this tested on different systems?
>
I am not sure hence I created this patch to understand if this is the
correct approach.
Thanks,
Sachin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 12:05 [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup Sachin Parekh
2024-08-29 13:52 ` Greg KH
@ 2024-08-29 13:58 ` Robin Murphy
2024-09-02 6:17 ` Sachin Parekh
2024-09-01 13:28 ` kernel test robot
2024-09-01 13:49 ` kernel test robot
3 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2024-08-29 13:58 UTC (permalink / raw)
To: Sachin Parekh, gregkh, rafael, will; +Cc: lokeshvutla, linux-kernel, iommu
On 2024-08-29 1:05 pm, Sachin Parekh wrote:
> Installing a kernel module that has an iommu node in its
> device tree increments corresponding iommu kernel module
> reference count during driver_register.
> Removing the same kernel module doesn't decrement the
> iommu reference count resulting in error while
> removing the iommu kernel module
>
> $ modprobe arm-smmu-v3
> $ modprobe test_module
> $ modprobe -r test_module
> $ modprobe -r arm-smmu-v3
> modprobe: can't unload module arm_smmu_v3: Resource temporarily unavailable
>
> Cause:
> platform_driver_register
> ...
> -> platform_dma_configure
> ...
> -> iommu_probe_device (Increments reference count)
>
> platform_driver_unregister
> ...
> -> platform_dma_cleanup
> ...
> -> No corresponding iommu_release_device call
>
> Fix:
> Call iommu_release_device in platform_dma_cleanup to remove the
> iommu from the corresponding kernel module
There is nothing to fix here, the current behaviour is correct and
expected. It appears the comment in of_iommu_configure() got lost, but
the equivalent in acpi_iommu_configure_id() is still there - the dodgy
iommu_probe_device() calls at those points still logically represent the
same operation which should have happened via iommu_bus_notifier() or
iommu_device_register(), it's just being replayed later for hacky reasons.
IOMMU groups are supposed to be tied to the lifetime of the struct
device, and unbinding a driver does not make the device itself go away -
if the devicetree says it exists, then the IOMMU layer knows that.
Modular IOMMU drivers are still not really intended to be removable,
unless all their client devices can actually be removed form the system
(and/or the IOMMU driver is being tested without any clients present).
Thanks,
Robin.
> Signed-off-by: Sachin Parekh <sachinparekh@google.com>
> ---
> drivers/base/platform.c | 3 +++
> drivers/iommu/iommu.c | 3 +--
> include/linux/iommu.h | 1 +
> 3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 4c3ee6521ba5..c8125325a5e9 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1467,6 +1467,9 @@ static void platform_dma_cleanup(struct device *dev)
>
> if (!drv->driver_managed_dma)
> iommu_device_unuse_default_domain(dev);
> +
> + if (dev_of_node(dev))
> + iommu_release_device(dev);
> }
>
> static const struct dev_pm_ops platform_dev_pm_ops = {
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ed6c5cb60c5a..495f548fd4b9 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -92,7 +92,6 @@ static const char * const iommu_group_resv_type_string[] = {
>
> static int iommu_bus_notifier(struct notifier_block *nb,
> unsigned long action, void *data);
> -static void iommu_release_device(struct device *dev);
> static struct iommu_domain *
> __iommu_group_domain_alloc(struct iommu_group *group, unsigned int type);
> static int __iommu_attach_device(struct iommu_domain *domain,
> @@ -663,7 +662,7 @@ static void __iommu_group_remove_device(struct device *dev)
> iommu_group_put(group);
> }
>
> -static void iommu_release_device(struct device *dev)
> +void iommu_release_device(struct device *dev)
> {
> struct iommu_group *group = dev->iommu_group;
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 04cbdae0052e..2f9cb7d9dadf 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -1049,6 +1049,7 @@ void dev_iommu_priv_set(struct device *dev, void *priv);
>
> extern struct mutex iommu_probe_device_lock;
> int iommu_probe_device(struct device *dev);
> +void iommu_release_device(struct device *dev);
>
> int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f);
> int iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 13:58 ` Robin Murphy
@ 2024-09-02 6:17 ` Sachin Parekh
0 siblings, 0 replies; 7+ messages in thread
From: Sachin Parekh @ 2024-09-02 6:17 UTC (permalink / raw)
To: Robin Murphy; +Cc: gregkh, rafael, will, lokeshvutla, linux-kernel, iommu
On Thu, Aug 29, 2024 at 7:28 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2024-08-29 1:05 pm, Sachin Parekh wrote:
> > Installing a kernel module that has an iommu node in its
> > device tree increments corresponding iommu kernel module
> > reference count during driver_register.
> > Removing the same kernel module doesn't decrement the
> > iommu reference count resulting in error while
> > removing the iommu kernel module
> >
> > $ modprobe arm-smmu-v3
> > $ modprobe test_module
> > $ modprobe -r test_module
> > $ modprobe -r arm-smmu-v3
> > modprobe: can't unload module arm_smmu_v3: Resource temporarily unavailable
> >
> > Cause:
> > platform_driver_register
> > ...
> > -> platform_dma_configure
> > ...
> > -> iommu_probe_device (Increments reference count)
> >
> > platform_driver_unregister
> > ...
> > -> platform_dma_cleanup
> > ...
> > -> No corresponding iommu_release_device call
> >
> > Fix:
> > Call iommu_release_device in platform_dma_cleanup to remove the
> > iommu from the corresponding kernel module
>
> There is nothing to fix here, the current behaviour is correct and
> expected. It appears the comment in of_iommu_configure() got lost, but
> the equivalent in acpi_iommu_configure_id() is still there - the dodgy
> iommu_probe_device() calls at those points still logically represent the
> same operation which should have happened via iommu_bus_notifier() or
> iommu_device_register(), it's just being replayed later for hacky reasons.
>
> IOMMU groups are supposed to be tied to the lifetime of the struct
> device, and unbinding a driver does not make the device itself go away -
> if the devicetree says it exists, then the IOMMU layer knows that.
> Modular IOMMU drivers are still not really intended to be removable,
> unless all their client devices can actually be removed form the system
> (and/or the IOMMU driver is being tested without any clients present).
>
Thanks Robin for explaining the intended behaviour and the context.
Does this mean that the iommu reference count should not be decremented
when we call platform_driver_unregister ?
In my scenario, platform_driver_register is incrementing the reference
count so I expected platform_driver_unregister to decrement the reference count
Even if the IOMMU drivers are not intended to be removed, the reference count
should be in accordance with the drivers that register the iommu group ?
Thanks,
Sachin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 12:05 [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup Sachin Parekh
2024-08-29 13:52 ` Greg KH
2024-08-29 13:58 ` Robin Murphy
@ 2024-09-01 13:28 ` kernel test robot
2024-09-01 13:49 ` kernel test robot
3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2024-09-01 13:28 UTC (permalink / raw)
To: Sachin Parekh; +Cc: oe-kbuild-all
Hi Sachin,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on driver-core/driver-core-next driver-core/driver-core-linus linus/master v6.11-rc6 next-20240830]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sachin-Parekh/driver-core-platform-Call-iommu_release_device-in-dma_cleanup/20240829-200729
base: driver-core/driver-core-testing
patch link: https://lore.kernel.org/r/20240829120504.2976612-1-sachinparekh%40google.com
patch subject: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
config: openrisc-allnoconfig (https://download.01.org/0day-ci/archive/20240901/202409012112.6WITFrsU-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240901/202409012112.6WITFrsU-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409012112.6WITFrsU-lkp@intel.com/
All errors (new ones prefixed by >>):
drivers/base/platform.c: In function 'platform_dma_cleanup':
>> drivers/base/platform.c:1472:17: error: implicit declaration of function 'iommu_release_device'; did you mean 'iommu_detach_device'? [-Wimplicit-function-declaration]
1472 | iommu_release_device(dev);
| ^~~~~~~~~~~~~~~~~~~~
| iommu_detach_device
vim +1472 drivers/base/platform.c
1463
1464 static void platform_dma_cleanup(struct device *dev)
1465 {
1466 struct platform_driver *drv = to_platform_driver(dev->driver);
1467
1468 if (!drv->driver_managed_dma)
1469 iommu_device_unuse_default_domain(dev);
1470
1471 if (dev_of_node(dev))
> 1472 iommu_release_device(dev);
1473 }
1474
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
2024-08-29 12:05 [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup Sachin Parekh
` (2 preceding siblings ...)
2024-09-01 13:28 ` kernel test robot
@ 2024-09-01 13:49 ` kernel test robot
3 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2024-09-01 13:49 UTC (permalink / raw)
To: Sachin Parekh; +Cc: llvm, oe-kbuild-all
Hi Sachin,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build errors:
[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on driver-core/driver-core-next driver-core/driver-core-linus linus/master v6.11-rc6 next-20240830]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sachin-Parekh/driver-core-platform-Call-iommu_release_device-in-dma_cleanup/20240829-200729
base: driver-core/driver-core-testing
patch link: https://lore.kernel.org/r/20240829120504.2976612-1-sachinparekh%40google.com
patch subject: [RFC PATCH] driver core: platform: Call iommu_release_device in dma_cleanup
config: um-allnoconfig (https://download.01.org/0day-ci/archive/20240901/202409012129.wNTAUtIZ-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240901/202409012129.wNTAUtIZ-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409012129.wNTAUtIZ-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from drivers/base/platform.c:15:
In file included from include/linux/of_irq.h:7:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:14:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
548 | val = __raw_readb(PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
561 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
| ^
In file included from drivers/base/platform.c:15:
In file included from include/linux/of_irq.h:7:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:14:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
574 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
| ^
In file included from drivers/base/platform.c:15:
In file included from include/linux/of_irq.h:7:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:14:
In file included from arch/um/include/asm/io.h:24:
include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
585 | __raw_writeb(value, PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
595 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
605 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
693 | readsb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
701 | readsw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
709 | readsl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
718 | writesb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
727 | writesw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
736 | writesl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
>> drivers/base/platform.c:1472:3: error: call to undeclared function 'iommu_release_device'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
1472 | iommu_release_device(dev);
| ^
drivers/base/platform.c:1472:3: note: did you mean 'iommu_detach_device'?
include/linux/iommu.h:1115:20: note: 'iommu_detach_device' declared here
1115 | static inline void iommu_detach_device(struct iommu_domain *domain,
| ^
12 warnings and 1 error generated.
vim +/iommu_release_device +1472 drivers/base/platform.c
1463
1464 static void platform_dma_cleanup(struct device *dev)
1465 {
1466 struct platform_driver *drv = to_platform_driver(dev->driver);
1467
1468 if (!drv->driver_managed_dma)
1469 iommu_device_unuse_default_domain(dev);
1470
1471 if (dev_of_node(dev))
> 1472 iommu_release_device(dev);
1473 }
1474
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 7+ messages in thread