* [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
@ 2024-07-11 13:25 Marek Behún
2024-07-11 14:03 ` Andrew Lunn
2024-07-12 20:41 ` Bjorn Helgaas
0 siblings, 2 replies; 6+ messages in thread
From: Marek Behún @ 2024-07-11 13:25 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas,
Andrew Lunn, Gregory Clement
Cc: Thomas Petazzoni, Rob Herring, linux-pci, linux-arm-kernel,
Manivannan Sadhasivam, Marek Behún
From: Pali Rohár <pali@kernel.org>
The documentation for the irq_domain_remove() function says that all
mappings within the IRQ domain must be disposed before the domain is
removed.
Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
method, which causes the kernel to crash when unloading the driver and
then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.
Unmapping of the IRQs at this point of the .remove() method is safe,
since the PCIe bus is already unregistered, and all its devices are
unbound from their drivers and removed. If there was indeed any
remaining use of PCIe resources, then it would mean that PCIe hotplug
code is broken, and we have bigger problems.
Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>
Signed-off-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Marek Behún <kabel@kernel.org>
[ Marek: refactored a little, added more explanation to commit message ]
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
Changes since v1:
- added explanation into commit message about why this is safe to do,
as suggested by Andy. The explanation originally comes from Pali:
https://lore.kernel.org/linux-arm-kernel/20220809133911.hqi7eyskcq2sojia@pali/
---
drivers/pci/controller/pci-mvebu.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
index 29fe09c99e7d..91a02b23aeb1 100644
--- a/drivers/pci/controller/pci-mvebu.c
+++ b/drivers/pci/controller/pci-mvebu.c
@@ -1683,8 +1683,15 @@ static void mvebu_pcie_remove(struct platform_device *pdev)
irq_set_chained_handler_and_data(irq, NULL, NULL);
/* Remove IRQ domains. */
- if (port->intx_irq_domain)
+ if (port->intx_irq_domain) {
+ for (int j = 0; j < PCI_NUM_INTX; j++) {
+ int virq = irq_find_mapping(port->intx_irq_domain, j);
+
+ if (virq > 0)
+ irq_dispose_mapping(virq);
+ }
irq_domain_remove(port->intx_irq_domain);
+ }
/* Free config space for emulated root bridge. */
pci_bridge_emul_cleanup(&port->bridge);
--
2.44.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
2024-07-11 13:25 [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain Marek Behún
@ 2024-07-11 14:03 ` Andrew Lunn
2024-07-12 20:41 ` Bjorn Helgaas
1 sibling, 0 replies; 6+ messages in thread
From: Andrew Lunn @ 2024-07-11 14:03 UTC (permalink / raw)
To: Marek Behún
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas,
Gregory Clement, Thomas Petazzoni, Rob Herring, linux-pci,
linux-arm-kernel, Manivannan Sadhasivam
On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> From: Pali Rohár <pali@kernel.org>
>
> The documentation for the irq_domain_remove() function says that all
> mappings within the IRQ domain must be disposed before the domain is
> removed.
>
> Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
> method, which causes the kernel to crash when unloading the driver and
> then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.
>
> Unmapping of the IRQs at this point of the .remove() method is safe,
> since the PCIe bus is already unregistered, and all its devices are
> unbound from their drivers and removed. If there was indeed any
> remaining use of PCIe resources, then it would mean that PCIe hotplug
> code is broken, and we have bigger problems.
>
> Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
> Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>
> Signed-off-by: Pali Rohár <pali@kernel.org>
> Reviewed-by: Marek Behún <kabel@kernel.org>
> [ Marek: refactored a little, added more explanation to commit message ]
> Signed-off-by: Marek Behún <kabel@kernel.org>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
2024-07-11 13:25 [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain Marek Behún
2024-07-11 14:03 ` Andrew Lunn
@ 2024-07-12 20:41 ` Bjorn Helgaas
2024-07-13 10:33 ` Thomas Gleixner
1 sibling, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2024-07-12 20:41 UTC (permalink / raw)
To: Marek Behún
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas,
Andrew Lunn, Gregory Clement, Thomas Petazzoni, Rob Herring,
linux-pci, linux-arm-kernel, Manivannan Sadhasivam,
Pali Rohár, Thomas Gleixner, Marc Zyngier
[+cc Pali, seems like the author should be included,
Thomas, Marc since they actually know about IRQs, unlike me]
On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> From: Pali Rohár <pali@kernel.org>
>
> The documentation for the irq_domain_remove() function says that all
> mappings within the IRQ domain must be disposed before the domain is
> removed.
>
> Currently, the INTx IRQs are not disposed in pci-mvebu driver .remove()
> method, which causes the kernel to crash when unloading the driver and
> then reading /sys/kernel/debug/irq/irqs/<num> or /proc/interrupts.
>
> Unmapping of the IRQs at this point of the .remove() method is safe,
> since the PCIe bus is already unregistered, and all its devices are
> unbound from their drivers and removed. If there was indeed any
> remaining use of PCIe resources, then it would mean that PCIe hotplug
> code is broken, and we have bigger problems.
>
> Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx interrupts")
> Reported-by: Hajo Noerenberg <hajo-linux-bugzilla@noerenberg.de>
Is there a URL for this report?
> Signed-off-by: Pali Rohár <pali@kernel.org>
> Reviewed-by: Marek Behún <kabel@kernel.org>
> [ Marek: refactored a little, added more explanation to commit message ]
> Signed-off-by: Marek Behún <kabel@kernel.org>
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
> Changes since v1:
> - added explanation into commit message about why this is safe to do,
> as suggested by Andy. The explanation originally comes from Pali:
> https://lore.kernel.org/linux-arm-kernel/20220809133911.hqi7eyskcq2sojia@pali/
> ---
> drivers/pci/controller/pci-mvebu.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> index 29fe09c99e7d..91a02b23aeb1 100644
> --- a/drivers/pci/controller/pci-mvebu.c
> +++ b/drivers/pci/controller/pci-mvebu.c
> @@ -1683,8 +1683,15 @@ static void mvebu_pcie_remove(struct platform_device *pdev)
> irq_set_chained_handler_and_data(irq, NULL, NULL);
>
> /* Remove IRQ domains. */
> - if (port->intx_irq_domain)
> + if (port->intx_irq_domain) {
> + for (int j = 0; j < PCI_NUM_INTX; j++) {
> + int virq = irq_find_mapping(port->intx_irq_domain, j);
> +
> + if (virq > 0)
> + irq_dispose_mapping(virq);
I am not an IRQ expert, so all I can really do is compare this to
usage in other drivers.
There are 20+ drivers in drivers/pci/controller, and I don't see
irq_dispose_mapping() usage similar to this elsewhere. Does that mean
most or all of the other drivers have a similar defect?
> + }
> irq_domain_remove(port->intx_irq_domain);
> + }
>
> /* Free config space for emulated root bridge. */
> pci_bridge_emul_cleanup(&port->bridge);
> --
> 2.44.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
2024-07-12 20:41 ` Bjorn Helgaas
@ 2024-07-13 10:33 ` Thomas Gleixner
2024-07-13 19:32 ` Bjorn Helgaas
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2024-07-13 10:33 UTC (permalink / raw)
To: Bjorn Helgaas, Marek Behún
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Bjorn Helgaas,
Andrew Lunn, Gregory Clement, Thomas Petazzoni, Rob Herring,
linux-pci, linux-arm-kernel, Manivannan Sadhasivam,
Pali Rohár, Marc Zyngier
On Fri, Jul 12 2024 at 15:41, Bjorn Helgaas wrote:
> On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
>> /* Remove IRQ domains. */
>> - if (port->intx_irq_domain)
>> + if (port->intx_irq_domain) {
>> + for (int j = 0; j < PCI_NUM_INTX; j++) {
>> + int virq = irq_find_mapping(port->intx_irq_domain, j);
>> +
>> + if (virq > 0)
>> + irq_dispose_mapping(virq);
>
> I am not an IRQ expert, so all I can really do is compare this to
> usage in other drivers.
>
> There are 20+ drivers in drivers/pci/controller, and I don't see
> irq_dispose_mapping() usage similar to this elsewhere. Does that mean
> most or all of the other drivers have a similar defect?
Right.
But the real question is why is such a mapping not torn down by the
entity (device, bridge, whatever) which set it up in the first place?
Thanks,
tglx
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
2024-07-13 10:33 ` Thomas Gleixner
@ 2024-07-13 19:32 ` Bjorn Helgaas
2024-07-13 19:56 ` Thomas Gleixner
0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2024-07-13 19:32 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Marek Behún, Lorenzo Pieralisi, Krzysztof Wilczyński,
Bjorn Helgaas, Andrew Lunn, Gregory Clement, Thomas Petazzoni,
Rob Herring, linux-pci, linux-arm-kernel, Manivannan Sadhasivam,
Pali Rohár, Marc Zyngier
On Sat, Jul 13, 2024 at 12:33:29PM +0200, Thomas Gleixner wrote:
> On Fri, Jul 12 2024 at 15:41, Bjorn Helgaas wrote:
> > On Thu, Jul 11, 2024 at 03:25:44PM +0200, Marek Behún wrote:
> >> /* Remove IRQ domains. */
> >> - if (port->intx_irq_domain)
> >> + if (port->intx_irq_domain) {
> >> + for (int j = 0; j < PCI_NUM_INTX; j++) {
> >> + int virq = irq_find_mapping(port->intx_irq_domain, j);
> >> +
> >> + if (virq > 0)
> >> + irq_dispose_mapping(virq);
> >
> > I am not an IRQ expert, so all I can really do is compare this to
> > usage in other drivers.
> >
> > There are 20+ drivers in drivers/pci/controller, and I don't see
> > irq_dispose_mapping() usage similar to this elsewhere. Does that mean
> > most or all of the other drivers have a similar defect?
>
> Right.
>
> But the real question is why is such a mapping not torn down by the
> entity (device, bridge, whatever) which set it up in the first place?
Marek/Pali, the commit log mentions a crash when unloading. Do you
have a pointer to any details? Maybe there's a driver there that we
can fix?
Bjorn
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain
2024-07-13 19:32 ` Bjorn Helgaas
@ 2024-07-13 19:56 ` Thomas Gleixner
0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2024-07-13 19:56 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Marek Behún, Lorenzo Pieralisi, Krzysztof Wilczyński,
Bjorn Helgaas, Andrew Lunn, Gregory Clement, Thomas Petazzoni,
Rob Herring, linux-pci, linux-arm-kernel, Manivannan Sadhasivam,
Pali Rohár, Marc Zyngier
On Sat, Jul 13 2024 at 14:32, Bjorn Helgaas wrote:
> On Sat, Jul 13, 2024 at 12:33:29PM +0200, Thomas Gleixner wrote:
>> > There are 20+ drivers in drivers/pci/controller, and I don't see
>> > irq_dispose_mapping() usage similar to this elsewhere. Does that mean
>> > most or all of the other drivers have a similar defect?
>>
>> Right.
>>
>> But the real question is why is such a mapping not torn down by the
>> entity (device, bridge, whatever) which set it up in the first place?
>
> Marek/Pali, the commit log mentions a crash when unloading. Do you
> have a pointer to any details? Maybe there's a driver there that we
> can fix?
Pali already explained it in his reply to me, but for some stupid reason
(my fault hitting reply instead of reply-all) this did not make the list.
Let me replay the discussion:
>>> Pali:
>>> I'm not expert neither. But IIRC it is because endpoint drivers do not
>>> call irq_dispose_mapping at their own for shared IRQs. And this is how
>>> shared PCI INTA-D interrupts are implemented in the kernel. First
>>> endpoint driver (e.g. wifi card) request for shared interrupt and kernel
>>> automatically creates irq mapping if it does not exist. Second endpoint
>>> driver (e.g. second wifi card) request for shared interrupt and kernel
>>> just returns existing mapping. And when the first endpoint driver is
>>> releasing its own irq handler it do not dispose irq mapping because it
>>> may be shared with other endpoint drivers (in this case with second wifi
>>> card). Seems that the owner of these shared mappings is the PCI
>>> controller driver and it has to do dispose them on its own removal (when
>>> releasing the domain for shared PCI IRQs).
>> tglx:
>> Working around this in a particular PCI controller driver is just wrong
>> then. This wants a common cleanup function so all affected drivers which
>> remove the INTX irqdomain can be fixed up without copy & pasta of the
>> very same code. Something like the below and then change the
>> irq_domain_remove() call for the INTX domain to use that function.
There was some additional back and forth but the actual patch which can
be used for both INTX and other, e.g. error reporting, domains is below.
Sorry for taking this offlist unintenitonally.
The background of all this is that initially PCI[e] controllers were not
removable/modular. Later on the whole modularization effort created this
problem.
Thanks,
tglx
---
--- a/drivers/pci/irq.c
+++ b/drivers/pci/irq.c
@@ -278,3 +278,19 @@ int __weak pcibios_alloc_irq(struct pci_
void __weak pcibios_free_irq(struct pci_dev *dev)
{
}
+
+#ifdef CONFIG_IRQDOMAIN
+void pci_remove_irqdomain(struct irqdomain *domain, unsigned int nr_irqs)
+{
+ /*
+ * Comment explaining the oddity of this.
+ */
+ for (unsigned int i = 0; j < nr_irqs; i++) {
+ int virq = irq_find_mapping(domain, i);
+
+ if (virq > 0)
+ irq_dispose_mapping(virq);
+ }
+ irq_domain_remove(domain);
+}
+#endif
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-13 19:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-11 13:25 [PATCH v2] PCI: mvebu: Dispose INTx IRQs before to removing INTx domain Marek Behún
2024-07-11 14:03 ` Andrew Lunn
2024-07-12 20:41 ` Bjorn Helgaas
2024-07-13 10:33 ` Thomas Gleixner
2024-07-13 19:32 ` Bjorn Helgaas
2024-07-13 19:56 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).