* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-11 6:19 ` Michael Ellerman
0 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2013-02-11 6:19 UTC (permalink / raw)
To: Baruch Siach; +Cc: linux-kernel, linuxppc-dev, grant.likely
On Mon, Feb 11, 2013 at 07:31:00AM +0200, Baruch Siach wrote:
> Hi lkml,
Hi Baruch,
> The drivers/edac/mpc85xx_edac.c driver contains the following (abbreviated)
> code snippet it its .probe:
You dropped an important detail which is the preceeding line:
pdata->irq = irq_of_parse_and_map(op->dev.of_node, 0);
> res = devm_request_irq(&op->dev, pdata->irq,
> mpc85xx_pci_isr, IRQF_DISABLED,
> "[EDAC] PCI err", pci);
> if (res < 0) {
> irq_dispose_mapping(pdata->irq);
> goto err2;
> }
>
> Now, since the requested irq is already in use, and IRQF_SHARED is not set,
> devm_request_irq errors() out, which is OK. Less OK is the
> irq_dispose_mapping() call, which gives me this:
>
> EDAC PCI1: Giving out device to module 'MPC85xx_edac' controller 'mpc85xx_pci_err': DEV 'ffe0a000.pcie' (INTERRUPT)
> genirq: Flags mismatch irq 16. 00000020 ([EDAC] PCI err) vs. 00000020 ([EDAC] PCI err)
The hint here is to notice which other irq you're clashing with ^^
ie. yourself. Which is odd, that is the root of the problem.
The badness you're getting from irq_dispose_mapping() is caused because you're
disposing of that mapping which is currently still in use, by the same interrupt.
That is caused by a "feature" in the irq mapping code, where if you ask to map an
already mapped hwirq, it will give you back the same virq. So in your case when
you called irq_of_parse_and_map() it noticed that someone had already mapped
that hwirq, and gave you back an existing (in use) virq.
> mpc85xx_pci_err_probe: Unable to requiest irq 16 for MPC85xx PCI err
^
While you're there, can you fix the typo :)
> So, is irq_dispose_mapping() the right thing to do when irq request fails?
It's the right thing to do to undo the effect of irq_create_mapping(), or in your case irq_of_parse_and_map().
It just falls down in this case, because you're inadvertently disposing of something that's still in use.
> A simple grep shows that irq_dispose_mapping() calls are mostly limited to
> powerpc code. Is there a reason for that?
That's because the irq domain code began life as powerpc specific code. It's now become generic and will start to appear in more places.
cheers
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] irq_dispose_mapping after irq request failure
2013-02-11 6:19 ` Michael Ellerman
@ 2013-02-11 6:44 ` Baruch Siach
-1 siblings, 0 replies; 13+ messages in thread
From: Baruch Siach @ 2013-02-11 6:44 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, linux-kernel
Hi Michael,
On Mon, Feb 11, 2013 at 05:19:49PM +1100, Michael Ellerman wrote:
> On Mon, Feb 11, 2013 at 07:31:00AM +0200, Baruch Siach wrote:
[...]
> > mpc85xx_pci_err_probe: Unable to requiest irq 16 for MPC85xx PCI err
> While you're there, can you fix the typo :)
The patch fixing it is already queued at
http://git.kernel.org/?p=linux/kernel/git/bp/bp.git;a=commitdiff;h=e7d2c215e56dc9fa0a01e26f2acfc3d73c889ba3.
Thanks for your details explanation. I'll now try to figure out what's wrong
with my device tree.
baruch
--
http://baruch.siach.name/blog/ ~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-11 6:44 ` Baruch Siach
0 siblings, 0 replies; 13+ messages in thread
From: Baruch Siach @ 2013-02-11 6:44 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev, grant.likely
Hi Michael,
On Mon, Feb 11, 2013 at 05:19:49PM +1100, Michael Ellerman wrote:
> On Mon, Feb 11, 2013 at 07:31:00AM +0200, Baruch Siach wrote:
[...]
> > mpc85xx_pci_err_probe: Unable to requiest irq 16 for MPC85xx PCI err
> While you're there, can you fix the typo :)
The patch fixing it is already queued at
http://git.kernel.org/?p=linux/kernel/git/bp/bp.git;a=commitdiff;h=e7d2c215e56dc9fa0a01e26f2acfc3d73c889ba3.
Thanks for your details explanation. I'll now try to figure out what's wrong
with my device tree.
baruch
--
http://baruch.siach.name/blog/ ~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il -
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
2013-02-11 6:19 ` Michael Ellerman
@ 2013-02-11 20:52 ` Grant Likely
-1 siblings, 0 replies; 13+ messages in thread
From: Grant Likely @ 2013-02-11 20:52 UTC (permalink / raw)
To: Michael Ellerman, Baruch Siach; +Cc: linuxppc-dev, linux-kernel
On Mon, 11 Feb 2013 17:19:49 +1100, Michael Ellerman <michael@ellerman.id.au> wrote:
> On Mon, Feb 11, 2013 at 07:31:00AM +0200, Baruch Siach wrote:
> > Hi lkml,
>
> Hi Baruch,
>
> > The drivers/edac/mpc85xx_edac.c driver contains the following (abbreviated)
> > code snippet it its .probe:
>
> You dropped an important detail which is the preceeding line:
>
> pdata->irq = irq_of_parse_and_map(op->dev.of_node, 0);
>
> > res = devm_request_irq(&op->dev, pdata->irq,
> > mpc85xx_pci_isr, IRQF_DISABLED,
> > "[EDAC] PCI err", pci);
> > if (res < 0) {
> > irq_dispose_mapping(pdata->irq);
> > goto err2;
> > }
> >
> > Now, since the requested irq is already in use, and IRQF_SHARED is not set,
> > devm_request_irq errors() out, which is OK. Less OK is the
> > irq_dispose_mapping() call, which gives me this:
> >
> > EDAC PCI1: Giving out device to module 'MPC85xx_edac' controller 'mpc85xx_pci_err': DEV 'ffe0a000.pcie' (INTERRUPT)
> > genirq: Flags mismatch irq 16. 00000020 ([EDAC] PCI err) vs. 00000020 ([EDAC] PCI err)
>
> The hint here is to notice which other irq you're clashing with ^^
> ie. yourself. Which is odd, that is the root of the problem.
>
> The badness you're getting from irq_dispose_mapping() is caused because you're
> disposing of that mapping which is currently still in use, by the same interrupt.
>
> That is caused by a "feature" in the irq mapping code, where if you ask to map an
> already mapped hwirq, it will give you back the same virq. So in your case when
> you called irq_of_parse_and_map() it noticed that someone had already mapped
> that hwirq, and gave you back an existing (in use) virq.
Really the irq mappings should be using reference counting. The existing
code is naive on this count and just releases the irq on the first call
to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
want to take that task on?
g.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-11 20:52 ` Grant Likely
0 siblings, 0 replies; 13+ messages in thread
From: Grant Likely @ 2013-02-11 20:52 UTC (permalink / raw)
To: Michael Ellerman, Baruch Siach; +Cc: linux-kernel, linuxppc-dev
On Mon, 11 Feb 2013 17:19:49 +1100, Michael Ellerman <michael@ellerman.id.au> wrote:
> On Mon, Feb 11, 2013 at 07:31:00AM +0200, Baruch Siach wrote:
> > Hi lkml,
>
> Hi Baruch,
>
> > The drivers/edac/mpc85xx_edac.c driver contains the following (abbreviated)
> > code snippet it its .probe:
>
> You dropped an important detail which is the preceeding line:
>
> pdata->irq = irq_of_parse_and_map(op->dev.of_node, 0);
>
> > res = devm_request_irq(&op->dev, pdata->irq,
> > mpc85xx_pci_isr, IRQF_DISABLED,
> > "[EDAC] PCI err", pci);
> > if (res < 0) {
> > irq_dispose_mapping(pdata->irq);
> > goto err2;
> > }
> >
> > Now, since the requested irq is already in use, and IRQF_SHARED is not set,
> > devm_request_irq errors() out, which is OK. Less OK is the
> > irq_dispose_mapping() call, which gives me this:
> >
> > EDAC PCI1: Giving out device to module 'MPC85xx_edac' controller 'mpc85xx_pci_err': DEV 'ffe0a000.pcie' (INTERRUPT)
> > genirq: Flags mismatch irq 16. 00000020 ([EDAC] PCI err) vs. 00000020 ([EDAC] PCI err)
>
> The hint here is to notice which other irq you're clashing with ^^
> ie. yourself. Which is odd, that is the root of the problem.
>
> The badness you're getting from irq_dispose_mapping() is caused because you're
> disposing of that mapping which is currently still in use, by the same interrupt.
>
> That is caused by a "feature" in the irq mapping code, where if you ask to map an
> already mapped hwirq, it will give you back the same virq. So in your case when
> you called irq_of_parse_and_map() it noticed that someone had already mapped
> that hwirq, and gave you back an existing (in use) virq.
Really the irq mappings should be using reference counting. The existing
code is naive on this count and just releases the irq on the first call
to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
want to take that task on?
g.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [BUG] irq_dispose_mapping after irq request failure
2013-02-11 20:52 ` Grant Likely
@ 2013-02-12 0:51 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2013-02-12 0:51 UTC (permalink / raw)
To: Grant Likely; +Cc: Baruch Siach, linuxppc-dev, linux-kernel
On Mon, 2013-02-11 at 20:52 +0000, Grant Likely wrote:
> Really the irq mappings should be using reference counting. The existing
> code is naive on this count and just releases the irq on the first call
> to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
> want to take that task on?
Is this the best approach ?
The original idea was that there was no point disposing of mappings in most
cases and keeping the mapping around would provide a bit of stability of
interrupt numbers which might come in handy for debugging etc...
The few cases where disposing of a mapping might be useful is if the underlying
physical interrupts completely disappear, as in a cascaded controller gets
removed or that sort of thing, which is a very rare case... And even then...
Could you just make irq_dispose_mapping() check if the irq desc is
active and fail/WARN/BUG if it is ? I don't see the point of adding a refcount,
that feels overkill.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-12 0:51 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2013-02-12 0:51 UTC (permalink / raw)
To: Grant Likely; +Cc: Michael Ellerman, Baruch Siach, linuxppc-dev, linux-kernel
On Mon, 2013-02-11 at 20:52 +0000, Grant Likely wrote:
> Really the irq mappings should be using reference counting. The existing
> code is naive on this count and just releases the irq on the first call
> to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
> want to take that task on?
Is this the best approach ?
The original idea was that there was no point disposing of mappings in most
cases and keeping the mapping around would provide a bit of stability of
interrupt numbers which might come in handy for debugging etc...
The few cases where disposing of a mapping might be useful is if the underlying
physical interrupts completely disappear, as in a cascaded controller gets
removed or that sort of thing, which is a very rare case... And even then...
Could you just make irq_dispose_mapping() check if the irq desc is
active and fail/WARN/BUG if it is ? I don't see the point of adding a refcount,
that feels overkill.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
2013-02-12 0:51 ` Benjamin Herrenschmidt
@ 2013-02-12 6:18 ` Michael Ellerman
-1 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2013-02-12 6:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Baruch Siach, linuxppc-dev, linux-kernel
On Tue, Feb 12, 2013 at 11:51:13AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2013-02-11 at 20:52 +0000, Grant Likely wrote:
> > Really the irq mappings should be using reference counting. The existing
> > code is naive on this count and just releases the irq on the first call
> > to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
> > want to take that task on?
>
> Is this the best approach ?
>
> The original idea was that there was no point disposing of mappings in most
> cases and keeping the mapping around would provide a bit of stability of
> interrupt numbers which might come in handy for debugging etc...
>
> The few cases where disposing of a mapping might be useful is if the underlying
> physical interrupts completely disappear, as in a cascaded controller gets
> removed or that sort of thing, which is a very rare case... And even then...
That may have been the intent, but we forgot to tell driver writers, ourselves
included.
> Could you just make irq_dispose_mapping() check if the irq desc is
> active and fail/WARN/BUG if it is ? I don't see the point of adding a refcount,
> that feels overkill.
I don't think you can, "active" is not well defined. Other code may have
done nothing other than create the mapping and remembered the virq,
which will break if you destroy the mapping. Or?
I agree refcounting is not fun. It'll end up with the same mess as
of_node_get/put() where practically every 2nd piece of code leaks
references.
I guess we can't go the other way, and say that mapping the same hwirq
twice is an error.
cheers
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-12 6:18 ` Michael Ellerman
0 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2013-02-12 6:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Grant Likely, Baruch Siach, linuxppc-dev, linux-kernel
On Tue, Feb 12, 2013 at 11:51:13AM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2013-02-11 at 20:52 +0000, Grant Likely wrote:
> > Really the irq mappings should be using reference counting. The existing
> > code is naive on this count and just releases the irq on the first call
> > to irq_dispose_mapping(). I've not gotten around to fixing that. Anyone
> > want to take that task on?
>
> Is this the best approach ?
>
> The original idea was that there was no point disposing of mappings in most
> cases and keeping the mapping around would provide a bit of stability of
> interrupt numbers which might come in handy for debugging etc...
>
> The few cases where disposing of a mapping might be useful is if the underlying
> physical interrupts completely disappear, as in a cascaded controller gets
> removed or that sort of thing, which is a very rare case... And even then...
That may have been the intent, but we forgot to tell driver writers, ourselves
included.
> Could you just make irq_dispose_mapping() check if the irq desc is
> active and fail/WARN/BUG if it is ? I don't see the point of adding a refcount,
> that feels overkill.
I don't think you can, "active" is not well defined. Other code may have
done nothing other than create the mapping and remembered the virq,
which will break if you destroy the mapping. Or?
I agree refcounting is not fun. It'll end up with the same mess as
of_node_get/put() where practically every 2nd piece of code leaks
references.
I guess we can't go the other way, and say that mapping the same hwirq
twice is an error.
cheers
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
2013-02-12 6:18 ` Michael Ellerman
@ 2013-02-12 8:53 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2013-02-12 8:53 UTC (permalink / raw)
To: Michael Ellerman; +Cc: Baruch Siach, linuxppc-dev, linux-kernel
On Tue, 2013-02-12 at 17:18 +1100, Michael Ellerman wrote:
>
> I don't think you can, "active" is not well defined. Other code may have
> done nothing other than create the mapping and remembered the virq,
> which will break if you destroy the mapping. Or?
Active as in "requested". Yes there's a potential problems with multiple
requests for mappings & shared interrupts. This is not a problem for PCI
on powerpc because we don't free those mappings afaik.
> I agree refcounting is not fun. It'll end up with the same mess as
> of_node_get/put() where practically every 2nd piece of code leaks
> references.
>
> I guess we can't go the other way, and say that mapping the same hwirq
> twice is an error.
Might be worth it, and force the sharing case to be handled at some kind
of upper level (bus or platform).
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] irq_dispose_mapping after irq request failure
@ 2013-02-12 8:53 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2013-02-12 8:53 UTC (permalink / raw)
To: Michael Ellerman; +Cc: Grant Likely, Baruch Siach, linuxppc-dev, linux-kernel
On Tue, 2013-02-12 at 17:18 +1100, Michael Ellerman wrote:
>
> I don't think you can, "active" is not well defined. Other code may have
> done nothing other than create the mapping and remembered the virq,
> which will break if you destroy the mapping. Or?
Active as in "requested". Yes there's a potential problems with multiple
requests for mappings & shared interrupts. This is not a problem for PCI
on powerpc because we don't free those mappings afaik.
> I agree refcounting is not fun. It'll end up with the same mess as
> of_node_get/put() where practically every 2nd piece of code leaks
> references.
>
> I guess we can't go the other way, and say that mapping the same hwirq
> twice is an error.
Might be worth it, and force the sharing case to be handled at some kind
of upper level (bus or platform).
Ben.
^ permalink raw reply [flat|nested] 13+ messages in thread