* Potential issue with (or misunderstanding of) of_irq_get()
@ 2023-05-19 11:02 Conor Dooley
2023-05-21 12:38 ` Marc Zyngier
0 siblings, 1 reply; 3+ messages in thread
From: Conor Dooley @ 2023-05-19 11:02 UTC (permalink / raw)
To: Thomas Gleixner, Rob Herring, Frank Rowand, Marc Zyngier
Cc: devicetree, linux-kernel, conor, daire.mcnamara
[-- Attachment #1: Type: text/plain, Size: 6298 bytes --]
Hey!
I've run into an issue with of_irq_get() while writing an irqchip driver
and I was hoping that by posting about it I might get some guidance as
to whether I just doing something fundamentally wrong in my code, or
if the specific case was just an oversight.
I've been trying to solve the issue that I pointed out here:
https://lore.kernel.org/linux-gpio/23a69be6-96d3-1c28-f1aa-555e38ff991e@microchip.com/
To spare reading that, the TL;DR is that the SoC has 3 GPIO controllers,
with 14, 24 and 32 GPIOs each. All 68 can be used for interrupts.
The PLIC only has 41 interrupts for GPIOs, so there's a bit of extra RTL
sitting between the GPIO controllers and the PLIC, that is runtime
configurable, deciding whether an GPIO gets a PLIC interrupt of its
own or shares an interrupt with other GPIOs from the same GPIO controller.
Since the interrupt router/mux is not part of the GPIO controller blocks,
I have written a driver for the it & changed the representation in the DT
to the below. For each of the 41 interrupts "consumed" by the driver
bound to the irqmux node, I have created a domain.
irqmux: interrupt-controller@20002054 {
interrupt-parent = <&plic>;
interrupts = <13>, <14>, <15>, <16>,
<17>, <18>, <19>, <20>,
<21>, <22>, <23>, <24>,
<25>, <26>, <27>, <28>,
<29>, <30>, <31>, <32>,
<33>, <34>, <35>, <36>,
<37>, <38>, <39>, <40>,
<41>, <42>, <43>, <44>,
<45>, <46>, <47>, <48>,
<49>, <50>, <51>, <52>,
<53>;
};
gpio0: gpio@20120000 {
interrupt-parent = <&irqmux>;
npios = <14>;
interrupts = <0>, <1>, <2>, <3>,
<4>, <5>, <6>, <7>,
<8>, <9>, <10>, <11>,
<12>, <13>;
};
gpio1: gpio@20121000 {
interrupt-parent = <&irqmux>;
npios = <24>;
interrupts = <32>, <33>, <34>, <35>,
<36>, <37>, <38>, <39>,
<40>, <41>, <42>, <43>,
<44>, <45>, <46>, <47>,
<48>, <49>, <50>, <51>,
<52>, <53>, <54>, <55>;
};
gpio2: gpio@20122000 {
interrupt-parent = <&irqmux>;
npios = <32>;
interrupts = <64>, <65>, <66>, <67>,
<68>, <69>, <70>, <71>,
<72>, <73>, <74>, <75>,
<76>, <77>, <78>, <79>,
<80>, <81>, <82>, <83>,
<84>, <85>, <86>, <87>,
<88>, <89>, <90>, <91>,
<92>, <93>, <94>, <95>;
};
This approach in DT allows the GPIO controller driver to not care about
the router/mux configuration, which makes sense to me as it is not part
of those IP blocks.
My irqchip driver was adding domains like so:
for (; i < MPFS_MUX_NUM_IRQS; i++) {
priv->irqchip_data[i].output_hwirq = i;
priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
domain = irq_domain_add_linear(node, MPFS_MAX_IRQS_PER_GPIO,
&mpfs_irq_mux_nondirect_domain_ops,
&priv->irqchip_data[i]);
irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
mpfs_irq_mux_nondirect_handler,
&priv->irqchip_data[i]);
}
In my irqchip's select callback I check the struct irq_fwspec's param[0]
to determine which domain is actually responsible for it.
That's all working nicely & I was doing some cleanup before submitting,
when I noticed that debugfs complained about the fact that I had several
domains hanging off the same of device_node:
debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
To get around that, I tried to switch to creating fwnodes instead,
one for each domain:
for (; i < MPFS_MUX_NUM_IRQS; i++) {
priv->irqchip_data[i].output_hwirq = i;
priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
fwnode = irq_domain_alloc_named_id_fwnode("mpfs-irq-mux", i);
domain = irq_domain_create_linear(fwnode, MPFS_MAX_IRQS_PER_GPIO,
&mpfs_irq_mux_nondirect_domain_ops,
&priv->irqchip_data[i]);
irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
mpfs_irq_mux_nondirect_handler,
&priv->irqchip_data[i]);
}
That's grand for debugfs, but I then ran into a problem that made me feel
I had designed myself into an incorrect corner.
The GPIO driver requests interrupts using:
nirqs = of_irq_count(node);
for (i = 0; i < nirqs; i++)
ret = platform_get_irq(pdev, i);
producing the following call trace:
mpfs_gpio_probe()
platform_get_irq()
platform_get_irq_optional()
of_irq_get()
irq_find_host()
irq_find_matching_host()
irq_find_matching_fwnode()
irq_find_matching_fwspec()
of_irq_find_host() is problematic for me, because it calls into
irq_find_matching_fwspec() having created a struct irq_fwspec where it
only populated the fwnode.
In irq_find_matching_fwspec(), the check for fwspec->param_count will
always be false & my select callback is never invoked.
Since irq_find_matching_fwspec() falls back to the match callback if
param_count is zero, I have added a match callback:
static int mpfs_irq_mux_nondirect_match(struct irq_domain *d, struct device_node *node,
enum irq_domain_bus_token bus_token)
{
if(!of_device_is_compatible(node, "microchip,mpfs-gpio-irq-mux"))
return false;
return true;
}
But this is a hack, since I have 41 domains using the same match callback,
and will return true as long as just one of them has been created.
It is sufficient though to get me as far along irq_create_of_mapping()
which is called using a struct of_phandle_args, containing the params
array that I rely on.
I appear to have run into a fundamental assumption in of_irq_get(), which
makes me think that I have gone pretty badly wrong here somewhere.
Is this a bug/oversight/omission, or am I just doing something broken in
my code?
Given how long that code seems to be there, I am leading towards the fault
being mine (it usually is!).
The code for this (likely aberration of an) irqchip driver is here [1]
but hopefully the snippets/description is sufficient.
Thanks,
Conor.
1 - https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/commit/?h=gpio-irq&id=d1d654bd3de925a6dc48d6e2222adb3d310b8f9d
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Potential issue with (or misunderstanding of) of_irq_get()
2023-05-19 11:02 Potential issue with (or misunderstanding of) of_irq_get() Conor Dooley
@ 2023-05-21 12:38 ` Marc Zyngier
2023-05-22 11:56 ` Conor Dooley
0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2023-05-21 12:38 UTC (permalink / raw)
To: Conor Dooley
Cc: Thomas Gleixner, Rob Herring, Frank Rowand, devicetree,
linux-kernel, conor, daire.mcnamara
On Fri, 19 May 2023 12:02:47 +0100,
Conor Dooley <conor.dooley@microchip.com> wrote:
>
> [1 <text/plain; us-ascii (quoted-printable)>]
> Hey!
>
> I've run into an issue with of_irq_get() while writing an irqchip driver
> and I was hoping that by posting about it I might get some guidance as
> to whether I just doing something fundamentally wrong in my code, or
> if the specific case was just an oversight.
>
> I've been trying to solve the issue that I pointed out here:
> https://lore.kernel.org/linux-gpio/23a69be6-96d3-1c28-f1aa-555e38ff991e@microchip.com/
>
> To spare reading that, the TL;DR is that the SoC has 3 GPIO controllers,
> with 14, 24 and 32 GPIOs each. All 68 can be used for interrupts.
> The PLIC only has 41 interrupts for GPIOs, so there's a bit of extra RTL
> sitting between the GPIO controllers and the PLIC, that is runtime
> configurable, deciding whether an GPIO gets a PLIC interrupt of its
> own or shares an interrupt with other GPIOs from the same GPIO controller.
>
> Since the interrupt router/mux is not part of the GPIO controller blocks,
> I have written a driver for the it & changed the representation in the DT
> to the below. For each of the 41 interrupts "consumed" by the driver
> bound to the irqmux node, I have created a domain.
In general, this feels a wee bit wrong.
From what I understand of the HW, it is expected that most of the GPIO
interrupt will be directly associated with a PLIC interrupt in an 1:1
fashion (only 68 - 41 + 1 = 28 interrupts will be muxed). So 40 GPIOs
could have a chance of being directly assigned to a PLIC input without
any muxing.
If you start allocating a domain per interrupt, you end-up actively
preventing the use of hierarchical domains, and you don't really
benefit from what the mux HW can do for you.
[...]
> This approach in DT allows the GPIO controller driver to not care about
> the router/mux configuration, which makes sense to me as it is not part
> of those IP blocks.
>
> My irqchip driver was adding domains like so:
>
> for (; i < MPFS_MUX_NUM_IRQS; i++) {
> priv->irqchip_data[i].output_hwirq = i;
>
> priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
>
> domain = irq_domain_add_linear(node, MPFS_MAX_IRQS_PER_GPIO,
> &mpfs_irq_mux_nondirect_domain_ops,
> &priv->irqchip_data[i]);
>
> irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
> mpfs_irq_mux_nondirect_handler,
> &priv->irqchip_data[i]);
> }
>
> In my irqchip's select callback I check the struct irq_fwspec's param[0]
> to determine which domain is actually responsible for it.
Huh. In general, if you want to resort to 'select', you're doing
something that is a bit iffy.
>
> That's all working nicely & I was doing some cleanup before submitting,
> when I noticed that debugfs complained about the fact that I had several
> domains hanging off the same of device_node:
> debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
> debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
Of course. You get 41 domains with all the same node...
You really should only have one hierarchical domain that represents
all inputs. How you deal with the difference in handling probably
shouldn't be directly reflected at that level of the hierarchy, but
below the mux.
> To get around that, I tried to switch to creating fwnodes instead,
> one for each domain:
>
> for (; i < MPFS_MUX_NUM_IRQS; i++) {
> priv->irqchip_data[i].output_hwirq = i;
>
> priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
>
> fwnode = irq_domain_alloc_named_id_fwnode("mpfs-irq-mux", i);
>
> domain = irq_domain_create_linear(fwnode, MPFS_MAX_IRQS_PER_GPIO,
> &mpfs_irq_mux_nondirect_domain_ops,
> &priv->irqchip_data[i]);
>
> irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
> mpfs_irq_mux_nondirect_handler,
> &priv->irqchip_data[i]);
> }
>
> That's grand for debugfs, but I then ran into a problem that made me feel
> I had designed myself into an incorrect corner.
Yup. Now that you have disassociated yourself from the firmware-based
naming, you cannot use it to drive the mapping and sh*t happens. The
thing is, named fwnode are only there as a band-aid to be able to
designate objects that have no fwnode representation.
And it goes downhill from there. My gut felling for this is that you
should try and build something that looks like this:
- the mux exposes a single hierarchical domain that is directly
connected to the PLIC.
- the first 40 interrupt allocations are serviced by simply allocating
a corresponding PLIC interrupt and configuring the mux to do its
job.
- all the 28 other interrupts must be muxed onto a single PLIC. For
these interrupts, you must make sure that the domain hierarchy gets
truncated at the MUX level (see irq_domain_disconnect_hierarchy()
for the gory details). They all get to be placed behind a chained
interrupt handler, with their own irqchip ops.
That way, no repainting of fwnodes, no select/match complexity, and
must of the interrupts get to benefit from the hierarchical setup
(such as being able to set their affinity).
Of course, all of this is assuming that the HW is able to deal with a
large number of interrupts muxed to a single one. If not, you may have
to use more that one of these, but the idea is the same.
Thoughts?
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Potential issue with (or misunderstanding of) of_irq_get()
2023-05-21 12:38 ` Marc Zyngier
@ 2023-05-22 11:56 ` Conor Dooley
0 siblings, 0 replies; 3+ messages in thread
From: Conor Dooley @ 2023-05-22 11:56 UTC (permalink / raw)
To: Marc Zyngier
Cc: Thomas Gleixner, Rob Herring, Frank Rowand, devicetree,
linux-kernel, conor, daire.mcnamara
[-- Attachment #1: Type: text/plain, Size: 7189 bytes --]
Hey Marc,
Thanks for taking a look.
On Sun, May 21, 2023 at 01:38:11PM +0100, Marc Zyngier wrote:
> On Fri, 19 May 2023 12:02:47 +0100, Conor Dooley <conor.dooley@microchip.com> wrote:
> > I've run into an issue with of_irq_get() while writing an irqchip driver
> > and I was hoping that by posting about it I might get some guidance as
> > to whether I just doing something fundamentally wrong in my code, or
> > if the specific case was just an oversight.
> >
> > I've been trying to solve the issue that I pointed out here:
> > https://lore.kernel.org/linux-gpio/23a69be6-96d3-1c28-f1aa-555e38ff991e@microchip.com/
> >
> > To spare reading that, the TL;DR is that the SoC has 3 GPIO controllers,
> > with 14, 24 and 32 GPIOs each. All 68 can be used for interrupts.
> > The PLIC only has 41 interrupts for GPIOs, so there's a bit of extra RTL
> > sitting between the GPIO controllers and the PLIC, that is runtime
> > configurable, deciding whether an GPIO gets a PLIC interrupt of its
> > own or shares an interrupt with other GPIOs from the same GPIO controller.
> >
> > Since the interrupt router/mux is not part of the GPIO controller blocks,
> > I have written a driver for the it & changed the representation in the DT
> > to the below. For each of the 41 interrupts "consumed" by the driver
> > bound to the irqmux node, I have created a domain.
>
> In general, this feels a wee bit wrong.
>
> From what I understand of the HW, it is expected that most of the GPIO
> interrupt will be directly associated with a PLIC interrupt in an 1:1
> fashion (only 68 - 41 + 1 = 28 interrupts will be muxed). So 40 GPIOs
> could have a chance of being directly assigned to a PLIC input without
> any muxing.
Not quite.
Firstly, I made a silly maths mistake, and 14 + 24 + 32 is not 68.
There are 70 GPIOs on this SoC.
Probably more important though is that there are 3 mux interrupts
though, not only 1, so the "40 GPIOs" becomes 38. It's also the source
of some of the complexity in what I wrote, since each of the GPIO
controller's has a different interrupt that the non direct connections
go to.
There's a bit that "scared" me further on down, that I think came out of
this assumption.
> If you start allocating a domain per interrupt, you end-up actively
> preventing the use of hierarchical domains, and you don't really
> benefit from what the mux HW can do for you.
Honestly, I'm happy with being able to use more than 32 of the GPIOs for
interrupts. Setting affinity etc is a nice bonus on top of working in
the first place.
> > In my irqchip's select callback I check the struct irq_fwspec's param[0]
> > to determine which domain is actually responsible for it.
>
> Huh. In general, if you want to resort to 'select', you're doing
> something that is a bit iffy.
Aye, given the lack of other select callbacks, I figured I was going
down a potentially dodgy path.
> > That's all working nicely & I was doing some cleanup before submitting,
> > when I noticed that debugfs complained about the fact that I had several
> > domains hanging off the same of device_node:
> > debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
> > debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
>
> Of course. You get 41 domains with all the same node...
Aye, it is obvious in hindsight ;)
> > To get around that, I tried to switch to creating fwnodes instead,
> > one for each domain:
> >
> > for (; i < MPFS_MUX_NUM_IRQS; i++) {
> > priv->irqchip_data[i].output_hwirq = i;
> >
> > priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
> >
> > fwnode = irq_domain_alloc_named_id_fwnode("mpfs-irq-mux", i);
> >
> > domain = irq_domain_create_linear(fwnode, MPFS_MAX_IRQS_PER_GPIO,
> > &mpfs_irq_mux_nondirect_domain_ops,
> > &priv->irqchip_data[i]);
> >
> > irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
> > mpfs_irq_mux_nondirect_handler,
> > &priv->irqchip_data[i]);
> > }
> >
> > That's grand for debugfs, but I then ran into a problem that made me feel
> > I had designed myself into an incorrect corner.
>
> Yup. Now that you have disassociated yourself from the firmware-based
> naming, you cannot use it to drive the mapping and sh*t happens. The
> thing is, named fwnode are only there as a band-aid to be able to
> designate objects that have no fwnode representation.
>
> And it goes downhill from there. My gut felling for this is that you
> should try and build something that looks like this:
>
> - the mux exposes a single hierarchical domain that is directly
> connected to the PLIC.
>
> - the first 40 interrupt allocations are serviced by simply allocating
> a corresponding PLIC interrupt and configuring the mux to do its
> job.
So you say "first" here and "configuring the mux to do its job" - this
scares me a little as the mux cannot be used to map arbitrary GPIOs to
arbitrary PLIC interrupts.
All it can do is pick which one of either GPIO2 or GPIO0/1 gets to use
the corresponding direct connection. For example, if bit 0 in the
control register is cleared, GPIO0's 0th GPIO gets the direct connection
& GPIO2's 0th GPIO gets muxed into the GPIO2 non-direct interrupt.
Ditto for bit 1, 2 etc. From bit 14 onwards, s/GPIO0/GPIO1/.
I'm not sure if your "first" here was temporal, IOW "for the first 40
interrupts you allocate, use a direct connection". If it was, then
that's not a possibility.
I'd also rather not edit the mux based on what Linux wants, since there
could well be some other program running in AMP on other cores that
would not appreciate it changing underneath its feet. I don't think that
makes things any more/less easy though, since temporal is already not
possible.
> - all the 28 other interrupts must be muxed onto a single PLIC. For
> these interrupts, you must make sure that the domain hierarchy gets
> truncated at the MUX level (see irq_domain_disconnect_hierarchy()
> for the gory details). They all get to be placed behind a chained
> interrupt handler, with their own irqchip ops.
Yeah, so this I cannot do per se since there is one of these mux
interrupts per GPIO controller. But it doesn't sound too difficult to
extend that idea to 3 different interrupts. Famous last words.
> That way, no repainting of fwnodes, no select/match complexity, and
> must of the interrupts get to benefit from the hierarchical setup
> (such as being able to set their affinity).
>
> Of course, all of this is assuming that the HW is able to deal with a
> large number of interrupts muxed to a single one. If not, you may have
> to use more that one of these, but the idea is the same.
>
> Thoughts?
Well, I'll have to go poking at this hierarchy truncation function that
you have pointed out & see how it works - but the idea all sounds doable
and less cumbersome than what I have right now.
Thanks for the pointers, I'll resurface with either a patchset or having
designed myself into another corner.
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-05-22 11:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-19 11:02 Potential issue with (or misunderstanding of) of_irq_get() Conor Dooley
2023-05-21 12:38 ` Marc Zyngier
2023-05-22 11:56 ` Conor Dooley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).