* Re: [PATCH] irqchip/armada-370-xp: Implement SoC Error interrupts [not found] <20240814124537.29847-1-kabel@kernel.org> @ 2024-08-14 15:51 ` Thomas Gleixner 2024-08-14 16:43 ` Conor Dooley 0 siblings, 1 reply; 3+ messages in thread From: Thomas Gleixner @ 2024-08-14 15:51 UTC (permalink / raw) To: Marek Behún, Andrew Lunn, Gregory Clement, Sebastian Hesselbarth, linux-arm-kernel, arm, Andy Shevchenko, Hans de Goede, Ilpo Järvinen, Rob Herring, devicetree Cc: Marek Behún On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: Cc+ device tree people. > + The code binds this new interrupt domain to the same device-tree node as > + the main interrupt domain. The main interrupt controller has its > + interrupts described by one argument in device-tree > + (#interrupt-cells = <1>), i.e.: > + > + interrupts-extended = <&mpic 8>; > + > + Because of backwards compatibility we cannot change this number of > + arguments, and so the SoC Error interrupts must also be described by > + this one number. > + > + Thus, to describe a SoC Error interrupt, one has to add the an offset > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > + main controller supports at most 1024 interrupts (in theory; in practice > + it seems to be 116 interrupts on all supported platforms). An example of > + describing a SoC Error interrupt is > + > + interrupts-extended = <&mpic 0x404>; This looks like a horrible hack and I don't understand why this can't be a separate interrupt controller, which it is in the hardware. That controller utilizes interrupt 4 from the MPIC. But then my DT foo is limited, so I let the DT folks comment on that. > +static int mpic_soc_err_irq_set_affinity(struct irq_data *d, const struct cpumask *mask, bool force) > +{ > + unsigned int cpu; > + > + /* > + * TODO: The mpic->per_cpu region accesses CPU Local IC registers for CPU n when accessed > + * from CPU n. Thus if we want to access this registers from another CPU, we need to request > + * a function to be executed on CPU n. This is what we do here by calling smp_call_on_cpu(). > + * > + * Instead, we could access CPU Local IC registers by having CPU Local region of each CPU > + * mapped in the MPIC private data structure. We could do this either by extending the > + * register resource in the device-tree, or by computing the physical base address of those > + * regions relative to the main MPIC base address. That requires locking for those registers obviously. > + */ > + > + cpus_read_lock(); This code was clearly never tested with any debug enabled. set_affinity() is invoked with interrupts disabled and irq_desc::lock held. cpus_read_lock() can sleep... The mandatory debug options would have told you loud and clearly. > + /* First, disable the ERR IRQ on all cores */ > + for_each_online_cpu(cpu) > + smp_call_on_cpu(cpu, mpic_soc_err_irq_mask_on_cpu, d, true); Again. smp_call_on_cpu() invokes wait_for_completion(), which obviously can sleep. Also why do you want to do that on _ALL_ CPUs if there is only one you pick from the effective affinity mask? > + /* Then enable on one online core from the affinity mask */ > + cpu = cpumask_any_and(mask, cpu_online_mask); > + smp_call_on_cpu(cpu, mpic_soc_err_irq_unmask_on_cpu, d, true); Ditto. So you really want to map the registers so they are accessible cross CPU including locking. Alternatively pin the error interrupts to CPU0 which cannot be unplugged and be done with it. > +static int mpic_soc_err_irq_map(struct irq_domain *domain, unsigned int virq, irq_hw_number_t hwirq) > +{ > + struct mpic *mpic = domain->host_data; > + > + irq_set_chip_data(virq, mpic); > + > + mpic_soc_err_irq_mask(irq_get_irq_data(virq)); What for? It should be masked if it's not mapped, no? > + irq_set_status_flags(virq, IRQ_LEVEL); > + irq_set_chip_and_handler(virq, &mpic_soc_err_irq_chip, handle_level_irq); > + irq_set_probe(virq); > + > + return 0; > +} > +static int mpic_soc_err_xlate(struct irq_domain *domain, struct device_node *np, > + const u32 *spec, unsigned int count, > + unsigned long *hwirq, unsigned int *type) > +{ > + int err = irq_domain_xlate_onecell(domain, np, spec, count, hwirq, type); > + > + if (err) > + return err; > + > + *hwirq -= MPIC_SOC_ERR_IRQS_OFFSET; > + return 0; > +} > +static int __init mpic_soc_err_init(struct mpic *mpic, struct device_node *np) > +{ > + unsigned int nr_irqs; > + > + if (of_machine_is_compatible("marvell,armada-370-xp")) > + nr_irqs = 32; > + else > + nr_irqs = 64; > + > + mpic->soc_err_domain = irq_domain_add_hierarchy(mpic->domain, 0, nr_irqs, np, > + &mpic_soc_err_irq_ops, mpic); Why is this a hierarchical domain? That does not make any sense at all. The MPIC domain provides only the demultiplexing interrupt and is not involved in a domain hierarchy. Hierarchical domains are required when the top level domain depends on resources from the parent domain which need to be allocated when mapping an interrupt. e.g. on x86: vector - remap - PCI/MSI So the top level PCI/MSI domain requires resources from the remap domain and the remap domain requires a vector from the vector domain. These resources are per interrupt. During runtime there are also irq_*() callbacks which utilize callbacks from the parent domains. I.e. mask() invokes mask(parent) ... But that has nothing to do with demultiplexed interrupts because the underlying demultiplex interrupt is already there and the same for all interrupts in the demultiplexed domain. There is no callback dependency and no resource dependency at all. Thanks, tglx ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] irqchip/armada-370-xp: Implement SoC Error interrupts 2024-08-14 15:51 ` [PATCH] irqchip/armada-370-xp: Implement SoC Error interrupts Thomas Gleixner @ 2024-08-14 16:43 ` Conor Dooley 2024-08-15 10:15 ` Marek Behún 0 siblings, 1 reply; 3+ messages in thread From: Conor Dooley @ 2024-08-14 16:43 UTC (permalink / raw) To: Thomas Gleixner Cc: Marek Behún, Andrew Lunn, Gregory Clement, Sebastian Hesselbarth, linux-arm-kernel, arm, Andy Shevchenko, Hans de Goede, Ilpo Järvinen, Rob Herring, devicetree [-- Attachment #1: Type: text/plain, Size: 2083 bytes --] On Wed, Aug 14, 2024 at 05:51:33PM +0200, Thomas Gleixner wrote: > On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: > > Cc+ device tree people. > > > + The code binds this new interrupt domain to the same device-tree node as > > + the main interrupt domain. The main interrupt controller has its > > + interrupts described by one argument in device-tree > > + (#interrupt-cells = <1>), i.e.: > > + > > + interrupts-extended = <&mpic 8>; > > + > > + Because of backwards compatibility we cannot change this number of > > + arguments, and so the SoC Error interrupts must also be described by > > + this one number. > > + > > + Thus, to describe a SoC Error interrupt, one has to add the an offset > > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > > + main controller supports at most 1024 interrupts (in theory; in practice > > + it seems to be 116 interrupts on all supported platforms). An example of > > + describing a SoC Error interrupt is > > + > > + interrupts-extended = <&mpic 0x404>; > > This looks like a horrible hack and I don't understand why this can't be > a separate interrupt controller, which it is in the hardware. That > controller utilizes interrupt 4 from the MPIC. > > But then my DT foo is limited, so I let the DT folks comment on that. I'm guessing, not being author and all that, that since the registers for this SOC_ERR business are intermingled with those of the existing interrupt controller it is not possible to create a standalone node for the new controller with it's own reg entry, without having to redo the binding etc for the existing controller, ending up with some sort of syscon. Alternatively, I suppose a child node could work, but it wouldn't be much less of a hack than deciding that numbers > 400 are the SOC_ERR ones. I see arguments for doing it either way and Marek must have an opinion on doing it without a new node, given the comment suggesting that a previous revision did have a dedicated node and that approach was dropped. Cheers, Conor. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] irqchip/armada-370-xp: Implement SoC Error interrupts 2024-08-14 16:43 ` Conor Dooley @ 2024-08-15 10:15 ` Marek Behún 0 siblings, 0 replies; 3+ messages in thread From: Marek Behún @ 2024-08-15 10:15 UTC (permalink / raw) To: Conor Dooley, Thomas Gleixner Cc: Marek Behún, Andrew Lunn, Gregory Clement, Sebastian Hesselbarth, linux-arm-kernel, arm, Andy Shevchenko, Hans de Goede, Ilpo Järvinen, Rob Herring, devicetree On Wed, Aug 14, 2024 at 05:43:19PM +0100, Conor Dooley wrote: > On Wed, Aug 14, 2024 at 05:51:33PM +0200, Thomas Gleixner wrote: > > On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: > > > > Cc+ device tree people. > > > > > + The code binds this new interrupt domain to the same device-tree node as > > > + the main interrupt domain. The main interrupt controller has its > > > + interrupts described by one argument in device-tree > > > + (#interrupt-cells = <1>), i.e.: > > > + > > > + interrupts-extended = <&mpic 8>; > > > + > > > + Because of backwards compatibility we cannot change this number of > > > + arguments, and so the SoC Error interrupts must also be described by > > > + this one number. > > > + > > > + Thus, to describe a SoC Error interrupt, one has to add the an offset > > > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > > > + main controller supports at most 1024 interrupts (in theory; in practice > > > + it seems to be 116 interrupts on all supported platforms). An example of > > > + describing a SoC Error interrupt is > > > + > > > + interrupts-extended = <&mpic 0x404>; > > > > This looks like a horrible hack and I don't understand why this can't be > > a separate interrupt controller, which it is in the hardware. That > > controller utilizes interrupt 4 from the MPIC. > > > > But then my DT foo is limited, so I let the DT folks comment on that. > > I'm guessing, not being author and all that, that since the registers > for this SOC_ERR business are intermingled with those of the existing > interrupt controller it is not possible to create a standalone node for > the new controller with it's own reg entry, without having to redo > the binding etc for the existing controller, ending up with some sort of > syscon. > Alternatively, I suppose a child node could work, but it wouldn't be much > less of a hack than deciding that numbers > 400 are the SOC_ERR ones. > I see arguments for doing it either way and Marek must have an opinion > on doing it without a new node, given the comment suggesting that a > previous revision did have a dedicated node and that approach was > dropped. This is exactly the reason. Pali's original code required creating another interrupt-controller node (soc_err) within the mpic node: mpic: interrupt-controller@20a00 { compatible = "marvell,mpic"; reg = <0x20a00 0x2d0>, <0x21070 0x58>; #interrupt-cells = <1>; interrupt-controller; msi-controller; interrupts = <GIC_PPI 15 IRQ_TYPE_LEVEL_HIGH>; soc_err: interrupt-controller@20 { interrupt-controller; #interrupt-cells = <1>; }; }; I decided against it, because to do this correctly in device-tree bindings, it gets unnecessarily complicated: - it requires #address-cells and #size-cells within the mpic node - the `interrupt-controller` property of mpic node can in some interpretations clash with the `interrupt-controller@20` name of the child node - the registers of soc_err and mpic overlap - the child note should have a compatible string if we want to be proper I also did consider syscon, but IMO it just adds unnecessary burden to backwards compatibility of device-trees. The solution with the 0x400 offset solves the backwards compatiblity issue. Marek ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-08-15 10:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240814124537.29847-1-kabel@kernel.org>
2024-08-14 15:51 ` [PATCH] irqchip/armada-370-xp: Implement SoC Error interrupts Thomas Gleixner
2024-08-14 16:43 ` Conor Dooley
2024-08-15 10:15 ` Marek Behún
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox