From: Herve Codina <herve.codina@bootlin.com>
To: Rob Herring <robh@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Lizhi Hou <lizhi.hou@amd.com>, Max Zhen <max.zhen@amd.com>,
Sonal Santan <sonal.santan@amd.com>,
Stefano Stabellini <stefano.stabellini@xilinx.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
PCI <linux-pci@vger.kernel.org>,
Allan Nielsen <allan.nielsen@microchip.com>,
Horatiu Vultur <horatiu.vultur@microchip.com>,
Steen Hegelund <steen.hegelund@microchip.com>,
Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Subject: Re: [PATCH v2 0/2] Attach DT nodes to existing PCI devices
Date: Tue, 5 Dec 2023 09:04:52 +0100 [thread overview]
Message-ID: <20231205090452.7c601eb5@bootlin.com> (raw)
In-Reply-To: <CAL_JsqJJ64513pyQggU71agTzawNWPpm6ZpWMB6e0zu-tWL8yw@mail.gmail.com>
On Mon, 4 Dec 2023 17:03:21 -0600
Rob Herring <robh@kernel.org> wrote:
> On Mon, Dec 4, 2023 at 9:30 AM Herve Codina <herve.codina@bootlin.com> wrote:
> >
> > Hi Rob,
> >
> > On Mon, 4 Dec 2023 07:59:09 -0600
> > Rob Herring <robh@kernel.org> wrote:
> >
> > [...]
> >
> > > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> > > > > index 9c2137dae429..46b252bbe500 100644
> > > > > --- a/drivers/pci/bus.c
> > > > > +++ b/drivers/pci/bus.c
> > > > > @@ -342,8 +342,6 @@ void pci_bus_add_device(struct pci_dev *dev)
> > > > > */
> > > > > pcibios_bus_add_device(dev);
> > > > > pci_fixup_device(pci_fixup_final, dev);
> > > > > - if (pci_is_bridge(dev))
> > > > > - of_pci_make_dev_node(dev);
> > > > > pci_create_sysfs_dev_files(dev);
> > > > > pci_proc_attach_device(dev);
> > > > > pci_bridge_d3_update(dev);
> > > > > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > > > > index 51e3dd0ea5ab..e15eaf0127fc 100644
> > > > > --- a/drivers/pci/of.c
> > > > > +++ b/drivers/pci/of.c
> > > > > @@ -31,6 +31,8 @@ int pci_set_of_node(struct pci_dev *dev)
> > > > > return 0;
> > > > >
> > > > > node = of_pci_find_child_device(dev->bus->dev.of_node, dev->devfn);
> > > > > + if (!node && pci_is_bridge(dev))
> > > > > + of_pci_make_dev_node(dev);
> > > > > if (!node)
> > > > > return 0;
> > > >
> > > > Maybe it is too early.
> > > > of_pci_make_dev_node() creates a node and fills some properties based on
> > > > some already set values available in the PCI device such as its struct resource
> > > > values.
> > > > We need to have some values set by the PCI infra in order to create our DT node
> > > > with correct values.
> > >
> > > Indeed, that's probably the issue I'm having. In that case,
> > > DECLARE_PCI_FIXUP_HEADER should work. That's later, but still before
> > > device_add().
> > >
> > > I think modifying sysfs after device_add() is going to race with
> > > userspace. Userspace is notified of a new device, and then the of_node
> > > link may or may not be there when it reads sysfs. Also, not sure if
> > > we'll need DT modaliases with PCI devices, but they won't work if the
> > > DT node is not set before device_add().
> >
> > Ok, we can try using DECLARE_PCI_FIXUP_HEADER.
> > On your side, is moving from DECLARE_PCI_FIXUP_EARLY to DECLARE_PCI_FIXUP_HEADER
> > fix your QEMU unittest ?
>
> No...
>
> And testing the bridge part crashes. That's because there's a
> dependency on the bridge->subordinate to write out bus-range and
> interrupt-map. I think the fix there is we should just not write those
> properties. The bus range isn't needed because the kernel does its own
> assignments. For interrupt-map, it is only needed if "interrupts" is
> present in the child devices. If not present, then the standard PCI
> swizzling is used. Alternatively, I think the interrupt mapping could
> be simplified to just implement the standard swizzling at each level
> which isn't dependent on any of the devices on the bus. I gave that a
> go where each interrupt-map just points to the parent bridge, but ran
> into an issue that the bridge nodes don't have a phandle. That should
> be fixable, but I'd rather go with the first option. I suppose that
> depends on how the interrupts downstream of the PCI device need to get
> resolved. It could be that the PCI device serves as the interrupt
> controller and can resolve the parent interrupt on its own (which may
> be needed for ACPI host anyways).
About interrupt, I am a bit stuck on my side.
My dtso (applied at PCI device level) contains the following:
fragment@0 {
target-path="";
__overlay__ {
pci-ep-bus@0 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
/*
* map @0xe2000000 (32MB) to BAR0 (CPU)
* map @0xe0000000 (16MB) to BAR1 (AMBA)
*/
ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
0xe0000000 0x01 0x00 0x00 0x1000000>;
itc: itc {
compatible = "microchip,lan966x-itc";
#interrupt-cells = <1>;
interrupt-controller;
reg = <0xe00c0120 0x190>;
};
...
};
};
};
I have a 'simple-bus' with a 'ranges' property to translate the BAR addresses
then several devices. Among them a interrupt controller (itc). Its parent
interrupt is the one used by the PCI device (INTA).
I cannot describe this parent interrupt in the dtso because to that I need the
parent interrupt phandle which will be know only at runtime.
Of course, I can modified the overlay applied to tweak the 'interrupt' and
'interrupt-parent' in the itc node from my PCI device driver at runtime but I
would like to avoid this kind of tweak in the PCI device driver.
This kind of tweak is overlay dependent and needs to be done by each PCI
device driver that need to work with interrupts.
For BAR addresses translation, we use the 'ranges' property at the PCI device
node to translate 0 0 0 to BAR0, 1 0 0 to BAR1, ...
What do you think about a new 'irq-ranges' property to translate the irq number
and get the irq parent controller base.
irq-ranges = <child_irq_spec parent_irq_spec length>;
The idea would be to translate the child irq specifier (irq number +
possible flags) parent DT node.
if 'irq-ranges' present in parent translate irq spec then:
1) 'interrupt' present in parent.
In that case, if the translation match, we use the translated irq spec
and resolve the parent interrupt controller using existing ways from this
node. If it does not match: error.
2) 'interrupt-map' present in parent.
We use existing ways from this node with the translated irq spec to get the
parent interrupt controller.
3) 'interrupt-controller' present in parent.
We use this node as parent interrupt controller and the translated irq spec
4) no specific property present, update parent taking the parent's parent and
go again.
With this the overlay becomes:
pci-ep-bus@0 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
#interrupt-cells = <1>;
/*
* map @0xe2000000 (32MB) to BAR0 (CPU)
* map @0xe0000000 (16MB) to BAR1 (AMBA)
*/
ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
0xe0000000 0x01 0x00 0x00 0x1000000>;
/* Translate PCI IRQ*/
irq-ranges = <1 1 1>;
itc: itc {
compatible = "microchip,lan966x-itc";
#interrupt-cells = <1>;
interrupts = <1>;
interrupt-controller;
reg = <0xe00c0120 0x190>;
};
...
};
And the PCI device node built at runtime:
dev@0,0 {
#address-cells = <0x03>;
interrupts = <0x01>; <-- use parent interrupt-map
#size-cells = <0x02>;
compatible = "pci1055,9660\0pciclass,020000\0pciclass,0200";
ranges = ...;
irq-ranges = <1 0x01 1>
reg = <0x10000 0x00 0x00 0x00 0x00>;
};
or
dev@0,0 {
#address-cells = <0x03>;
interrupts = <NN>;
interrupts-parent = <phandle'
#size-cells = <0x02>;
compatible = "pci1055,9660\0pciclass,020000\0pciclass,0200";
ranges = ...;
irq-ranges = <1 NN 1>
reg = <0x10000 0x00 0x00 0x00 0x00>;
};
In the second case, interrup-map in bridges nodes is still needed to build the
'interrupts' / 'interrupts-parent' in devices nodes.
and irq-ranges (if it makes sense on your side) allow to get and interrupt from
the overlay without the need to know the irq parent phandle.
Regards,
Hervé
>
> > We have to note that between the pci_fixup_device(pci_fixup_header, dev) call
> > and the device_add() call, the call to pci_set_msi_domain() is present.
> > MSIs are not supported currently but in the future ...
>
> MSI's aren't ever described in PCI nodes. Only the host bridge. So I
> don't think we should have problems there.
>
> > Related to DT modaliases, I don't think they are needed.
> > All drivers related to PCI device should be declared as pci_driver.
> > Correct me if I am wrong but I think that the core PCI will load the correct
> > module without any DT modalias.
>
> Yes, you are probably right.
>
> Rob
--
Hervé Codina, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
next prev parent reply other threads:[~2023-12-05 8:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-30 16:56 [PATCH v2 0/2] Attach DT nodes to existing PCI devices Herve Codina
2023-11-30 16:56 ` [PATCH v2 1/2] driver core: Introduce device_{add,remove}_of_node() Herve Codina
2023-11-30 16:56 ` [PATCH v2 2/2] PCI: of: Attach created of_node to existing device Herve Codina
2023-12-01 22:49 ` Bjorn Helgaas
2023-12-01 22:26 ` [PATCH v2 0/2] Attach DT nodes to existing PCI devices Rob Herring
2023-12-01 22:45 ` Bjorn Helgaas
2023-12-04 16:48 ` Lizhi Hou
2023-12-04 12:43 ` Herve Codina
2023-12-04 13:59 ` Rob Herring
2023-12-04 15:30 ` Herve Codina
2023-12-04 23:03 ` Rob Herring
2023-12-05 8:04 ` Herve Codina [this message]
2023-12-07 22:51 ` Rob Herring
2023-12-08 8:48 ` Herve Codina
2023-12-14 14:31 ` Herve Codina
2024-03-19 14:41 ` Herve Codina
2023-12-05 18:53 ` Lizhi Hou
2023-12-15 13:52 ` Herve Codina
2024-03-19 15:25 ` Bjorn Helgaas
2024-03-19 16:34 ` Herve Codina
2024-03-19 16:54 ` Bjorn Helgaas
2024-04-10 21:41 ` Rob Herring
2024-04-11 14:05 ` Herve Codina
2024-04-11 20:57 ` Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231205090452.7c601eb5@bootlin.com \
--to=herve.codina@bootlin.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=allan.nielsen@microchip.com \
--cc=bhelgaas@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=horatiu.vultur@microchip.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lizhi.hou@amd.com \
--cc=max.zhen@amd.com \
--cc=rafael@kernel.org \
--cc=robh@kernel.org \
--cc=sonal.santan@amd.com \
--cc=steen.hegelund@microchip.com \
--cc=stefano.stabellini@xilinx.com \
--cc=thomas.petazzoni@bootlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).