* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130307200235.GB20695-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-07 20:47 ` Thierry Reding
2013-03-08 0:05 ` Rob Herring
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-07 20:47 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Russell King - ARM Linux, Jason Cooper, Andrew Lunn,
linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Rob Herring,
Bjorn Helgaas, Tawfik Bayouk,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 4650 bytes --]
On Thu, Mar 07, 2013 at 01:02:35PM -0700, Jason Gunthorpe wrote:
> On Thu, Mar 07, 2013 at 08:48:30PM +0100, Thierry Reding wrote:
> > On Thu, Mar 07, 2013 at 10:49:55AM -0700, Jason Gunthorpe wrote:
> > > On Thu, Mar 07, 2013 at 09:08:32AM +0100, Thierry Reding wrote:
> > [...]
> > > > > Both have various problems, but I think I prefer the first one as it
> > > > > doesn't conflate the contoller registers and host apertures in a
> > > > > single ranges..
> > > >
> > > > I think a better alternative would be (and this matches what Thomas has
> > > > said elsewhere) to use something like the first alternative but move the
> > > > regs property into the pcie@0,X nodes. That would save us from having to
> > > > index a property in the parent. At least from a DT point of view I find
> > > > that to be a more consistent representation.
> > >
> > > You are thinking a new property 'host-controller-regs' or the like?
> >
> > Well, something shorter would be nice, but that's the general idea, yes.
> > As I mentioned before, for Tegra these registers aren't actually any
> > controller specific registers but rather a window to access the PCI
> > configuration space for the root ports.
>
> Yes, I understand - but in this DT model configuration space access is
> a host controller function, not a PCI-device function. Anyhow I was
> also thinking that by the choice of the name it could do translation
> from the host-controller scope, not from the bridge scope - so the
> extra elements in ranges could be avoided as well. Hence the name..
>
> > I don't think assigned-addresses is a good fit either. The PCI binding
> > document is equally specific about it as it is about the reg property.
> > So in my opinion a separate property would be a better choice. The only
> > big obstacle is that it needs to be somehow hooked up with the OF core
> > so that proper address translation can be performed.
>
> Yes, agreed.
>
> My suggestion is to get the OF experts like GKL/Rob H/etc to weigh in
> on a preferred approach to this problem with the goal of standardizing
> across all PCI host drivers. Seems like there are 2 main options
> (outside regs + regnames/etc or new 'regs' in the bridge) and 1 hacky
> one (assigned addresses)
Arnd is already Cc'ed on this thread, adding Grant, Rob and the
devicetree-discuss ML.
In a nutshell (since some of the context isn't quoted anymore) the
problem that we're trying to solve is that some of the embedded SoCs
require per-root-port registers for configuration. The PCI DT
specification doesn't make any provisions for this. A few alternatives
have been discussed so far:
1) Use a "regs" property outside of the root port nodes with
some mechanism to index into them from within the root port
nodes. Conceptually somewhat like this:
pcie-controller {
...
regs = <0x80000000 0x00001000
0x80001000 0x00001000>;
pci@0,1 {
...
port-index = <0>;
};
pci@0,2 {
...
port-index = <1>;
};
};
2) Use a "regs" property inside of the root part nodes, along
the following lines:
pcie-controller {
...
pci@0,1 {
...
reg = <0x00000800 0 0 0 0>;
regs = <0x80000000 0x00001000>;
};
pci@0,2 {
...
reg = <0x00001000 0 0 0 0>;
regs = <0x80001000 0x00001000>;
};
};
3) Repurpose the "assigned-addresses" property to achieve the
same. This should work out-of-the-box but isn't a good fit
because it conflicts with the OF PCI specification which
defines this property to contain the addresses assigned to
the base address registers.
Options 1 and 2 above require changes to the OF core to allow proper
address translation, but the changes shouldn't be very big.
> > One possible solution that wouldn't be too hard to implement is to
> > provide a new function (say of_get_named_address()) similar to
> > of_get_address() which doesn't get the name of the register property
> > from the struct of_bus but from a parameter and call that function from
> > another new function similar to of_address_to_resource() that also gets
> > the property name from a parameter. I can't think of a better name for
> > the latter than of_named_address_to_resource(), which is rather long.
>
> Seems like a reasonable API, maybe pass in a be32*/length pointer
> instead of a name to be more flexible?
There's already __of_address_to_resource() which takes a be32 * and size
but I thought it might be easier to wrap that to make it easier on the
drivers to use the API.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-07 20:47 ` [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems Thierry Reding
@ 2013-03-08 0:05 ` Rob Herring
2013-03-08 7:14 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Rob Herring @ 2013-03-08 0:05 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, Thomas Petazzoni, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
On 03/07/2013 02:47 PM, Thierry Reding wrote:
> On Thu, Mar 07, 2013 at 01:02:35PM -0700, Jason Gunthorpe wrote:
>> On Thu, Mar 07, 2013 at 08:48:30PM +0100, Thierry Reding wrote:
>>> On Thu, Mar 07, 2013 at 10:49:55AM -0700, Jason Gunthorpe wrote:
>>>> On Thu, Mar 07, 2013 at 09:08:32AM +0100, Thierry Reding wrote:
>>> [...]
>>>>>> Both have various problems, but I think I prefer the first one as it
>>>>>> doesn't conflate the contoller registers and host apertures in a
>>>>>> single ranges..
>>>>>
>>>>> I think a better alternative would be (and this matches what Thomas has
>>>>> said elsewhere) to use something like the first alternative but move the
>>>>> regs property into the pcie@0,X nodes. That would save us from having to
>>>>> index a property in the parent. At least from a DT point of view I find
>>>>> that to be a more consistent representation.
>>>>
>>>> You are thinking a new property 'host-controller-regs' or the like?
>>>
>>> Well, something shorter would be nice, but that's the general idea, yes.
>>> As I mentioned before, for Tegra these registers aren't actually any
>>> controller specific registers but rather a window to access the PCI
>>> configuration space for the root ports.
>>
>> Yes, I understand - but in this DT model configuration space access is
>> a host controller function, not a PCI-device function. Anyhow I was
>> also thinking that by the choice of the name it could do translation
>> from the host-controller scope, not from the bridge scope - so the
>> extra elements in ranges could be avoided as well. Hence the name..
>>
>>> I don't think assigned-addresses is a good fit either. The PCI binding
>>> document is equally specific about it as it is about the reg property.
>>> So in my opinion a separate property would be a better choice. The only
>>> big obstacle is that it needs to be somehow hooked up with the OF core
>>> so that proper address translation can be performed.
>>
>> Yes, agreed.
>>
>> My suggestion is to get the OF experts like GKL/Rob H/etc to weigh in
>> on a preferred approach to this problem with the goal of standardizing
>> across all PCI host drivers. Seems like there are 2 main options
>> (outside regs + regnames/etc or new 'regs' in the bridge) and 1 hacky
>> one (assigned addresses)
>
> Arnd is already Cc'ed on this thread, adding Grant, Rob and the
> devicetree-discuss ML.
>
> In a nutshell (since some of the context isn't quoted anymore) the
> problem that we're trying to solve is that some of the embedded SoCs
> require per-root-port registers for configuration. The PCI DT
> specification doesn't make any provisions for this. A few alternatives
> have been discussed so far:
I'm not sure I follow. This is different than the host controller
registers? Why would this not just be multiple entries in the reg property?
Rob
>
> 1) Use a "regs" property outside of the root port nodes with
> some mechanism to index into them from within the root port
> nodes. Conceptually somewhat like this:
>
> pcie-controller {
> ...
> regs = <0x80000000 0x00001000
> 0x80001000 0x00001000>;
>
> pci@0,1 {
> ...
> port-index = <0>;
> };
>
> pci@0,2 {
> ...
> port-index = <1>;
> };
> };
>
> 2) Use a "regs" property inside of the root part nodes, along
> the following lines:
>
> pcie-controller {
> ...
> pci@0,1 {
> ...
> reg = <0x00000800 0 0 0 0>;
>
> regs = <0x80000000 0x00001000>;
> };
>
> pci@0,2 {
> ...
> reg = <0x00001000 0 0 0 0>;
>
> regs = <0x80001000 0x00001000>;
> };
> };
>
> 3) Repurpose the "assigned-addresses" property to achieve the
> same. This should work out-of-the-box but isn't a good fit
> because it conflicts with the OF PCI specification which
> defines this property to contain the addresses assigned to
> the base address registers.
>
> Options 1 and 2 above require changes to the OF core to allow proper
> address translation, but the changes shouldn't be very big.
>
>>> One possible solution that wouldn't be too hard to implement is to
>>> provide a new function (say of_get_named_address()) similar to
>>> of_get_address() which doesn't get the name of the register property
>>> from the struct of_bus but from a parameter and call that function from
>>> another new function similar to of_address_to_resource() that also gets
>>> the property name from a parameter. I can't think of a better name for
>>> the latter than of_named_address_to_resource(), which is rather long.
>>
>> Seems like a reasonable API, maybe pass in a be32*/length pointer
>> instead of a name to be more flexible?
>
> There's already __of_address_to_resource() which takes a be32 * and size
> but I thought it might be easier to wrap that to make it easier on the
> drivers to use the API.
>
> Thierry
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 0:05 ` Rob Herring
@ 2013-03-08 7:14 ` Thierry Reding
2013-03-08 16:52 ` Jason Gunthorpe
2013-03-08 23:12 ` Rob Herring
0 siblings, 2 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-08 7:14 UTC (permalink / raw)
To: Rob Herring
Cc: Jason Gunthorpe, Thomas Petazzoni, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
[-- Attachment #1: Type: text/plain, Size: 2525 bytes --]
On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
> On 03/07/2013 02:47 PM, Thierry Reding wrote:
[...]
> > In a nutshell (since some of the context isn't quoted anymore) the
> > problem that we're trying to solve is that some of the embedded SoCs
> > require per-root-port registers for configuration. The PCI DT
> > specification doesn't make any provisions for this. A few alternatives
> > have been discussed so far:
>
> I'm not sure I follow. This is different than the host controller
> registers? Why would this not just be multiple entries in the reg property?
Well the register regions are per root-port. On Tegra20 there's 2 of
them, Tegra30 has 3 and if I understand correctly Marvell can have up to
10 (!). Adding all of them to the reg property of the host controller
could work but it needs some way to match the reg entry to the root port
similar to option 1 below.
Adding a property in the root port nodes seems like a cleaner and more
accurate representation of the hardware to me, but if that's not
acceptable perhaps we need to bite the bullet and add the code to look
the registers up from the parent's reg property.
Thierry
> > 1) Use a "regs" property outside of the root port nodes with
> > some mechanism to index into them from within the root port
> > nodes. Conceptually somewhat like this:
> >
> > pcie-controller {
> > ...
> > regs = <0x80000000 0x00001000
> > 0x80001000 0x00001000>;
> >
> > pci@0,1 {
> > ...
> > port-index = <0>;
> > };
> >
> > pci@0,2 {
> > ...
> > port-index = <1>;
> > };
> > };
> >
> > 2) Use a "regs" property inside of the root part nodes, along
> > the following lines:
> >
> > pcie-controller {
> > ...
> > pci@0,1 {
> > ...
> > reg = <0x00000800 0 0 0 0>;
> >
> > regs = <0x80000000 0x00001000>;
> > };
> >
> > pci@0,2 {
> > ...
> > reg = <0x00001000 0 0 0 0>;
> >
> > regs = <0x80001000 0x00001000>;
> > };
> > };
> >
> > 3) Repurpose the "assigned-addresses" property to achieve the
> > same. This should work out-of-the-box but isn't a good fit
> > because it conflicts with the OF PCI specification which
> > defines this property to contain the addresses assigned to
> > the base address registers.
> >
> > Options 1 and 2 above require changes to the OF core to allow proper
> > address translation, but the changes shouldn't be very big.
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 7:14 ` Thierry Reding
@ 2013-03-08 16:52 ` Jason Gunthorpe
2013-03-08 19:12 ` Thierry Reding
2013-03-08 23:12 ` Rob Herring
1 sibling, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-08 16:52 UTC (permalink / raw)
To: Thierry Reding
Cc: Rob Herring, Thomas Petazzoni, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
On Fri, Mar 08, 2013 at 08:14:44AM +0100, Thierry Reding wrote:
> On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
> > On 03/07/2013 02:47 PM, Thierry Reding wrote:
> [...]
> > > In a nutshell (since some of the context isn't quoted anymore) the
> > > problem that we're trying to solve is that some of the embedded SoCs
> > > require per-root-port registers for configuration. The PCI DT
> > > specification doesn't make any provisions for this. A few alternatives
> > > have been discussed so far:
> >
> > I'm not sure I follow. This is different than the host controller
> > registers? Why would this not just be multiple entries in the reg property?
>
> Well the register regions are per root-port. On Tegra20 there's 2 of
> them, Tegra30 has 3 and if I understand correctly Marvell can have up to
> 10 (!). Adding all of them to the reg property of the host controller
> could work but it needs some way to match the reg entry to the root port
> similar to option 1 below.
Thomas just posted an implementation like this, please see:
http://www.spinics.net/lists/arm-kernel/msg228749.html
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 16:52 ` Jason Gunthorpe
@ 2013-03-08 19:12 ` Thierry Reding
[not found] ` <20130308191227.GA6551-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
2013-03-09 8:58 ` Thomas Petazzoni
0 siblings, 2 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-08 19:12 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Rob Herring, Thomas Petazzoni, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]
On Fri, Mar 08, 2013 at 09:52:28AM -0700, Jason Gunthorpe wrote:
> On Fri, Mar 08, 2013 at 08:14:44AM +0100, Thierry Reding wrote:
> > On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
> > > On 03/07/2013 02:47 PM, Thierry Reding wrote:
> > [...]
> > > > In a nutshell (since some of the context isn't quoted anymore) the
> > > > problem that we're trying to solve is that some of the embedded SoCs
> > > > require per-root-port registers for configuration. The PCI DT
> > > > specification doesn't make any provisions for this. A few alternatives
> > > > have been discussed so far:
> > >
> > > I'm not sure I follow. This is different than the host controller
> > > registers? Why would this not just be multiple entries in the reg property?
> >
> > Well the register regions are per root-port. On Tegra20 there's 2 of
> > them, Tegra30 has 3 and if I understand correctly Marvell can have up to
> > 10 (!). Adding all of them to the reg property of the host controller
> > could work but it needs some way to match the reg entry to the root port
> > similar to option 1 below.
>
> Thomas just posted an implementation like this, please see:
>
> http://www.spinics.net/lists/arm-kernel/msg228749.html
Oh well. I don't like it, but if that's the way it has to be, then so be
it. Any reason why the reg-names can't match the root port's node name
so that it can be used directly instead of going through hoops to
construct a string from extra parameters? Like below:
pcie-controller {
regs = <...>;
reg-names = ..., "pcie@1,0", "pcie@2,0";
pcie@1,0 {
...
};
pcie@2,0 {
...
};
};
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130308191227.GA6551-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
@ 2013-03-08 19:43 ` Mitch Bradley
2013-03-08 20:02 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-08 19:43 UTC (permalink / raw)
To: Thierry Reding
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/8/2013 9:12 AM, Thierry Reding wrote:
> On Fri, Mar 08, 2013 at 09:52:28AM -0700, Jason Gunthorpe wrote:
>> On Fri, Mar 08, 2013 at 08:14:44AM +0100, Thierry Reding wrote:
>>> On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
>>>> On 03/07/2013 02:47 PM, Thierry Reding wrote:
>>> [...]
>>>>> In a nutshell (since some of the context isn't quoted anymore) the
>>>>> problem that we're trying to solve is that some of the embedded SoCs
>>>>> require per-root-port registers for configuration. The PCI DT
>>>>> specification doesn't make any provisions for this. A few alternatives
>>>>> have been discussed so far:
>>>>
>>>> I'm not sure I follow. This is different than the host controller
>>>> registers? Why would this not just be multiple entries in the reg property?
>>>
>>> Well the register regions are per root-port. On Tegra20 there's 2 of
>>> them, Tegra30 has 3 and if I understand correctly Marvell can have up to
>>> 10 (!). Adding all of them to the reg property of the host controller
>>> could work but it needs some way to match the reg entry to the root port
>>> similar to option 1 below.
>>
>> Thomas just posted an implementation like this, please see:
>>
>> http://www.spinics.net/lists/arm-kernel/msg228749.html
The example in that posting looks messed up to me.
1) It has "reg = <0x0800 0 0 0 0>", but 0x0800 0 0 is not a valid
address in the address space defined by its parent - because the form of
the parent's ranges property indicates that it's a PCI-style address
form. 0x0800 0 0 lacks the top bits that indicate non-relocatable and
the type (I/O, memory, etc).
2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
<0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
What is address space represented by the pcie-controller node? reg
properties in child nodes must contain proper addresses in some
well-defined address space. It looks like this example is trying to put
half of the PCI bus abstraction in the pcie-controller node and the rest
in the pcie sub-nodes, with some handwaving address-space splicing
between the two. That's not kosher. Each level of the tree needs a
coherent address space definition, otherwise the address space
"calculus" breaks.
>
> Oh well. I don't like it, but if that's the way it has to be, then so be
> it. Any reason why the reg-names can't match the root port's node name
> so that it can be used directly instead of going through hoops to
> construct a string from extra parameters? Like below:
>
> pcie-controller {
> regs = <...>;
> reg-names = ..., "pcie@1,0", "pcie@2,0";
>
> pcie@1,0 {
> ...
> };
>
> pcie@2,0 {
> ...
> };
> };
>
> Thierry
>
>
>
> _______________________________________________
> devicetree-discuss mailing list
> devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
> https://lists.ozlabs.org/listinfo/devicetree-discuss
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 19:43 ` Mitch Bradley
@ 2013-03-08 20:02 ` Jason Gunthorpe
[not found] ` <20130308200245.GC29435-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-08 20:02 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
> >> http://www.spinics.net/lists/arm-kernel/msg228749.html
>
> The example in that posting looks messed up to me.
>
> 1) It has "reg = <0x0800 0 0 0 0>", but 0x0800 0 0 is not a valid
> address in the address space defined by its parent - because the form of
> the parent's ranges property indicates that it's a PCI-style address
> form. 0x0800 0 0 lacks the top bits that indicate non-relocatable and
> the type (I/O, memory, etc).
You need to review the OF PCI bindings to make sense of this. The
subnodes are PCI devices. Those PCI devices show up in the
configuration space (ie lspci). They are the PCI root port bridges.
The reg value follows the OF PCI spec and has the configuration
address of the bridge. For the first port's bridge this address in
lspci format is 00:01.0 which encodes to <0x800 0 0>
The Linux OF PCI core uses the reg value in this format to match the
OF node in the DT to the PCI device node as it does PCI discovery.
> 2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
> <0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
@0,1,0 (bus,device,fn) could be more appropriate, but that is
cosmetic?
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130308200245.GC29435-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-08 20:13 ` Thierry Reding
2013-03-10 15:09 ` Thomas Petazzoni
2013-03-08 23:46 ` Mitch Bradley
1 sibling, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-08 20:13 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 535 bytes --]
On Fri, Mar 08, 2013 at 01:02:46PM -0700, Jason Gunthorpe wrote:
> On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
[...]
> > 2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
> > <0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
>
> @0,1,0 (bus,device,fn) could be more appropriate, but that is
> cosmetic?
The OF PCI specification is pretty strict about this as well. It says in
section 2.2.1.3. that only the DD and DD,FF forms can appear in a device
path.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 7:14 ` Thierry Reding
2013-03-08 16:52 ` Jason Gunthorpe
@ 2013-03-08 23:12 ` Rob Herring
[not found] ` <513A7044.1020700-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
1 sibling, 1 reply; 59+ messages in thread
From: Rob Herring @ 2013-03-08 23:12 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, Thomas Petazzoni, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
On 03/08/2013 01:14 AM, Thierry Reding wrote:
> On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
>> On 03/07/2013 02:47 PM, Thierry Reding wrote:
> [...]
>>> In a nutshell (since some of the context isn't quoted anymore) the
>>> problem that we're trying to solve is that some of the embedded SoCs
>>> require per-root-port registers for configuration. The PCI DT
>>> specification doesn't make any provisions for this. A few alternatives
>>> have been discussed so far:
>>
>> I'm not sure I follow. This is different than the host controller
>> registers? Why would this not just be multiple entries in the reg property?
>
> Well the register regions are per root-port. On Tegra20 there's 2 of
> them, Tegra30 has 3 and if I understand correctly Marvell can have up to
> 10 (!). Adding all of them to the reg property of the host controller
> could work but it needs some way to match the reg entry to the root port
> similar to option 1 below.
The compatible property of the PCI host controller can imply what each
index of the reg property entries is for.
>
> Adding a property in the root port nodes seems like a cleaner and more
> accurate representation of the hardware to me, but if that's not
> acceptable perhaps we need to bite the bullet and add the code to look
> the registers up from the parent's reg property.
What I don't like is a new property defined to describe mmio addresses.
We already have a property for that and it is "reg". But I think I'm
still missing something:
>>> pci@0,1 {
>>> ...
>>> reg = <0x00000800 0 0 0 0>;
Is this a PCI bus address?
>>> regs = <0x80000000 0x00001000>;
>>> };
Rob
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130308200245.GC29435-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-08 20:13 ` Thierry Reding
@ 2013-03-08 23:46 ` Mitch Bradley
2013-03-09 1:31 ` Jason Gunthorpe
1 sibling, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-08 23:46 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/8/2013 10:02 AM, Jason Gunthorpe wrote:
> On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
>
>>>> http://www.spinics.net/lists/arm-kernel/msg228749.html
>>
>> The example in that posting looks messed up to me.
>>
>> 1) It has "reg = <0x0800 0 0 0 0>", but 0x0800 0 0 is not a valid
>> address in the address space defined by its parent - because the form of
>> the parent's ranges property indicates that it's a PCI-style address
>> form. 0x0800 0 0 lacks the top bits that indicate non-relocatable and
>> the type (I/O, memory, etc).
>
> You need to review the OF PCI bindings to make sense of this. The
> subnodes are PCI devices. Those PCI devices show up in the
> configuration space (ie lspci). They are the PCI root port bridges.
As it turns out, I wrote those bindings, almost 20 years ago.
Having dug through old versions of that patch, I think I see the source
of the confusion.
In a very early version of the patch -
http://www.spinics.net/lists/arm-kernel/msg211839.html
the pcie-controller address space had 1 address cell, essentially
propagating the CPU address down to the subordinate nodes, and the
subordinate pcie nodes had no child address specification. Somebody
correctly observed that PCI buses need to export 3/2 address/size cells
to their children.
But that stipulation applies only to the subordinate pcie nodes, not
necessarily to the enclosing pcie-controller node.
A later version -
http://permalink.gmane.org/gmane.linux.kernel.pci/20358
applied 3/2 PCI addressing at both levels. That is somewhat sensible if
you treat the top level as a "PCI root complex" and if the subordinate
port control registers are really PCI config header blocks.
But look at the ranges entries therein. PCI config address <0x800 ..>
maps to CPU address 0xd004000 size 0x2000 and <0x1000 ..> maps to
0xd0044000 size 0x2000. 0x800 size 0x2000 encroaches into the <0x1000
..> range. That can't be right.
That, plus the first version of the patch, makes me think that these
root-port control registers might not really be PCI config registers, in
which case the use of PCI addressing at the top level is a fiction.
Furthermore, we have the fact that pcie@0,0 corresponds to
marvell,pcie-port = <0> and marvell,pcie-lane = <0>, and similarly for
pcie@0,1 and pcie@0,2. That correspondence still appears in the recent
patch.
So it looks like the "@0,0" addressing attempts to represent lane,port
instead of device,function PCI config space addressing. 0,0 is not the
right PCI device/function number for reg=<0x800 ...> - 0x800 is device
1, not device 0.
Another problem is the device_type values. If the top level is to be
interpreted as a PCI addressing domain, it should have a device_type=pci
property.
One solution is for the top node to use 1-cell addresses, passing the
CPU address to the subordinates. The subordinate nodes use the standard
PCI address representation. The subordinate reg properties just list
the CPU address of the control registers. Each subordinate has a ranges
properties to translate PCI to CPU addresses and a bus-range property
for PCI bus numbers (unless those are determined dynamically).
What you lose with that is the ability to refer to the nodes by
port,lane. If that is important, then the top address domain needs an
additional cell to say whether a given address is a CPU address or
port,lane. The top node would need a ranges to map the various forms
into CPU addresses. The child reg properties could use the port,lane
form, and marvell,pcie-{port,lane} would disappear.
In neither case would you need the controversial "reg-names" thing.
The tradeoff:
Existing proposed patch: Violates addressing rules, requires funny new
"reg-names" mechanism, pretends to create a PCI bus level that does not
exist.
One-cell top address: Follows addressing rules, may be able to use
existing "simple-bus" address translation code, does not permit the
pretty @lane,port human-readable form.
Two-cell top address: Follows addressing rules, possibly requires new
code to translate this particular 2-address-cell format, permits
@lane,port representation.
>
> The reg value follows the OF PCI spec and has the configuration
> address of the bridge. For the first port's bridge this address in
> lspci format is 00:01.0 which encodes to <0x800 0 0>
>
> The Linux OF PCI core uses the reg value in this format to match the
> OF node in the DT to the PCI device node as it does PCI discovery.
>
>> 2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
>> <0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
>
> @0,1,0 (bus,device,fn) could be more appropriate, but that is
> cosmetic?
Except that the @0,0 in this case seems to represent port,lane and not
device,function , as argued above.
>
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 23:46 ` Mitch Bradley
@ 2013-03-09 1:31 ` Jason Gunthorpe
[not found] ` <20130309013152.GA3883-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-09 1:31 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Fri, Mar 08, 2013 at 01:46:13PM -1000, Mitch Bradley wrote:
> On 3/8/2013 10:02 AM, Jason Gunthorpe wrote:
> > On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
> >
> >>>> http://www.spinics.net/lists/arm-kernel/msg228749.html
> >>
> >> The example in that posting looks messed up to me.
> >>
> >> 1) It has "reg = <0x0800 0 0 0 0>", but 0x0800 0 0 is not a valid
> >> address in the address space defined by its parent - because the form of
> >> the parent's ranges property indicates that it's a PCI-style address
> >> form. 0x0800 0 0 lacks the top bits that indicate non-relocatable and
> >> the type (I/O, memory, etc).
> >
> > You need to review the OF PCI bindings to make sense of this. The
> > subnodes are PCI devices. Those PCI devices show up in the
> > configuration space (ie lspci). They are the PCI root port bridges.
>
> As it turns out, I wrote those bindings, almost 20 years ago.
>
> Having dug through old versions of that patch, I think I see the source
> of the confusion.
Not sure, I think this has gone into a weird tangent, your conclusions
don't match how the PCI-E root complex is presented to the kernel.
Lets just go back to the start?
The pci-controller node is a root complex.
The children of that node are elements in the root complex - they are
the root port bridges. They are all on PCI bus 0.
Each root port bridge responds to configuration cycles, and has a PCI
configuration space. They show up in lspci.
The host controller device driver knows how to issue configuration
cycles, and it knows how to deliver configuration cycles on bus 0 to
the bridges.
This one:
+ pcie@0,0 {
+ device_type = "pci";
+ reg = <0x0800 0 0 0 0>;
Responds to bus 0, device 1, function 0.
Each root port bridge has an associated non-PCI MMIO address
space. The child above is associated with 0xd0040000. This MMIO
address space is needed by the host controller driver to operate the
port.
The text after the @ is a mistake, Linux doesn't enforce anything
about this text so nobody noticed till now, it should be corrected to
@1,0, similarly for the second one.
Also, from your comments the top level should gain a 'device_type =
"pci"'.
So, from that point, does the DT start to make sense? Are there other
problems?
The question at hand is how do you associate the non-PCI MMIO space
0xd0040000 with the root port bridge 'pcie@1,0' ? The patch proposes
to use a reg array in the controller plus reg-names to do this.
I have trouble making sense of your suggestions to switch away from
5dw addressing. It is critical that the DT child stanza be associated
with the PCI discovered root port bridge. By my understanding Linux
does this based on the configuration space format 'reg' value. How is
that association possible if you switch things away from 5dw
addressing?
Also, this is for firmware-less embedded. Linux is required to scan
the bus and do full address assignment of all PCI devies, including
the root port bridges. The ranges property on the top node specifies
the addressing apertures that are available for this process. This is
common practice, there are many examples of this in kernel dts for PCI
controllers.
It is impossible to specify any detail for the bridges (be it address
ranges or bus number ranges) because we have no idea what discovery
will find, or what values Linux will choose to assign.
> In neither case would you need the controversial "reg-names" thing.
reg-names has been a standard feature in the Linux kernel for a bit
already..
Thanks a lot for looking at this,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 19:12 ` Thierry Reding
[not found] ` <20130308191227.GA6551-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
@ 2013-03-09 8:58 ` Thomas Petazzoni
1 sibling, 0 replies; 59+ messages in thread
From: Thomas Petazzoni @ 2013-03-09 8:58 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, Rob Herring, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Arnd Bergmann,
Stephen Warren, linux-pci, Eran Ben-Avi, Nadav Haklai,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Gregory Clement,
Tawfik Bayouk, Grant Likely, linux-arm-kernel, devicetree-discuss
Dear Thierry Reding,
On Fri, 8 Mar 2013 20:12:27 +0100, Thierry Reding wrote:
> Oh well. I don't like it, but if that's the way it has to be, then so be
> it.
I don't like it too much either, but it's not a big problem either. It
works, and satisfies the DT requirements reasonably.
> Any reason why the reg-names can't match the root port's node name
> so that it can be used directly instead of going through hoops to
> construct a string from extra parameters? Like below:
>
> pcie-controller {
> regs = <...>;
> reg-names = ..., "pcie@1,0", "pcie@2,0";
>
> pcie@1,0 {
> ...
> };
>
> pcie@2,0 {
> ...
> };
> };
Yes, I thought about doing this, but in my driver, I also use the
"pcie0.0" string to request the address decoding window from the
mvebu-mbus driver. So I would anyway have to construct this "pciX.Y"
string. But if you feel like using "pcie@X,Y" for the reg-names is
better even if I still need to construct the "pcieX.Y" string, then I
will be perfectly ok with making this change.
Best regards,
Thomas
--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <513A7044.1020700-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2013-03-09 11:10 ` Thierry Reding
2013-03-10 5:04 ` Mitch Bradley
1 sibling, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-09 11:10 UTC (permalink / raw)
To: Rob Herring
Cc: Lior Amsalem, Russell King - ARM Linux, Jason Cooper, Andrew Lunn,
linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Nadav Haklai, Shadi Ammouri,
Bjorn Helgaas, Tawfik Bayouk,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 2654 bytes --]
On Fri, Mar 08, 2013 at 05:12:04PM -0600, Rob Herring wrote:
> On 03/08/2013 01:14 AM, Thierry Reding wrote:
> > On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
> >> On 03/07/2013 02:47 PM, Thierry Reding wrote:
> > [...]
> >>> In a nutshell (since some of the context isn't quoted anymore) the
> >>> problem that we're trying to solve is that some of the embedded SoCs
> >>> require per-root-port registers for configuration. The PCI DT
> >>> specification doesn't make any provisions for this. A few alternatives
> >>> have been discussed so far:
> >>
> >> I'm not sure I follow. This is different than the host controller
> >> registers? Why would this not just be multiple entries in the reg property?
> >
> > Well the register regions are per root-port. On Tegra20 there's 2 of
> > them, Tegra30 has 3 and if I understand correctly Marvell can have up to
> > 10 (!). Adding all of them to the reg property of the host controller
> > could work but it needs some way to match the reg entry to the root port
> > similar to option 1 below.
>
> The compatible property of the PCI host controller can imply what each
> index of the reg property entries is for.
>
> >
> > Adding a property in the root port nodes seems like a cleaner and more
> > accurate representation of the hardware to me, but if that's not
> > acceptable perhaps we need to bite the bullet and add the code to look
> > the registers up from the parent's reg property.
>
> What I don't like is a new property defined to describe mmio addresses.
> We already have a property for that and it is "reg".
Okay, understood.
> But I think I'm still missing something:
>
> >>> pci@0,1 {
> >>> ...
> >>> reg = <0x00000800 0 0 0 0>;
>
> Is this a PCI bus address?
Yes. The example is slightly wrong. pci@0,1 should actually be pci@1,0.
That means it is device 1, function 0 on the bus (implicitly 0 in this
case). That matches the values in the reg property, whose first cell is
defined as follows, quoting the PCI OF specification:
npt000ss bbbbbbbb dddddfff rrrrrrrr
Where:
n is 0 if the address is relocatable, 1 otherwise
p is 1 if the addressable region is "prefetchable", 0
otherwise
t is 1 if the address is aliased (for non-relocatable
I/O), below 1 MB (for Memory), or below 64 KB (for
relocatable I/O).
ss is the space code, denoting the address space
bbbbbbbb is the 8-bit Bus Number
ddddd is the 5-bit Device Number
fff is the 3-bit Function Number
rrrrrrrr is the 8-bit Register Number
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130309013152.GA3883-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-10 4:52 ` Mitch Bradley
2013-03-10 6:55 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-10 4:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/8/2013 3:31 PM, Jason Gunthorpe wrote:
> On Fri, Mar 08, 2013 at 01:46:13PM -1000, Mitch Bradley wrote:
>> On 3/8/2013 10:02 AM, Jason Gunthorpe wrote:
>>> On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
>>>
>>>>>> http://www.spinics.net/lists/arm-kernel/msg228749.html
>>>>
>>>> The example in that posting looks messed up to me.
>>>>
>>>> 1) It has "reg = <0x0800 0 0 0 0>", but 0x0800 0 0 is not a valid
>>>> address in the address space defined by its parent - because the
form of
>>>> the parent's ranges property indicates that it's a PCI-style address
>>>> form. 0x0800 0 0 lacks the top bits that indicate non-relocatable and
>>>> the type (I/O, memory, etc).
>>>
>>> You need to review the OF PCI bindings to make sense of this. The
>>> subnodes are PCI devices. Those PCI devices show up in the
>>> configuration space (ie lspci). They are the PCI root port bridges.
>>
>> As it turns out, I wrote those bindings, almost 20 years ago.
>>
>> Having dug through old versions of that patch, I think I see the source
>> of the confusion.
>
> Not sure, I think this has gone into a weird tangent, your conclusions
> don't match how the PCI-E root complex is presented to the kernel.
>
> Lets just go back to the start?
>
> The pci-controller node is a root complex.
>
> The children of that node are elements in the root complex - they are
> the root port bridges. They are all on PCI bus 0.
>
> Each root port bridge responds to configuration cycles, and has a PCI
> configuration space. They show up in lspci.
>
> The host controller device driver knows how to issue configuration
> cycles, and it knows how to deliver configuration cycles on bus 0 to
> the bridges.
>
> This one:
> + pcie@0,0 {
> + device_type = "pci";
> + reg = <0x0800 0 0 0 0>;
>
> Responds to bus 0, device 1, function 0.
>
> Each root port bridge has an associated non-PCI MMIO address
> space. The child above is associated with 0xd0040000. This MMIO
> address space is needed by the host controller driver to operate the
> port.
>
> The text after the @ is a mistake, Linux doesn't enforce anything
> about this text so nobody noticed till now, it should be corrected to
> @1,0, similarly for the second one.
>
> Also, from your comments the top level should gain a 'device_type =
> "pci"'.
>
> So, from that point, does the DT start to make sense? Are there other
> problems?
>
> The question at hand is how do you associate the non-PCI MMIO space
> 0xd0040000 with the root port bridge 'pcie@1,0' ? The patch proposes
> to use a reg array in the controller plus reg-names to do this.
>
> I have trouble making sense of your suggestions to switch away from
> 5dw addressing. It is critical that the DT child stanza be associated
> with the PCI discovered root port bridge. By my understanding Linux
> does this based on the configuration space format 'reg' value. How is
> that association possible if you switch things away from 5dw
> addressing?
>
> Also, this is for firmware-less embedded. Linux is required to scan
> the bus and do full address assignment of all PCI devies, including
> the root port bridges. The ranges property on the top node specifies
> the addressing apertures that are available for this process. This is
> common practice, there are many examples of this in kernel dts for PCI
> controllers.
Okay, I think I finally get it. The Marvell root port bridge setup
registers look like standard config headers, even though they aren't
really in config space (because you need a different access method for
them, compared to real config accesses to downstream devices).
So, in order for the common code that enumerates the PCI bus to work,
you need to "spoof" the config accesses so that when you try to access
those particular "pseudo config headers", it uses the MMIO method
instead of the real config access method.
If that is indeed the case, them I would vote for a slight modification
of the intermediate patch that I cited earlier - the one in which the
ranges property includes translations from those special config
addresses into CPU addresses. The modification is to fix the sizes,
changing 0x2000 to 0x800, so those ranges entries do not overlap in the
child address domain.
With that change, plus fixing the "@port,lane" strings to be "@dev,fun",
the DT would not raise all the red flags that led me down this rathole.
The controller driver would still have to implement the magic spoofing
of those funky config headers, but at least it would be doing it based
on well-established representation principles, specifically ranges
entries and well-formed child addresses, without having to resort to the
new sideband "port and lane" address representation.
Here's what it would look like:
pcie-controller {
compatible = "marvell,armada-370-pcie";
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
bus-range = <0x00 0xff>;
ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root
port config header */
0x00001000 0 0 0xd4008000 0 0x00000800 /* Root
port config header */
0x82000000 0 0xe0000000 0xe0000000 0 0x08000000 /*
non-prefetchable memory */
0x81000000 0 0 0xe8000000 0 0x00100000>; /*
downstream I/O */
pcie@1,0 {
device_type = "pci";
reg = <0x0800 0 0 0 0>;
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0 0 0 0 &mpic 58>;
marvell,pcie-port = <0>;
marvell,pcie-lane = <0>;
clocks = <&gateclk 5>;
status = "disabled";
};
pcie@2,0 {
device_type = "pci";
reg = <0x1000 0 0 0 0>;
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 0>;
interrupt-map = <0 0 0 0 &mpic 62>;
marvell,pcie-port = <1>;
marvell,pcie-lane = <0>;
clocks = <&gateclk 9>;
status = "disabled";
};
};
>
> It is impossible to specify any detail for the bridges (be it address
> ranges or bus number ranges) because we have no idea what discovery
> will find, or what values Linux will choose to assign.
>
>> In neither case would you need the controversial "reg-names" thing.
>
> reg-names has been a standard feature in the Linux kernel for a bit
> already..
>
> Thanks a lot for looking at this,
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <513A7044.1020700-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-03-09 11:10 ` Thierry Reding
@ 2013-03-10 5:04 ` Mitch Bradley
2013-03-10 15:06 ` Thomas Petazzoni
1 sibling, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-10 5:04 UTC (permalink / raw)
To: Rob Herring
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/8/2013 1:12 PM, Rob Herring wrote:
> On 03/08/2013 01:14 AM, Thierry Reding wrote:
>> On Thu, Mar 07, 2013 at 06:05:33PM -0600, Rob Herring wrote:
>>> On 03/07/2013 02:47 PM, Thierry Reding wrote:
>> [...]
>>>> In a nutshell (since some of the context isn't quoted anymore) the
>>>> problem that we're trying to solve is that some of the embedded SoCs
>>>> require per-root-port registers for configuration. The PCI DT
>>>> specification doesn't make any provisions for this. A few alternatives
>>>> have been discussed so far:
>>>
>>> I'm not sure I follow. This is different than the host controller
>>> registers? Why would this not just be multiple entries in the reg property?
>>
>> Well the register regions are per root-port. On Tegra20 there's 2 of
>> them, Tegra30 has 3 and if I understand correctly Marvell can have up to
>> 10 (!). Adding all of them to the reg property of the host controller
>> could work but it needs some way to match the reg entry to the root port
>> similar to option 1 below.
>
> The compatible property of the PCI host controller can imply what each
> index of the reg property entries is for.
>
>>
>> Adding a property in the root port nodes seems like a cleaner and more
>> accurate representation of the hardware to me, but if that's not
>> acceptable perhaps we need to bite the bullet and add the code to look
>> the registers up from the parent's reg property.
>
> What I don't like is a new property defined to describe mmio addresses.
> We already have a property for that and it is "reg". But I think I'm
> still missing something:
As stated in my recent reply to Jason, I thing the correct property is
"ranges". "Ranges" translates mappable child address space addresses
into parent addresses, and that is exactly what is going on. A specific
subset of config addresses is mappable into parent MMIO space.
When done that way, there is no need for the multi-entry reg property at
the top level, and the correspondence between specific root port DT
nodes and their MMIO addresses is easily determined by matching child
reg properties to parent ranges entries.
>
>>>> pci@0,1 {
>>>> ...
>>>> reg = <0x00000800 0 0 0 0>;
>
> Is this a PCI bus address?
>
>>>> regs = <0x80000000 0x00001000>;
>>>> };
>
> Rob
>
> _______________________________________________
> devicetree-discuss mailing list
> devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
> https://lists.ozlabs.org/listinfo/devicetree-discuss
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-10 4:52 ` Mitch Bradley
@ 2013-03-10 6:55 ` Jason Gunthorpe
[not found] ` <20130310065539.GA14704-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-10 6:55 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Sat, Mar 09, 2013 at 06:52:13PM -1000, Mitch Bradley wrote:
> Okay, I think I finally get it. The Marvell root port bridge setup
> registers look like standard config headers, even though they aren't
> really in config space (because you need a different access method for
> them, compared to real config accesses to downstream devices).
Well, thats pretty close. On Marvell the MMIO space has a lot of
stuff, some of it is config related, lots of it isn't. In fact most of
it isn't. Tegra is different, the MMIO region is exactly the config
space of the root port bridge.
Tegra could probably happily and sanely do as you've described (well,
see below), but it is wonky for Marvell. It looks like there are more
drivers coming that have this need - so I had hoped for a general
answer to the 'per-port MMIO space' problem that is unrelated to the
MMIO spaces role in config access.
> So, in order for the common code that enumerates the PCI bus to work,
> you need to "spoof" the config accesses so that when you try to access
> those particular "pseudo config headers", it uses the MMIO method
> instead of the real config access method.
There are rules for config access in both Tegra and Marvell that are
not trivial. Again, the driver takes are of this and all Linux sees is
a nice compliant config space.
> If that is indeed the case, them I would vote for a slight modification
> of the intermediate patch that I cited earlier - the one in which the
> ranges property includes translations from those special config
> addresses into CPU addresses. The modification is to fix the sizes,
> changing 0x2000 to 0x800, so those ranges entries do not overlap in the
> child address domain.
On Marvell the size of the MMIO space is 0x2000, there are imporant
registers there beyond index 0x800 (todays driver may not touch them
but HW has them). Reducing the size doesn't seem appropriate.
> pcie-controller {
> compatible = "marvell,armada-370-pcie";
> status = "disabled";
device_type = "pci";
Here as well, as you noted before?
> #address-cells = <3>;
> #size-cells = <2>;
>
> bus-range = <0x00 0xff>;
>
> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root
> port config header */
> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root
> port config header */
> 0x82000000 0 0xe0000000 0xe0000000 0 0x08000000 /*
> non-prefetchable memory */
> 0x81000000 0 0 0xe8000000 0 0x00100000>; /*
> downstream I/O */
>
> pcie@1,0 {
> device_type = "pci";
> reg = <0x0800 0 0 0 0>;
Okay, this was Thierry's original idea, and it was my note that this
didn't seem like a good choice. I'm not wedded to that, but I'll
explain my reasoning a bit.
Two things bothered me
- Describing a CPU MMIO mapping with a config address space seems
wonky
- Linux's OF core doesn't parse a 'reg' property under
'device_type = "pci"' as something CPU mappable, it only looks
to assigned-addresses for that. So the OF core would need to learn
how to handle this case, and it would have to be well defined..
Thierry's original patch avoided this problem by not using
device_type = "pci"..
Also, did you mean to have a 0 size for the 'reg'? How will things
know how big the MMIO region is? Can't just assume 0x800..
Again, thanks!
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-10 5:04 ` Mitch Bradley
@ 2013-03-10 15:06 ` Thomas Petazzoni
2013-03-10 18:33 ` Mitch Bradley
0 siblings, 1 reply; 59+ messages in thread
From: Thomas Petazzoni @ 2013-03-10 15:06 UTC (permalink / raw)
To: Mitch Bradley
Cc: Rob Herring, Lior Amsalem, Andrew Lunn, Russell King - ARM Linux,
Jason Cooper, Nadav Haklai, linux-pci, devicetree-discuss,
Eran Ben-Avi, Jason Gunthorpe, Maen Suleiman, Shadi Ammouri,
Bjorn Helgaas, Tawfik Bayouk, linux-arm-kernel
Dear Mitch Bradley,
On Sat, 09 Mar 2013 19:04:51 -1000, Mitch Bradley wrote:
> As stated in my recent reply to Jason, I thing the correct property is
> "ranges". "Ranges" translates mappable child address space addresses
> into parent addresses, and that is exactly what is going on. A specific
> subset of config addresses is mappable into parent MMIO space.
The PCI configuration space is *not* mapped in the MMIO space on
Marvell hardware. In the MMIO space of each PCIe interface, there are
many registers, only *two* of which are dedicated to accessing the PCI
configuration space:
* One register to set the offset in the PCI configuration space.
* One register to read or write a value in the PCI configuration, at
the offset specified in the first register.
See the implementation of mvebu_pcie_hw_rd_conf() and
mvebu_pcie_hw_wr_conf() in the driver.
So really, the values specified in the reg = <...> property are *not*
the PCI configuration spaces mapped in the MMIO space. They represent a
bunch of per PCIe interface registers used to configure them, get the
status of the link... and access, through an indirect mechanism, the
PCI configuration space.
Does this helps?
Thanks,
Thomas
--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-08 20:13 ` Thierry Reding
@ 2013-03-10 15:09 ` Thomas Petazzoni
2013-03-11 8:08 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Thomas Petazzoni @ 2013-03-10 15:09 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Tawfik Bayouk, linux-pci,
devicetree-discuss, Eran Ben-Avi, Nadav Haklai, Maen Suleiman,
Shadi Ammouri, Bjorn Helgaas, linux-arm-kernel
Dear Thierry Reding,
On Fri, 8 Mar 2013 21:13:40 +0100, Thierry Reding wrote:
> On Fri, Mar 08, 2013 at 01:02:46PM -0700, Jason Gunthorpe wrote:
> > On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
> [...]
> > > 2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
> > > <0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
> >
> > @0,1,0 (bus,device,fn) could be more appropriate, but that is
> > cosmetic?
>
> The OF PCI specification is pretty strict about this as well. It says in
> section 2.2.1.3. that only the DD and DD,FF forms can appear in a device
> path.
Note that in the case of my driver, the @X,Y represent the port and
lane of the PCIe interface. Not sure if it is correct, I can change the
names to whatever is appropriate, those names aren't used anywhere in
the driver.
Best regards,
Thomas
--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-10 15:06 ` Thomas Petazzoni
@ 2013-03-10 18:33 ` Mitch Bradley
0 siblings, 0 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-10 18:33 UTC (permalink / raw)
To: Thomas Petazzoni
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Bjorn Helgaas, Shadi Ammouri,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Jason Gunthorpe
On 3/10/2013 5:06 AM, Thomas Petazzoni wrote:
> Dear Mitch Bradley,
>
> On Sat, 09 Mar 2013 19:04:51 -1000, Mitch Bradley wrote:
>
>> As stated in my recent reply to Jason, I thing the correct property is
>> "ranges". "Ranges" translates mappable child address space addresses
>> into parent addresses, and that is exactly what is going on. A specific
>> subset of config addresses is mappable into parent MMIO space.
>
> The PCI configuration space is *not* mapped in the MMIO space on
> Marvell hardware. In the MMIO space of each PCIe interface, there are
> many registers, only *two* of which are dedicated to accessing the PCI
> configuration space:
>
> * One register to set the offset in the PCI configuration space.
>
> * One register to read or write a value in the PCI configuration, at
> the offset specified in the first register.
>
> See the implementation of mvebu_pcie_hw_rd_conf() and
> mvebu_pcie_hw_wr_conf() in the driver.
>
> So really, the values specified in the reg = <...> property are *not*
> the PCI configuration spaces mapped in the MMIO space. They represent a
> bunch of per PCIe interface registers used to configure them, get the
> status of the link... and access, through an indirect mechanism, the
> PCI configuration space.
>
> Does this helps?
I agree that PCI config space accesses to *downstream* devices is via an
indirect-access register pair.
The question is, does that indirect-access mechanism apply also to the
internal config headers for the root port bridges?
According to section 20.15 of the MV78230 functional spec that I am
looking at, the configuration header registers are mapped to the
internal memory space. That section is unclear about whether those
registers are CPU-accessible via indirect-access config transactions.
When the PCIe hardware is configured for endpoint mode, the internal
headers can be accessed via PCIe config transactions from an external
port, but in root complex mode, the possibility of indirect access from
the CPU is not mentioned.
The manual is a little vague in some respects, but it does say quite
clearly that MMIO access to the root port bridge config header is
possible, and it lists the MMIO addresses thereof (section A.10).
So we all agree that access to external (downstream) config registers is
via the indirect register pair. The unclear thing is whether the
internal config registers for the root port bridges can be indirectly
accessed.
Do we have empirical evidence that indirect-access works to the internal
root port bridge config headers?
>
> Thanks,
>
> Thomas
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130310065539.GA14704-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-11 5:46 ` Mitch Bradley
[not found] ` <513D6F9C.9000100-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
2013-03-11 18:15 ` Jason Gunthorpe
0 siblings, 2 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-11 5:46 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/9/2013 8:55 PM, Jason Gunthorpe wrote:
> On Sat, Mar 09, 2013 at 06:52:13PM -1000, Mitch Bradley wrote:
>
>> Okay, I think I finally get it. The Marvell root port bridge setup
>> registers look like standard config headers, even though they aren't
>> really in config space (because you need a different access method for
>> them, compared to real config accesses to downstream devices).
>
> Well, thats pretty close. On Marvell the MMIO space has a lot of
> stuff, some of it is config related, lots of it isn't. In fact most of
> it isn't. Tegra is different, the MMIO region is exactly the config
> space of the root port bridge.
>
> Tegra could probably happily and sanely do as you've described (well,
> see below), but it is wonky for Marvell. It looks like there are more
> drivers coming that have this need - so I had hoped for a general
> answer to the 'per-port MMIO space' problem that is unrelated to the
> MMIO spaces role in config access.
>
>> So, in order for the common code that enumerates the PCI bus to work,
>> you need to "spoof" the config accesses so that when you try to access
>> those particular "pseudo config headers", it uses the MMIO method
>> instead of the real config access method.
>
> There are rules for config access in both Tegra and Marvell that are
> not trivial. Again, the driver takes are of this and all Linux sees is
> a nice compliant config space.
Thanks for explaining that.
So it seems that we are faced with two requirements that are somewhat at
odds with one another.
1) Some of the root port bridge registers have to be accessed via config
space access functions so common PCI enumeration code will work. To
represent this with the usual DT structure, the top root-complex needs
to define a 3/2 address space so its children
can have standard PCI reg properties. Presumably, if those registers
are being programmed by common code, Marvell-specific code would
restrict its role to setting up config-space access functions, leaving
the actual touching of the registers to the common code.
2) Marvell chips have additional non-standard per-root-port registers
that generic PCI code would not understand. These registers would be
touched only by Marvell-specific code.
The two kinds of registers are adjacent in MMIO space. However, unless
I am misunderstanding this MV78230 manual, the highest "config header"
register index is 0x134, well below the 0x1000 size limit of a PCIe
config header. Some of the extra registers are at 0x8xx, and others are
above 0x1800.
That suggests the following:
For the "config header" registers that use generic PCI code, use a
ranges entry to associate (pass through) MMIO addresses to the config
header portion of the register block. This parameterizes the code that
sets up the special-case PCI config access. That specification
technique is general and could be used not only for the Marvell case,
but also for any other chip that has some number of direct-mapped config
headers.
For requirement (2), the top node has a reg property listing the
portions of the address space that are consumed by the driver at the top
level instead of being passed through to the PCI addressing domain. E.g.
reg = <0xd0040800 0x1800>, <0xd0080800 0x1800>;
Thus we have accurately described the two aspects of the true situation.
The PCI-compliant "config header" registers are passed through to the
child nodes where they can be dealt with in the normal PCI fashion. The
non-PCI-compliant register footprint lives within the Marvell-specific
root-complex driver.
The root-complex driver presumably needs associate non-compliant
register blocks with specific child nodes. That can be done by
requiring that the reg entries are in the same order as the config-space
ranges entries.
Anticipating the possible objection that ARM's 0x1000-byte page size
does not permit virtual-to-physical mapping at 0x800-byte granularity:
The device tree does not guarantee that reg entries are page-aligned; it
simply tries to describe the reality, even though it might be messy.
>
>> If that is indeed the case, them I would vote for a slight modification
>> of the intermediate patch that I cited earlier - the one in which the
>> ranges property includes translations from those special config
>> addresses into CPU addresses. The modification is to fix the sizes,
>> changing 0x2000 to 0x800, so those ranges entries do not overlap in the
>> child address domain.
BTW I have completely changed my mind about the overlap thing.
I said that it was bogus to use size=0x2000 for a config space header.
That was based on an interpretation - which I now dislike - of the
meaning of 3-cell config addresses. By that old interpretation,
size=0x800 would also be bogus, because bits 10-8 of the config address
are for the function number.
Consider the following question, which I have never previously
considered, at least not explicitly:
Q: What would be the 3-cell representation of the Command/Status
register address (offset 4) in device 1, function 1?
One obvious - but weak - answer is <0x00000904 0 0> . It's obvious
because the value 0x80000904 is what you put in the cf8 indirect access
address register. But it is weak because it unnecessarily restricts the
config header size, and because it works differently than offsets
applied to memory and I/O space addresses. Furthermore, coupling the
offset to the cf8 register is dodgy because of the funniness where you
have to but the byte-selector bits 1..0 not in the address register cf8,
but rather in the data register cfc.
The better answer is <0x00000900 0 4>, using the third cell for the
offset, as with IO and memory space offsets.
Under this better interpretation, config space sizes larger than 0x100
are no problem. The first cell is just a selector. The size is "added"
to the third cell, so it does not encroach into the first cell's
device,function bits.
This better interpretation easily handles PCIe's 0x1000-byte extended
config headers, and trivially accomodates even larger sizes should a
future PCI variant further increase the header size. as described way
down below.
>
> On Marvell the size of the MMIO space is 0x2000, there are imporant
> registers there beyond index 0x800 (todays driver may not touch them
> but HW has them). Reducing the size doesn't seem appropriate.
Per the above, the current idea is not to reduce the size, but rather to
split the 0x2000 into two pieces, accurately reflecting the fact that
part of it is PCI-compliant and can be handled by generic PCI drivers,
while the rest must be handled differently, outside of standard PCI
land. Commingling the two seems dubious.
>
>> pcie-controller {
>> compatible = "marvell,armada-370-pcie";
>> status = "disabled";
>
> device_type = "pci";
>
> Here as well, as you noted before?
>
>> #address-cells = <3>;
>> #size-cells = <2>;
>>
>> bus-range = <0x00 0xff>;
>>
>> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root
>> port config header */
>> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root
>> port config header */
>> 0x82000000 0 0xe0000000 0xe0000000 0 0x08000000 /*
>> non-prefetchable memory */
>> 0x81000000 0 0 0xe8000000 0 0x00100000>; /*
>> downstream I/O */
>>
>> pcie@1,0 {
>> device_type = "pci";
>> reg = <0x0800 0 0 0 0>;
>
> Okay, this was Thierry's original idea, and it was my note that this
> didn't seem like a good choice. I'm not wedded to that, but I'll
> explain my reasoning a bit.
>
> Two things bothered me
> - Describing a CPU MMIO mapping with a config address space seems
> wonky
I agree that mapping config space is sort of a jarring concept, but I
think that's because PCs have polluted the mindspace, not because there
is anything inherently bad about it. The original PCI spec
fundamentally treated config space as just another (rather segmented)
linear address space. It specified the cf8/cfc indirect access
mechanism not as a deep semantic feature, but rather as an
implementation hack to work around addressing limitations of PC chipsets
and the addressing-deficient popular OS of the time (Windows 3.1 which
was a veneer over DOS). The PCI spec defined configuration mechanism #2
as a direct map into PC I/O space. RISC chipsets of that era (e.g.
Alpha, Power PC) often direct-mapped config space into some form of
load/store space.
But PCs soon encroached on the RISC market and RISC system vendors
started to "leverage" PC I/O chipsets. The indirect access "config
mechanism #1" became part of the mindset. Indirect access was hardcoded
in enough software that chipset designers would provide indirect access
even if direct-mapping was possible.
Fundamentally, a config header is just a block of registers, addressable
as an index relative to some base. It might require some specific
access path - but that is true with any block of registers. Even
direct-mapped MMIO registers often require specific semantics such as
non-cached, non-write-buffered, or special address space ids.
The PCI SIG expected PCIe extended config space to be direct-mapped -
see slide 25 of
http://www.pcisig.com/specifications/pciexpress/technical_library/dev_con_09_02/specifications/pciexpress/PCIExpress_SoftwareandConfigurationModel.pdf
So mappability of config space was intended from the get-go. The real
wonky thing is the "indirect access zombie that cannot die".
> - Linux's OF core doesn't parse a 'reg' property under
> 'device_type = "pci"' as something CPU mappable, it only looks
> to assigned-addresses for that. So the OF core would need to learn
> how to handle this case, and it would have to be well defined..
The way I envisioned it working is that the root-complex node defines
its own config space access functions (the "ops" argument to
pci_scan_root_bus). For config addresses listed in ranges, the
functions do MMIO. For others, they use the chipset's indirect-access
registers. The OF core is not involved.
>
> Thierry's original patch avoided this problem by not using
> device_type = "pci"..
>
> Also, did you mean to have a 0 size for the 'reg'? How will things
> know how big the MMIO region is? Can't just assume 0x800..
Sorry, that was a typo.
>
> Again, thanks!
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <513D6F9C.9000100-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
@ 2013-03-11 7:46 ` Thierry Reding
[not found] ` <20130311074615.GA6365-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-11 7:46 UTC (permalink / raw)
To: Mitch Bradley
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 11049 bytes --]
On Sun, Mar 10, 2013 at 07:46:04PM -1000, Mitch Bradley wrote:
> On 3/9/2013 8:55 PM, Jason Gunthorpe wrote:
> > On Sat, Mar 09, 2013 at 06:52:13PM -1000, Mitch Bradley wrote:
> >
> >> Okay, I think I finally get it. The Marvell root port bridge setup
> >> registers look like standard config headers, even though they aren't
> >> really in config space (because you need a different access method for
> >> them, compared to real config accesses to downstream devices).
> >
> > Well, thats pretty close. On Marvell the MMIO space has a lot of
> > stuff, some of it is config related, lots of it isn't. In fact most of
> > it isn't. Tegra is different, the MMIO region is exactly the config
> > space of the root port bridge.
> >
> > Tegra could probably happily and sanely do as you've described (well,
> > see below), but it is wonky for Marvell. It looks like there are more
> > drivers coming that have this need - so I had hoped for a general
> > answer to the 'per-port MMIO space' problem that is unrelated to the
> > MMIO spaces role in config access.
> >
> >> So, in order for the common code that enumerates the PCI bus to work,
> >> you need to "spoof" the config accesses so that when you try to access
> >> those particular "pseudo config headers", it uses the MMIO method
> >> instead of the real config access method.
> >
> > There are rules for config access in both Tegra and Marvell that are
> > not trivial. Again, the driver takes are of this and all Linux sees is
> > a nice compliant config space.
>
> Thanks for explaining that.
>
> So it seems that we are faced with two requirements that are somewhat at
> odds with one another.
>
> 1) Some of the root port bridge registers have to be accessed via config
> space access functions so common PCI enumeration code will work. To
> represent this with the usual DT structure, the top root-complex needs
> to define a 3/2 address space so its children
> can have standard PCI reg properties. Presumably, if those registers
> are being programmed by common code, Marvell-specific code would
> restrict its role to setting up config-space access functions, leaving
> the actual touching of the registers to the common code.
>
> 2) Marvell chips have additional non-standard per-root-port registers
> that generic PCI code would not understand. These registers would be
> touched only by Marvell-specific code.
>
> The two kinds of registers are adjacent in MMIO space. However, unless
> I am misunderstanding this MV78230 manual, the highest "config header"
> register index is 0x134, well below the 0x1000 size limit of a PCIe
> config header. Some of the extra registers are at 0x8xx, and others are
> above 0x1800.
>
> That suggests the following:
>
> For the "config header" registers that use generic PCI code, use a
> ranges entry to associate (pass through) MMIO addresses to the config
> header portion of the register block. This parameterizes the code that
> sets up the special-case PCI config access. That specification
> technique is general and could be used not only for the Marvell case,
> but also for any other chip that has some number of direct-mapped config
> headers.
>
> For requirement (2), the top node has a reg property listing the
> portions of the address space that are consumed by the driver at the top
> level instead of being passed through to the PCI addressing domain. E.g.
>
> reg = <0xd0040800 0x1800>, <0xd0080800 0x1800>;
>
> Thus we have accurately described the two aspects of the true situation.
> The PCI-compliant "config header" registers are passed through to the
> child nodes where they can be dealt with in the normal PCI fashion. The
> non-PCI-compliant register footprint lives within the Marvell-specific
> root-complex driver.
>
> The root-complex driver presumably needs associate non-compliant
> register blocks with specific child nodes. That can be done by
> requiring that the reg entries are in the same order as the config-space
> ranges entries.
>
> Anticipating the possible objection that ARM's 0x1000-byte page size
> does not permit virtual-to-physical mapping at 0x800-byte granularity:
> The device tree does not guarantee that reg entries are page-aligned; it
> simply tries to describe the reality, even though it might be messy.
>
> >
> >> If that is indeed the case, them I would vote for a slight modification
> >> of the intermediate patch that I cited earlier - the one in which the
> >> ranges property includes translations from those special config
> >> addresses into CPU addresses. The modification is to fix the sizes,
> >> changing 0x2000 to 0x800, so those ranges entries do not overlap in the
> >> child address domain.
>
> BTW I have completely changed my mind about the overlap thing.
>
> I said that it was bogus to use size=0x2000 for a config space header.
> That was based on an interpretation - which I now dislike - of the
> meaning of 3-cell config addresses. By that old interpretation,
> size=0x800 would also be bogus, because bits 10-8 of the config address
> are for the function number.
>
> Consider the following question, which I have never previously
> considered, at least not explicitly:
>
> Q: What would be the 3-cell representation of the Command/Status
> register address (offset 4) in device 1, function 1?
>
> One obvious - but weak - answer is <0x00000904 0 0> . It's obvious
> because the value 0x80000904 is what you put in the cf8 indirect access
> address register. But it is weak because it unnecessarily restricts the
> config header size, and because it works differently than offsets
> applied to memory and I/O space addresses. Furthermore, coupling the
> offset to the cf8 register is dodgy because of the funniness where you
> have to but the byte-selector bits 1..0 not in the address register cf8,
> but rather in the data register cfc.
>
> The better answer is <0x00000900 0 4>, using the third cell for the
> offset, as with IO and memory space offsets.
>
> Under this better interpretation, config space sizes larger than 0x100
> are no problem. The first cell is just a selector. The size is "added"
> to the third cell, so it does not encroach into the first cell's
> device,function bits.
>
> This better interpretation easily handles PCIe's 0x1000-byte extended
> config headers, and trivially accomodates even larger sizes should a
> future PCI variant further increase the header size. as described way
> down below.
That makes a lot of sense. It also mirrors parts of a discussion we
previously had when we first discussed a DT binding for Tegra PCIe.
Thanks for taking the time to go over this in so much detail. I very
much appreciate it.
> >> #address-cells = <3>;
> >> #size-cells = <2>;
> >>
> >> bus-range = <0x00 0xff>;
> >>
> >> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root
> >> port config header */
> >> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root
> >> port config header */
> >> 0x82000000 0 0xe0000000 0xe0000000 0 0x08000000 /*
> >> non-prefetchable memory */
> >> 0x81000000 0 0 0xe8000000 0 0x00100000>; /*
> >> downstream I/O */
> >>
> >> pcie@1,0 {
> >> device_type = "pci";
> >> reg = <0x0800 0 0 0 0>;
> >
> > Okay, this was Thierry's original idea, and it was my note that this
> > didn't seem like a good choice. I'm not wedded to that, but I'll
> > explain my reasoning a bit.
> >
> > Two things bothered me
> > - Describing a CPU MMIO mapping with a config address space seems
> > wonky
>
> I agree that mapping config space is sort of a jarring concept, but I
> think that's because PCs have polluted the mindspace, not because there
> is anything inherently bad about it. The original PCI spec
> fundamentally treated config space as just another (rather segmented)
> linear address space. It specified the cf8/cfc indirect access
> mechanism not as a deep semantic feature, but rather as an
> implementation hack to work around addressing limitations of PC chipsets
> and the addressing-deficient popular OS of the time (Windows 3.1 which
> was a veneer over DOS). The PCI spec defined configuration mechanism #2
> as a direct map into PC I/O space. RISC chipsets of that era (e.g.
> Alpha, Power PC) often direct-mapped config space into some form of
> load/store space.
>
> But PCs soon encroached on the RISC market and RISC system vendors
> started to "leverage" PC I/O chipsets. The indirect access "config
> mechanism #1" became part of the mindset. Indirect access was hardcoded
> in enough software that chipset designers would provide indirect access
> even if direct-mapping was possible.
>
> Fundamentally, a config header is just a block of registers, addressable
> as an index relative to some base. It might require some specific
> access path - but that is true with any block of registers. Even
> direct-mapped MMIO registers often require specific semantics such as
> non-cached, non-write-buffered, or special address space ids.
>
> The PCI SIG expected PCIe extended config space to be direct-mapped -
> see slide 25 of
>
> http://www.pcisig.com/specifications/pciexpress/technical_library/dev_con_09_02/specifications/pciexpress/PCIExpress_SoftwareandConfigurationModel.pdf
>
> So mappability of config space was intended from the get-go. The real
> wonky thing is the "indirect access zombie that cannot die".
>
> > - Linux's OF core doesn't parse a 'reg' property under
> > 'device_type = "pci"' as something CPU mappable, it only looks
> > to assigned-addresses for that. So the OF core would need to learn
> > how to handle this case, and it would have to be well defined..
>
> The way I envisioned it working is that the root-complex node defines
> its own config space access functions (the "ops" argument to
> pci_scan_root_bus). For config addresses listed in ranges, the
> functions do MMIO. For others, they use the chipset's indirect-access
> registers. The OF core is not involved.
>
> >
> > Thierry's original patch avoided this problem by not using
> > device_type = "pci"..
I think the OF core still needs to be involved in order to translate the
reg = <0x0800 0 0 0 0>; entry into a resource that can be ioremap()'ed.
As Jason already said, the OF core only does that for assigned-addresses
when device_type = "pci".
Otherwise the pci_ops implementation still couldn't access the MMIO
registers. Aiming for a driver-specific solution doesn't seem like a
good idea but if the functionality is common enough to be required by
two or more drivers perhaps a new helper could be created for exactly
this purpose.
Perhaps another alternative would be to extend the OF core to translate
the entries in the reg property as well. Or maybe I misunderstood what
you said.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-10 15:09 ` Thomas Petazzoni
@ 2013-03-11 8:08 ` Thierry Reding
0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-11 8:08 UTC (permalink / raw)
To: Thomas Petazzoni
Cc: Jason Gunthorpe, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Tawfik Bayouk, linux-pci,
devicetree-discuss, Eran Ben-Avi, Nadav Haklai, Maen Suleiman,
Shadi Ammouri, Bjorn Helgaas, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]
On Sun, Mar 10, 2013 at 04:09:37PM +0100, Thomas Petazzoni wrote:
> Dear Thierry Reding,
>
> On Fri, 8 Mar 2013 21:13:40 +0100, Thierry Reding wrote:
> > On Fri, Mar 08, 2013 at 01:02:46PM -0700, Jason Gunthorpe wrote:
> > > On Fri, Mar 08, 2013 at 09:43:11AM -1000, Mitch Bradley wrote:
> > [...]
> > > > 2) The "@0,0" and "@1,0" suffixes do not correspond to the reg values
> > > > <0x0800 0 0 0 0> and <0x1000 0 0 0 0> using any rule that I know.
> > >
> > > @0,1,0 (bus,device,fn) could be more appropriate, but that is
> > > cosmetic?
> >
> > The OF PCI specification is pretty strict about this as well. It says in
> > section 2.2.1.3. that only the DD and DD,FF forms can appear in a device
> > path.
>
> Note that in the case of my driver, the @X,Y represent the port and
> lane of the PCIe interface. Not sure if it is correct, I can change the
> names to whatever is appropriate, those names aren't used anywhere in
> the driver.
I think they need to be changed to be @dev,fn in order to conform to the
OF PCI binding.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130311074615.GA6365-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
@ 2013-03-11 18:04 ` Mitch Bradley
2013-03-11 18:23 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-11 18:04 UTC (permalink / raw)
To: Thierry Reding
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/10/2013 9:46 PM, Thierry Reding wrote:
> On Sun, Mar 10, 2013 at 07:46:04PM -1000, Mitch Bradley wrote:
>> On 3/9/2013 8:55 PM, Jason Gunthorpe wrote:
>>> On Sat, Mar 09, 2013 at 06:52:13PM -1000, Mitch Bradley wrote:
>>>
>>>> Okay, I think I finally get it. The Marvell root port bridge setup
>>>> registers look like standard config headers, even though they aren't
>>>> really in config space (because you need a different access method for
>>>> them, compared to real config accesses to downstream devices).
>>>
>>> Well, thats pretty close. On Marvell the MMIO space has a lot of
>>> stuff, some of it is config related, lots of it isn't. In fact most of
>>> it isn't. Tegra is different, the MMIO region is exactly the config
>>> space of the root port bridge.
>>>
>>> Tegra could probably happily and sanely do as you've described (well,
>>> see below), but it is wonky for Marvell. It looks like there are more
>>> drivers coming that have this need - so I had hoped for a general
>>> answer to the 'per-port MMIO space' problem that is unrelated to the
>>> MMIO spaces role in config access.
>>>
>>>> So, in order for the common code that enumerates the PCI bus to work,
>>>> you need to "spoof" the config accesses so that when you try to access
>>>> those particular "pseudo config headers", it uses the MMIO method
>>>> instead of the real config access method.
>>>
>>> There are rules for config access in both Tegra and Marvell that are
>>> not trivial. Again, the driver takes are of this and all Linux sees is
>>> a nice compliant config space.
>>
>> Thanks for explaining that.
>>
>> So it seems that we are faced with two requirements that are somewhat at
>> odds with one another.
>>
>> 1) Some of the root port bridge registers have to be accessed via config
>> space access functions so common PCI enumeration code will work. To
>> represent this with the usual DT structure, the top root-complex needs
>> to define a 3/2 address space so its children
>> can have standard PCI reg properties. Presumably, if those registers
>> are being programmed by common code, Marvell-specific code would
>> restrict its role to setting up config-space access functions, leaving
>> the actual touching of the registers to the common code.
>>
>> 2) Marvell chips have additional non-standard per-root-port registers
>> that generic PCI code would not understand. These registers would be
>> touched only by Marvell-specific code.
>>
>> The two kinds of registers are adjacent in MMIO space. However, unless
>> I am misunderstanding this MV78230 manual, the highest "config header"
>> register index is 0x134, well below the 0x1000 size limit of a PCIe
>> config header. Some of the extra registers are at 0x8xx, and others are
>> above 0x1800.
>>
>> That suggests the following:
>>
>> For the "config header" registers that use generic PCI code, use a
>> ranges entry to associate (pass through) MMIO addresses to the config
>> header portion of the register block. This parameterizes the code that
>> sets up the special-case PCI config access. That specification
>> technique is general and could be used not only for the Marvell case,
>> but also for any other chip that has some number of direct-mapped config
>> headers.
>>
>> For requirement (2), the top node has a reg property listing the
>> portions of the address space that are consumed by the driver at the top
>> level instead of being passed through to the PCI addressing domain. E.g.
>>
>> reg = <0xd0040800 0x1800>, <0xd0080800 0x1800>;
>>
>> Thus we have accurately described the two aspects of the true situation.
>> The PCI-compliant "config header" registers are passed through to the
>> child nodes where they can be dealt with in the normal PCI fashion. The
>> non-PCI-compliant register footprint lives within the Marvell-specific
>> root-complex driver.
>>
>> The root-complex driver presumably needs associate non-compliant
>> register blocks with specific child nodes. That can be done by
>> requiring that the reg entries are in the same order as the config-space
>> ranges entries.
>>
>> Anticipating the possible objection that ARM's 0x1000-byte page size
>> does not permit virtual-to-physical mapping at 0x800-byte granularity:
>> The device tree does not guarantee that reg entries are page-aligned; it
>> simply tries to describe the reality, even though it might be messy.
>>
>>>
>>>> If that is indeed the case, them I would vote for a slight modification
>>>> of the intermediate patch that I cited earlier - the one in which the
>>>> ranges property includes translations from those special config
>>>> addresses into CPU addresses. The modification is to fix the sizes,
>>>> changing 0x2000 to 0x800, so those ranges entries do not overlap in the
>>>> child address domain.
>>
>> BTW I have completely changed my mind about the overlap thing.
>>
>> I said that it was bogus to use size=0x2000 for a config space header.
>> That was based on an interpretation - which I now dislike - of the
>> meaning of 3-cell config addresses. By that old interpretation,
>> size=0x800 would also be bogus, because bits 10-8 of the config address
>> are for the function number.
>>
>> Consider the following question, which I have never previously
>> considered, at least not explicitly:
>>
>> Q: What would be the 3-cell representation of the Command/Status
>> register address (offset 4) in device 1, function 1?
>>
>> One obvious - but weak - answer is <0x00000904 0 0> . It's obvious
>> because the value 0x80000904 is what you put in the cf8 indirect access
>> address register. But it is weak because it unnecessarily restricts the
>> config header size, and because it works differently than offsets
>> applied to memory and I/O space addresses. Furthermore, coupling the
>> offset to the cf8 register is dodgy because of the funniness where you
>> have to but the byte-selector bits 1..0 not in the address register cf8,
>> but rather in the data register cfc.
>>
>> The better answer is <0x00000900 0 4>, using the third cell for the
>> offset, as with IO and memory space offsets.
>>
>> Under this better interpretation, config space sizes larger than 0x100
>> are no problem. The first cell is just a selector. The size is "added"
>> to the third cell, so it does not encroach into the first cell's
>> device,function bits.
>>
>> This better interpretation easily handles PCIe's 0x1000-byte extended
>> config headers, and trivially accomodates even larger sizes should a
>> future PCI variant further increase the header size. as described way
>> down below.
>
> That makes a lot of sense. It also mirrors parts of a discussion we
> previously had when we first discussed a DT binding for Tegra PCIe.
>
> Thanks for taking the time to go over this in so much detail. I very
> much appreciate it.
>
>>>> #address-cells = <3>;
>>>> #size-cells = <2>;
>>>>
>>>> bus-range = <0x00 0xff>;
>>>>
>>>> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root
>>>> port config header */
>>>> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root
>>>> port config header */
>>>> 0x82000000 0 0xe0000000 0xe0000000 0 0x08000000 /*
>>>> non-prefetchable memory */
>>>> 0x81000000 0 0 0xe8000000 0 0x00100000>; /*
>>>> downstream I/O */
>>>>
>>>> pcie@1,0 {
>>>> device_type = "pci";
>>>> reg = <0x0800 0 0 0 0>;
>>>
>>> Okay, this was Thierry's original idea, and it was my note that this
>>> didn't seem like a good choice. I'm not wedded to that, but I'll
>>> explain my reasoning a bit.
>>>
>>> Two things bothered me
>>> - Describing a CPU MMIO mapping with a config address space seems
>>> wonky
>>
>> I agree that mapping config space is sort of a jarring concept, but I
>> think that's because PCs have polluted the mindspace, not because there
>> is anything inherently bad about it. The original PCI spec
>> fundamentally treated config space as just another (rather segmented)
>> linear address space. It specified the cf8/cfc indirect access
>> mechanism not as a deep semantic feature, but rather as an
>> implementation hack to work around addressing limitations of PC chipsets
>> and the addressing-deficient popular OS of the time (Windows 3.1 which
>> was a veneer over DOS). The PCI spec defined configuration mechanism #2
>> as a direct map into PC I/O space. RISC chipsets of that era (e.g.
>> Alpha, Power PC) often direct-mapped config space into some form of
>> load/store space.
>>
>> But PCs soon encroached on the RISC market and RISC system vendors
>> started to "leverage" PC I/O chipsets. The indirect access "config
>> mechanism #1" became part of the mindset. Indirect access was hardcoded
>> in enough software that chipset designers would provide indirect access
>> even if direct-mapping was possible.
>>
>> Fundamentally, a config header is just a block of registers, addressable
>> as an index relative to some base. It might require some specific
>> access path - but that is true with any block of registers. Even
>> direct-mapped MMIO registers often require specific semantics such as
>> non-cached, non-write-buffered, or special address space ids.
>>
>> The PCI SIG expected PCIe extended config space to be direct-mapped -
>> see slide 25 of
>>
>> http://www.pcisig.com/specifications/pciexpress/technical_library/dev_con_09_02/specifications/pciexpress/PCIExpress_SoftwareandConfigurationModel.pdf
>>
>> So mappability of config space was intended from the get-go. The real
>> wonky thing is the "indirect access zombie that cannot die".
>>
>>> - Linux's OF core doesn't parse a 'reg' property under
>>> 'device_type = "pci"' as something CPU mappable, it only looks
>>> to assigned-addresses for that. So the OF core would need to learn
>>> how to handle this case, and it would have to be well defined..
>>
>> The way I envisioned it working is that the root-complex node defines
>> its own config space access functions (the "ops" argument to
>> pci_scan_root_bus). For config addresses listed in ranges, the
>> functions do MMIO. For others, they use the chipset's indirect-access
>> registers. The OF core is not involved.
>>
>>>
>>> Thierry's original patch avoided this problem by not using
>>> device_type = "pci"..
>
> I think the OF core still needs to be involved in order to translate the
> reg = <0x0800 0 0 0 0>; entry into a resource that can be ioremap()'ed.
> As Jason already said, the OF core only does that for assigned-addresses
> when device_type = "pci".
>
> Otherwise the pci_ops implementation still couldn't access the MMIO
> registers.
I'm not sure I understand your point that <0x0800 ...> needs to be
translated into a resource.
(a) Existing code that needs to match a child nodes to a discovered PCI
hardware device uses the existing of_pci_find_child_device(parent,
devfn) from of_pci.c . That routine "just works", because the reg =
<0x0800 ..> entry is standard.
(b) The discovery/enumeration code needs to access config space via
pci_ops. The root complex driver implements pci_ops based on a trivial
parsing of ranges (omitting irrelevant details):
pci_op_read(devfn, pos) {
loop_over_ranges_entries {
if (to_devfn(ranges_entry.child) == devfn) {
return mmio_read(ranges_entry.parent + pos);
}
return indirect_read(devfn, pos);
}
The actual implementation might setup different data structures, but the
pseudocode captures the essential points: The input argument is devfn,
and a simple scan of ranges controls whether to use MMIO or indirect
access for a given devfn value. Every aspect of the scanning loop is
trivial. The loop itself is just a stride-5 walk over an array of integers.
"ranges_entry.parent" is be32_to_cpu(ranges_entry[3])
and
"to_devfn(ranges_entry.child)" is "be32_to_cpup(ranges_entry[0]) >> 8"
(you don't want to mask the result in this case because not masking
prevents non-config-space ranges entries from matching devfn.)
> Aiming for a driver-specific solution doesn't seem like a
> good idea but if the functionality is common enough to be required by
> two or more drivers perhaps a new helper could be created for exactly
> this purpose.
Indeed - but it is often best to wait and see rather than guessing about
what sort of common helper is needed. I usually guess wrong in such cases.
>
> Perhaps another alternative would be to extend the OF core to translate
> the entries in the reg property as well. Or maybe I misunderstood what
> you said.
>
> Thierry
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-11 5:46 ` Mitch Bradley
[not found] ` <513D6F9C.9000100-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
@ 2013-03-11 18:15 ` Jason Gunthorpe
[not found] ` <20130311181554.GA10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
1 sibling, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-11 18:15 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Sun, Mar 10, 2013 at 07:46:04PM -1000, Mitch Bradley wrote:
> So it seems that we are faced with two requirements that are somewhat at
> odds with one another.
>
> 1) Some of the root port bridge registers have to be accessed via config
> space access functions so common PCI enumeration code will work. To
> represent this with the usual DT structure, the top root-complex needs
> to define a 3/2 address space so its children
> can have standard PCI reg properties. Presumably, if those registers
> are being programmed by common code, Marvell-specific code would
> restrict its role to setting up config-space access functions, leaving
> the actual touching of the registers to the common code.
>
> 2) Marvell chips have additional non-standard per-root-port registers
> that generic PCI code would not understand. These registers would be
> touched only by Marvell-specific code.
>
> The two kinds of registers are adjacent in MMIO space. However, unless
> I am misunderstanding this MV78230 manual, the highest "config header"
> register index is 0x134, well below the 0x1000 size limit of a PCIe
> config header. Some of the extra registers are at 0x8xx, and others are
> above 0x1800.
The register block you are looking at is for the cores 'target end
port mode', and the MMIO version is for internal use only, to allow
the CPU to configure RO bits, and other tasks. The target core will
allow access to that space via config cycles, either internally
(through the indirection register) or externally (from the off-chip
root complex) generated.
However - the driver runs the core in a 'root port bridge mode' where
the config header register block you are looking at is inhibited. The
Marvell IP block requires software support to run in bridge mode. So
Marvell really has only (2), while Tegra has only (1).
> For requirement (2), the top node has a reg property listing the
> portions of the address space that are consumed by the driver at the top
> level instead of being passed through to the PCI addressing domain. E.g.
>
> reg = <0xd0040800 0x1800>, <0xd0080800 0x1800>;
Okay, so this is agreeing with what Thomas already has:
http://www.spinics.net/lists/arm-kernel/msg228749.html
With your two notes:
- Correct the pcie@x,y to match the OF spec
- Add a 'device_type = "pci"' to the pcie-controller node
Is that right?
> I said that it was bogus to use size=0x2000 for a config space header.
> That was based on an interpretation - which I now dislike - of the
> meaning of 3-cell config addresses. By that old interpretation,
> size=0x800 would also be bogus, because bits 10-8 of the config address
> are for the function number.
Hum, I thought the spec was pretty clear on this point:
00 denotes Configuration Space, in which case:
[..]
bbbbbbbb,ddddd,fff,rrrrrrrrr is the Configuration Space address
hh...hh,ll...ll must be zero
And also the text at 2.1.4.4..
So you get an 8 bit register offset, placed in the highest DW..
Can you read things another way?
Is there an updated spec that supports PCI-E extended config space?
> Consider the following question, which I have never previously
> considered, at least not explicitly:
>
> Q: What would be the 3-cell representation of the Command/Status
> register address (offset 4) in device 1, function 1?
Well, see section 11.1.2, where it provides an example
for the 'Expansion ROM base address register (0x30)' as being:
02xxxx30 00000000 00000000
Also the f-code bindings say:
The data type 'config-addr' refers to the 'phys.hi' cell of the
numerical representation of a Configuration Space address.
So there is an strong convention to use the 'r' bits as the
offset..
Further, review section 12 about how ranges should be treated - it
specifically says that the b,d,f bits in ranges should be 0, and the
child address should have those bits masked prior to searching the
ranges.
Section 12 would seem to forbid this:
ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root port config header */
0x00001000 0 0 0xd4008000 0 0x00000800 /* Root port config header */
Are you reading that differently?
> > Two things bothered me
> > - Describing a CPU MMIO mapping with a config address space seems
> > wonky
>
> I agree that mapping config space is sort of a jarring concept, but I
> think that's because PCs have polluted the mindspace, not because
> there
I'm well aware of MMIO config space acces, that isn't so jarring. What
is so jarring is the idea that the OF translation would work like:
<0x00000900 0 0> -> 0xd4004000
<0x00000901 0 0> -> 0xd4004001
Which is completely unlike any ranges translation I've ever seen.
Basically, I look at how the register offset is encoded in the higest
dword plus the statement in section 12, and and conclude the there
wasn't an intention to model a memory map'd config space through OF.
What am I missing here?
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-11 18:04 ` Mitch Bradley
@ 2013-03-11 18:23 ` Jason Gunthorpe
[not found] ` <20130311182339.GB10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-11 18:23 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
> (b) The discovery/enumeration code needs to access config space via
> pci_ops. The root complex driver implements pci_ops based on a trivial
> parsing of ranges (omitting irrelevant details):
>
> pci_op_read(devfn, pos) {
> loop_over_ranges_entries {
> if (to_devfn(ranges_entry.child) == devfn) {
> return mmio_read(ranges_entry.parent + pos);
^^^^^^^^^^^^^^^^^^^
This has to be converted through all enclosing node's ranges prior to
being used as a CPU address - the OF core has all the code to do this,
but a new entry point would be needed for this specific application...
Cheers,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130311182339.GB10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-11 19:49 ` Mitch Bradley
0 siblings, 0 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-11 19:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/11/2013 8:23 AM, Jason Gunthorpe wrote:
>> (b) The discovery/enumeration code needs to access config space via
>> pci_ops. The root complex driver implements pci_ops based on a trivial
>> parsing of ranges (omitting irrelevant details):
>>
>> pci_op_read(devfn, pos) {
>> loop_over_ranges_entries {
>> if (to_devfn(ranges_entry.child) == devfn) {
>> return mmio_read(ranges_entry.parent + pos);
> ^^^^^^^^^^^^^^^^^^^
>
> This has to be converted through all enclosing node's ranges prior to
> being used as a CPU address - the OF core has all the code to do this,
> but a new entry point would be needed for this specific application...
Agreed. But I was considering that detail to be outside the scope of
this discussion because the same thing must happen for any physical
address at this level of the tree.
>
> Cheers,
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130311181554.GA10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-11 21:50 ` Mitch Bradley
2013-03-11 23:25 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-11 21:50 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/11/2013 8:15 AM, Jason Gunthorpe wrote:
> On Sun, Mar 10, 2013 at 07:46:04PM -1000, Mitch Bradley wrote:
>
>> So it seems that we are faced with two requirements that are somewhat at
>> odds with one another.
>>
>> 1) Some of the root port bridge registers have to be accessed via config
>> space access functions so common PCI enumeration code will work. To
>> represent this with the usual DT structure, the top root-complex needs
>> to define a 3/2 address space so its children
>> can have standard PCI reg properties. Presumably, if those registers
>> are being programmed by common code, Marvell-specific code would
>> restrict its role to setting up config-space access functions, leaving
>> the actual touching of the registers to the common code.
>>
>> 2) Marvell chips have additional non-standard per-root-port registers
>> that generic PCI code would not understand. These registers would be
>> touched only by Marvell-specific code.
>>
>> The two kinds of registers are adjacent in MMIO space. However, unless
>> I am misunderstanding this MV78230 manual, the highest "config header"
>> register index is 0x134, well below the 0x1000 size limit of a PCIe
>> config header. Some of the extra registers are at 0x8xx, and others are
>> above 0x1800.
>
> The register block you are looking at is for the cores 'target end
> port mode', and the MMIO version is for internal use only, to allow
> the CPU to configure RO bits, and other tasks. The target core will
> allow access to that space via config cycles, either internally
> (through the indirection register) or externally (from the off-chip
> root complex) generated.
>
> However - the driver runs the core in a 'root port bridge mode' where
> the config header register block you are looking at is inhibited. The
> Marvell IP block requires software support to run in bridge mode. So
> Marvell really has only (2), while Tegra has only (1).
Interesting.
For Marvell, if Marvell has only (2), does that imply that standard PCI
discovery and enumeration code cannot find the root port bridges, and
also there is no way to auto-configure the bridges with common pci-pci
bridge code?
>
>> For requirement (2), the top node has a reg property listing the
>> portions of the address space that are consumed by the driver at the top
>> level instead of being passed through to the PCI addressing domain. E.g.
>>
>> reg = <0xd0040800 0x1800>, <0xd0080800 0x1800>;
>
> Okay, so this is agreeing with what Thomas already has:
>
> http://www.spinics.net/lists/arm-kernel/msg228749.html
>
> With your two notes:
> - Correct the pcie@x,y to match the OF spec
> - Add a 'device_type = "pci"' to the pcie-controller node
>
> Is that right?
At this point I am confused again. There was talk of the need to use
standard PCI enumeration code to deal with the bridges, thus the need
for 3/2 to describe the interface between root-complex and the child
root-port nodes. But now I think I'm hearing that these particular root
ports bear no resemblance to standard pcie bridges. So I am back to
"not sure why we are using 3/2 PCI addresses across an interface
boundary that has no relation to PCI addressing". Sorry if I'm being
obtuse.
For Tegra, I think it all makes sense, and I still like the ranges
representation to control the config-space spoofing. But for Marvell,
it seems like there is neither the need for nor the possibility of
spoofing, nor for PCI-style addressing at the complex/port boundary.
>
>> I said that it was bogus to use size=0x2000 for a config space header.
>> That was based on an interpretation - which I now dislike - of the
>> meaning of 3-cell config addresses. By that old interpretation,
>> size=0x800 would also be bogus, because bits 10-8 of the config address
>> are for the function number.
>
> Hum, I thought the spec was pretty clear on this point:
>
> 00 denotes Configuration Space, in which case:
> [..]
> bbbbbbbb,ddddd,fff,rrrrrrrrr is the Configuration Space address
> hh...hh,ll...ll must be zero
>
> And also the text at 2.1.4.4..
>
> So you get an 8 bit register offset, placed in the highest DW..
>
> Can you read things another way?
The reading is clear, but there is a crucial point that is missing.
I don't recall ever having seen anything but 0 in rrrrrrrr. PCI reg
properties always list the base address of config header. There is
never a need for a reg entry to point into the middle of a config
header. The granularity is always at the dev,function level. So the
fact that the spec puts the offset in those bits is moot, because the
offset is never used. (I will address your counterargument below.)
I didn't anticipate that fact when I developed the representation, nor
did I anticipate that the rather-rigidly structured config space would
be extended years later.
With my "new interpretation", you still never actually set any of the
"offset bits". The only thing you get from it is permission to use
larger size values without creating algebraic nonsense. You never
actually do arithmetic on that representation.
>
> Is there an updated spec that supports PCI-E extended config space?
http://www.openfirmware.org/ofwg/bindings/pci/pci-express.txt
It puts the extended config address bits in 27:24 of the high word - but
that said, I doubt that those bits have ever been set to nonzero. Again
because you never need to point to an individual register using this
representation. I will call David Kahn, who lives down the street from
me, later today and see what he thinks.
>
>> Consider the following question, which I have never previously
>> considered, at least not explicitly:
>>
>> Q: What would be the 3-cell representation of the Command/Status
>> register address (offset 4) in device 1, function 1?
>
> Well, see section 11.1.2, where it provides an example
> for the 'Expansion ROM base address register (0x30)' as being:
>
> 02xxxx30 00000000 00000000
>
> Also the f-code bindings say:
>
> The data type 'config-addr' refers to the 'phys.hi' cell of the
> numerical representation of a Configuration Space address.
>
> So there is an strong convention to use the 'r' bits as the
> offset..
You're right, the spec does indeed say that. But in practice, it never
actually happened that way. Expansion ROMs, in practice, had to be
copied into memory because the tended to disappear under certain
settings of various enable bits. And even that copying had to be done
very carefully, because expansion ROM access had an annoying tendency to
hang some bus interfaces.
So, yeah, the spec implies the convention, but I don't think it was ever
used.
>
> Further, review section 12 about how ranges should be treated - it
> specifically says that the b,d,f bits in ranges should be 0, and the
> child address should have those bits masked prior to searching the
> ranges.
>
> Section 12 would seem to forbid this:
>
> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root port config header */
> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root port config header */
>
> Are you reading that differently?
I wasn't reading it at all, until you pointed it out to me :-) I was
just reasoning based on my mental model.
Again, I agree with your reading of the spec, but I can't think of any
reason why relaxing that restriction to permit config space mappings
would be a problem.
>
>>> Two things bothered me
>>> - Describing a CPU MMIO mapping with a config address space seems
>>> wonky
>>
>> I agree that mapping config space is sort of a jarring concept, but I
>> think that's because PCs have polluted the mindspace, not because
>> there
>
> I'm well aware of MMIO config space acces, that isn't so jarring. What
> is so jarring is the idea that the OF translation would work like:
>
> <0x00000900 0 0> -> 0xd4004000
> <0x00000901 0 0> -> 0xd4004001
>
> Which is completely unlike any ranges translation I've ever seen.
You never actually say <0x901 0 0> in the DT. What you say "there is a
config header based at <0x900 0 0>". The offset is maintained
separately, not added directly to the DT representation of the config
header base address.
The most useful interpretation of a ranges entry is that the tuples
(child_base,offset) and (parent_base,offset) correspond for 0 <= offset
< size. In some cases, you might be able to use arithmetic to merge
base and offset; in other cases you can't.
>
> Basically, I look at how the register offset is encoded in the higest
> dword plus the statement in section 12, and and conclude the there
> wasn't an intention to model a memory map'd config space through OF.
>
> What am I missing here?
The real point of section 12 is that you have to attend to the fact that
PCI a structured address space, not just a single linear space.
As you say, the text does not convey the intention of mapping config
space, but I can say categorically that I certainly did not intent "thou
shalt not map config space".
>
> Regards,
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-11 21:50 ` Mitch Bradley
@ 2013-03-11 23:25 ` Jason Gunthorpe
[not found] ` <20130311232516.GA13873-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-11 23:25 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Mon, Mar 11, 2013 at 11:50:19AM -1000, Mitch Bradley wrote:
> > However - the driver runs the core in a 'root port bridge mode' where
> > the config header register block you are looking at is inhibited. The
> > Marvell IP block requires software support to run in bridge mode. So
> > Marvell really has only (2), while Tegra has only (1).
>
> Interesting.
>
> For Marvell, if Marvell has only (2), does that imply that standard PCI
> discovery and enumeration code cannot find the root port bridges, and
> also there is no way to auto-configure the bridges with common pci-pci
> bridge code?
The driver is required to take care of all of this. From the common
code's perspective there is a bridge config header, and writing to it
controls the device, as the PCI spec intends. But the driver doesn't
copy those read/writes 1:1 to any HW register block, it distributes
them around the device's register space as necessary to create the
PCI bridge config header.
The DT is still modeling the HW, there really are root port bridge(s),
that are completely conformant at the TLP level, but the IP designers
choose to leave the detail of the bridge config space up to a SW
implementation.
> At this point I am confused again. There was talk of the need to
> use standard PCI enumeration code to deal with the bridges, thus the
> need for 3/2 to describe the interface between root-complex and the
> child root-port nodes.
I think your confusion is coming from your expectation that the PCI-E
IP block would be an entirely self contained root port, rather than
'tools to build a root port, with the right driver SW'.
> > Further, review section 12 about how ranges should be treated - it
> > specifically says that the b,d,f bits in ranges should be 0, and the
> > child address should have those bits masked prior to searching the
> > ranges.
> >
> > Section 12 would seem to forbid this:
> >
> > ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root port config header */
> > 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root port config header */
> >
> > Are you reading that differently?
>
> I wasn't reading it at all, until you pointed it out to me :-) I was
> just reasoning based on my mental model.
>
> Again, I agree with your reading of the spec, but I can't think of any
> reason why relaxing that restriction to permit config space mappings
> would be a problem.
Okay, just wanted to confirm that, I think I follow you now. :)
So you are basically proposing a small update to Section 12 that would
read something like:
'When matching a config space address, the b,d,f bits are valid in the
ranges and left unmasked in the child. The r bits in the ranges
should be 0, and the r bits from the child are copied in the last DW
and masked off in the first DW.'
Which, as you note, would not conflict with any existing usage..
Though, I can already see that the above language is not good enough,
it doesn't elegantly represent standard ECAM, for instance.. How would
you even represent ECAM with ranges??
Another wrinkle is that tegra not only has the per-bridge memory
mapped config space, bit it also has 4 'all port' ECAM-like memory
mapped config spaces. The rules around them are sufficiently
specialized that I'm not sure a ranges mapping would ever be
practical.. So right out of the gate we'd fail to capture memory
mapped config access via ranges on at least one SOC.
So, to me, this seems like overkill. Only ECAM has a hope of having a
common implementation, everything else will be host driver specific,
so why go to the trouble to revise the OF PCI specification for config
space?
Particularly since there are patches waiting on this question being
resolved :)
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130311232516.GA13873-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-11 23:38 ` Mitch Bradley
[not found] ` <513E6AFE.3090304-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Mitch Bradley @ 2013-03-11 23:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On 3/11/2013 1:25 PM, Jason Gunthorpe wrote:
> On Mon, Mar 11, 2013 at 11:50:19AM -1000, Mitch Bradley wrote:
>
>>> However - the driver runs the core in a 'root port bridge mode' where
>>> the config header register block you are looking at is inhibited. The
>>> Marvell IP block requires software support to run in bridge mode. So
>>> Marvell really has only (2), while Tegra has only (1).
>>
>> Interesting.
>>
>> For Marvell, if Marvell has only (2), does that imply that standard PCI
>> discovery and enumeration code cannot find the root port bridges, and
>> also there is no way to auto-configure the bridges with common pci-pci
>> bridge code?
>
> The driver is required to take care of all of this. From the common
> code's perspective there is a bridge config header, and writing to it
> controls the device, as the PCI spec intends. But the driver doesn't
> copy those read/writes 1:1 to any HW register block, it distributes
> them around the device's register space as necessary to create the
> PCI bridge config header.
>
> The DT is still modeling the HW, there really are root port bridge(s),
> that are completely conformant at the TLP level, but the IP designers
> choose to leave the detail of the bridge config space up to a SW
> implementation.
Oh, ouch, deep spoofing. I had to do that in system management mode
firmware for a Geode chip. What a pain that was.
>
>> At this point I am confused again. There was talk of the need to
>> use standard PCI enumeration code to deal with the bridges, thus the
>> need for 3/2 to describe the interface between root-complex and the
>> child root-port nodes.
>
> I think your confusion is coming from your expectation that the PCI-E
> IP block would be an entirely self contained root port, rather than
> 'tools to build a root port, with the right driver SW'.
>
>>> Further, review section 12 about how ranges should be treated - it
>>> specifically says that the b,d,f bits in ranges should be 0, and the
>>> child address should have those bits masked prior to searching the
>>> ranges.
>>>
>>> Section 12 would seem to forbid this:
>>>
>>> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root port config header */
>>> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root port config header */
>>>
>>> Are you reading that differently?
>>
>> I wasn't reading it at all, until you pointed it out to me :-) I was
>> just reasoning based on my mental model.
>>
>> Again, I agree with your reading of the spec, but I can't think of any
>> reason why relaxing that restriction to permit config space mappings
>> would be a problem.
>
> Okay, just wanted to confirm that, I think I follow you now. :)
>
> So you are basically proposing a small update to Section 12 that would
> read something like:
>
> 'When matching a config space address, the b,d,f bits are valid in the
> ranges and left unmasked in the child. The r bits in the ranges
> should be 0, and the r bits from the child are copied in the last DW
> and masked off in the first DW.'
>
> Which, as you note, would not conflict with any existing usage..
>
> Though, I can already see that the above language is not good enough,
> it doesn't elegantly represent standard ECAM, for instance.. How would
> you even represent ECAM with ranges??
>
> Another wrinkle is that tegra not only has the per-bridge memory
> mapped config space, bit it also has 4 'all port' ECAM-like memory
> mapped config spaces. The rules around them are sufficiently
> specialized that I'm not sure a ranges mapping would ever be
> practical.. So right out of the gate we'd fail to capture memory
> mapped config access via ranges on at least one SOC.
>
> So, to me, this seems like overkill. Only ECAM has a hope of having a
> common implementation, everything else will be host driver specific,
> so why go to the trouble to revise the OF PCI specification for config
> space?
>
> Particularly since there are patches waiting on this question being
> resolved :)
Okay, so I'll just bow out. Do what you have to do.
>
> Regards,
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <513E6AFE.3090304-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
@ 2013-03-12 7:08 ` Thierry Reding
2013-03-12 15:57 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-12 7:08 UTC (permalink / raw)
To: Mitch Bradley
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 4828 bytes --]
On Mon, Mar 11, 2013 at 01:38:38PM -1000, Mitch Bradley wrote:
> On 3/11/2013 1:25 PM, Jason Gunthorpe wrote:
> > On Mon, Mar 11, 2013 at 11:50:19AM -1000, Mitch Bradley wrote:
> >
> >>> However - the driver runs the core in a 'root port bridge mode' where
> >>> the config header register block you are looking at is inhibited. The
> >>> Marvell IP block requires software support to run in bridge mode. So
> >>> Marvell really has only (2), while Tegra has only (1).
> >>
> >> Interesting.
> >>
> >> For Marvell, if Marvell has only (2), does that imply that standard PCI
> >> discovery and enumeration code cannot find the root port bridges, and
> >> also there is no way to auto-configure the bridges with common pci-pci
> >> bridge code?
> >
> > The driver is required to take care of all of this. From the common
> > code's perspective there is a bridge config header, and writing to it
> > controls the device, as the PCI spec intends. But the driver doesn't
> > copy those read/writes 1:1 to any HW register block, it distributes
> > them around the device's register space as necessary to create the
> > PCI bridge config header.
> >
> > The DT is still modeling the HW, there really are root port bridge(s),
> > that are completely conformant at the TLP level, but the IP designers
> > choose to leave the detail of the bridge config space up to a SW
> > implementation.
>
> Oh, ouch, deep spoofing. I had to do that in system management mode
> firmware for a Geode chip. What a pain that was.
>
> >
> >> At this point I am confused again. There was talk of the need to
> >> use standard PCI enumeration code to deal with the bridges, thus the
> >> need for 3/2 to describe the interface between root-complex and the
> >> child root-port nodes.
> >
> > I think your confusion is coming from your expectation that the PCI-E
> > IP block would be an entirely self contained root port, rather than
> > 'tools to build a root port, with the right driver SW'.
> >
> >>> Further, review section 12 about how ranges should be treated - it
> >>> specifically says that the b,d,f bits in ranges should be 0, and the
> >>> child address should have those bits masked prior to searching the
> >>> ranges.
> >>>
> >>> Section 12 would seem to forbid this:
> >>>
> >>> ranges = <0x00000800 0 0 0xd4004000 0 0x00000800 /* Root port config header */
> >>> 0x00001000 0 0 0xd4008000 0 0x00000800 /* Root port config header */
> >>>
> >>> Are you reading that differently?
> >>
> >> I wasn't reading it at all, until you pointed it out to me :-) I was
> >> just reasoning based on my mental model.
> >>
> >> Again, I agree with your reading of the spec, but I can't think of any
> >> reason why relaxing that restriction to permit config space mappings
> >> would be a problem.
> >
> > Okay, just wanted to confirm that, I think I follow you now. :)
> >
> > So you are basically proposing a small update to Section 12 that would
> > read something like:
> >
> > 'When matching a config space address, the b,d,f bits are valid in the
> > ranges and left unmasked in the child. The r bits in the ranges
> > should be 0, and the r bits from the child are copied in the last DW
> > and masked off in the first DW.'
> >
> > Which, as you note, would not conflict with any existing usage..
> >
> > Though, I can already see that the above language is not good enough,
> > it doesn't elegantly represent standard ECAM, for instance.. How would
> > you even represent ECAM with ranges??
> >
> > Another wrinkle is that tegra not only has the per-bridge memory
> > mapped config space, bit it also has 4 'all port' ECAM-like memory
> > mapped config spaces. The rules around them are sufficiently
> > specialized that I'm not sure a ranges mapping would ever be
> > practical.. So right out of the gate we'd fail to capture memory
> > mapped config access via ranges on at least one SOC.
> >
> > So, to me, this seems like overkill. Only ECAM has a hope of having a
> > common implementation, everything else will be host driver specific,
> > so why go to the trouble to revise the OF PCI specification for config
> > space?
> >
> > Particularly since there are patches waiting on this question being
> > resolved :)
>
> Okay, so I'll just bow out. Do what you have to do.
So to recapitulate, we agree that configuration space can be translated
through the "ranges" property. That means the only missing link is a new
function to translate not only "assigned-addresses" but also the "reg"
property for PCI devices. Is that it?
And for Marvell the non-configuration-space per-root-port registers need
to be mapped separately from the configuration space windows?
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 7:08 ` Thierry Reding
@ 2013-03-12 15:57 ` Jason Gunthorpe
2013-03-12 20:38 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-12 15:57 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Tue, Mar 12, 2013 at 08:08:52AM +0100, Thierry Reding wrote:
> So to recapitulate, we agree that configuration space can be translated
> through the "ranges" property. That means the only missing link is a new
> function to translate not only "assigned-addresses" but also the "reg"
> property for PCI devices. Is that it?
No, the first conclusion is that placing a config space in ranges is
against the language in the current spec (see section 12).
The second conclusion is that there is probably a way forward to
update the spec in a backwards compatible way to model ECAM, but would
require more analysis.
Finally, there was no objections to the approach in Thomas's patch,
other than the note to fix the '@1,0' and the extra device_type="pci"
..
> And for Marvell the non-configuration-space per-root-port registers need
> to be mapped separately from the configuration space windows?
No, there are no configuration space windows on Marvell, so only one mapping.
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 15:57 ` Jason Gunthorpe
@ 2013-03-12 20:38 ` Thierry Reding
2013-03-12 21:03 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-12 20:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 3072 bytes --]
On Tue, Mar 12, 2013 at 09:57:49AM -0600, Jason Gunthorpe wrote:
> On Tue, Mar 12, 2013 at 08:08:52AM +0100, Thierry Reding wrote:
>
> > So to recapitulate, we agree that configuration space can be translated
> > through the "ranges" property. That means the only missing link is a new
> > function to translate not only "assigned-addresses" but also the "reg"
> > property for PCI devices. Is that it?
>
> No, the first conclusion is that placing a config space in ranges is
> against the language in the current spec (see section 12).
>
> The second conclusion is that there is probably a way forward to
> update the spec in a backwards compatible way to model ECAM, but would
> require more analysis.
>
> Finally, there was no objections to the approach in Thomas's patch,
> other than the note to fix the '@1,0' and the extra device_type="pci"
That's not what I deduced from Mitch's comments. He said explicitly that
he liked the idea to represent mapped configuration space using the
ranges property. Mitch, correct me if I misinterpret what you said.
I've also verified that things actually do work with the binding that I
proposed for Tegra, even if I add device_type = "pci" to the root port
nodes. The reason is that the OF core looks up the type of the *parent*
node to find a matching bus. This happens to be the pcie-controller node
which is not a PCI device and therefore the address translation works
properly.
So let me quote the latest binding that I'm using:
pcie-controller {
compatible = "nvidia,tegra20-pcie";
reg = <0x80003000 0x00000800 /* PADS registers */
0x80003800 0x00000200 /* AFI registers */
0x90000000 0x10000000>; /* configuration space */
reg-names = "pads", "afi", "cs";
interrupts = <0 98 0x04 /* controller interrupt */
0 99 0x04>; /* MSI interrupt */
interrupt-names = "intr", "msi";
bus-range = <0x00 0xff>;
#address-cells = <3>;
#size-cells = <2>;
ranges = <0x00000800 0 0 0x80000000 0 0x00001000 /* port 0 configuration space */
0x00001000 0 0 0x80001000 0 0x00001000 /* port 1 configuration space */
0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
clocks = <&tegra_car 70>, <&tegra_car 72>, <&tegra_car 74>,
<&tegra_car 118>;
clock-names = "pex", "afi", "pcie_xclk", "pll_e";
status = "disabled";
pci@1,0 {
device_type = "pci";
reg = <0x000800 0 0 0 0x1000>;
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
nvidia,num-lanes = <2>;
};
pci@2,0 {
device_type = "pci";
reg = <0x001000 0 0 0 0x1000>;
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
nvidia,num-lanes = <2>;
};
};
I can't find anything wrong in the above. As far as I can tell, there's
nothing in it that's in conflict with the specification.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 20:38 ` Thierry Reding
@ 2013-03-12 21:03 ` Jason Gunthorpe
2013-03-12 21:30 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-12 21:03 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Tue, Mar 12, 2013 at 09:38:19PM +0100, Thierry Reding wrote:
> I've also verified that things actually do work with the binding that I
> proposed for Tegra, even if I add device_type = "pci" to the root port
> nodes. The reason is that the OF core looks up the type of the *parent*
> node to find a matching bus. This happens to be the pcie-controller node
> which is not a PCI device and therefore the address translation works
> properly.
Mitch pointed out early on that the parent node needs a 'device_type =
"pci"' as well.
The pcie-controller node would be classified as a 'bus node' according
to the spec.
Not going down the of_pci_* code paths for address translation at the
root port bridge nodes is certainly not right.
> ranges = <0x00000800 0 0 0x80000000 0 0x00001000 /* port 0 configuration space */
> 0x00001000 0 0 0x80001000 0 0x00001000 /* port 1 configuration space */
Section 12 says that the 'b,d,f' bits are 0 in ranges, so the above is
in conflict with that language:
In particular, the phys.hi fields of the child address spaces in the
"ranges" property for PCI does not contain the same information as
"reg" property entries within PCI nodes. The only information that is
present in "ranges" phys.hi entries are the non-relocatable,
prefetchable and the PCI address space bits for which the entry
applies. I.e., only the n, p and ss bits are present; the bbbbbbbb,
ddddd, fff and rrrrrrrr fields are 0.
And the further requirement:
When an address is to be mapped through a PCI bus bridge node, the
phys.hi value of the address to be mapped and the child field of a
"ranges" entry should be masked so that only the ss bits are compared.
Which is embodied by the code in of_bus_pci_map, meaning the OF core
won't even look at the bits in the high DW when searching ranges.
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 21:03 ` Jason Gunthorpe
@ 2013-03-12 21:30 ` Thierry Reding
2013-03-12 22:08 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-12 21:30 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]
On Tue, Mar 12, 2013 at 03:03:28PM -0600, Jason Gunthorpe wrote:
> On Tue, Mar 12, 2013 at 09:38:19PM +0100, Thierry Reding wrote:
>
> > I've also verified that things actually do work with the binding that I
> > proposed for Tegra, even if I add device_type = "pci" to the root port
> > nodes. The reason is that the OF core looks up the type of the *parent*
> > node to find a matching bus. This happens to be the pcie-controller node
> > which is not a PCI device and therefore the address translation works
> > properly.
>
> Mitch pointed out early on that the parent node needs a 'device_type =
> "pci"' as well.
>
> The pcie-controller node would be classified as a 'bus node' according
> to the spec.
>
> Not going down the of_pci_* code paths for address translation at the
> root port bridge nodes is certainly not right.
I'm not so sure. Why should the pcie-controller be a PCI device? It
really isn't. Instead it is a device on the processor/SoC/platform bus
that bridges to the PCI bus. As such its children should be PCI devices
to reflect that the translation happens within the pcie-controller node.
In the above it is logical to not set the pcie-controller's device_type
property to "pci".
> > ranges = <0x00000800 0 0 0x80000000 0 0x00001000 /* port 0 configuration space */
> > 0x00001000 0 0 0x80001000 0 0x00001000 /* port 1 configuration space */
>
> Section 12 says that the 'b,d,f' bits are 0 in ranges, so the above is
> in conflict with that language:
>
> In particular, the phys.hi fields of the child address spaces in the
> "ranges" property for PCI does not contain the same information as
> "reg" property entries within PCI nodes. The only information that is
> present in "ranges" phys.hi entries are the non-relocatable,
> prefetchable and the PCI address space bits for which the entry
> applies. I.e., only the n, p and ss bits are present; the bbbbbbbb,
> ddddd, fff and rrrrrrrr fields are 0.
>
> And the further requirement:
>
> When an address is to be mapped through a PCI bus bridge node, the
> phys.hi value of the address to be mapped and the child field of a
> "ranges" entry should be masked so that only the ss bits are compared.
>
> Which is embodied by the code in of_bus_pci_map, meaning the OF core
> won't even look at the bits in the high DW when searching ranges.
Okay, so the problem with that is that we won't be able to include more
entries for configuration space within a ranges property because we
can't uniquely map them (they both have the same ss field values). One
possibility would be to differentiate using the phys.mid and phys.low
cells but they need to be 0 according to the spec.
The other alternative would be to amend the specification. Besides the
fact that the specification says so I don't see any reason why this
shouldn't be allowed.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 21:30 ` Thierry Reding
@ 2013-03-12 22:08 ` Jason Gunthorpe
2013-03-12 23:25 ` Mitch Bradley
2013-03-13 8:18 ` Thierry Reding
0 siblings, 2 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-12 22:08 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Tue, Mar 12, 2013 at 10:30:06PM +0100, Thierry Reding wrote:
> > Not going down the of_pci_* code paths for address translation at the
> > root port bridge nodes is certainly not right.
>
> I'm not so sure. Why should the pcie-controller be a PCI device?
The spec is clear on that point as well:
device_type
A Standard Package conforming to this specification and corresponding
to a device that implements a PCI bus shall implement this property
with the string value "pci"
The children of a pcie-controller node are PCI devices, thus the
pcie-controller node 'implements a PCI bus'. Or, as you say 'device on
the processor/SoC/platform bus that bridges to the PCI bus'
Think of device_type as also meaning bus_type and it makes more
logical sense, the name is terrible, but its usage is governed by the
OF docs..
> The other alternative would be to amend the specification. Besides the
> fact that the specification says so I don't see any reason why this
> shouldn't be allowed.
'the specification says so' *IS* the reason. DT isn't a free for all
where you get to do whatever you want, or whatever 'feels' right. It
is supposed to be a stable, OS agnostic ABI. That means bindings have
to follow the specs (when available).
Maybe a future revision will support PCI-E ECAM, but we don't know
what that will look like, and I'm pretty sure you don't want to hold
up your patches until an IEEE committee gets around to deciding
something ;)
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 22:08 ` Jason Gunthorpe
@ 2013-03-12 23:25 ` Mitch Bradley
2013-03-13 8:18 ` Thierry Reding
1 sibling, 0 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-12 23:25 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci, devicetree-discuss, Thierry Reding,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Shadi Ammouri,
Bjorn Helgaas, linux-arm-kernel
On 3/12/2013 12:08 PM, Jason Gunthorpe wrote:
> On Tue, Mar 12, 2013 at 10:30:06PM +0100, Thierry Reding wrote:
>
>>> Not going down the of_pci_* code paths for address translation at the
>>> root port bridge nodes is certainly not right.
>>
>> I'm not so sure. Why should the pcie-controller be a PCI device?
>
> The spec is clear on that point as well:
>
> device_type
> A Standard Package conforming to this specification and corresponding
> to a device that implements a PCI bus shall implement this property
> with the string value "pci"
>
> The children of a pcie-controller node are PCI devices, thus the
> pcie-controller node 'implements a PCI bus'. Or, as you say 'device on
> the processor/SoC/platform bus that bridges to the PCI bus'
>
> Think of device_type as also meaning bus_type and it makes more
> logical sense, the name is terrible, but its usage is governed by the
> OF docs..
I beg to differ about the name. device_type means "what kind of
functionality is implemented by this device", not "what interface
connects this device to the rest of the system". The use of
"device_type" for that meaning is perfectly reasonable in the English
language.
The DT answers the "what interface connects me to the system" question
by looking at the parent.
>
>> The other alternative would be to amend the specification. Besides the
>> fact that the specification says so I don't see any reason why this
>> shouldn't be allowed.
>
> 'the specification says so' *IS* the reason. DT isn't a free for all
> where you get to do whatever you want, or whatever 'feels' right. It
> is supposed to be a stable, OS agnostic ABI. That means bindings have
> to follow the specs (when available).
>
> Maybe a future revision will support PCI-E ECAM, but we don't know
> what that will look like, and I'm pretty sure you don't want to hold
> up your patches until an IEEE committee gets around to deciding
> something ;)
The Open Firmware Working Group quit doing things under the IEEE banner
a long long time ago, precisely for the reason that the IEEE process was
too slow, cumbersome, and expensive. The group thereafter published
"Recommended Practices" documents under its own non-authority instead of
going through the IEEE ratification process. The group's only
"authority" arose from the fact that the members had expertise and
worked for companies that needed to use the results.
The Working Group is now inactive. I am in contact with many of the
former members, but by and large they are no longer interested in
firmware standardization. Many of them work for companies that now hold
their cards closely, that seemingly being the only way to make enough
money to stay in business.
In other words, there is no committee.
New versions of the document in question, or of any document for that
matter, can be issued by the simple expedient of interested parties
coming to agreement and writing down that agreement. In practice, that
is what happens in Linux land for nearly every issue.
At this time, the primary "interest group" for device tree issues
revolves around the Linux kernel - the devicetree-discuss list is where
all the action is. I suppose that Oracle still cares about Open
Firmware on its SPARC systems, but it is likely that Oracle is going to
do whatever they want no matter what Linux people say or do. In
principle, the DT is supposed to be OS-agnostic, but in practice, which
other OS both uses the DT and has enough users to matter? So it would
be entirely reasonable for a group of Linux kernel people to publish
revised documents, as has already been done in the form of
Documentation/ files setting down usage rules.
>
> Jason
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-12 22:08 ` Jason Gunthorpe
2013-03-12 23:25 ` Mitch Bradley
@ 2013-03-13 8:18 ` Thierry Reding
2013-03-13 17:02 ` Jason Gunthorpe
1 sibling, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-13 8:18 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci, devicetree-discuss, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Mitch Bradley,
Bjorn Helgaas, linux-arm-kernel
[-- Attachment #1.1: Type: text/plain, Size: 2206 bytes --]
On Tue, Mar 12, 2013 at 04:08:54PM -0600, Jason Gunthorpe wrote:
> On Tue, Mar 12, 2013 at 10:30:06PM +0100, Thierry Reding wrote:
>
> > > Not going down the of_pci_* code paths for address translation at the
> > > root port bridge nodes is certainly not right.
> >
> > I'm not so sure. Why should the pcie-controller be a PCI device?
>
> The spec is clear on that point as well:
>
> device_type
> A Standard Package conforming to this specification and corresponding
> to a device that implements a PCI bus shall implement this property
> with the string value "pci"
>
> The children of a pcie-controller node are PCI devices, thus the
> pcie-controller node 'implements a PCI bus'. Or, as you say 'device on
> the processor/SoC/platform bus that bridges to the PCI bus'
>
> Think of device_type as also meaning bus_type and it makes more
> logical sense, the name is terrible, but its usage is governed by the
> OF docs..
>
> > The other alternative would be to amend the specification. Besides the
> > fact that the specification says so I don't see any reason why this
> > shouldn't be allowed.
>
> 'the specification says so' *IS* the reason. DT isn't a free for all
> where you get to do whatever you want, or whatever 'feels' right. It
> is supposed to be a stable, OS agnostic ABI. That means bindings have
> to follow the specs (when available).
>
> Maybe a future revision will support PCI-E ECAM, but we don't know
> what that will look like, and I'm pretty sure you don't want to hold
> up your patches until an IEEE committee gets around to deciding
> something ;)
Mitch already answered this. The specification is now almost 15 years
old and it couldn't possibly have foreseen all of the future use-cases.
If the specification is too restrictive and Mitch gives his blessing to
remove some of the restrictions, I don't see any reason why we shouldn't
do so if it lets us represent the reality of hardware more accurately in
DT.
Furthermore we're not discussing radical changes. None of the changes
will be backwards-incompatible, but they will allow recent hardware to
be represented more correctly or conveniently.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 8:18 ` Thierry Reding
@ 2013-03-13 17:02 ` Jason Gunthorpe
[not found] ` <20130313170205.GB24042-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-13 17:02 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Wed, Mar 13, 2013 at 09:18:15AM +0100, Thierry Reding wrote:
> Mitch already answered this. The specification is now almost 15 years
> old and it couldn't possibly have foreseen all of the future use-cases.
> If the specification is too restrictive and Mitch gives his blessing to
> remove some of the restrictions, I don't see any reason why we shouldn't
> do so if it lets us represent the reality of hardware more accurately in
> DT.
I understand the spec is old, and I have no problem with making a
Linux specific revision, but do *that* - don't bury some random
deviation inside the bindings for a driver. I even suggested some
language, but I feel more thought is needed to consider how to model
the standardized ECAM mechanism..
> Furthermore we're not discussing radical changes. None of the changes
> will be backwards-incompatible, but they will allow recent hardware to
> be represented more correctly or conveniently.
Sure, but it is still inconsistent, one MMIO config mechansim is in
ranges, one is in regs. Plus I don't think tegra's case is a great
starting point to design a spec update, it's config access mechanism
is especially strange...
Anyhow, I think this has been hashed to death, as long as your final
binding has the 'device_type = pci' on the pcie-controller node I
think it will be fine.
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <20130313170205.GB24042-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2013-03-13 19:26 ` Thierry Reding
2013-03-13 19:59 ` Jason Gunthorpe
2013-03-13 20:58 ` Mitch Bradley
0 siblings, 2 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-13 19:26 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Mitch Bradley,
Bjorn Helgaas, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 2725 bytes --]
On Wed, Mar 13, 2013 at 11:02:05AM -0600, Jason Gunthorpe wrote:
> On Wed, Mar 13, 2013 at 09:18:15AM +0100, Thierry Reding wrote:
>
> > Mitch already answered this. The specification is now almost 15 years
> > old and it couldn't possibly have foreseen all of the future use-cases.
> > If the specification is too restrictive and Mitch gives his blessing to
> > remove some of the restrictions, I don't see any reason why we shouldn't
> > do so if it lets us represent the reality of hardware more accurately in
> > DT.
>
> I understand the spec is old, and I have no problem with making a
> Linux specific revision, but do *that* - don't bury some random
> deviation inside the bindings for a driver. I even suggested some
> language, but I feel more thought is needed to consider how to model
> the standardized ECAM mechanism..
As Mitch already said there's very little chance that the specification
update will be ratified through IEEE, so I think that we might just as
well put a corresponding text somewhere below Documentation/devicetree.
Also note that this has absolutely nothing to do with ECAM. All I'm
proposing is to allow the reg property of a root port to be translated
into a CPU addressable memory region through the ranges property. This
turns out to actually work properly with the current OF core in Linux.
> > Furthermore we're not discussing radical changes. None of the changes
> > will be backwards-incompatible, but they will allow recent hardware to
> > be represented more correctly or conveniently.
>
> Sure, but it is still inconsistent, one MMIO config mechansim is in
> ranges, one is in regs. Plus I don't think tegra's case is a great
> starting point to design a spec update, it's config access mechanism
> is especially strange...
Again, it is still the most accurate way to describe the hardware. I
know it's not terribly beautiful and doesn't match with what you'd
expect from PC hardware. However it is still a reality and something the
kernel will have to deal with. Marvell hardware isn't very pretty either
but that shouldn't exclude it from being supported by Linux.
> Anyhow, I think this has been hashed to death, as long as your final
> binding has the 'device_type = pci' on the pcie-controller node I
> think it will be fine.
No. The pcie-controller is *not* a PCI device. It is a PCI host bridge.
It is the instance that translates from the platform to the PCI bus. Why
should it be device_type = "pci"? And we've also been over this before,
making that change stops the proposed binding from working properly
because the entry in the reg property can't be translated into a CPU
addressable memory region.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 19:26 ` Thierry Reding
@ 2013-03-13 19:59 ` Jason Gunthorpe
2013-03-13 20:54 ` Thierry Reding
2013-03-13 20:58 ` Mitch Bradley
1 sibling, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-13 19:59 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Wed, Mar 13, 2013 at 08:26:28PM +0100, Thierry Reding wrote:
> As Mitch already said there's very little chance that the specification
> update will be ratified through IEEE, so I think that we might just as
> well put a corresponding text somewhere below Documentation/devicetree.
Sure, I'm fine with that.
> Also note that this has absolutely nothing to do with ECAM. All I'm
> proposing is to allow the reg property of a root port to be translated
> into a CPU addressable memory region through the ranges property. This
> turns out to actually work properly with the current OF core in Linux.
ECAM is the only standard based mechanism that could possibly use this
new config space ranges concept. Whatever you do for tegra shouldn't
stomp on the possibility of making that work in the future. So at
least think about how ECAM could be modeled for a few mins before
going ahead.
Tegra is the non-standard unusual case here, an update to the PCI OF
binding should be of general use.
> > Anyhow, I think this has been hashed to death, as long as your final
> > binding has the 'device_type = pci' on the pcie-controller node I
> > think it will be fine.
>
> No. The pcie-controller is *not* a PCI device. It is a PCI host bridge.
> It is the instance that translates from the platform to the PCI bus. Why
> should it be device_type = "pci"?
Spec says so, Mitch said so, other PCI host bridge bindings in the
tree work like this. Do a grep in the powerpc tree, for instance.
arch/x86/platform/ce4100/falconfalls.dts is a pretty comprehensive
example.
> And we've also been over this before, making that change stops the
> proposed binding from working properly because the entry in the reg
> property can't be translated into a CPU addressable memory region.
Sigh, that's because the change makes things follow the OF PCI
spec. The Linux OF core follows the spec. Config space ranges binding
is specifically *defined to not work* in the spec. Why should it work
today?
You have to deal with this problem somehow, and omitting the top level
device_type is not a solution.
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 19:59 ` Jason Gunthorpe
@ 2013-03-13 20:54 ` Thierry Reding
0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-13 20:54 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 4250 bytes --]
On Wed, Mar 13, 2013 at 01:59:01PM -0600, Jason Gunthorpe wrote:
> On Wed, Mar 13, 2013 at 08:26:28PM +0100, Thierry Reding wrote:
>
> > As Mitch already said there's very little chance that the specification
> > update will be ratified through IEEE, so I think that we might just as
> > well put a corresponding text somewhere below Documentation/devicetree.
>
> Sure, I'm fine with that.
>
> > Also note that this has absolutely nothing to do with ECAM. All I'm
> > proposing is to allow the reg property of a root port to be translated
> > into a CPU addressable memory region through the ranges property. This
> > turns out to actually work properly with the current OF core in Linux.
>
> ECAM is the only standard based mechanism that could possibly use this
> new config space ranges concept. Whatever you do for tegra shouldn't
> stomp on the possibility of making that work in the future. So at
> least think about how ECAM could be modeled for a few mins before
> going ahead.
>
> Tegra is the non-standard unusual case here, an update to the PCI OF
> binding should be of general use.
But I'm not trying to encode ECAM in the DT. ECAM isn't the problem at
all. The only problem is that each root port has a separate window in
the CPU address space that is used to access that port's configuration
space.
ECAM only comes into play when accessing the configuration space of
subordinate devices (via type 1 transactions). But at that point the
device tree doesn't need to know about it. The only thing that needs to
be in DT is the memory region that the controller should write to in
order to generate type 1 transactions. And that's a problem that has
long been solved.
> > > Anyhow, I think this has been hashed to death, as long as your final
> > > binding has the 'device_type = pci' on the pcie-controller node I
> > > think it will be fine.
> >
> > No. The pcie-controller is *not* a PCI device. It is a PCI host bridge.
> > It is the instance that translates from the platform to the PCI bus. Why
> > should it be device_type = "pci"?
>
> Spec says so, Mitch said so, other PCI host bridge bindings in the
> tree work like this. Do a grep in the powerpc tree, for instance.
>
> arch/x86/platform/ce4100/falconfalls.dts is a pretty comprehensive
> example.
Yes, yes. We all know by now what the spec says. The whole point is
that maybe the spec needs to be updated and this requirement lifted.
What I keep trying to say is that in case of Tegra the PCIe controller
does not provide the PCI bus. It doesn't appear as a device on the PCI
bus and it doesn't have configuration space itself.
Only the PCI root ports provide access to the subordinate busses. And
only they appear on the PCI bus.
> > And we've also been over this before, making that change stops the
> > proposed binding from working properly because the entry in the reg
> > property can't be translated into a CPU addressable memory region.
>
> Sigh, that's because the change makes things follow the OF PCI
> spec. The Linux OF core follows the spec. Config space ranges binding
> is specifically *defined to not work* in the spec. Why should it work
> today?
Okay. I keep understanding what you're saying. But the whole point of
this discussion is that maybe the spec should be amended to allow this
to work. In fact it does work with the current implementation in the
Linux kernel. And what I've been trying to explain is that I think this
to be the most accurate description of the Tegra hardware.
> You have to deal with this problem somehow, and omitting the top level
> device_type is not a solution.
So we've already agreed that the specification might be too restrictive,
and I'm trying to solve this problem by finding an accurate description
in DT to represent the hardware without following the spec to the letter
because it wasn't written with this kind of hardware in mind.
Tegra's PCIe controller doesn't actually implement the bus. In fact we
have to jump through hoops to make both root ports appear on the root
bus specifically because there's no common access to the configuration
space that includes the host bridge or the root ports.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 19:26 ` Thierry Reding
2013-03-13 19:59 ` Jason Gunthorpe
@ 2013-03-13 20:58 ` Mitch Bradley
2013-03-13 21:33 ` Thierry Reding
` (2 more replies)
1 sibling, 3 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-13 20:58 UTC (permalink / raw)
To: Thierry Reding
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci, devicetree-discuss, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel
On 3/13/2013 9:26 AM, Thierry Reding wrote:
> On Wed, Mar 13, 2013 at 11:02:05AM -0600, Jason Gunthorpe wrote:
>> On Wed, Mar 13, 2013 at 09:18:15AM +0100, Thierry Reding wrote:
>>
>>> Mitch already answered this. The specification is now almost 15 years
>>> old and it couldn't possibly have foreseen all of the future use-cases.
>>> If the specification is too restrictive and Mitch gives his blessing to
>>> remove some of the restrictions, I don't see any reason why we shouldn't
>>> do so if it lets us represent the reality of hardware more accurately in
>>> DT.
>>
>> I understand the spec is old, and I have no problem with making a
>> Linux specific revision, but do *that* - don't bury some random
>> deviation inside the bindings for a driver. I even suggested some
>> language, but I feel more thought is needed to consider how to model
>> the standardized ECAM mechanism..
>
> As Mitch already said there's very little chance that the specification
> update will be ratified through IEEE, so I think that we might just as
> well put a corresponding text somewhere below Documentation/devicetree.
>
> Also note that this has absolutely nothing to do with ECAM. All I'm
> proposing is to allow the reg property of a root port to be translated
> into a CPU addressable memory region through the ranges property. This
> turns out to actually work properly with the current OF core in Linux.
>
>>> Furthermore we're not discussing radical changes. None of the changes
>>> will be backwards-incompatible, but they will allow recent hardware to
>>> be represented more correctly or conveniently.
>>
>> Sure, but it is still inconsistent, one MMIO config mechansim is in
>> ranges, one is in regs. Plus I don't think tegra's case is a great
>> starting point to design a spec update, it's config access mechanism
>> is especially strange...
>
> Again, it is still the most accurate way to describe the hardware. I
> know it's not terribly beautiful and doesn't match with what you'd
> expect from PC hardware. However it is still a reality and something the
> kernel will have to deal with. Marvell hardware isn't very pretty either
> but that shouldn't exclude it from being supported by Linux.
>
>> Anyhow, I think this has been hashed to death, as long as your final
>> binding has the 'device_type = pci' on the pcie-controller node I
>> think it will be fine.
>
> No. The pcie-controller is *not* a PCI device. It is a PCI host bridge.
> It is the instance that translates from the platform to the PCI bus. Why
> should it be device_type = "pci"?
device_type="pci" means that this device implements PCI functionality,
which means that it is either a PCI host bridge or a PCI-to-PCI bridge.
"device_type" answers the question "What does this device do?". The
alternative question "What bus does this device plug into?" is answered
by looking at the parent.
In this case, the answer to "what does pcie_controller do?" is "it
implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
Having just re-read the code in drivers/of/address.c, I think I have a
deeper understanding of a few practical issues:
In order to translate a 3/2 PCI address via of_get_pci_address(), the
parent device (i.e. root complex / pcie-controller) must have both
'device_type = "pci"' (else of_match_bus() won't find the pci version of
"struct of_bus") and also 'name = "pci"' (else strcmp(bus->name, "pci")
will fail).
So it would seem that, if you want to use address.c verbatim (for the
interface between pcie-controller and its direct children), not only do
you need device_type = pci, but you also need to rename
"pcie-controller" to "pci".
But then you run into the problem that the pci variant of struct of_bus
uses "assigned-addresses" instead of "reg". So it still doesn't work
as-is. To make it work, you would need to add an "assigned-addresses"
property to each root port node. That property value could be in
non-relocatable memory space, translatable via normal rules, in which
case the "map config space to MMIO via ranges" discussion is moot for
Marvell.
Which re-raises a question for which I have not yet heard a compelling
answer: Why use 3/2 PCI addressing between Marvell pcie-controller and
its root ports (with their unusual programming model)? The Marvell
hardware address structure between controller and root ports is
definitely not 3/2. Certainly you need 3/2 addressing below
(implemented by) the root ports, but I just don't see the utility of
pretending 3/2 addressing at the complex-to-port interface. If the root
port hardware used the common programming model, with real config
headers and stuff, 3/2 would be good because you could use existing
drivers. But since you need a special root-port driver anyway, why go
to the trouble of emulating non-existent addressing?
And we've also been over this before,
> making that change stops the proposed binding from working properly
> because the entry in the reg property can't be translated into a CPU
> addressable memory region.
>
> Thierry
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 20:58 ` Mitch Bradley
@ 2013-03-13 21:33 ` Thierry Reding
2013-03-13 22:48 ` Mitch Bradley
2013-03-14 4:56 ` Stephen Warren
[not found] ` <5140E85A.3040900-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
2013-03-13 22:22 ` Jason Gunthorpe
2 siblings, 2 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-13 21:33 UTC (permalink / raw)
To: Mitch Bradley
Cc: Jason Gunthorpe, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 3079 bytes --]
On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
[...]
> In this case, the answer to "what does pcie_controller do?" is "it
> implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
Alright, that's 2 against 1. I don't have much of a choice but to yield.
> Having just re-read the code in drivers/of/address.c, I think I have a
> deeper understanding of a few practical issues:
>
> In order to translate a 3/2 PCI address via of_get_pci_address(), the
> parent device (i.e. root complex / pcie-controller) must have both
> 'device_type = "pci"' (else of_match_bus() won't find the pci version of
> "struct of_bus") and also 'name = "pci"' (else strcmp(bus->name, "pci")
> will fail).
>
> So it would seem that, if you want to use address.c verbatim (for the
> interface between pcie-controller and its direct children), not only do
> you need device_type = pci, but you also need to rename
> "pcie-controller" to "pci".
bus->name refers to that of the of_bus structure, not the node name of
the controller.
> But then you run into the problem that the pci variant of struct of_bus
> uses "assigned-addresses" instead of "reg". So it still doesn't work
> as-is. To make it work, you would need to add an "assigned-addresses"
> property to each root port node. That property value could be in
> non-relocatable memory space, translatable via normal rules, in which
> case the "map config space to MMIO via ranges" discussion is moot for
> Marvell.
At the risk of quoting the spec: it says (4.1.2) that the entries in the
"assigned-addresses" property correspond to the function's configuration
space BARs. In this case however that's no longer true. But if everybody
thinks that repurposing the "assigned-addresses" property is more
acceptable than leaving out the device_type = "pci" property from the
pcie-controller node, then so be it.
> Which re-raises a question for which I have not yet heard a compelling
> answer: Why use 3/2 PCI addressing between Marvell pcie-controller and
> its root ports (with their unusual programming model)? The Marvell
> hardware address structure between controller and root ports is
> definitely not 3/2. Certainly you need 3/2 addressing below
> (implemented by) the root ports, but I just don't see the utility of
> pretending 3/2 addressing at the complex-to-port interface. If the root
> port hardware used the common programming model, with real config
> headers and stuff, 3/2 would be good because you could use existing
> drivers. But since you need a special root-port driver anyway, why go
> to the trouble of emulating non-existent addressing?
I think the reason is that reg needs to be in the standard format for
the device node matching code to work. There are various checks to make
sure the reg property has entries which are 3/2 each.
There's also the issue that the pcie-controller's ranges property
contains the mapping of the I/O and memory windows and therefore needs
the 3/2 addressing to encode the type.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
[not found] ` <5140E85A.3040900-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
@ 2013-03-13 22:02 ` Thierry Reding
2013-03-13 22:21 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-13 22:02 UTC (permalink / raw)
To: Mitch Bradley
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci-u79uwXL29TY76Z2rM5mHXA,
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Eran Ben-Avi,
Jason Gunthorpe, Maen Suleiman, Shadi Ammouri, Bjorn Helgaas,
Tawfik Bayouk, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
[-- Attachment #1.1: Type: text/plain, Size: 3248 bytes --]
On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
[...]
> But then you run into the problem that the pci variant of struct of_bus
> uses "assigned-addresses" instead of "reg". So it still doesn't work
> as-is. To make it work, you would need to add an "assigned-addresses"
> property to each root port node. That property value could be in
> non-relocatable memory space, translatable via normal rules, in which
> case the "map config space to MMIO via ranges" discussion is moot for
> Marvell.
So if I understand all of the above correctly, the Tegra binding should
look like this:
pcie-controller {
compatible = "nvidia,tegra20-pcie";
device_type = "pci";
reg = <0x80003000 0x00000800 /* PADS registers */
0x80003800 0x00000200 /* AFI registers */
0x90000000 0x10000000>; /* configuration space */
reg-names = "pads", "afi", "cs";
interrupts = <0 98 0x04 /* controller interrupt */
0 99 0x04>; /* MSI interrupt */
interrupt-names = "intr", "msi";
bus-range = <0x00 0xff>;
#address-cells = <3>;
#size-cells = <2>;
ranges = <0x82000800 0 0 0x80000000 0 0x00001000 /* port 0 registers */
0x82001000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
clocks = <&tegra_car 70>, <&tegra_car 72>, <&tegra_car 74>,
<&tegra_car 118>;
clock-names = "pex", "afi", "pcie_xclk", "pll_e";
status = "disabled";
pci@1,0 {
device_type = "pci";
assigned-addresses = <0x82000800 0 0 0x80000000 0x1000>;
reg = <0x000800 0 0 0 0>;
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
nvidia,num-lanes = <2>;
};
pci@2,0 {
device_type = "pci";
assigned-addresses = <0x82001000 0 0 0x80001000 0x1000>;
reg = <0x001000 0 0 0 0>;
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
nvidia,num-lanes = <2>;
};
};
Does that look about correct?
One problem I see is that now the ranges property contains three entries
for 32-bit memory space:
0x82000800 0 0 0x80000000 0 0x00001000 /* port 0 registers */
0x82001000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
But we used to rely on the ss field to parse the ranges property and
configure the windows in the controller properly so that accesses to
these regions will generate the correct transactions.
Now the code will actually match the first entry and assume that it
represents the non-prefetchable 32-bit memory space and program the
controller to forward accesses to the 0x80000000-0x80000fff region
as PCI memory write transactions.
I suppose we could add a check to verify that devfn == 0. However does
the above ranges property not imply that the brige on bus 0, device 1
has a 32-bit memory space from 0x80000000 to 0x80000fff and the bridge
on bus 0, device 2 has a 32-bit memory space from 0x80001000 to
0x80001fff? Which matches with the wording in the spec that the
"assigned-addresses" correspond to the bridge's BARs.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 192 bytes --]
_______________________________________________
devicetree-discuss mailing list
devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
https://lists.ozlabs.org/listinfo/devicetree-discuss
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 22:02 ` Thierry Reding
@ 2013-03-13 22:21 ` Jason Gunthorpe
2013-03-14 9:01 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-13 22:21 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Wed, Mar 13, 2013 at 11:02:35PM +0100, Thierry Reding wrote:
> Does that look about correct?
By my reading of the spec the entries in ranges should not
have the b,d,f bits set..
Although it is not a big stretch to include them..
> Now the code will actually match the first entry and assume that it
> represents the non-prefetchable 32-bit memory space and program the
> controller to forward accesses to the 0x80000000-0x80000fff region
> as PCI memory write transactions.
Yah, it makes a mess of determining the host bridge aperture. Require
the aperture to be first/last?
Make use of the 2nd dword?
Go back to the idea of the top level reg?
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 20:58 ` Mitch Bradley
2013-03-13 21:33 ` Thierry Reding
[not found] ` <5140E85A.3040900-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
@ 2013-03-13 22:22 ` Jason Gunthorpe
2 siblings, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-13 22:22 UTC (permalink / raw)
To: Mitch Bradley
Cc: Thierry Reding, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
> port hardware used the common programming model, with real config
> headers and stuff, 3/2 would be good because you could use existing
> drivers. But since you need a special root-port driver anyway, why go
> to the trouble of emulating non-existent addressing?
Yes, I'm sorry, I've tried not to explore the complexity behind this
reasoning here.. Those discussions went on for a long time and I'm
tired of rehashing them :)
It isn't existing drivers that are valuable, it is the entire common
PCI infrastructure for dealing with discovery and address assignment
that is has value.
The common infrastructure can only allocate addresses within fixed
regions assigned to each PCI domain - and it divides those addreses
regions between ports in the root complex via the standard PCI-E root
port bridge memory windows.
Now, the largest SOC has 10 of the PCI-E ports in it, with 32 bit
addressing. It is essential that a small region of address space be
set aside for PCI-E use, and thus it is critical that resource
assignment be done across all ports, drawing on a common pool of
address space.
Thus two choices:
1) Keep the PCI core the same and let the host bridge driver
provide the missing configuration elements of the root
complex
2) Upgrade the PCI common core to deal with dynamic cross
domain resource allocation, including hot plug.
Based on discussion/analysis #1 was chosen. Lots of reasons.
>From that choice flows the DT bindings, which are now required to
model the PCI bus 0.
Does that help explain the rational?
Cheers,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 21:33 ` Thierry Reding
@ 2013-03-13 22:48 ` Mitch Bradley
2013-03-14 0:43 ` Rob Herring
2013-03-14 7:11 ` Thierry Reding
2013-03-14 4:56 ` Stephen Warren
1 sibling, 2 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-13 22:48 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
On 3/13/2013 11:33 AM, Thierry Reding wrote:
> On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
> [...]
>> In this case, the answer to "what does pcie_controller do?" is "it
>> implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
>
> Alright, that's 2 against 1. I don't have much of a choice but to yield.
All issues of "voting" aside, 'device_type = pci' is what tells
of_get_pci_address() to use the 3/2 interpretation. So if you want a
node to implement 3/2 addresses, it must say device_type = pci, unless
you do address translation some other way.
>
>> Having just re-read the code in drivers/of/address.c, I think I have a
>> deeper understanding of a few practical issues:
>>
>> In order to translate a 3/2 PCI address via of_get_pci_address(), the
>> parent device (i.e. root complex / pcie-controller) must have both
>> 'device_type = "pci"' (else of_match_bus() won't find the pci version of
>> "struct of_bus") and also 'name = "pci"' (else strcmp(bus->name, "pci")
>> will fail).
>>
>> So it would seem that, if you want to use address.c verbatim (for the
>> interface between pcie-controller and its direct children), not only do
>> you need device_type = pci, but you also need to rename
>> "pcie-controller" to "pci".
>
> bus->name refers to that of the of_bus structure, not the node name of
> the controller.
D'oh! Completely correct. That's good; I didn't like the idea of
needing both device_type and name = pci.
>
>> But then you run into the problem that the pci variant of struct of_bus
>> uses "assigned-addresses" instead of "reg". So it still doesn't work
>> as-is. To make it work, you would need to add an "assigned-addresses"
>> property to each root port node. That property value could be in
>> non-relocatable memory space, translatable via normal rules, in which
>> case the "map config space to MMIO via ranges" discussion is moot for
>> Marvell.
>
> At the risk of quoting the spec: it says (4.1.2) that the entries in the
> "assigned-addresses" property correspond to the function's configuration
> space BARs. In this case however that's no longer true.
Right, but we got into this mess by pretending that a PCI domain exists
despite the fact that the hardware is not PCI at that level, in order to
accommodate some existing Linux code. Since we are already "faking it
to deal with existing code", it doesn't seem particularly strange to
take advantage of how existing code works, in order to accomplish the
forgery.
> But if everybody
> thinks that repurposing the "assigned-addresses" property is more
> acceptable than leaving out the device_type = "pci" property from the
> pcie-controller node, then so be it.
I'm confused again. Why does leaving out device_type = pci make it work?
>
>> Which re-raises a question for which I have not yet heard a compelling
>> answer: Why use 3/2 PCI addressing between Marvell pcie-controller and
>> its root ports (with their unusual programming model)? The Marvell
>> hardware address structure between controller and root ports is
>> definitely not 3/2. Certainly you need 3/2 addressing below
>> (implemented by) the root ports, but I just don't see the utility of
>> pretending 3/2 addressing at the complex-to-port interface. If the root
>> port hardware used the common programming model, with real config
>> headers and stuff, 3/2 would be good because you could use existing
>> drivers. But since you need a special root-port driver anyway, why go
>> to the trouble of emulating non-existent addressing?
>
> I think the reason is that reg needs to be in the standard format for
> the device node matching code to work. There are various checks to make
> sure the reg property has entries which are 3/2 each.
Could you point me to some of those checks so I can understand this better?
>
> There's also the issue that the pcie-controller's ranges property
> contains the mapping of the I/O and memory windows and therefore needs
> the 3/2 addressing to encode the type.
Why can't that be done in each of the root port nodes?
>
> Thierry
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 22:48 ` Mitch Bradley
@ 2013-03-14 0:43 ` Rob Herring
2013-03-14 1:20 ` Mitch Bradley
2013-03-14 7:11 ` Thierry Reding
1 sibling, 1 reply; 59+ messages in thread
From: Rob Herring @ 2013-03-14 0:43 UTC (permalink / raw)
To: Mitch Bradley, Thierry Reding, Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Nadav Haklai, linux-pci, devicetree-discuss, Eran Ben-Avi,
Maen Suleiman, Shadi Ammouri, Bjorn Helgaas, Tawfik Bayouk,
linux-arm-kernel
On 03/13/2013 05:48 PM, Mitch Bradley wrote:
> On 3/13/2013 11:33 AM, Thierry Reding wrote:
>> On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
>> [...]
>>> In this case, the answer to "what does pcie_controller do?" is "it
>>> implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
>>
>> Alright, that's 2 against 1. I don't have much of a choice but to yield.
>
> All issues of "voting" aside, 'device_type = pci' is what tells
> of_get_pci_address() to use the 3/2 interpretation. So if you want a
> node to implement 3/2 addresses, it must say device_type = pci, unless
> you do address translation some other way.
I should note that device_type is used for OF, but is supposed to not be
used for FDT as matching against compatible properties is preferred. I
don't have a good reason as to why, but Mitch may know the history.
However, there are numerous exceptions to that for compatibility and to
work with existing s/w. So this may be one of those cases.
Rob
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 0:43 ` Rob Herring
@ 2013-03-14 1:20 ` Mitch Bradley
0 siblings, 0 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-14 1:20 UTC (permalink / raw)
To: Rob Herring
Cc: Thierry Reding, Jason Gunthorpe, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Nadav Haklai, linux-pci,
devicetree-discuss, Eran Ben-Avi, Maen Suleiman, Shadi Ammouri,
Bjorn Helgaas, Tawfik Bayouk, linux-arm-kernel
On 3/13/2013 2:43 PM, Rob Herring wrote:
> On 03/13/2013 05:48 PM, Mitch Bradley wrote:
>> On 3/13/2013 11:33 AM, Thierry Reding wrote:
>>> On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
>>> [...]
>>>> In this case, the answer to "what does pcie_controller do?" is "it
>>>> implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
>>>
>>> Alright, that's 2 against 1. I don't have much of a choice but to yield.
>>
>> All issues of "voting" aside, 'device_type = pci' is what tells
>> of_get_pci_address() to use the 3/2 interpretation. So if you want a
>> node to implement 3/2 addresses, it must say device_type = pci, unless
>> you do address translation some other way.
>
> I should note that device_type is used for OF, but is supposed to not be
> used for FDT as matching against compatible properties is preferred. I
> don't have a good reason as to why, but Mitch may know the history.
> However, there are numerous exceptions to that for compatibility and to
> work with existing s/w. So this may be one of those cases.
Indeed. The history is that, originally, the "name" property was
supposed to be a very precise name, like what is now the most specific
part of "compatible". And "device_type" was "what, in general, does
this device do".
But with rule, pathnames quickly became unusable to humans:
/marvell,mv87230/marvell,svxyz/adaptec,12345/wd,8876
What the heck is that? Model names are like hashes; they have no rhyme
or reason and humans cannot remember them for long. In order for a
random person to have a clue about what something was, it was much
better for pathname components to represent the generic function:
/pci/pci/scsi/disk
Now it's clear that we are looking at a bridged PCI domain hosting a
SCSI disk. That's very useful for a system administrator; the
gobbledygook name is not.
Furthermore, in light of the facts that model numbers change much more
rapidly than programming interfaces, companies clone each others
programming interfaces, and companies change their names, a much more
capable way of describing programming models was needed.
So the "generic names" recommended practice changed things to make
"name" mean what "device_type" used to vaguely mean, "device_type" was
deprecated (it was never very precisely defined, and
mutually-conflicting usage patterns had developed), and "compatible" in
its present form was introduced. "Generic names" was quite
controversial at the time - I think David Kahn wanted to strangle me at
one point - but it was the right thing. (Not to mention the
embarrassing mistake of the underscore in device_type, mea culpa.)
That being the case, I personally would like for device_type to
disappear forever. But the fact exists that address.c currently uses it
(and fdt.c duly parses it), which I why I said that, if you want to use
of_get_pci_address() in its present form, device_type=pci must be
present in the bus node that implements the 3/2 address domain.
So, yeah, it's compability with existing s/w, namely the PCI support in
address.c .
>
> Rob
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 21:33 ` Thierry Reding
2013-03-13 22:48 ` Mitch Bradley
@ 2013-03-14 4:56 ` Stephen Warren
1 sibling, 0 replies; 59+ messages in thread
From: Stephen Warren @ 2013-03-14 4:56 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, Lior Amsalem, Andrew Lunn,
Russell King - ARM Linux, Jason Cooper, Nadav Haklai, linux-pci,
devicetree-discuss, Eran Ben-Avi, Jason Gunthorpe, Maen Suleiman,
Shadi Ammouri, Bjorn Helgaas, Tawfik Bayouk, linux-arm-kernel
On 03/13/2013 03:33 PM, Thierry Reding wrote:
> On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
> [...]
>> In this case, the answer to "what does pcie_controller do?" is
>> "it implements a PCI bus" below. So 'device_type = "pci"' is
>> appropriate.
>
> Alright, that's 2 against 1. I don't have much of a choice but to
> yield.
Just one note here though re: how the Tegra HW works:
The Tegra "PCIe controller" HW translates from a SoC-internal bus to
another SoC-internal bus. The "PCIe root ports" translate from that
second SoC-internal bus to a PCIe bus. That's exactly why the PCIe
root port configuration registers don't show up via type 0 PCIe
configuration transactions. Thus, from a HW perspective, it really is
true that the PCIe root ports are PCI devices, but the "PCIe
controller" really isn't anything to do with PCIe.
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 22:48 ` Mitch Bradley
2013-03-14 0:43 ` Rob Herring
@ 2013-03-14 7:11 ` Thierry Reding
1 sibling, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-14 7:11 UTC (permalink / raw)
To: Mitch Bradley
Cc: Jason Gunthorpe, Lior Amsalem, Russell King - ARM Linux,
Jason Cooper, Andrew Lunn, linux-pci, devicetree-discuss,
Eran Ben-Avi, Nadav Haklai, Maen Suleiman, Bjorn Helgaas,
Shadi Ammouri, Tawfik Bayouk, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 6625 bytes --]
On Wed, Mar 13, 2013 at 12:48:28PM -1000, Mitch Bradley wrote:
> On 3/13/2013 11:33 AM, Thierry Reding wrote:
> > On Wed, Mar 13, 2013 at 10:58:02AM -1000, Mitch Bradley wrote:
> > [...]
> >> In this case, the answer to "what does pcie_controller do?" is "it
> >> implements a PCI bus" below. So 'device_type = "pci"' is appropriate.
> >
> > Alright, that's 2 against 1. I don't have much of a choice but to yield.
>
> All issues of "voting" aside, 'device_type = pci' is what tells
> of_get_pci_address() to use the 3/2 interpretation. So if you want a
> node to implement 3/2 addresses, it must say device_type = pci, unless
> you do address translation some other way.
of_bus_default_map() can also handle #address-cells > 2. We explicitly
added that special case back when the bindings still looked very
different and we needed to match the exact entry.
This is the reason why it actually works to not use device_type = "pci"
for the pcie-controller. For subordinate device address translation the
code will look at the parent bus, see that it isn't a PCI device and use
the default bus, which in turn will simply memcmp() the first 3 cells of
the "reg" property entry against those in the "ranges" property.
> >> But then you run into the problem that the pci variant of struct of_bus
> >> uses "assigned-addresses" instead of "reg". So it still doesn't work
> >> as-is. To make it work, you would need to add an "assigned-addresses"
> >> property to each root port node. That property value could be in
> >> non-relocatable memory space, translatable via normal rules, in which
> >> case the "map config space to MMIO via ranges" discussion is moot for
> >> Marvell.
> >
> > At the risk of quoting the spec: it says (4.1.2) that the entries in the
> > "assigned-addresses" property correspond to the function's configuration
> > space BARs. In this case however that's no longer true.
>
> Right, but we got into this mess by pretending that a PCI domain exists
> despite the fact that the hardware is not PCI at that level, in order to
> accommodate some existing Linux code.
Actually at that level the hardware is indeed PCI. These entries are for
the root port nodes, which are in fact the bridges to the PCI bus. This
is what I've been trying to say all along. The PCIe controller is just a
collection of registers that allow CPU address space to be bridged to
an internal bus (FPCI, which is in fact a relic from the HyperTransport
roots of the controller IIUC) and setup/handle MSI.
The transition to the PCI fabric happens in the root ports. There is no
PCI host bridge.
> Since we are already "faking it to deal with existing code", it doesn't
> seem particularly strange to take advantage of how existing code works,
> in order to accomplish the forgery.
So we're back to square one. How is this kind of faking (repurposing the
"assigned-addresses" property) any better than translating the "reg"
property through an "ss = 00" entry in the "ranges" property?
> > But if everybody
> > thinks that repurposing the "assigned-addresses" property is more
> > acceptable than leaving out the device_type = "pci" property from the
> > pcie-controller node, then so be it.
>
> I'm confused again. Why does leaving out device_type = pci make it work?
I've already said this above. The default bus implementation handles the
case where #address-cells > 2 by comparing the first three cells, but
not go through the additional range matching that would otherwise need
to take place. Instead it treats the address cells as some kind of
identifier to match between the "reg" and "ranges" entries. That's very
much the way that the PCI bus implementation works, except for some
extra PCI-specific checks.
> >> Which re-raises a question for which I have not yet heard a compelling
> >> answer: Why use 3/2 PCI addressing between Marvell pcie-controller and
> >> its root ports (with their unusual programming model)? The Marvell
> >> hardware address structure between controller and root ports is
> >> definitely not 3/2. Certainly you need 3/2 addressing below
> >> (implemented by) the root ports, but I just don't see the utility of
> >> pretending 3/2 addressing at the complex-to-port interface. If the root
> >> port hardware used the common programming model, with real config
> >> headers and stuff, 3/2 would be good because you could use existing
> >> drivers. But since you need a special root-port driver anyway, why go
> >> to the trouble of emulating non-existent addressing?
> >
> > I think the reason is that reg needs to be in the standard format for
> > the device node matching code to work. There are various checks to make
> > sure the reg property has entries which are 3/2 each.
>
> Could you point me to some of those checks so I can understand this better?
The code for that is in drivers/of/of_pci.c. The __of_pci_pci_compare()
function looks like this:
static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int devfn)
{
unsigned int size;
const __be32 *reg = of_get_property(node, "reg", &size);
if (!reg || size < 5 * sizeof(__be32))
return 0;
return ((be32_to_cpup(®[0]) >> 8) & 0xff) == devfn;
}
This is the function that's called for each node to check whether it
corresponds to a given PCI device by matching the devfn to the value
found in the first cell of a device node's "reg" property.
> > There's also the issue that the pcie-controller's ranges property
> > contains the mapping of the I/O and memory windows and therefore needs
> > the 3/2 addressing to encode the type.
>
> Why can't that be done in each of the root port nodes?
It could possibly be done, but we'd be lying. The way the hardware works
is that we configure a single window to map to each of the FPCI regions
for I/O and (non-)prefetchable memory spaces. So the pcie-controller
node's "ranges" property contains three entries, one for each of those
spaces which the driver uses to program the corresponding registers. The
same information is passed to the resource assignment code of the PCI
core so that it can partition these windows as appropriate and allocate
sub-windows that are assigned to the BARs of the root-ports.
Note that the programming of the root-port BARs is vital here. If memory
within one of the FPCI-mapped windows is accessed, the hardware will
look for a matching address in the root port BARs in order to determine
the port through which the transactions should be sent.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-13 22:21 ` Jason Gunthorpe
@ 2013-03-14 9:01 ` Thierry Reding
2013-03-14 17:25 ` Jason Gunthorpe
0 siblings, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-14 9:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lior Amsalem, Andrew Lunn, Russell King - ARM Linux, Jason Cooper,
Tawfik Bayouk, linux-pci, devicetree-discuss, Eran Ben-Avi,
Nadav Haklai, Maen Suleiman, Shadi Ammouri, Mitch Bradley,
Bjorn Helgaas, linux-arm-kernel
[-- Attachment #1.1: Type: text/plain, Size: 2386 bytes --]
On Wed, Mar 13, 2013 at 04:21:02PM -0600, Jason Gunthorpe wrote:
> On Wed, Mar 13, 2013 at 11:02:35PM +0100, Thierry Reding wrote:
>
> > Does that look about correct?
>
> By my reading of the spec the entries in ranges should not
> have the b,d,f bits set..
>
> Although it is not a big stretch to include them..
>
> > Now the code will actually match the first entry and assume that it
> > represents the non-prefetchable 32-bit memory space and program the
> > controller to forward accesses to the 0x80000000-0x80000fff region
> > as PCI memory write transactions.
>
> Yah, it makes a mess of determining the host bridge aperture. Require
> the aperture to be first/last?
>
> Make use of the 2nd dword?
>
> Go back to the idea of the top level reg?
It turns out that this works with the Tegra driver because it uses the
new of_pci_process_ranges() function and simply overwrites earlier
matches by subsequent ones.
ranges = <0x82000000 0 0 0x80000000 0 0x00001000 /* port 0 registers */
0x82000000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
So the second entry would overwrite the non-prefetchable memory region
as established by the first entry. Finally that will also be overwritten
by the fourth entry.
So the above will actually work along with the corresponding root-port
"assigned-addresses" properties. I still don't like it much because I
don't think it accurately reflects the hardware. Furthermore it has the
same kludgy, non-spec conformant smack that my original proposal had
because it uses assigned-addresses for something it wasn't intended to.
Also the mapping is actually not correct, because the first two entries
don't refer to 32-bit memory space. They really map configuration space
and therefore the ss field is wrong IMO (it should be 00, but that won't
work with the OF core). One of the very early versions of my Tegra PCIe
patch series actually introduced an IORESOURCE_CS type to account for
this but it didn't go anywhere and became obsolete when I switched to
the new binding which side-steps this issue by not using the PCI bus'
translation mechanism.
Thierry
[-- Attachment #1.2: Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 9:01 ` Thierry Reding
@ 2013-03-14 17:25 ` Jason Gunthorpe
2013-03-14 20:38 ` Thierry Reding
2013-03-14 21:09 ` Thierry Reding
0 siblings, 2 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-14 17:25 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
[trimm'd the cc list]
On Thu, Mar 14, 2013 at 10:01:20AM +0100, Thierry Reding wrote:
> It turns out that this works with the Tegra driver because it uses the
> new of_pci_process_ranges() function and simply overwrites earlier
> matches by subsequent ones.
>
> ranges = <0x82000000 0 0 0x80000000 0 0x00001000 /* port 0 registers */
> 0x82000000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
> 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> 0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> 0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
Okay.. There is still something funny here, the 3rd dword of the child
address should not be 0 in every line and there shouldn't be overlaps
in the child address space.
I'm assuming 0x80000000, 0xa0000000 and 0xb0000000 are real CPU physical
addresses?
Then it should probably look like:
ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
address 0xa0000000' - this is the 'normal' case, I assume that is what
happens on tegra?
It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
with address 0' - this translation is something Linux typically
expects..
Then you'd go on to have:
pci@1,0 {
device_type = "pci";
assigned-addresses = <0x82000000 0 0x80000000 0 0x1000>;
reg = <0x000800 0 0 0 0>;
}
pci@2,0 {
device_type = "pci";
assigned-addresses = <0x82000000 0 0x80001000 0 0x1000>;
reg = <0x001000 0 0 0 0>;
}
Notice I've made the upper dw of assigned-addresses's size 0 and
included the full 3dw from the appropriate ranges line.
> So the above will actually work along with the corresponding root-port
> "assigned-addresses" properties. I still don't like it much because I
> don't think it accurately reflects the hardware.
There are lots of valid ways to model the same HW :(
Bear in mind, for the PCI case - the OF PCI bindings model the HW
through the eyes of the abstractions in the PCI specification. That is
to say, they are not supposed to be an exact representation of the on
chip architecture.
Perhaps this would be clearer if you used 'pcie-root-complex' as the
name of the top level node?
> same kludgy, non-spec conformant smack that my original proposal had
> because it uses assigned-addresses for something it wasn't intended
> to.
Yes, only the top level 'reg' method avoids going outside any specs.
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 17:25 ` Jason Gunthorpe
@ 2013-03-14 20:38 ` Thierry Reding
2013-03-14 21:05 ` Jason Gunthorpe
2013-03-14 21:10 ` Mitch Bradley
2013-03-14 21:09 ` Thierry Reding
1 sibling, 2 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-14 20:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 3852 bytes --]
On Thu, Mar 14, 2013 at 11:25:55AM -0600, Jason Gunthorpe wrote:
> [trimm'd the cc list]
>
> On Thu, Mar 14, 2013 at 10:01:20AM +0100, Thierry Reding wrote:
>
> > It turns out that this works with the Tegra driver because it uses the
> > new of_pci_process_ranges() function and simply overwrites earlier
> > matches by subsequent ones.
> >
> > ranges = <0x82000000 0 0 0x80000000 0 0x00001000 /* port 0 registers */
> > 0x82000000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
> > 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> > 0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> > 0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
>
> Okay.. There is still something funny here, the 3rd dword of the child
> address should not be 0 in every line and there shouldn't be overlaps
> in the child address space.
>
> I'm assuming 0x80000000, 0xa0000000 and 0xb0000000 are real CPU physical
> addresses?
Yes.
> Then it should probably look like:
>
> ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
> 0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
> 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> 0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> 0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
>
> Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
> address 0xa0000000' - this is the 'normal' case, I assume that is what
> happens on tegra?
>
> It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
> with address 0' - this translation is something Linux typically
> expects..
>
> Then you'd go on to have:
>
> pci@1,0 {
> device_type = "pci";
> assigned-addresses = <0x82000000 0 0x80000000 0 0x1000>;
> reg = <0x000800 0 0 0 0>;
> }
> pci@2,0 {
> device_type = "pci";
> assigned-addresses = <0x82000000 0 0x80001000 0 0x1000>;
> reg = <0x001000 0 0 0 0>;
> }
>
> Notice I've made the upper dw of assigned-addresses's size 0 and
> included the full 3dw from the appropriate ranges line.
Okay, that all makes sense.
> > So the above will actually work along with the corresponding root-port
> > "assigned-addresses" properties. I still don't like it much because I
> > don't think it accurately reflects the hardware.
>
> There are lots of valid ways to model the same HW :(
>
> Bear in mind, for the PCI case - the OF PCI bindings model the HW
> through the eyes of the abstractions in the PCI specification. That is
> to say, they are not supposed to be an exact representation of the on
> chip architecture.
That seems to be at odds with most other uses of DT that I've come
across. Generally the guideline seems to be to describe hardware
irrespective of the underlying implementation and leave it up to the
driver to translate the DT description into something the OS can use.
> Perhaps this would be clearer if you used 'pcie-root-complex' as the
> name of the top level node?
Perhaps. I'm not sure.
> > same kludgy, non-spec conformant smack that my original proposal had
> > because it uses assigned-addresses for something it wasn't intended
> > to.
>
> Yes, only the top level 'reg' method avoids going outside any specs.
Yes. It has a couple of other disadvantages, though, so if what we've
been discussing here is in any way acceptable I'd rather go with that
solution, even if I'm not entirely happy about it either.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 20:38 ` Thierry Reding
@ 2013-03-14 21:05 ` Jason Gunthorpe
2013-03-14 21:10 ` Mitch Bradley
1 sibling, 0 replies; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-14 21:05 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
On Thu, Mar 14, 2013 at 09:38:58PM +0100, Thierry Reding wrote:
> > pci@1,0 {
> > device_type = "pci";
> > assigned-addresses = <0x82000000 0 0x80000000 0 0x1000>;
Sorry, I missed this. The b,d,f bits should be set:
assigned-addresses = <0x82000800 0 0x80000000 0 0x1000>;
> > reg = <0x000800 0 0 0 0>;
> > }
> > pci@2,0 {
> > device_type = "pci";
> > assigned-addresses = <0x82000000 0 0x80001000 0 0x1000>;
assigned-addresses = <0x82001000 0 0x80001000 0 0x1000>;
The 'ranges' stays the same, the PCI 3/2 parse will ignore the b,d,f
bits when comparing against ranges.
The use of r == 0 for assigned-addresses should be enough to
disambiguate this case from any future use of assigned-addresses - the
normal case has r == offset of the associated BAR != 0.
Cheers,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 17:25 ` Jason Gunthorpe
2013-03-14 20:38 ` Thierry Reding
@ 2013-03-14 21:09 ` Thierry Reding
2013-03-14 21:29 ` Jason Gunthorpe
1 sibling, 1 reply; 59+ messages in thread
From: Thierry Reding @ 2013-03-14 21:09 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 1346 bytes --]
On Thu, Mar 14, 2013 at 11:25:55AM -0600, Jason Gunthorpe wrote:
[...]
> I'm assuming 0x80000000, 0xa0000000 and 0xb0000000 are real CPU physical
> addresses?
>
> Then it should probably look like:
>
> ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
> 0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
> 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> 0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> 0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
>
> Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
> address 0xa0000000' - this is the 'normal' case, I assume that is what
> happens on tegra?
>
> It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
> with address 0' - this translation is something Linux typically
> expects..
Both of the above paragraphs are true. However accesses to the windows
at 0x80000000 and 0x80001000 don't generate memory TLPs but type 0
configuration space TLPs.
So my first instinct was to make the first cell of the first two entries
0, but that doesn't work, since the OF core expects to find either
memory or I/O spaces. ss == 2 isn't quite right here.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 20:38 ` Thierry Reding
2013-03-14 21:05 ` Jason Gunthorpe
@ 2013-03-14 21:10 ` Mitch Bradley
1 sibling, 0 replies; 59+ messages in thread
From: Mitch Bradley @ 2013-03-14 21:10 UTC (permalink / raw)
To: Thierry Reding
Cc: Jason Gunthorpe, linux-pci, devicetree-discuss, linux-arm-kernel
On 3/14/2013 10:38 AM, Thierry Reding wrote:
> On Thu, Mar 14, 2013 at 11:25:55AM -0600, Jason Gunthorpe wrote:
>> [trimm'd the cc list]
>>
>> On Thu, Mar 14, 2013 at 10:01:20AM +0100, Thierry Reding wrote:
>>
>>> It turns out that this works with the Tegra driver because it uses the
>>> new of_pci_process_ranges() function and simply overwrites earlier
>>> matches by subsequent ones.
>>>
>>> ranges = <0x82000000 0 0 0x80000000 0 0x00001000 /* port 0 registers */
>>> 0x82000000 0 0 0x80001000 0 0x00001000 /* port 1 registers */
>>> 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
>>> 0x82000000 0 0 0xa0000000 0 0x10000000 /* non-prefetchable memory */
>>> 0xc2000000 0 0 0xb0000000 0 0x10000000>; /* prefetchable memory */
>>
>> Okay.. There is still something funny here, the 3rd dword of the child
>> address should not be 0 in every line and there shouldn't be overlaps
>> in the child address space.
>>
>> I'm assuming 0x80000000, 0xa0000000 and 0xb0000000 are real CPU physical
>> addresses?
>
> Yes.
>
>> Then it should probably look like:
>>
>> ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
>> 0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
>> 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
>> 0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
>> 0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
>>
>> Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
>> address 0xa0000000' - this is the 'normal' case, I assume that is what
>> happens on tegra?
>>
>> It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
>> with address 0' - this translation is something Linux typically
>> expects..
>>
>> Then you'd go on to have:
>>
>> pci@1,0 {
>> device_type = "pci";
>> assigned-addresses = <0x82000000 0 0x80000000 0 0x1000>;
>> reg = <0x000800 0 0 0 0>;
>> }
>> pci@2,0 {
>> device_type = "pci";
>> assigned-addresses = <0x82000000 0 0x80001000 0 0x1000>;
>> reg = <0x001000 0 0 0 0>;
>> }
>>
>> Notice I've made the upper dw of assigned-addresses's size 0 and
>> included the full 3dw from the appropriate ranges line.
>
> Okay, that all makes sense.
>
>>> So the above will actually work along with the corresponding root-port
>>> "assigned-addresses" properties. I still don't like it much because I
>>> don't think it accurately reflects the hardware.
>>
>> There are lots of valid ways to model the same HW :(
>>
>> Bear in mind, for the PCI case - the OF PCI bindings model the HW
>> through the eyes of the abstractions in the PCI specification. That is
>> to say, they are not supposed to be an exact representation of the on
>> chip architecture.
>
> That seems to be at odds with most other uses of DT that I've come
> across. Generally the guideline seems to be to describe hardware
> irrespective of the underlying implementation and leave it up to the
> driver to translate the DT description into something the OS can use.
This can be a gray area where one just has to make a judgement call.
You can think of the hardware at various levels of abstraction and
sometimes it is not obvious where to draw the line. "Tell the truth"
tends to be more stable in the long run, but can be more work in the
short run, and it can be hard to predict the future well enough to
choose the optimum approach.
>
>> Perhaps this would be clearer if you used 'pcie-root-complex' as the
>> name of the top level node?
>
> Perhaps. I'm not sure.
>
>>> same kludgy, non-spec conformant smack that my original proposal had
>>> because it uses assigned-addresses for something it wasn't intended
>>> to.
>>
>> Yes, only the top level 'reg' method avoids going outside any specs.
>
> Yes. It has a couple of other disadvantages, though, so if what we've
> been discussing here is in any way acceptable I'd rather go with that
> solution, even if I'm not entirely happy about it either.
>
> Thierry
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 21:09 ` Thierry Reding
@ 2013-03-14 21:29 ` Jason Gunthorpe
2013-03-14 21:37 ` Thierry Reding
0 siblings, 1 reply; 59+ messages in thread
From: Jason Gunthorpe @ 2013-03-14 21:29 UTC (permalink / raw)
To: Thierry Reding
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
On Thu, Mar 14, 2013 at 10:09:26PM +0100, Thierry Reding wrote:
> > ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
> > 0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
> > 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> > 0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> > 0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
> >
> > Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
> > address 0xa0000000' - this is the 'normal' case, I assume that is what
> > happens on tegra?
> >
> > It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
> > with address 0' - this translation is something Linux typically
> > expects..
>
> Both of the above paragraphs are true. However accesses to the windows
> at 0x80000000 and 0x80001000 don't generate memory TLPs but type 0
> configuration space TLPs.
By my understanding access to 0x80000000/0x80001000 doesn't generate
any externally visible TLPs?
IHMO modeling this register space as a controller-internal MMIO region
associated with the bridge is reasonable... After all, you are
iomapping it and accessing it with readl/writel - those are MMIO
functions..
> So my first instinct was to make the first cell of the first two entries
> 0, but that doesn't work, since the OF core expects to find either
> memory or I/O spaces.
Right, there is no specified way to translate config ss through
ranges.
Regards,
Jason
^ permalink raw reply [flat|nested] 59+ messages in thread
* Re: [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems
2013-03-14 21:29 ` Jason Gunthorpe
@ 2013-03-14 21:37 ` Thierry Reding
0 siblings, 0 replies; 59+ messages in thread
From: Thierry Reding @ 2013-03-14 21:37 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mitch Bradley, linux-pci, devicetree-discuss, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 1566 bytes --]
On Thu, Mar 14, 2013 at 03:29:21PM -0600, Jason Gunthorpe wrote:
> On Thu, Mar 14, 2013 at 10:09:26PM +0100, Thierry Reding wrote:
>
> > > ranges = <0x82000000 0 0x80000000 0x80000000 0 0x00001000 /* port 0 registers */
> > > 0x82000000 0 0x80001000 0x80001000 0 0x00001000 /* port 1 registers */
> > > 0x81000000 0 0 0x82000000 0 0x00010000 /* downstream I/O */
> > > 0x82000000 0 0xa0000000 0xa0000000 0 0x10000000 /* non-prefetchable memory */
> > > 0xc2000000 0 0xb0000000 0xb0000000 0 0x10000000>; /* prefetchable memory */
> > >
> > > Which says 'access to CPU address 0xa0000000 produces a PCI-E memory TLP with
> > > address 0xa0000000' - this is the 'normal' case, I assume that is what
> > > happens on tegra?
> > >
> > > It also says 'access to CPU address 0x82000000 produces a PCI-E IO TLP
> > > with address 0' - this translation is something Linux typically
> > > expects..
> >
> > Both of the above paragraphs are true. However accesses to the windows
> > at 0x80000000 and 0x80001000 don't generate memory TLPs but type 0
> > configuration space TLPs.
>
> By my understanding access to 0x80000000/0x80001000 doesn't generate
> any externally visible TLPs?
Now that you mention it, that's probably correct, yes.
> IHMO modeling this register space as a controller-internal MMIO region
> associated with the bridge is reasonable... After all, you are
> iomapping it and accessing it with readl/writel - those are MMIO
> functions..
Yes, I think that'd be okay.
Thierry
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 59+ messages in thread
end of thread, other threads:[~2013-03-14 21:37 UTC | newest]
Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1360686546-24277-1-git-send-email-thomas.petazzoni@free-electrons.com>
[not found] ` <1360686546-24277-25-git-send-email-thomas.petazzoni@free-electrons.com>
[not found] ` <20130212223511.GB31555@obsidianresearch.com>
[not found] ` <20130306105441.4d24033e@skate>
[not found] ` <20130306121118.GA17079@avionic-0098.mockup.avionic-design.de>
[not found] ` <20130306180946.GA2433@obsidianresearch.com>
[not found] ` <20130307080832.GD3451@avionic-0098.mockup.avionic-design.de>
[not found] ` <20130307174955.GC20840@obsidianresearch.com>
[not found] ` <20130307194830.GA1811@avionic-0098.mockup.avionic-design.de>
[not found] ` <20130307200235.GB20695@obsidianresearch.com>
[not found] ` <20130307200235.GB20695-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-07 20:47 ` [PATCH 24/32] pci: PCIe driver for Marvell Armada 370/XP systems Thierry Reding
2013-03-08 0:05 ` Rob Herring
2013-03-08 7:14 ` Thierry Reding
2013-03-08 16:52 ` Jason Gunthorpe
2013-03-08 19:12 ` Thierry Reding
[not found] ` <20130308191227.GA6551-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
2013-03-08 19:43 ` Mitch Bradley
2013-03-08 20:02 ` Jason Gunthorpe
[not found] ` <20130308200245.GC29435-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-08 20:13 ` Thierry Reding
2013-03-10 15:09 ` Thomas Petazzoni
2013-03-11 8:08 ` Thierry Reding
2013-03-08 23:46 ` Mitch Bradley
2013-03-09 1:31 ` Jason Gunthorpe
[not found] ` <20130309013152.GA3883-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-10 4:52 ` Mitch Bradley
2013-03-10 6:55 ` Jason Gunthorpe
[not found] ` <20130310065539.GA14704-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-11 5:46 ` Mitch Bradley
[not found] ` <513D6F9C.9000100-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
2013-03-11 7:46 ` Thierry Reding
[not found] ` <20130311074615.GA6365-RM9K5IK7kjIQXX3q8xo1gnVAuStQJXxyR5q1nwbD4aMs9pC9oP6+/A@public.gmane.org>
2013-03-11 18:04 ` Mitch Bradley
2013-03-11 18:23 ` Jason Gunthorpe
[not found] ` <20130311182339.GB10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-11 19:49 ` Mitch Bradley
2013-03-11 18:15 ` Jason Gunthorpe
[not found] ` <20130311181554.GA10992-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-11 21:50 ` Mitch Bradley
2013-03-11 23:25 ` Jason Gunthorpe
[not found] ` <20130311232516.GA13873-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-11 23:38 ` Mitch Bradley
[not found] ` <513E6AFE.3090304-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
2013-03-12 7:08 ` Thierry Reding
2013-03-12 15:57 ` Jason Gunthorpe
2013-03-12 20:38 ` Thierry Reding
2013-03-12 21:03 ` Jason Gunthorpe
2013-03-12 21:30 ` Thierry Reding
2013-03-12 22:08 ` Jason Gunthorpe
2013-03-12 23:25 ` Mitch Bradley
2013-03-13 8:18 ` Thierry Reding
2013-03-13 17:02 ` Jason Gunthorpe
[not found] ` <20130313170205.GB24042-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-03-13 19:26 ` Thierry Reding
2013-03-13 19:59 ` Jason Gunthorpe
2013-03-13 20:54 ` Thierry Reding
2013-03-13 20:58 ` Mitch Bradley
2013-03-13 21:33 ` Thierry Reding
2013-03-13 22:48 ` Mitch Bradley
2013-03-14 0:43 ` Rob Herring
2013-03-14 1:20 ` Mitch Bradley
2013-03-14 7:11 ` Thierry Reding
2013-03-14 4:56 ` Stephen Warren
[not found] ` <5140E85A.3040900-D5eQfiDGL7eakBO8gow8eQ@public.gmane.org>
2013-03-13 22:02 ` Thierry Reding
2013-03-13 22:21 ` Jason Gunthorpe
2013-03-14 9:01 ` Thierry Reding
2013-03-14 17:25 ` Jason Gunthorpe
2013-03-14 20:38 ` Thierry Reding
2013-03-14 21:05 ` Jason Gunthorpe
2013-03-14 21:10 ` Mitch Bradley
2013-03-14 21:09 ` Thierry Reding
2013-03-14 21:29 ` Jason Gunthorpe
2013-03-14 21:37 ` Thierry Reding
2013-03-13 22:22 ` Jason Gunthorpe
2013-03-09 8:58 ` Thomas Petazzoni
2013-03-08 23:12 ` Rob Herring
[not found] ` <513A7044.1020700-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-03-09 11:10 ` Thierry Reding
2013-03-10 5:04 ` Mitch Bradley
2013-03-10 15:06 ` Thomas Petazzoni
2013-03-10 18:33 ` Mitch Bradley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).