mvebu-mbus: defining a DT binding

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* mvebu-mbus: defining a DT binding
@ 2013-04-05 13:02 Thomas Petazzoni
  2013-04-05 13:17 ` Arnd Bergmann
  2013-04-05 16:27 ` Jason Gunthorpe
  0 siblings, 2 replies; 16+ messages in thread
From: Thomas Petazzoni @ 2013-04-05 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

Jason, Arnd,

Now that the non-DT version of mvebu-mbus has been picked up by Jason
Cooper for 3.10, I'd like to start working on a proper DT binding for
mvebu-mbus, probably targeting 3.11. I've already started writing a bit
of code just to see how things were going, and I have a couple of
questions.

The last proposal from Jason Gunthorpe was the following one:

========================================================================

/* MAPDEF is a 2 dw value, top DW encodes the target id, bottom dword is
   usally 0. The special target id of '0' is no target/no window. */
#define MAPDEF(x,y,z) ((x << 16) | (y << 8) | z) 0
#define MAPDEF_INTERNAL MAPDEF(...)
[..]
#define PCI_MMIO_APERTURE 0xe0000000
#define PCI_IO_APERTURE 0xe8000000

mbus {
  ranges = <MAPDEF_INTERNAL 0xf1000000 0x100000
            MAPDEF_NAND     0xf4000000 0x10000
            MAPDEF_CRYPTO   0xf5000000 0x10000

 	    /* These two are *not* mbus windows but are required to
   	       model the PCI aperture abstraction. Windows are
	       allocated within these regions dynamically
               as neeed. */
            0 PCI_MMIO_APERTURE PCI_MMIO_APERTURE 0x08000000
            0 PCI_IO_APERTURE PCI_MMIO_APERTURE 0x00100000>
  #address-cells = <2>;
  [..]

  // Internal regs special target
  internal_regs at f1000000 {
       compatible = "simple-bus";
       ranges = <0x00000000 MAPDEF_INTERNAL 0x100000>;
       #address-cells = <1>;

       serial at 12000 {
              compatible = "ns16550a";
              reg = <0x12000 0x100>;
              reg-shift = <2>;
              interrupts = <33>;
        };
       [..]
  }

  // All the PCI-E targets
  pcie-controller {
     compatible = "marvell,armada-370-pcie";
     reg = <MAPDEF_INTERNAL + 0x40000 0x2000>,
           <MAPDEF_INTERNAL + 0x80000 0x2000>;
     ranges = <0x82000000 0 PCI_MMIO_APERTURE PCI_MMIO_APERTURE 0 0x08000000   /* non-prefetchable memory */
               0x81000000 0 0                 PCI_IO_APERTURE 0  0x00100000>; /* downstream I/O */
     [..]
     pcie at 0,0 {
          reg = <0x0800 0 0 0 0>;
          marvell,mbus-targets = <MAPDEF_PCIE0_MEM MAPDEF_PCIE0_IO>;
     }
  }

  // Target id 1,0x2f
  nand at 0xf4000000 {
          #address-cells = <1>;
          #size-cells = <1>;
          cle = <0>;
          ale = <1>;
          bank-width = <1>;
          compatible = "marvell,orion-nand";
          reg = <MAPDEF_NAND 0x400>;
  };

  // Target id 0x3,0x1
  crypto at f5000000 {
          reg = <MAPDEF_INTERNAL + 0x30000 0x10000>,
                <MAPDEF_CRYPTO 0x800>;
  }
}

========================================================================

It raises the following questions:

 * The address decoding windows are in fact defined by the ranges =
   <...> property of the mbus node. Here, in the example provided by
   Jason, this ranges = <...> property is part of the SoC-level .dtsi.
   However, some boards will have to add some board-specific windows
   (adjusted to the size of their NOR, or FPGA, for example).

   As far as I know, we can 'extend' an existing property in a .dts
   file, we have to completely overload it. So this means that boards
   wanting to add an additional address decoding window will have to
   copy/paste the 'ranges' property from the included .dtsi file and
   add their own entry. This is of course possible, but it doesn't look
   very nice: if a new window is added in the SoC-level .dtsi file for
   some reason, this change will have to be replicated on all the
   including .dts files that overload this ranges property.

   Is this the intended behavior?

 * I am not sure to understand why the nand, crypto and pcie-controller
   nodes are outside of the internal registers. Well, I understand it's
   because those devices need a special address decoding window. But it
   sounds strange because those devices also have internal registers
   (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
   property).

   Is this really the desired DT binding?

 * Regarding the PCIe binding, Jason Gunthorpe suggests the use of a
   'marvell,mbus-targets' attribute so that the PCIe driver knows what
   target/attribute values to use to create the window. Currently, the
   PCIe driver uses a name (like "pcie0.3") that it gives to the
   mvebu-mbus driver, which then translates this name to the real
   target/attribute values using per-SoC tables that associate names
   (pcie0.3) with values (say 0x4 / 0x42).

   Is this marvell,mbus-targets really the suggested solution? This
   would basically mean that the entire name -> value mapping tables of
   the driver would ultimately become useless (note that I'm not
   necessarily saying it is bad, I just want to check where we are
   going before doing useless work). Of course, those tables would
   remain at the beginning, once all platforms are converted to the
   mvebu-mbus DT binding, which may take a bit of time since it requires
   quite a lot of code movement in the .dtsi/.dts files.

I already have some basic code in the mvebu-mbus driver that parses the
ranges = <...> property and creates the corresponding windows, it seems
to work fine.

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 13:02 mvebu-mbus: defining a DT binding Thomas Petazzoni
@ 2013-04-05 13:17 ` Arnd Bergmann
  2013-04-05 13:51   ` Thomas Petazzoni
  2013-04-05 16:27 ` Jason Gunthorpe
  1 sibling, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 13:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Thomas Petazzoni wrote:

> ========================================================================
> 
> It raises the following questions:
> 
>  * The address decoding windows are in fact defined by the ranges =
>    <...> property of the mbus node. Here, in the example provided by
>    Jason, this ranges = <...> property is part of the SoC-level .dtsi.
>    However, some boards will have to add some board-specific windows
>    (adjusted to the size of their NOR, or FPGA, for example).
> 
>    As far as I know, we can 'extend' an existing property in a .dts
>    file, we have to completely overload it. So this means that boards
>    wanting to add an additional address decoding window will have to
>    copy/paste the 'ranges' property from the included .dtsi file and
>    add their own entry. This is of course possible, but it doesn't look
>    very nice: if a new window is added in the SoC-level .dtsi file for
>    some reason, this change will have to be replicated on all the
>    including .dts files that overload this ranges property.
> 
>    Is this the intended behavior?

I don't have a good solution. Maybe we can work around this with the
new preprocessor support by letting the board file provide a macro
with the actual size?

Another option would be to not map them by default but let the
mbus driver do as many mappings as possible at boot time based
on the devices that are actually present as children of the mbus
device and not marked as status="disabled".

Do you know which children of the mbus node (if any) we actually
need to access before initializing mbus?

>  * I am not sure to understand why the nand, crypto and pcie-controller
>    nodes are outside of the internal registers. Well, I understand it's
>    because those devices need a special address decoding window. But it
>    sounds strange because those devices also have internal registers
>    (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
>    property).
> 
>    Is this really the desired DT binding?

I don't understant what "internal registers" mean in this context.

>  * Regarding the PCIe binding, Jason Gunthorpe suggests the use of a
>    'marvell,mbus-targets' attribute so that the PCIe driver knows what
>    target/attribute values to use to create the window. Currently, the
>    PCIe driver uses a name (like "pcie0.3") that it gives to the
>    mvebu-mbus driver, which then translates this name to the real
>    target/attribute values using per-SoC tables that associate names
>    (pcie0.3) with values (say 0x4 / 0x42).
> 
>    Is this marvell,mbus-targets really the suggested solution? This
>    would basically mean that the entire name -> value mapping tables of
>    the driver would ultimately become useless (note that I'm not
>    necessarily saying it is bad, I just want to check where we are
>    going before doing useless work). Of course, those tables would
>    remain at the beginning, once all platforms are converted to the
>    mvebu-mbus DT binding, which may take a bit of time since it requires
>    quite a lot of code movement in the .dtsi/.dts files.

I think it would be much nicer to express the specific mbus target using
the ranges properties of the device nodes in PCI. Maybe I'm missing what
the problem is here. 

> I already have some basic code in the mvebu-mbus driver that parses the
> ranges = <...> property and creates the corresponding windows, it seems
> to work fine.

Ok, nice! So I guess the next step is to dynamically insert additional
ranges for devices that are only mapped at run time, right?

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 13:17 ` Arnd Bergmann
@ 2013-04-05 13:51   ` Thomas Petazzoni
  2013-04-05 14:36     ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Petazzoni @ 2013-04-05 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

Dear Arnd Bergmann,

On Fri, 5 Apr 2013 15:17:26 +0200, Arnd Bergmann wrote:
> On Friday 05 April 2013, Thomas Petazzoni wrote:
> 
> > ========================================================================
> > 
> > It raises the following questions:
> > 
> >  * The address decoding windows are in fact defined by the ranges =
> >    <...> property of the mbus node. Here, in the example provided by
> >    Jason, this ranges = <...> property is part of the SoC-level .dtsi.
> >    However, some boards will have to add some board-specific windows
> >    (adjusted to the size of their NOR, or FPGA, for example).
> > 
> >    As far as I know, we can 'extend' an existing property in a .dts
> >    file, we have to completely overload it. So this means that boards
> >    wanting to add an additional address decoding window will have to
> >    copy/paste the 'ranges' property from the included .dtsi file and
> >    add their own entry. This is of course possible, but it doesn't look
> >    very nice: if a new window is added in the SoC-level .dtsi file for
> >    some reason, this change will have to be replicated on all the
> >    including .dts files that overload this ranges property.
> > 
> >    Is this the intended behavior?
> 
> I don't have a good solution. Maybe we can work around this with the
> new preprocessor support by letting the board file provide a macro
> with the actual size?

The thing is that we don't know how many NOR, FPGA and other device-bus
devices the board will use. So you mean we should define a size of 0 by
default (in which case no window would be created), and if a board
overrides that with a non-zero value, then we would use that one?

I don't find the solution of using pre-processor magic to do that very
pretty. The 'inheritance' property of the DT normally solves that
nicely... but since we're encoding everything with ranges, it doesn't
work very well.

Maybe the 'ranges' property should not contain any static window
definition at all, and instead some other DT properties would define
the windows, and the 'ranges' property would be reconstructed by the
mbus driver at initialization time? I admit I would really prefer if
windows were described by name ("pcie0.0", "devbus-cs0", "bootrom")
instead of by encoding the target/attribute values in the DT. This
would make the DT a lot easier to read, understand and modify by people
porting the kernel to new boards. But using a name is not possible in
the 'ranges' property.

> Another option would be to not map them by default but let the
> mbus driver do as many mappings as possible at boot time based
> on the devices that are actually present as children of the mbus
> device and not marked as status="disabled".

So the 'ranges' property would be more or less empty? How would the
mbus driver know the target/attribute values for each window?

> Do you know which children of the mbus node (if any) we actually
> need to access before initializing mbus?

I'm sorry but I don't understand this question. There's nothing to do
to 'initialize mbus'. We can add/remove address decoding windows
whenever we want.

> >  * I am not sure to understand why the nand, crypto and pcie-controller
> >    nodes are outside of the internal registers. Well, I understand it's
> >    because those devices need a special address decoding window. But it
> >    sounds strange because those devices also have internal registers
> >    (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
> >    property).
> > 
> >    Is this really the desired DT binding?
> 
> I don't understant what "internal registers" mean in this context.

On Marvell HW, the physical address space is split in multiple windows:
PCIe memory, PCIe I/O, mapping of NOR, mapping of FPGA, special
memories like crypto-engine SRAM, etc. One very special window is
called the 'internal registers window'. It is set up by the bootloader,
and contains all the registers for all the devices: UART, PCIe
interfaces, USB, SDIO, Ethernet, everything. All the 0xd00yyyyy
addresses in the current Armada 370/XP .dts(i) are part of the internal
registers window.

> >  * Regarding the PCIe binding, Jason Gunthorpe suggests the use of a
> >    'marvell,mbus-targets' attribute so that the PCIe driver knows what
> >    target/attribute values to use to create the window. Currently, the
> >    PCIe driver uses a name (like "pcie0.3") that it gives to the
> >    mvebu-mbus driver, which then translates this name to the real
> >    target/attribute values using per-SoC tables that associate names
> >    (pcie0.3) with values (say 0x4 / 0x42).
> > 
> >    Is this marvell,mbus-targets really the suggested solution? This
> >    would basically mean that the entire name -> value mapping tables of
> >    the driver would ultimately become useless (note that I'm not
> >    necessarily saying it is bad, I just want to check where we are
> >    going before doing useless work). Of course, those tables would
> >    remain at the beginning, once all platforms are converted to the
> >    mvebu-mbus DT binding, which may take a bit of time since it requires
> >    quite a lot of code movement in the .dtsi/.dts files.
> 
> I think it would be much nicer to express the specific mbus target using
> the ranges properties of the device nodes in PCI. Maybe I'm missing what
> the problem is here. 

Hum, this has already been discussed when doing the PCIe driver.

Basically it's not possible because the target/attribute values are per
PCIe interface, while the ranges we need to PCIe are *global*. Since we
don't know how much memory and I/O each PCIe interface will need, we
can't provision the right windows statically in the DT. The windows are
created dynamically by the PCIe driver depending on what gets
enumerated on the downstream bus.

So in the DT we have *one* global range for PCIe memory and *one* global
range for PCIe I/O.

But then, the PCIe driver needs to know for *each* PCIe interface what
is the right target/attribute tuple to create a memory or I/O window
for that particular interface.

Does that make sense?

> > I already have some basic code in the mvebu-mbus driver that parses the
> > ranges = <...> property and creates the corresponding windows, it seems
> > to work fine.
> 
> Ok, nice! So I guess the next step is to dynamically insert additional
> ranges for devices that are only mapped at run time, right?

What would be the need of this? Nobody will ever read the 'ranges'
property again at runtime, so why bother? Of course, if you need it for
good DT-compliance, I'll do it, but I don't see the point of having
kernel code to update a data structure that will never be read again by
anybody else.

Thanks for your comments!

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 13:51   ` Thomas Petazzoni
@ 2013-04-05 14:36     ` Arnd Bergmann
  2013-04-05 17:07       ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Thomas Petazzoni wrote:
> On Fri, 5 Apr 2013 15:17:26 +0200, Arnd Bergmann wrote:
> > On Friday 05 April 2013, Thomas Petazzoni wrote:
> > I don't have a good solution. Maybe we can work around this with the
> > new preprocessor support by letting the board file provide a macro
> > with the actual size?
> 
> The thing is that we don't know how many NOR, FPGA and other device-bus
> devices the board will use. So you mean we should define a size of 0 by
> default (in which case no window would be created), and if a board
> overrides that with a non-zero value, then we would use that one?
> 
> I don't find the solution of using pre-processor magic to do that very
> pretty. The 'inheritance' property of the DT normally solves that
> nicely... but since we're encoding everything with ranges, it doesn't
> work very well.

Right.

> Maybe the 'ranges' property should not contain any static window
> definition at all, and instead some other DT properties would define
> the windows, and the 'ranges' property would be reconstructed by the
> mbus driver at initialization time? I admit I would really prefer if
> windows were described by name ("pcie0.0", "devbus-cs0", "bootrom")
> instead of by encoding the target/attribute values in the DT. This
> would make the DT a lot easier to read, understand and modify by people
> porting the kernel to new boards. But using a name is not possible in
> the 'ranges' property.

Yes, that is unfortunately something we cannot change any more.
Setting up a minimum number of windows from a different property
would be possible, but I don't see a strong reason to do that
instead of using the ranges property for it.

> > Another option would be to not map them by default but let the
> > mbus driver do as many mappings as possible at boot time based
> > on the devices that are actually present as children of the mbus
> > device and not marked as status="disabled".
> 
> So the 'ranges' property would be more or less empty? How would the
> mbus driver know the target/attribute values for each window?

I think ideally the ranges property would be completely empty
(absent actually, but that is a detail) at boot, and the mbus driver
would fill the ranges property as needed. I can see two ways to
do that:

a) The mbus driver does a for_each_child_of_node() loop to iterate
through its children and allocates a new physical address for each
"reg" property it finds on all devices that are not status="disabled".
Since the windows are all power-of-two pages in size, a simple
"first-fit" algorith should be just fine here. The definition of 
the mbus address space guarantees that the "reg" property has
all the information needed to do that mapping.

b) We explicitly make all devices under mbus aware of the fact that
they they are on mbus, and export a function that they have to call
to map the window, like

	phys_addr_t mbus_map_reg(struct device_node *dn, int index);

The mbus driver then allocates a free physical address (as in a)) and
maps the window based on the address referred to by the reg property.

We can add a convenience function that combines this with request_resource
and ioremap for drivers that would do those anyway.

> > Do you know which children of the mbus node (if any) we actually
> > need to access before initializing mbus?
> 
> I'm sorry but I don't understand this question. There's nothing to do
> to 'initialize mbus'. We can add/remove address decoding windows
> whenever we want.

Well, we have to set up the windows at one point during the boot process,
and we cannot access anything behind the windows until they are set up.

So if for instance the UART we use for DEBUG_LL is behind an MBUS window,
we cannot easily change its address. My question is which devices fall
into this category, if any.

> > >  * I am not sure to understand why the nand, crypto and pcie-controller
> > >    nodes are outside of the internal registers. Well, I understand it's
> > >    because those devices need a special address decoding window. But it
> > >    sounds strange because those devices also have internal registers
> > >    (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
> > >    property).
> > > 
> > >    Is this really the desired DT binding?
> > 
> > I don't understant what "internal registers" mean in this context.
> 
> On Marvell HW, the physical address space is split in multiple windows:
> PCIe memory, PCIe I/O, mapping of NOR, mapping of FPGA, special
> memories like crypto-engine SRAM, etc. One very special window is
> called the 'internal registers window'. It is set up by the bootloader,
> and contains all the registers for all the devices: UART, PCIe
> interfaces, USB, SDIO, Ethernet, everything. All the 0xd00yyyyy
> addresses in the current Armada 370/XP .dts(i) are part of the internal
> registers window.

Ok, I see. Two follow-up questions:

Does the internal registers window contain the MMIO space of the mbus
device as well?

Do you expect that we always need just one window to map all the internal
registers, or would there be a reason to split it up into multiple windows
to reduce the amount of physical address space consumed?

> > I think it would be much nicer to express the specific mbus target using
> > the ranges properties of the device nodes in PCI. Maybe I'm missing what
> > the problem is here. 
> 
> Hum, this has already been discussed when doing the PCIe driver.
> 
> Basically it's not possible because the target/attribute values are per
> PCIe interface, while the ranges we need to PCIe are *global*. Since we
> don't know how much memory and I/O each PCIe interface will need, we
> can't provision the right windows statically in the DT. The windows are
> created dynamically by the PCIe driver depending on what gets
> enumerated on the downstream bus.
> 
> So in the DT we have *one* global range for PCIe memory and *one* global
> range for PCIe I/O.
> 
> But then, the PCIe driver needs to know for *each* PCIe interface what
> is the right target/attribute tuple to create a memory or I/O window
> for that particular interface.
> 
> Does that make sense?

I don't get it yet. I would assume that each PCIe port maps exactly to
one address window, which solves the problem you describe above very
nicely. We can do away with the fake "one range for memory" concept
and just let the mbus driver pick any physical address when we enable
or hot-plug one of the ports.

> > > I already have some basic code in the mvebu-mbus driver that parses the
> > > ranges = <...> property and creates the corresponding windows, it seems
> > > to work fine.
> > 
> > Ok, nice! So I guess the next step is to dynamically insert additional
> > ranges for devices that are only mapped at run time, right?
> 
> What would be the need of this? Nobody will ever read the 'ranges'
> property again at runtime, so why bother? Of course, if you need it for
> good DT-compliance, I'll do it, but I don't see the point of having
> kernel code to update a data structure that will never be read again by
> anybody else.

The ranges properties are required so a device driver can call of_iomap().
Without it, there is no way to convert an address in mbus space into a
CPU phys_addr_t.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 14:36     ` Arnd Bergmann
@ 2013-04-05 17:07       ` Jason Gunthorpe
  2013-04-05 17:28         ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 17:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 04:36:56PM +0200, Arnd Bergmann wrote:

> > > Another option would be to not map them by default but let the
> > > mbus driver do as many mappings as possible at boot time based
> > > on the devices that are actually present as children of the mbus
> > > device and not marked as status="disabled".
> > 
> > So the 'ranges' property would be more or less empty? How would the
> > mbus driver know the target/attribute values for each window?
> 
> I think ideally the ranges property would be completely empty
> (absent actually, but that is a detail) at boot, and the mbus driver
> would fill the ranges property as needed. I can see two ways to
> do that:

At a minimum it needs to have the special internal regs target. The
mbus driver cannot relocate that memory, and must be told where the
bootloader left it.

Beyond that, things get muddled, IMHO. DT is being asked to do two
slightly conflicting things - represent the address map from the
bootloader and also provide enough information for Linux to
dynamically reconfigure it.

> a) The mbus driver does a for_each_child_of_node() loop to iterate
> through its children and allocates a new physical address for each
> "reg" property it finds on all devices that are not status="disabled".
> Since the windows are all power-of-two pages in size, a simple
> "first-fit" algorith should be just fine here. The definition of 
> the mbus address space guarantees that the "reg" property has
> all the information needed to do that mapping.

Certainly doable.. ranges would have to be parsed as well, and it is a
bit complex, but very doable.

> b) We explicitly make all devices under mbus aware of the fact that
> they they are on mbus, and export a function that they have to call
> to map the window, like

IMHO, this is the nuclear option - it should be avoided. Existing
general drivers like serial and MTD should not need special wrappers
to work..

> > > Do you know which children of the mbus node (if any) we actually
> > > need to access before initializing mbus?
> > 
> > I'm sorry but I don't understand this question. There's nothing to do
> > to 'initialize mbus'. We can add/remove address decoding windows
> > whenever we want.
> 
> Well, we have to set up the windows at one point during the boot process,
> and we cannot access anything behind the windows until they are set up.
> 
> So if for instance the UART we use for DEBUG_LL is behind an MBUS window,
> we cannot easily change its address. My question is which devices fall
> into this category, if any.

Agree, the mbus driver should parse the DTB and construct all the
'static' windows, update the ranges, then populate the children. As is
usual in Linux the bus driver should ensure everything is setup for
the child nodes before their probe function is called.

PEX is a special case, since it has 'dynamic' windows.

> Does the internal registers window contain the MMIO space of the mbus
> device as well?

Yes, the registers the MBUS driver accesses are part of the internal
regs target.

> Do you expect that we always need just one window to map all the internal
> registers, or would there be a reason to split it up into multiple windows
> to reduce the amount of physical address space consumed?

Internal regs is special. There is a single dedicated aperture
register for it. There can be only be one mapping.

> I don't get it yet. I would assume that each PCIe port maps exactly to
> one address window, which solves the problem you describe above very
> nicely. We can do away with the fake "one range for memory" concept
> and just let the mbus driver pick any physical address when we enable
> or hot-plug one of the ports.

The Linux PCI core requires a single host bridge aperture, that is
what the ranges indicate. It is up to the PCI core to select the
address window(s) the PEX will use. When it does this it tells the PCI
driver, which tells the MBUS driver, which ultimately creates the
window. The address selection must be slaved to the PCI core, the MBUS
driver cannot operate autonomously.

This is an unavoidable consequence of merging all the PEX's into a
single root complex. If you recall this was required to support the 10
PEX case. Cross port address assignment is necessary to avoid address
space exhaustion.

> The ranges properties are required so a device driver can call of_iomap().
> Without it, there is no way to convert an address in mbus space into a
> CPU phys_addr_t.

Right. When the mbus driver populates the children nodes, the standard
OF code should be immediately able to get the correct physical address
through all normal OF means.

So, for every target the mbus driver either forces the mbus registers
to match the DT or dynamically assigns a physical address and updates
the DT to reflect that.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 17:07       ` Jason Gunthorpe
@ 2013-04-05 17:28         ` Arnd Bergmann
  2013-04-05 17:48           ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 17:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Jason Gunthorpe wrote:
> On Fri, Apr 05, 2013 at 04:36:56PM +0200, Arnd Bergmann wrote:
> > I think ideally the ranges property would be completely empty
> > (absent actually, but that is a detail) at boot, and the mbus driver
> > would fill the ranges property as needed. I can see two ways to
> > do that:
> 
> At a minimum it needs to have the special internal regs target. The
> mbus driver cannot relocate that memory, and must be told where the
> bootloader left it.

Ok, I see.

> Beyond that, things get muddled, IMHO. DT is being asked to do two
> slightly conflicting things - represent the address map from the
> bootloader and also provide enough information for Linux to
> dynamically reconfigure it.

Right.

> > a) The mbus driver does a for_each_child_of_node() loop to iterate
> > through its children and allocates a new physical address for each
> > "reg" property it finds on all devices that are not status="disabled".
> > Since the windows are all power-of-two pages in size, a simple
> > "first-fit" algorith should be just fine here. The definition of 
> > the mbus address space guarantees that the "reg" property has
> > all the information needed to do that mapping.
> 
> Certainly doable.. ranges would have to be parsed as well, and it is a
> bit complex, but very doable.

One small complication that I realized now is that the windows on the
PCIe bus must be contiguous and not have other windows in the middle
because of the way that PCI resource allocation works in Linux at
the moment.

This means the mbus driver would probably need to sort the windows
by size first and allocate space from one end for all non-PCIe devices
and then we can let the PCIe driver announce the remaining space as
available for other devices to the PCIe core.

> > b) We explicitly make all devices under mbus aware of the fact that
> > they they are on mbus, and export a function that they have to call
> > to map the window, like
> 
> IMHO, this is the nuclear option - it should be avoided. Existing
> general drivers like serial and MTD should not need special wrappers
> to work..

If the internal regs have to be a special case, most of the devices
just stay regular platform devices and keep working as before, we'd
just have to special-case the drivers that are not on the internal
register window.

> > Well, we have to set up the windows at one point during the boot process,
> > and we cannot access anything behind the windows until they are set up.
> > 
> > So if for instance the UART we use for DEBUG_LL is behind an MBUS window,
> > we cannot easily change its address. My question is which devices fall
> > into this category, if any.
> 
> Agree, the mbus driver should parse the DTB and construct all the
> 'static' windows, update the ranges, then populate the children. As is
> usual in Linux the bus driver should ensure everything is setup for
> the child nodes before their probe function is called.

We should be able to handle this very easily if we don't call the mbus
device itself compatible="simple-bus" because it is not. Instead we
the probe function of the mbus device can assign all the windows and
then add its children by calling of_platform_populate() on itself.

> > Does the internal registers window contain the MMIO space of the mbus
> > device as well?
> 
> Yes, the registers the MBUS driver accesses are part of the internal
> regs target.

ok.

> > Do you expect that we always need just one window to map all the internal
> > registers, or would there be a reason to split it up into multiple windows
> > to reduce the amount of physical address space consumed?
> 
> Internal regs is special. There is a single dedicated aperture
> register for it. There can be only be one mapping.

Ok, that simplifies the number of options we have, which is probably
a good thing.

> > I don't get it yet. I would assume that each PCIe port maps exactly to
> > one address window, which solves the problem you describe above very
> > nicely. We can do away with the fake "one range for memory" concept
> > and just let the mbus driver pick any physical address when we enable
> > or hot-plug one of the ports.
> 
> The Linux PCI core requires a single host bridge aperture, that is
> what the ranges indicate. It is up to the PCI core to select the
> address window(s) the PEX will use. When it does this it tells the PCI
> driver, which tells the MBUS driver, which ultimately creates the
> window. The address selection must be slaved to the PCI core, the MBUS
> driver cannot operate autonomously.
> 
> This is an unavoidable consequence of merging all the PEX's into a
> single root complex. If you recall this was required to support the 10
> PEX case. Cross port address assignment is necessary to avoid address
> space exhaustion.

Right, I remembered that already and wrote above that we need to leave
a single phys_addr_t range for all the PCIe ranges after assigning
everything else. However, I think the DT representation should still
reflect what the hardware looks like, meaning that the ranges property
of the root complex can have one static range for each port, translating
the 4GB address space of the port to the 4GB mbus window.

When the PCIe core actually gets around to enable the port and assign
a PCIe memory resource, we can use the information from the ranges
property and the PCIe memory aperture to set up the mbus window.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 17:28         ` Arnd Bergmann
@ 2013-04-05 17:48           ` Jason Gunthorpe
  2013-04-05 19:49             ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 17:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 07:28:13PM +0200, Arnd Bergmann wrote:

> > The Linux PCI core requires a single host bridge aperture, that is
> > what the ranges indicate. It is up to the PCI core to select the
> > address window(s) the PEX will use. When it does this it tells the PCI
> > driver, which tells the MBUS driver, which ultimately creates the
> > window. The address selection must be slaved to the PCI core, the MBUS
> > driver cannot operate autonomously.
> > 
> > This is an unavoidable consequence of merging all the PEX's into a
> > single root complex. If you recall this was required to support the 10
> > PEX case. Cross port address assignment is necessary to avoid address
> > space exhaustion.
> 
> Right, I remembered that already and wrote above that we need to leave
> a single phys_addr_t range for all the PCIe ranges after assigning
> everything else. However, I think the DT representation should still
> reflect what the hardware looks like, meaning that the ranges property
> of the root complex can have one static range for each port, translating
> the 4GB address space of the port to the 4GB mbus window.

The DT representation reflects what the HW looks like 'from the view
point of the PCI-E specification' - which is appropriate since it is a
'device_type=pci' node and has special OF specifications governing its
behaviour.

PCI-E, and the OF PCI bindings want each port to be allocatable within
a 'host aperture', dynamically, based on need. Trying to statically
assign each port's address space in the DT is really struggling
against that :)

Further, there just isn't enough address space to do it. Example:

A system has 3GiB of low RAM, and should support a single VGA card
with a 256MiB BAR - plugged into any PEX port. It has about 512MiB of
low address space to allocate to the PCI host aperture and 10 ports.

It simply cannot be done statically. The kernel must go through all
the PEX's, find the VGA card, allocate 256MiB to that one PEX and then
allocate much smaller amounts to all others.

So, given all of this - can you write out an example PCI controller
DT binding that uses target id in ranges?

This is why I suggested to include the PEX target IDs as meta-data in
the PEX nodes, rather than trying to encode them in ranges.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 17:48           ` Jason Gunthorpe
@ 2013-04-05 19:49             ` Arnd Bergmann
  2013-04-05 20:36               ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 19:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Jason Gunthorpe wrote:
> The DT representation reflects what the HW looks like 'from the view
> point of the PCI-E specification' - which is appropriate since it is a
> 'device_type=pci' node and has special OF specifications governing its
> behaviour.
> 
> PCI-E, and the OF PCI bindings want each port to be allocatable within
> a 'host aperture', dynamically, based on need. Trying to statically
> assign each port's address space in the DT is really struggling
> against that :)

I did not say anything about assigning the phys_addr_t range of
the port in DT, quite the contrary. What I meant is to describe
how a bus address on any of the PCIe ports is wired to the
mbus, which is really a 4GB space per port.

> Further, there just isn't enough address space to do it. Example:
> 
> A system has 3GiB of low RAM, and should support a single VGA card
> with a 256MiB BAR - plugged into any PEX port. It has about 512MiB of
> low address space to allocate to the PCI host aperture and 10 ports.
> 
> It simply cannot be done statically. The kernel must go through all
> the PEX's, find the VGA card, allocate 256MiB to that one PEX and then
> allocate much smaller amounts to all others.

Not what I meant.

> So, given all of this - can you write out an example PCI controller
> DT binding that uses target id in ranges?
> 
> This is why I suggested to include the PEX target IDs as meta-data in
> the PEX nodes, rather than trying to encode them in ranges.

I mean something like

/ {
	#address-cells = <1>;
	#size-cells = <1>;

	mbus {
		#address-cells = <2>;
		#size-cells = <1>;

		/* ignore the specific encoding here */

		ranges = <0x1 0 0xd0000000 0x100000>, /* internal regs */
			 <0xa 0xc0000000 0xc0000000 0x10000> /* PCI port a,
							already assigned */


		internal-bus {
			#address-cells = <1>;
			#size-cells = <0>;
			ranges = <0x1 0 0 0x100000>;
	
			...
		};


		pcie {
			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x82000800 0 0 0xa 0 0x1 0>, # port a
				 <0x82001000 0 0 0xb 0 0x1 0>, # port b
				 <0x82001800 0 0 0xc 0 0x1 0>, # port c
				 <0x82002000 0 0 0xd 0 0x1 0>, # port d
				 <0x82002800 0 0 0xe 0 0x1 0>; # port e

			pci { /* port a at device 000800*/
				/* only decodes range 0xc0000000 to 0xc0010000 */
				ranges = <0x82000800 0 0xc0000000 0x82000800 0 0xc0000000 0 0x10000>;
			};
			pci { /* port b at device 001000 */
				/* or just translate everything, to be more sloppy ;-) */
				ranges;
			};
		};
	};
};

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 19:49             ` Arnd Bergmann
@ 2013-04-05 20:36               ` Jason Gunthorpe
  2013-04-05 21:01                 ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 20:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 09:49:33PM +0200, Arnd Bergmann wrote:

> > So, given all of this - can you write out an example PCI controller
> > DT binding that uses target id in ranges?
> > 
> > This is why I suggested to include the PEX target IDs as meta-data in
> > the PEX nodes, rather than trying to encode them in ranges.
> 
> I mean something like

Okay, I see now.. Bit more complex for the PCI driver.

Here is a version with more detail:

mbus {
  ranges = <MAPDEF_INTERNAL 0xf1000000 0x100000
            MAPDEF_PEX0     ......     0xFFFFFFFF
            MAPDEF_PEX0_IO  ......     0xFFFFFFFF
            MAPDEF_PEX1     ......     0xFFFFFFFF
            MAPDEF_PEX1_IO  ......     0xFFFFFFFF>

  pcie-controller {
     compatible = "marvell,armada-370-pcie";

     // Mandatory, we need to change to PCI 3dw addressing here
     ranges = <
          // For assigned-addresses
          0x82000000 MAPDEF_INTERNAL 0 0x100000
          // Non prefetchable memory
          0x82000000 MAPDEF_PEX0  MAPDEF_PEX0  0 0xFFFFFFFF
          0x82000000 MAPDEF_PEX1  MAPDEF_PEX1  0 0xFFFFFFFF
          // IO memory
          0x81000000 MAPDEF_PEX0_IO  MAPDEF_PEX0_IO  0 0xFFFFFFFF
          0x81000000 MAPDEF_PEX1_IO  MAPDEF_PEX1_IO  0 0xFFFFFFFF>

     pcie at 0,0 {
          reg = <0x0800 0 0 0 0>;
          /* Mandatory, the driver needs to parse this range to learn
	     the MAPDEF values. */
          ranges = <0x82000000 0 0  0x82000000 MAPDEF_PEX0  0xFFFFFFFF
                    0x81000000 0 0  0x81000000 MAPDEF_PEX0_IO 0xFFFFFFFF>;

          // The internal register block for this PEX
          assigned-addresses = <0x82000800 MAPDEF_INTERNAL + 0x40000  0 0x2000>,
     }
}

Notes
 - The PEX link cannot be encoded in the highest DWORD due to the OF
   PCI rules, lets just use the lower 2 dws instead. MAPDEP_X is a 2
   dword value.
 - The bridge ranges would be offset 0 length 4G-1 in the DT since
   the value is not known. However firmware could do PCI address
   assignment and fill in corrected values.
 - Although unnecessary for Linux, the PCI driver could fill in the bridge
   ranges to reflect the actual bridge setup.
 - The top level mbus ranges can reflect the address translation:

   MAPDEF_PEX0 + 0xC0000000   0xC0000000 0x10000 // Identity map MMIO
   MAPDEF_PEX0_IO             0xD0000000 0x10000 // Translate 0xD0000000 to IO 0
 - A full featured firmware could fill in the various ranges to
   represent a working PCI bus configuration

So the PCI driver would now have to dynamically determine on its own a
host bridge aperture based on available space in the resource tree,
this is done instead of parsing ranges. I guess we could have a
'linux,host-aperture' or something as a stop gap.

The driver should create each root bridge by pulling
 - reg = the device.fn of the config space
 - assigned-addresess = the PEX internal register block
 - ranges = The target ID's for the PEX

So there is no longer a need to assume in the DT binding that a
particular device.fn refers to a particular PEX.

If this is the way to go, I hope it can be a revision on top of the
existing PCI patch set?

Did I miss anything Arnd?

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 20:36               ` Jason Gunthorpe
@ 2013-04-05 21:01                 ` Arnd Bergmann
  2013-04-05 21:21                   ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 21:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Jason Gunthorpe wrote:
> On Fri, Apr 05, 2013 at 09:49:33PM +0200, Arnd Bergmann wrote:

> Notes
>  - The PEX link cannot be encoded in the highest DWORD due to the OF
>    PCI rules, lets just use the lower 2 dws instead. MAPDEP_X is a 2
>    dword value.

Ok, I wondered about whether it's allowed but then concluded that it
was ok for the root bridge but not for P2P bridges. Your solution
nicely sidesteps the problem.

>  - The bridge ranges would be offset 0 length 4G-1 in the DT since
>    the value is not known. However firmware could do PCI address
>    assignment and fill in corrected values.

I don't undestand this part. It would make the topmost byte in the
4GB bus space unadressable, which seems strange. Why can't we use
the entire 4GB? Maybe we should leave at least a page?

>  - Although unnecessary for Linux, the PCI driver could fill in the bridge
>    ranges to reflect the actual bridge setup.

Correct. The IEEE PCI binding talks about this, and it seems reasonable
for actual OF implementations, but I think we can be a little sloppy
here.

>  - The top level mbus ranges can reflect the address translation:
> 
>    MAPDEF_PEX0 + 0xC0000000   0xC0000000 0x10000 // Identity map MMIO
>    MAPDEF_PEX0_IO             0xD0000000 0x10000 // Translate 0xD0000000 to IO 0

Right.

>  - A full featured firmware could fill in the various ranges to
>    represent a working PCI bus configuration

Yes, although Linux normally doesn't expect the firmware to do that. On
PowerPC, we support both ways, either trusting the firmware to completely
probe the PCI bus and fill all the properties, or doing everything
ourselves. Given that Linux already has to do it for PCIe hotplug,
there is little value in the former unless you run under a hypervisor
that does not let you reconfigure the PCI resources (e.g. some
versions of IBM pSeries).

> So the PCI driver would now have to dynamically determine on its own a
> host bridge aperture based on available space in the resource tree,
> this is done instead of parsing ranges. I guess we could have a
> 'linux,host-aperture' or something as a stop gap.

The PCI driver already has to communicate with the MBUS driver, so
it can just ask MBUS to return the largest unassigned continugous 
physical address range.

> The driver should create each root bridge by pulling
>  - reg = the device.fn of the config space
>  - assigned-addresess = the PEX internal register block
>  - ranges = The target ID's for the PEX
> 
> So there is no longer a need to assume in the DT binding that a
> particular device.fn refers to a particular PEX.
> 
> If this is the way to go, I hope it can be a revision on top of the
> existing PCI patch set?

Sounds good to me.

> Did I miss anything Arnd?

The only part I'm unsure about is the question above about the missing
byte in the 4GB address space.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 21:01                 ` Arnd Bergmann
@ 2013-04-05 21:21                   ` Jason Gunthorpe
  2013-04-06  8:39                     ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 21:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 11:01:27PM +0200, Arnd Bergmann wrote:

> >  - The bridge ranges would be offset 0 length 4G-1 in the DT since
> >    the value is not known. However firmware could do PCI address
> >    assignment and fill in corrected values.
> 
> I don't undestand this part. It would make the topmost byte in the
> 4GB bus space unadressable, which seems strange. Why can't we use
> the entire 4GB? Maybe we should leave at least a page?

Oh, sorry, that was just a mostly arbitary hacky choice to fit within
1 size cell. Using 2 size cells requires an upgrade to skeleton64 for
all mvebu platforms, eg kirkwood.

Realistically this size should never be used so it doesn't matter if
it is 4G or 4G-1 - though obviously 4G is preferred in cases where
size cells is 2. :)

> >  - A full featured firmware could fill in the various ranges to
> >    represent a working PCI bus configuration
> 
> Yes, although Linux normally doesn't expect the firmware to do that. On
> PowerPC, we support both ways, either trusting the firmware to completely
> probe the PCI bus and fill all the properties, or doing everything
> ourselves. Given that Linux already has to do it for PCIe hotplug,
> there is little value in the former unless you run under a hypervisor
> that does not let you reconfigure the PCI resources (e.g. some
> versions of IBM pSeries).

Well, x86 is sort of half and half. If Linux is happy with the
firmware values it uses them, otherwise things can change. Hotplug
tries to work within those values in some cases.

There is a real need for this, any time the firmware is re-entered it
could touch any of the address map. Moving things around could break
that. On x86 there are re-entry points in AHCI, EFI, and system
management mode.

ARM seems to be developing PSCI which is a similar firmware re-entry
point, and 'secure mode' seems to be another (???)

So.. I don't think it matters today for mvebu, but something to think
about - shuffling the firmware's address map could be dangerous.

> > So the PCI driver would now have to dynamically determine on its own a
> > host bridge aperture based on available space in the resource tree,
> > this is done instead of parsing ranges. I guess we could have a
> > 'linux,host-aperture' or something as a stop gap.
> 
> The PCI driver already has to communicate with the MBUS driver, so
> it can just ask MBUS to return the largest unassigned continugous 
> physical address range.

Right, this information should be in the resource tracking stuff as
well. (/proc/iomem should work properly, for instance)

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 21:21                   ` Jason Gunthorpe
@ 2013-04-06  8:39                     ` Arnd Bergmann
  2013-04-08 17:03                       ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-06  8:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Jason Gunthorpe wrote:
> On Fri, Apr 05, 2013 at 11:01:27PM +0200, Arnd Bergmann wrote:
> 
> > >  - The bridge ranges would be offset 0 length 4G-1 in the DT since
> > >    the value is not known. However firmware could do PCI address
> > >    assignment and fill in corrected values.
> > 
> > I don't undestand this part. It would make the topmost byte in the
> > 4GB bus space unadressable, which seems strange. Why can't we use
> > the entire 4GB? Maybe we should leave at least a page?
> 
> Oh, sorry, that was just a mostly arbitary hacky choice to fit within
> 1 size cell. Using 2 size cells requires an upgrade to skeleton64 for
> all mvebu platforms, eg kirkwood.
> 
> Realistically this size should never be used so it doesn't matter if
> it is 4G or 4G-1 - though obviously 4G is preferred in cases where
> size cells is 2. :)

But the PCI bus already is required to have #size-cells=<2>, and
we would only use this in the ranges property of the PCI node, right?

> So.. I don't think it matters today for mvebu, but something to think
> about - shuffling the firmware's address map could be dangerous.

I would at least expect a secure-mode firmware to prevent the OS
from changing any mappings that the firmware relies on, but I see
what you mean.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-06  8:39                     ` Arnd Bergmann
@ 2013-04-08 17:03                       ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-08 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Apr 06, 2013 at 10:39:56AM +0200, Arnd Bergmann wrote:

> > Realistically this size should never be used so it doesn't matter if
> > it is 4G or 4G-1 - though obviously 4G is preferred in cases where
> > size cells is 2. :)
> 
> But the PCI bus already is required to have #size-cells=<2>, and
> we would only use this in the ranges property of the PCI node, right?

Yes, my bad, you are correct.

> > So.. I don't think it matters today for mvebu, but something to think
> > about - shuffling the firmware's address map could be dangerous.
> 
> I would at least expect a secure-mode firmware to prevent the OS
> from changing any mappings that the firmware relies on, but I see
> what you mean.

Right.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 13:02 mvebu-mbus: defining a DT binding Thomas Petazzoni
  2013-04-05 13:17 ` Arnd Bergmann
@ 2013-04-05 16:27 ` Jason Gunthorpe
  2013-04-05 16:58   ` Arnd Bergmann
  1 sibling, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 03:02:00PM +0200, Thomas Petazzoni wrote:

>  * The address decoding windows are in fact defined by the ranges =
>    <...> property of the mbus node. Here, in the example provided by
>    Jason, this ranges = <...> property is part of the SoC-level .dtsi.
>    However, some boards will have to add some board-specific windows
>    (adjusted to the size of their NOR, or FPGA, for example).

Right. The things Linux does to try and re-use parts of the DT are
nice, but I don't think they should strongly guide the design of the
binding.. Someone recently said that the DT is designed to be machine
parseable not human readable..

In any event, the 'default' mbus window locations in the static DTB
should match the configuration the bootloader sets when it boots
Linux. Ideally some basic level of functionality should be possible
without the dynamic mbus driver..

So to me, this is already a guide that this might have to be board
specific anyhow.

>  * I am not sure to understand why the nand, crypto and pcie-controller
>    nodes are outside of the internal registers. Well, I understand it's
>    because those devices need a special address decoding window. But it
>    sounds strange because those devices also have internal registers
>    (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
>    property).

This is a arbitary choice, of course. Arnd: The issue is how do you
refer to two MBUS windows in a single device node.

>From a design perspective, the chip has an internal regs blocks. I
modeled that block directly with a bus node (internal_regs at f1000000).
It has a single ranges, that is 0 based. This makes the 'regs' of the
children exactly match the datasheet.  Everything that couldn't fit in
there I treated as an exception, and placed directly under mbus
instead.

The main alternative made less sense to me:

mbus {
  ranges = <MAPDEF_INTERNAL 0xf1000000 0x100000
 [..]

  everything@?? {
       ranges = <0x00000000 MAPDEF_INTERNAL 0x100000
                 0x10000000 MAPDEF_CRYPTO 0x800>
  }
}

The 'everything' node is very muddled to me.

>  * Regarding the PCIe binding, Jason Gunthorpe suggests the use of a
>    'marvell,mbus-targets' attribute so that the PCIe driver knows what
>    target/attribute values to use to create the window. Currently, the
>    PCIe driver uses a name (like "pcie0.3") that it gives to the
>    mvebu-mbus driver, which then translates this name to the real
>    target/attribute values using per-SoC tables that associate names
>    (pcie0.3) with values (say 0x4 / 0x42).

Right, my intention with this was to eliminate the tables from the
mbus driver and move them into the DT. It made no sense to use the
MAPDEF stuff for the ranges, but then require driver hard-wiring for
PEX.

FWIW, I belive this should all be done on top of the nice stuff from
Steven Warren that enables the preprocessor for dts. That is why I
wrote the sample in preprocessor language.

Regards,
Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 16:27 ` Jason Gunthorpe
@ 2013-04-05 16:58   ` Arnd Bergmann
  2013-04-05 17:29     ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2013-04-05 16:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Friday 05 April 2013, Jason Gunthorpe wrote:
> On Fri, Apr 05, 2013 at 03:02:00PM +0200, Thomas Petazzoni wrote:
> 
> >  * The address decoding windows are in fact defined by the ranges =
> >    <...> property of the mbus node. Here, in the example provided by
> >    Jason, this ranges = <...> property is part of the SoC-level .dtsi.
> >    However, some boards will have to add some board-specific windows
> >    (adjusted to the size of their NOR, or FPGA, for example).
> 
> Right. The things Linux does to try and re-use parts of the DT are
> nice, but I don't think they should strongly guide the design of the
> binding.. Someone recently said that the DT is designed to be machine
> parseable not human readable..
> 
> In any event, the 'default' mbus window locations in the static DTB
> should match the configuration the bootloader sets when it boots
> Linux. Ideally some basic level of functionality should be possible
> without the dynamic mbus driver..
> 
> So to me, this is already a guide that this might have to be board
> specific anyhow.

I mentioned three options before, what you describe is one of
them:

a) make the ranges in the DT refer to what the hardware is set to,
as you explain above. This makes it easy to run an OS without an
mbus driver, but somewhat harder to reassign the windows at run-time

b) set the ranges to the windows we want to have in Linux. This is closer
to what we have at the moment, where we basically configure mbus from
a static configuration, and allow the windows that are to be dynamic
(e.g. PCIe) to be left out of this.

c) Try to keep the ranges property empty at boot time, except possibly
the internal register window, and find the optimum address map at
boot time based on which devices are enabled.

> >  * I am not sure to understand why the nand, crypto and pcie-controller
> >    nodes are outside of the internal registers. Well, I understand it's
> >    because those devices need a special address decoding window. But it
> >    sounds strange because those devices also have internal registers
> >    (which is why Jason used 'MAPDEF_INTERNAL + offset' in the reg
> >    property).
> 
> This is a arbitary choice, of course. Arnd: The issue is how do you
> refer to two MBUS windows in a single device node.
> 
> From a design perspective, the chip has an internal regs blocks. I
> modeled that block directly with a bus node (internal_regs at f1000000).
> It has a single ranges, that is 0 based. This makes the 'regs' of the
> children exactly match the datasheet.  Everything that couldn't fit in
> there I treated as an exception, and placed directly under mbus
> instead.

Makes sense.
> The main alternative made less sense to me:
> 
> mbus {
>   ranges = <MAPDEF_INTERNAL 0xf1000000 0x100000
>  [..]
> 
>   everything@?? {
>        ranges = <0x00000000 MAPDEF_INTERNAL 0x100000
>                  0x10000000 MAPDEF_CRYPTO 0x800>
>   }
> }
> 
> The 'everything' node is very muddled to me.

Yes, that would neither clarify it to the reader nor actually
reflect what the hardware does.

	Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* mvebu-mbus: defining a DT binding
  2013-04-05 16:58   ` Arnd Bergmann
@ 2013-04-05 17:29     ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2013-04-05 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Apr 05, 2013 at 06:58:05PM +0200, Arnd Bergmann wrote:

> > So to me, this is already a guide that this might have to be board
> > specific anyhow.
> 
> I mentioned three options before, what you describe is one of
> them:

There is also 

d) The DT contains the bootloader address map, but the mbus driver
   discards that and fully reassigns everything.

e) The DT contains the bootloader address map, but some targets may be
   missing. The address map is kept, and the missing targets are
   dynamically allocated.

> a) make the ranges in the DT refer to what the hardware is set to,
> as you explain above. This makes it easy to run an OS without an
> mbus driver, but somewhat harder to reassign the windows at run-time

I don't think it makes things harder, it is straightforward for the
driver to wipe and rewrite the ranges property based on actual
programming.

> b) set the ranges to the windows we want to have in Linux. This is closer
> to what we have at the moment, where we basically configure mbus from
> a static configuration, and allow the windows that are to be dynamic
> (e.g. PCIe) to be left out of this.

Keep in mind that in all cases the mbus driver will zero all the
windows and write its own information. This is necessary to ensure any
left over PEX windows are blown away, for instance. So from a
driver-design standpoint a and b are identical.

The only difference is what do you put in the board
file. Pragmatically it won't matter to many people, Both will work if
the kernel has the mbus driver.

> c) Try to keep the ranges property empty at boot time, except possibly
> the internal register window, and find the optimum address map at
> boot time based on which devices are enabled.

I think d or e is a better choice, since we marry the best feature of
a, and the best function of c.

For working on the driver a/b is certainly the simplest first
step. Adding future capability to go to d or e won't require a DT
change.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-04-08 17:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-05 13:02 mvebu-mbus: defining a DT binding Thomas Petazzoni
2013-04-05 13:17 ` Arnd Bergmann
2013-04-05 13:51   ` Thomas Petazzoni
2013-04-05 14:36     ` Arnd Bergmann
2013-04-05 17:07       ` Jason Gunthorpe
2013-04-05 17:28         ` Arnd Bergmann
2013-04-05 17:48           ` Jason Gunthorpe
2013-04-05 19:49             ` Arnd Bergmann
2013-04-05 20:36               ` Jason Gunthorpe
2013-04-05 21:01                 ` Arnd Bergmann
2013-04-05 21:21                   ` Jason Gunthorpe
2013-04-06  8:39                     ` Arnd Bergmann
2013-04-08 17:03                       ` Jason Gunthorpe
2013-04-05 16:27 ` Jason Gunthorpe
2013-04-05 16:58   ` Arnd Bergmann
2013-04-05 17:29     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).