linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* pci_ioremap_set_mem_type(), pci_remap_iospace()
@ 2016-04-27 22:58 Bjorn Helgaas
  2016-04-28  7:21 ` Thomas Petazzoni
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2016-04-27 22:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas & Liviu,

You added pci_ioremap_set_mem_type(int mem_type) with 1c8c3cf0b523
("ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory
type").

I see this patch on the list: "[PATCH 3/3] ARM: mvebu: implement
L2/PCIe deadlock workaround"
(http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242784.html)
that does call pci_ioremap_set_mem_type(), but it doesn't look like
that patch ever got merged.

Is it still useful to have pci_ioremap_set_mem_type() even though
nobody calls it?

I'm looking at the issue of how we ioremap memory-mapped ioport
spaces, and pci_ioremap_mem_type is currently used in the arm-specific
pci_ioremap_io():

  int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
  {
    BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);

    return ioremap_page_range(PCI_IO_VIRT_BASE + offset,
                              PCI_IO_VIRT_BASE + offset + SZ_64K,
                              phys_addr,
                              __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte));
  }

Also, what about pci_remap_iospace(), added by 8b921acfeffd ("PCI: Add
pci_remap_iospace() to map bus I/O resources")?

  int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
  {
  #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
    unsigned long vaddr = (unsigned long)PCI_IOBASE + res->start;

    if (!(res->flags & IORESOURCE_IO))
      return -EINVAL;

    if (res->end > IO_SPACE_LIMIT)
      return -EINVAL;

    return ioremap_page_range(vaddr, vaddr + resource_size(res), phys_addr,
                              pgprot_device(PAGE_KERNEL));
    ...

pci_remap_iospace() is generic code from drivers/pci/pci.c.  Here we
also call ioremap_page_range(), but we use pgprot_device(PAGE_KERNEL)
(not __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte)).

It seems like these two calls of ioremap_page_range() should use the
same prot argument.  Looking at __arm_ioremap_pfn_caller() makes me
suspect that pci_remap_iospace() is not safe on arm.

Bjorn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-27 22:58 pci_ioremap_set_mem_type(), pci_remap_iospace() Bjorn Helgaas
@ 2016-04-28  7:21 ` Thomas Petazzoni
  2016-04-28 12:06 ` Liviu Dudau
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28  7:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wed, 27 Apr 2016 17:58:27 -0500, Bjorn Helgaas wrote:

> You added pci_ioremap_set_mem_type(int mem_type) with 1c8c3cf0b523
> ("ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory
> type").
> 
> I see this patch on the list: "[PATCH 3/3] ARM: mvebu: implement
> L2/PCIe deadlock workaround"
> (http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242784.html)
> that does call pci_ioremap_set_mem_type(), but it doesn't look like
> that patch ever got merged.
> 
> Is it still useful to have pci_ioremap_set_mem_type() even though
> nobody calls it?

I am actually about to send a patch that makes use of it, I discussed
it with Arnd a few days ago / last week.

Essentially the story was the following. On Cortex-A9 based Marvell
SoCs, when HW I/O coherency is enabled, you need all non-RAM space to
be mapped strongly ordered. To this effect, I submitted this patch
series that introduces pci_ioremap_mem_type() to allow
platform-specific code to override the memory type used to map PCI I/O
space.

The reaction of Arnd after this patch series was to suggest that maybe
I should just change the memory type for all platforms, so I sent
another patch doing that.

But in the end, Russell merged the implementation of
pci_ioremap_mem_type() from the previous version of the patch series.
And the remaining patch fell through the cracks. Since PCI I/O is
rarely used these days, it wasn't noticed.

But right now, I have this in my stack of patches to be sent:

diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
index 7e989d6..2d1f06d 100644
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -162,22 +162,16 @@ exit:
 }
 
 /*
- * This ioremap hook is used on Armada 375/38x to ensure that PCIe
- * memory areas are mapped as MT_UNCACHED instead of MT_DEVICE. This
- * is needed as a workaround for a deadlock issue between the PCIe
- * interface and the cache controller.
+ * This ioremap hook is used on Armada 375/38x to ensure that all MMIO
+ * areas are mapped as MT_UNCACHED instead of MT_DEVICE. This is
+ * needed as a workaround for a deadlock issue occurring when HW I/O
+ * coherency is used.
  */
 static void __iomem *
-armada_pcie_wa_ioremap_caller(phys_addr_t phys_addr, size_t size,
-                             unsigned int mtype, void *caller)
+armada_wa_ioremap_caller(phys_addr_t phys_addr, size_t size,
+                        unsigned int mtype, void *caller)
 {
-       struct resource pcie_mem;
-
-       mvebu_mbus_get_pcie_mem_aperture(&pcie_mem);
-
-       if (pcie_mem.start <= phys_addr && (phys_addr + size) <= pcie_mem.end)
-               mtype = MT_UNCACHED;
-
+       mtype = MT_UNCACHED;
        return __arm_ioremap_caller(phys_addr, size, mtype, caller);
 }
 
@@ -186,7 +180,8 @@ static void __init armada_375_380_coherency_init(struct device_node *np)
        struct device_node *cache_dn;
 
        coherency_cpu_base = of_iomap(np, 0);
-       arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
+       arch_ioremap_caller = armada_wa_ioremap_caller;
+       pci_ioremap_set_mem_type(MT_UNCACHED);
 
        /*
         * We should switch the PL310 to I/O coherency mode only if

> I'm looking at the issue of how we ioremap memory-mapped ioport
> spaces, and pci_ioremap_mem_type is currently used in the arm-specific
> pci_ioremap_io():
> 
>   int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
>   {
>     BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);
> 
>     return ioremap_page_range(PCI_IO_VIRT_BASE + offset,
>                               PCI_IO_VIRT_BASE + offset + SZ_64K,
>                               phys_addr,
>                               __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte));
>   }
> 
> Also, what about pci_remap_iospace(), added by 8b921acfeffd ("PCI: Add
> pci_remap_iospace() to map bus I/O resources")?
> 
>   int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
>   {
>   #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
>     unsigned long vaddr = (unsigned long)PCI_IOBASE + res->start;
> 
>     if (!(res->flags & IORESOURCE_IO))
>       return -EINVAL;
> 
>     if (res->end > IO_SPACE_LIMIT)
>       return -EINVAL;
> 
>     return ioremap_page_range(vaddr, vaddr + resource_size(res), phys_addr,
>                               pgprot_device(PAGE_KERNEL));
>     ...

Yes, I was also surprised to see two functions doing pretty much the
same, this probably needs to be refactored. There are 11 users of
pci_ioremap_io() (6 of them being Marvell related), and
pci_ioremap_io() is provided by ARM specific code.

On the other hand, pci_remap_iospace() is provided by
architecture-independent code, and use in 6 PCI controller drivers.

What about:

 1/ Providing an alternate version of pci_remap_iospace() that allows
    to pass the memory attribute to be used.

 2/ Convert all users of pci_ioremap_io() to pci_remap_iospace().

 3/ Get rid of pci_ioremap_io().

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-27 22:58 pci_ioremap_set_mem_type(), pci_remap_iospace() Bjorn Helgaas
  2016-04-28  7:21 ` Thomas Petazzoni
@ 2016-04-28 12:06 ` Liviu Dudau
  2016-04-28 12:13   ` Thomas Petazzoni
  2016-04-28 14:19 ` Bjorn Helgaas
  2016-04-28 14:33 ` Bjorn Helgaas
  3 siblings, 1 reply; 14+ messages in thread
From: Liviu Dudau @ 2016-04-28 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 27, 2016 at 05:58:27PM -0500, Bjorn Helgaas wrote:
> Hi Thomas & Liviu,

Hi Bjorn,

> 
> You added pci_ioremap_set_mem_type(int mem_type) with 1c8c3cf0b523
> ("ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory
> type").
> 
> I see this patch on the list: "[PATCH 3/3] ARM: mvebu: implement
> L2/PCIe deadlock workaround"
> (http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242784.html)
> that does call pci_ioremap_set_mem_type(), but it doesn't look like
> that patch ever got merged.
> 
> Is it still useful to have pci_ioremap_set_mem_type() even though
> nobody calls it?
> 
> I'm looking at the issue of how we ioremap memory-mapped ioport
> spaces, and pci_ioremap_mem_type is currently used in the arm-specific
> pci_ioremap_io():
> 
>   int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
>   {
>     BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);
> 
>     return ioremap_page_range(PCI_IO_VIRT_BASE + offset,
>                               PCI_IO_VIRT_BASE + offset + SZ_64K,
>                               phys_addr,
>                               __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte));
>   }
> 
> Also, what about pci_remap_iospace(), added by 8b921acfeffd ("PCI: Add
> pci_remap_iospace() to map bus I/O resources")?
> 
>   int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
>   {
>   #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
>     unsigned long vaddr = (unsigned long)PCI_IOBASE + res->start;
> 
>     if (!(res->flags & IORESOURCE_IO))
>       return -EINVAL;
> 
>     if (res->end > IO_SPACE_LIMIT)
>       return -EINVAL;
> 
>     return ioremap_page_range(vaddr, vaddr + resource_size(res), phys_addr,
>                               pgprot_device(PAGE_KERNEL));
>     ...
> 
> pci_remap_iospace() is generic code from drivers/pci/pci.c.  Here we
> also call ioremap_page_range(), but we use pgprot_device(PAGE_KERNEL)
> (not __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte)).

The pci_ioremap_mem_type variable is an arch/arm thing and not generic.
I chose pgprot_device(PAGE_KERNEL) as the more generic way of describing
device type memory mapping for a kernel page, which is arguably the correct
description for PCI MMIO addresses. If a platform needs special treatment for
the mapping then they can redefine pgprot_device to be something other than
pgprot_uncached.

> 
> It seems like these two calls of ioremap_page_range() should use the
> same prot argument.  

Except that, according to Thomas, he wants to use custom prot attributes to work
around bugs in HW.

> Looking at __arm_ioremap_pfn_caller() makes me
> suspect that pci_remap_iospace() is not safe on arm.

Hmm, not sure how you reached that conclusion. __arm_ioremap_pfn_caller()
does call in the end ioremap_page_range() for ARMv7 || ARM_LPAE, which is
what pci_remap_iospace() does as well. Beside the possibly different pgprot_t
values, what would make it unsafe for arm?

I am OK with the suggestion that Thomas has to add a parameter to
pci_remap_iospace() to pass on the pgprot_t value one wants and get rid of
pci_remap_io(), but I would suggest first to him to convert the ARMADA XP
platform to generic PCI code and see if it doesn't work OK by default. We
(well, Lorenzo driven nowadays) are pushing in that direction for a while now.

Best regards,
Liviu

> 
> Bjorn
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 12:06 ` Liviu Dudau
@ 2016-04-28 12:13   ` Thomas Petazzoni
  2016-04-28 13:02     ` Liviu Dudau
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, 28 Apr 2016 13:06:24 +0100, Liviu Dudau wrote:

> I am OK with the suggestion that Thomas has to add a parameter to
> pci_remap_iospace() to pass on the pgprot_t value one wants and get rid of
> pci_remap_io(), but I would suggest first to him to convert the ARMADA XP
> platform to generic PCI code and see if it doesn't work OK by default. We
> (well, Lorenzo driven nowadays) are pushing in that direction for a while now.

Armada XP itself is not affected by the HW issue that requires use to
use strongly-ordered mappings, it's only the Cortex-A9 based SoC, i.e
Armada 375, 38x and 39x.

That being said, could you point to me to which bits of the generic PCI
code I should convert our PCI support to? I'd be happy to take a look.

Thanks!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 12:13   ` Thomas Petazzoni
@ 2016-04-28 13:02     ` Liviu Dudau
  2016-04-28 13:12       ` Thomas Petazzoni
  0 siblings, 1 reply; 14+ messages in thread
From: Liviu Dudau @ 2016-04-28 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 28, 2016 at 02:13:36PM +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Thu, 28 Apr 2016 13:06:24 +0100, Liviu Dudau wrote:
> 
> > I am OK with the suggestion that Thomas has to add a parameter to
> > pci_remap_iospace() to pass on the pgprot_t value one wants and get rid of
> > pci_remap_io(), but I would suggest first to him to convert the ARMADA XP
> > platform to generic PCI code and see if it doesn't work OK by default. We
> > (well, Lorenzo driven nowadays) are pushing in that direction for a while now.
> 
> Armada XP itself is not affected by the HW issue that requires use to
> use strongly-ordered mappings, it's only the Cortex-A9 based SoC, i.e
> Armada 375, 38x and 39x.

Sorry, I blame my unfamiliarity with the Armada family of devices when I
said Armada XP (I know the name better than any other).

> 
> That being said, could you point to me to which bits of the generic PCI
> code I should convert our PCI support to? I'd be happy to take a look.

Hmm, looking at the DT bindings there seem to be a lot of custom stuff in there.
I would start with trying to see if you can replace the custom parsing of
ranges with the generic of_pci_range_to_resource() and then look at the
pcie-designware.c how they got rid of the pci_common_init_dev() and the
need to use hw_pci structure. You want to end up with calling
of_pci_get_host_bridge_resources() to get back your list of MEM and IO
resources (it parses the bus ranges as well) and then use those to map IO
space and start the root bus scanning.

Best regards,
Liviu

> 
> Thanks!
> 
> Thomas
> -- 
> Thomas Petazzoni, CTO, Free Electrons
> Embedded Linux, Kernel and Android engineering
> http://free-electrons.com
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 13:02     ` Liviu Dudau
@ 2016-04-28 13:12       ` Thomas Petazzoni
  2016-04-28 14:41         ` Liviu Dudau
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, 28 Apr 2016 14:02:28 +0100, Liviu Dudau wrote:

> > Armada XP itself is not affected by the HW issue that requires use to
> > use strongly-ordered mappings, it's only the Cortex-A9 based SoC, i.e
> > Armada 375, 38x and 39x.
> 
> Sorry, I blame my unfamiliarity with the Armada family of devices when I
> said Armada XP (I know the name better than any other).

No problem. It is impossible to understand the fine details of all SoC
families supported in the kernel. I was making this statement only for
the sake of precision.

> > That being said, could you point to me to which bits of the generic PCI
> > code I should convert our PCI support to? I'd be happy to take a look.
> 
> Hmm, looking at the DT bindings there seem to be a lot of custom stuff in there.
> I would start with trying to see if you can replace the custom parsing of
> ranges with the generic of_pci_range_to_resource() and then look at the
> pcie-designware.c how they got rid of the pci_common_init_dev() and the
> need to use hw_pci structure. You want to end up with calling
> of_pci_get_host_bridge_resources() to get back your list of MEM and IO
> resources (it parses the bus ranges as well) and then use those to map IO
> space and start the root bus scanning.

I will have a look, but there is clearly one thing that is not
possible: parsing the list of MEM and IO resources is not sufficient to
map the IO space.

With the pci-mvebu driver, all the MEM and IO mappings are dynamic. The
device tree does *not* contain the addresses at which the MEM and IO
mappings will be done. Indeed, we have too many PCIe interfaces and too
few MBus windows and physical address space to map everything
statically. So we have a logic that emulates a PCI bridge, for which we
trap the read/write accesses made by the kernel when scanning the PCI
busses, and we use what's written in the configuration space of the
emulated PCI bridge to on-demand create the MBus windows with the
appropriate size. Look at our usage of the DT:

     0x82000000 0x1 0     MBUS_ID(0x04, 0xe8) 0 1 0 /* Port 0.0 MEM */
     0x81000000 0x1 0     MBUS_ID(0x04, 0xe0) 0 1 0 /* Port 0.0 IO  */
     0x82000000 0x2 0     MBUS_ID(0x04, 0xd8) 0 1 0 /* Port 0.1 MEM */
     0x81000000 0x2 0     MBUS_ID(0x04, 0xd0) 0 1 0 /* Port 0.1 IO  */
     0x82000000 0x3 0     MBUS_ID(0x04, 0xb8) 0 1 0 /* Port 0.2 MEM */
     0x81000000 0x3 0     MBUS_ID(0x04, 0xb0) 0 1 0 /* Port 0.2 IO  */
     0x82000000 0x4 0     MBUS_ID(0x04, 0x78) 0 1 0 /* Port 0.3 MEM */
     0x81000000 0x4 0     MBUS_ID(0x04, 0x70) 0 1 0 /* Port 0.3 IO  */

     0x82000000 0x5 0     MBUS_ID(0x08, 0xe8) 0 1 0 /* Port 1.0 MEM */
     0x81000000 0x5 0     MBUS_ID(0x08, 0xe0) 0 1 0 /* Port 1.0 IO  */
     0x82000000 0x6 0     MBUS_ID(0x08, 0xd8) 0 1 0 /* Port 1.1 MEM */
     0x81000000 0x6 0     MBUS_ID(0x08, 0xd0) 0 1 0 /* Port 1.1 IO  */
     0x82000000 0x7 0     MBUS_ID(0x08, 0xb8) 0 1 0 /* Port 1.2 MEM */
     0x81000000 0x7 0     MBUS_ID(0x08, 0xb0) 0 1 0 /* Port 1.2 IO  */
     0x82000000 0x8 0     MBUS_ID(0x08, 0x78) 0 1 0 /* Port 1.3 MEM */
     0x81000000 0x8 0     MBUS_ID(0x08, 0x70) 0 1 0 /* Port 1.3 IO  */

     0x82000000 0x9 0     MBUS_ID(0x04, 0xf8) 0 1 0 /* Port 2.0 MEM */
     0x81000000 0x9 0     MBUS_ID(0x04, 0xf0) 0 1 0 /* Port 2.0 IO  */

     0x82000000 0xa 0     MBUS_ID(0x08, 0xf8) 0 1 0 /* Port 3.0 MEM */
     0x81000000 0xa 0     MBUS_ID(0x08, 0xf0) 0 1 0 /* Port 3.0 IO  */>;

See how the address and size are 0 ? We don't know at the moment of
scanning the DT, what will be the address and size of the different MEM
and IO mappings.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-27 22:58 pci_ioremap_set_mem_type(), pci_remap_iospace() Bjorn Helgaas
  2016-04-28  7:21 ` Thomas Petazzoni
  2016-04-28 12:06 ` Liviu Dudau
@ 2016-04-28 14:19 ` Bjorn Helgaas
  2016-04-28 14:36   ` Thomas Petazzoni
  2016-04-28 14:33 ` Bjorn Helgaas
  3 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-04-28 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

[+cc linux-pci, sorry, due to the @google.com DMARC hassles I can't reply
directly to mail sent to bhelgaas at google.com or to helgaas at kernel.org
(which is currently forwarded to bhelgaas at google.com).  I *can* reply to
linux-pci mail because I have a non google.com account subscribed there.
Sorry again, I know this is TMI.  I'm going to manually insert Thomas'
response so I can respond.]

Thomas Petazzoni wrote:
> On Wed, Apr 27, 2016 at 05:58:27PM -0500, Bjorn Helgaas wrote:
>> Hi Thomas & Liviu,
>> 
>> You added pci_ioremap_set_mem_type(int mem_type) with 1c8c3cf0b523
>> ("ARM: 8060/1: mm: allow sub-architectures to override PCI I/O memory
>> type").
>> 
>> I see this patch on the list: "[PATCH 3/3] ARM: mvebu: implement
>> L2/PCIe deadlock workaround"
>> (http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242784.html)
>> that does call pci_ioremap_set_mem_type(), but it doesn't look like
>> that patch ever got merged.
>> 
>> Is it still useful to have pci_ioremap_set_mem_type() even though
>> nobody calls it?

> I am actually about to send a patch that makes use of it, I discussed
> it with Arnd a few days ago / last week.
> 
> Essentially the story was the following. On Cortex-A9 based Marvell
> SoCs, when HW I/O coherency is enabled, you need all non-RAM space to
> be mapped strongly ordered. To this effect, I submitted this patch
> series that introduces pci_ioremap_mem_type() to allow
> platform-specific code to override the memory type used to map PCI I/O
> space.
> 
> The reaction of Arnd after this patch series was to suggest that maybe
> I should just change the memory type for all platforms, so I sent
> another patch doing that.
> 
> But in the end, Russell merged the implementation of
> pci_ioremap_mem_type() from the previous version of the patch series.
> And the remaining patch fell through the cracks. Since PCI I/O is
> rarely used these days, it wasn't noticed.
>
> But right now, I have this in my stack of patches to be sent:
> ...
> diff --git a/arch/arm/mach-mvebu/coherency.c b/arch/arm/mach-mvebu/coherency.c
> index 7e989d6..2d1f06d 100644
> --- a/arch/arm/mach-mvebu/coherency.c
> +++ b/arch/arm/mach-mvebu/coherency.c
> ...
> @@ -186,7 +180,8 @@ static void __init armada_375_380_coherency_init(struct device_node *np)
>         struct device_node *cache_dn;
> 
>         coherency_cpu_base = of_iomap(np, 0);
> -       arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
> +       arch_ioremap_caller = armada_wa_ioremap_caller;
> +       pci_ioremap_set_mem_type(MT_UNCACHED);

This makes sense to me because I think this changes ioremap() to do
the right thing.  But my concern is that ioremap_page_range() doesn't
actually use ioremap(), so pci_ioremap_set_mem_type() has no effect on
that path.  pci_remap_iospace() uses ioremap_page_range(), so I
suspect it will use the wrong attributes.

>> I'm looking at the issue of how we ioremap memory-mapped ioport
>> spaces, and pci_ioremap_mem_type is currently used in the arm-specific
>> pci_ioremap_io():
>> 
>>   int pci_ioremap_io(unsigned int offset, phys_addr_t phys_addr)
>>   {
>>     BUG_ON(offset + SZ_64K > IO_SPACE_LIMIT);
>> 
>>     return ioremap_page_range(PCI_IO_VIRT_BASE + offset,
>>                               PCI_IO_VIRT_BASE + offset + SZ_64K,
>>                               phys_addr,
>>                               __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte));
>>   }
>> 
>> Also, what about pci_remap_iospace(), added by 8b921acfeffd ("PCI: Add
>> pci_remap_iospace() to map bus I/O resources")?
>> 
>>   int __weak pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
>>   {
>>   #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
>>     unsigned long vaddr = (unsigned long)PCI_IOBASE + res->start;
>> 
>>     if (!(res->flags & IORESOURCE_IO))
>>       return -EINVAL;
>> 
>>     if (res->end > IO_SPACE_LIMIT)
>>       return -EINVAL;
>> 
>>     return ioremap_page_range(vaddr, vaddr + resource_size(res), phys_addr,
>>                               pgprot_device(PAGE_KERNEL));
>>     ...

> Yes, I was also surprised to see two functions doing pretty much the
> same, this probably needs to be refactored. There are 11 users of
> pci_ioremap_io() (6 of them being Marvell related), and
> pci_ioremap_io() is provided by ARM specific code.
> 
> On the other hand, pci_remap_iospace() is provided by
> architecture-independent code, and use in 6 PCI controller drivers.
> 
> What about:
> 
>  1/ Providing an alternate version of pci_remap_iospace() that allows
>     to pass the memory attribute to be used.

I'd rather fix ioremap_page_range() somehow, because that's an
exported interface, and if we only fix pci_remap_iospace(), we still
have to worry about ioremap_page_range() doing the wrong thing.

>  2/ Convert all users of pci_ioremap_io() to pci_remap_iospace().
> 
>  3/ Get rid of pci_ioremap_io().

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-27 22:58 pci_ioremap_set_mem_type(), pci_remap_iospace() Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2016-04-28 14:19 ` Bjorn Helgaas
@ 2016-04-28 14:33 ` Bjorn Helgaas
  2016-04-28 14:38   ` Thomas Petazzoni
  3 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-04-28 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

[+cc linux-pci, manually inserted reply, sorry again]

Liviu wrote:
> On Wed, Apr 27, 2016 at 05:58:27PM -0500, Bjorn Helgaas wrote:
>> Looking at __arm_ioremap_pfn_caller() makes me
>> suspect that pci_remap_iospace() is not safe on arm.

> Hmm, not sure how you reached that conclusion. __arm_ioremap_pfn_caller()
> does call in the end ioremap_page_range() for ARMv7 || ARM_LPAE, which is
> what pci_remap_iospace() does as well. Beside the possibly different
> pgprot_t values, what would make it unsafe for arm?

The different pgprot_t is exactly what I'm worried about.  The callers of
pci_ioremap_io() probably get the right thing (a pgprot tweaked to deal
with this Cortex problem), but other callers of ioremap_page_range(),
pci_remap_iospace() in particular, don't get that tweak:

  pci_ioremap_io
    ioremap_page_range(..., __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte))

  pci_remap_iospace
    ioremap_page_range(..., pgprot_device(PAGE_KERNEL))
      ioremap_pud_range(..., pgprot_device(PAGE_KERNEL))
        ioremap_pmd_range(..., pgprot_device(PAGE_KERNEL))
          ioremap_pte_range(..., pgprot_device(PAGE_KERNEL))
            set_pte_at(..., pfn_pte(pfn, pgprot_device(PAGE_KERNEL)))

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 14:19 ` Bjorn Helgaas
@ 2016-04-28 14:36   ` Thomas Petazzoni
  2016-04-28 16:03     ` Bjorn Helgaas
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, 28 Apr 2016 09:19:20 -0500, Bjorn Helgaas wrote:

> >         coherency_cpu_base = of_iomap(np, 0);
> > -       arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
> > +       arch_ioremap_caller = armada_wa_ioremap_caller;
> > +       pci_ioremap_set_mem_type(MT_UNCACHED);
> 
> This makes sense to me because I think this changes ioremap() to do
> the right thing.

What changes ioremap() to do the right thing for all mappings *except*
PCI I/O mappings:

   arch_ioremap_caller = armada_wa_ioremap_caller;

>  But my concern is that ioremap_page_range() doesn't
> actually use ioremap(), so pci_ioremap_set_mem_type() has no effect on
> that path.  pci_remap_iospace() uses ioremap_page_range(), so I
> suspect it will use the wrong attributes.

Indeed, ioremap_page_range() doesn't use ioremap(), and that's why
there is this separate pci_ioremap_set_mem_type() to ensure that
mappings done for PCI I/O space are done with MT_UNCACHED.

And indeed, pci_remap_iospace() would not work for me, as it wouldn't
use the memory attributes set by pci_ioremap_set_mem_type().

To be honest, I am wondering why we're not using the same memory
attribute for PCI I/O space as the memory attributes used for anything
else that's ioremapped. In practice, that's what happens on ARM:
ioremap() uses MT_DEVICE by default, and pci_ioremap_io() uses
MT_DEVICE as well.

> >  1/ Providing an alternate version of pci_remap_iospace() that allows
> >     to pass the memory attribute to be used.
> 
> I'd rather fix ioremap_page_range() somehow, because that's an
> exported interface, and if we only fix pci_remap_iospace(), we still
> have to worry about ioremap_page_range() doing the wrong thing.

What do you want to fix in ioremap_page_range() ? It already allows
passing the memory type that you want, so I'm not sure to understand.

It is worth mentioning that on ARM, the arch_ioremap_caller mechanism
allows platform-specific to do its own magic for each mapping. I.e, it
might decide to use some specific memory attributes, or some specific
way of doing the mappings.

It is used on 5 platforms:

 * EBSA110, where it does a 1:1 mapping with the physical address if I
   understand correctly:

static void __iomem *ebsa110_ioremap_caller(phys_addr_t cookie, size_t size,
                                            unsigned int flags, void *caller)
{
        return (void __iomem *)cookie;
}

 * i.MX3, where it forces the memory attribute to be
   MT_DEVICE_NONSHARED, but only for the mappings that are below
   0x80000000 and not within the L2 cache registers. Otherwise, the
   default memory attribute is used.

 * iop13xx, where it really does some magic calculations to get the
   virtual addresses matching some specific physical addresses (and
   falls back to a real ioremap for other areas).

 * ixp4xx, which does a normal ioremap() for non-PCI devices, and a 1:1
   mapping for PCI stuff.

 * mvebu, where we currently override the memory attribute to strongly
   ordered for PCI devices only (but I need to change that so that all
   mappings are strongly ordered)

All in all, the pgprot_device() mechanism added by Liviu is not really
sufficient to address all those customizations, which is why ARM has
this more complicated way of allowing platform specific code to
override the ioremap behavior.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 14:33 ` Bjorn Helgaas
@ 2016-04-28 14:38   ` Thomas Petazzoni
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28 14:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Thu, 28 Apr 2016 09:33:32 -0500, Bjorn Helgaas wrote:
> [+cc linux-pci, manually inserted reply, sorry again]
> 
> Liviu wrote:
> > On Wed, Apr 27, 2016 at 05:58:27PM -0500, Bjorn Helgaas wrote:
> >> Looking at __arm_ioremap_pfn_caller() makes me
> >> suspect that pci_remap_iospace() is not safe on arm.
> 
> > Hmm, not sure how you reached that conclusion. __arm_ioremap_pfn_caller()
> > does call in the end ioremap_page_range() for ARMv7 || ARM_LPAE, which is
> > what pci_remap_iospace() does as well. Beside the possibly different
> > pgprot_t values, what would make it unsafe for arm?
> 
> The different pgprot_t is exactly what I'm worried about.  The callers of
> pci_ioremap_io() probably get the right thing (a pgprot tweaked to deal
> with this Cortex problem), but other callers of ioremap_page_range(),
> pci_remap_iospace() in particular, don't get that tweak:
> 
>   pci_ioremap_io
>     ioremap_page_range(..., __pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte))
> 
>   pci_remap_iospace
>     ioremap_page_range(..., pgprot_device(PAGE_KERNEL))
>       ioremap_pud_range(..., pgprot_device(PAGE_KERNEL))
>         ioremap_pmd_range(..., pgprot_device(PAGE_KERNEL))
>           ioremap_pte_range(..., pgprot_device(PAGE_KERNEL))
>             set_pte_at(..., pfn_pte(pfn, pgprot_device(PAGE_KERNEL)))

Indeed. Which is why we should get rid of one of the two interfaces.
Since pci_remap_iospace() is the generic one, I guess that's the one we
should migrate to.

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 13:12       ` Thomas Petazzoni
@ 2016-04-28 14:41         ` Liviu Dudau
  2016-04-28 14:47           ` Thomas Petazzoni
  0 siblings, 1 reply; 14+ messages in thread
From: Liviu Dudau @ 2016-04-28 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 28, 2016 at 03:12:22PM +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Thu, 28 Apr 2016 14:02:28 +0100, Liviu Dudau wrote:
> 
> > > Armada XP itself is not affected by the HW issue that requires use to
> > > use strongly-ordered mappings, it's only the Cortex-A9 based SoC, i.e
> > > Armada 375, 38x and 39x.
> > 
> > Sorry, I blame my unfamiliarity with the Armada family of devices when I
> > said Armada XP (I know the name better than any other).
> 
> No problem. It is impossible to understand the fine details of all SoC
> families supported in the kernel. I was making this statement only for
> the sake of precision.
> 
> > > That being said, could you point to me to which bits of the generic PCI
> > > code I should convert our PCI support to? I'd be happy to take a look.
> > 
> > Hmm, looking at the DT bindings there seem to be a lot of custom stuff in there.
> > I would start with trying to see if you can replace the custom parsing of
> > ranges with the generic of_pci_range_to_resource() and then look at the
> > pcie-designware.c how they got rid of the pci_common_init_dev() and the
> > need to use hw_pci structure. You want to end up with calling
> > of_pci_get_host_bridge_resources() to get back your list of MEM and IO
> > resources (it parses the bus ranges as well) and then use those to map IO
> > space and start the root bus scanning.
> 
> I will have a look, but there is clearly one thing that is not
> possible: parsing the list of MEM and IO resources is not sufficient to
> map the IO space.
> 
> With the pci-mvebu driver, all the MEM and IO mappings are dynamic. The
> device tree does *not* contain the addresses at which the MEM and IO
> mappings will be done. Indeed, we have too many PCIe interfaces and too
> few MBus windows and physical address space to map everything
> statically. So we have a logic that emulates a PCI bridge, for which we
> trap the read/write accesses made by the kernel when scanning the PCI
> busses, and we use what's written in the configuration space of the
> emulated PCI bridge to on-demand create the MBus windows with the
> appropriate size. Look at our usage of the DT:
> 
>      0x82000000 0x1 0     MBUS_ID(0x04, 0xe8) 0 1 0 /* Port 0.0 MEM */
>      0x81000000 0x1 0     MBUS_ID(0x04, 0xe0) 0 1 0 /* Port 0.0 IO  */
>      0x82000000 0x2 0     MBUS_ID(0x04, 0xd8) 0 1 0 /* Port 0.1 MEM */
>      0x81000000 0x2 0     MBUS_ID(0x04, 0xd0) 0 1 0 /* Port 0.1 IO  */
>      0x82000000 0x3 0     MBUS_ID(0x04, 0xb8) 0 1 0 /* Port 0.2 MEM */
>      0x81000000 0x3 0     MBUS_ID(0x04, 0xb0) 0 1 0 /* Port 0.2 IO  */
>      0x82000000 0x4 0     MBUS_ID(0x04, 0x78) 0 1 0 /* Port 0.3 MEM */
>      0x81000000 0x4 0     MBUS_ID(0x04, 0x70) 0 1 0 /* Port 0.3 IO  */
> 
>      0x82000000 0x5 0     MBUS_ID(0x08, 0xe8) 0 1 0 /* Port 1.0 MEM */
>      0x81000000 0x5 0     MBUS_ID(0x08, 0xe0) 0 1 0 /* Port 1.0 IO  */
>      0x82000000 0x6 0     MBUS_ID(0x08, 0xd8) 0 1 0 /* Port 1.1 MEM */
>      0x81000000 0x6 0     MBUS_ID(0x08, 0xd0) 0 1 0 /* Port 1.1 IO  */
>      0x82000000 0x7 0     MBUS_ID(0x08, 0xb8) 0 1 0 /* Port 1.2 MEM */
>      0x81000000 0x7 0     MBUS_ID(0x08, 0xb0) 0 1 0 /* Port 1.2 IO  */
>      0x82000000 0x8 0     MBUS_ID(0x08, 0x78) 0 1 0 /* Port 1.3 MEM */
>      0x81000000 0x8 0     MBUS_ID(0x08, 0x70) 0 1 0 /* Port 1.3 IO  */
> 
>      0x82000000 0x9 0     MBUS_ID(0x04, 0xf8) 0 1 0 /* Port 2.0 MEM */
>      0x81000000 0x9 0     MBUS_ID(0x04, 0xf0) 0 1 0 /* Port 2.0 IO  */
> 
>      0x82000000 0xa 0     MBUS_ID(0x08, 0xf8) 0 1 0 /* Port 3.0 MEM */
>      0x81000000 0xa 0     MBUS_ID(0x08, 0xf0) 0 1 0 /* Port 3.0 IO  */>;
> 
> See how the address and size are 0 ? 

PCI or host address? Either are not zero, I'm afraid: PCI is 0x1 0, 0x2 0 ...
while host address is given by MBUS_ID(...) 0. Anyway ...

> We don't know at the moment of
> scanning the DT, what will be the address and size of the different MEM
> and IO mappings.

True, but you have bounds on the sizes of each region given the way you
encode the address translation. Trying to decode what the example above is
telling me:
  - each port has 0x80000_00000000 possible IO space allocated based on
    the MBUS_ID split of address space
  - same for MEM space.

And IMO you *should* have address and sizes for the MEM and IO mappings, as
they act as *upper* boundaries. No one says you need to reserve the whole
space, you are just describing how the hardware translates addresses between
the busses.

Just for my own clarification, is the reason why the ranges are declared like
this due to the fact that each port is a separate entity and multiple ports
cannot be served by the same MBus window? What stops you from having one
MBus window assigned to all IO space and the other window(s) assigned to MEM
for individual ports?

Best regards,
Liviu

> 
> Best regards,
> 
> Thomas
> -- 
> Thomas Petazzoni, CTO, Free Electrons
> Embedded Linux, Kernel and Android engineering
> http://free-electrons.com
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ?\_(?)_/?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 14:41         ` Liviu Dudau
@ 2016-04-28 14:47           ` Thomas Petazzoni
  2016-04-28 16:23             ` Arnd Bergmann
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2016-04-28 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

Adding in Cc Arnd Bergmann, since he participated to the discussion
several years ago around the PCI DT binding for the Armada platform.

On Thu, 28 Apr 2016 15:41:17 +0100, Liviu Dudau wrote:

> > See how the address and size are 0 ? 
> 
> PCI or host address? Either are not zero, I'm afraid: PCI is 0x1 0, 0x2 0 ...
> while host address is given by MBUS_ID(...) 0. Anyway ...

The host address is kind of a "fake address". "MBUS_ID(..., ....) 0" is
not something that you can ioremap. See
Documentation/devicetree/bindings/pci/mvebu-pci.txt:

"""
Since the location and size of the different MBus windows is not fixed
in
hardware, and only determined in runtime, those ranges cover the full
first
4 GB of the physical address space, and do not translate into a valid
CPU
address.
"""

> > We don't know at the moment of
> > scanning the DT, what will be the address and size of the different MEM
> > and IO mappings.
> 
> True, but you have bounds on the sizes of each region given the way you
> encode the address translation. Trying to decode what the example above is
> telling me:
>   - each port has 0x80000_00000000 possible IO space allocated based on
>     the MBUS_ID split of address space
>   - same for MEM space.
> 
> And IMO you *should* have address and sizes for the MEM and IO mappings, as
> they act as *upper* boundaries. No one says you need to reserve the whole
> space, you are just describing how the hardware translates addresses between
> the busses.

Hm, not sure what would be the benefit of changing the Device Tree with
this, but I'm probably missing something.

> Just for my own clarification, is the reason why the ranges are declared like
> this due to the fact that each port is a separate entity and multiple ports
> cannot be served by the same MBus window? What stops you from having one
> MBus window assigned to all IO space and the other window(s) assigned to MEM
> for individual ports?

Each port needs its own MBus window for the PCI memory space and PCI
I/O space that is accessed by this port. The MBUS_ID(x, y) thing is
here to encode the information that are needed to create such windows,
which as you can see are different for each port.

So we cannot have a single MBus window for all the IO space, we need
one per PCI port.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 14:36   ` Thomas Petazzoni
@ 2016-04-28 16:03     ` Bjorn Helgaas
  0 siblings, 0 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2016-04-28 16:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 28, 2016 at 04:36:46PM +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Thu, 28 Apr 2016 09:19:20 -0500, Bjorn Helgaas wrote:
> 
> > >         coherency_cpu_base = of_iomap(np, 0);
> > > -       arch_ioremap_caller = armada_pcie_wa_ioremap_caller;
> > > +       arch_ioremap_caller = armada_wa_ioremap_caller;
> > > +       pci_ioremap_set_mem_type(MT_UNCACHED);
> > 
> > This makes sense to me because I think this changes ioremap() to do
> > the right thing.
> 
> What changes ioremap() to do the right thing for all mappings *except*
> PCI I/O mappings:
> 
>    arch_ioremap_caller = armada_wa_ioremap_caller;
> 
> >  But my concern is that ioremap_page_range() doesn't
> > actually use ioremap(), so pci_ioremap_set_mem_type() has no effect on
> > that path.  pci_remap_iospace() uses ioremap_page_range(), so I
> > suspect it will use the wrong attributes.
> 
> Indeed, ioremap_page_range() doesn't use ioremap(), and that's why
> there is this separate pci_ioremap_set_mem_type() to ensure that
> mappings done for PCI I/O space are done with MT_UNCACHED.
> 
> And indeed, pci_remap_iospace() would not work for me, as it wouldn't
> use the memory attributes set by pci_ioremap_set_mem_type().

Right.  pci_remap_iospace() is a generic interface and should work
on all arches.  It *is* declared __weak, but there's no arch-specific
implementation of it, and I think if we made ioremap_page_range()
truly generic, we could make pci_remap_iospace() non-weak.

> To be honest, I am wondering why we're not using the same memory
> attribute for PCI I/O space as the memory attributes used for anything
> else that's ioremapped. In practice, that's what happens on ARM:
> ioremap() uses MT_DEVICE by default, and pci_ioremap_io() uses
> MT_DEVICE as well.
> 
> > >  1/ Providing an alternate version of pci_remap_iospace() that allows
> > >     to pass the memory attribute to be used.
> > 
> > I'd rather fix ioremap_page_range() somehow, because that's an
> > exported interface, and if we only fix pci_remap_iospace(), we still
> > have to worry about ioremap_page_range() doing the wrong thing.
> 
> What do you want to fix in ioremap_page_range() ? It already allows
> passing the memory type that you want, so I'm not sure to understand.

Yes, but in order for ioremap_page_range() to work on arm, we have to
pass in an arm-specific memory type
(__pgprot(get_mem_type(pci_ioremap_mem_type)->prot_pte)).

We can't do that because ioremap_page_range() is a generic function,
implemented in lib/ioremap.c and exported to modules, and generic
callers only know about pgprot_noncached, pgprot_writecombine,
pgprot_writethrough, pgprot_device, etc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* pci_ioremap_set_mem_type(), pci_remap_iospace()
  2016-04-28 14:47           ` Thomas Petazzoni
@ 2016-04-28 16:23             ` Arnd Bergmann
  0 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2016-04-28 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 28 April 2016 16:47:34 Thomas Petazzoni wrote:

> > > We don't know at the moment of
> > > scanning the DT, what will be the address and size of the different MEM
> > > and IO mappings.
> > 
> > True, but you have bounds on the sizes of each region given the way you
> > encode the address translation. Trying to decode what the example above is
> > telling me:
> >   - each port has 0x80000_00000000 possible IO space allocated based on
> >     the MBUS_ID split of address space
> >   - same for MEM space.
> > 
> > And IMO you *should* have address and sizes for the MEM and IO mappings, as
> > they act as *upper* boundaries. No one says you need to reserve the whole
> > space, you are just describing how the hardware translates addresses between
> > the busses.
> 
> Hm, not sure what would be the benefit of changing the Device Tree with
> this, but I'm probably missing something.

Everything's fine here: we correctly describe how for each MEM and IO
window, the hardware has a 4GB address space within the MBUS.

Part of the debate back then was whether we should also fix a mapping
between the PCIs MBUS windows and the host CPU address space as we
do for all other MBUS slaves, but we decided in the end to use dynamic
allocation of MBUS windows to give us more flexibility: there are cases
where we don't know in advance how much address space to give to
each port, or whether there are enough MBUS windows to map all I/O
spaces.

> > Just for my own clarification, is the reason why the ranges are declared like
> > this due to the fact that each port is a separate entity and multiple ports
> > cannot be served by the same MBus window? What stops you from having one
> > MBus window assigned to all IO space and the other window(s) assigned to MEM
> > for individual ports?
> 
> Each port needs its own MBus window for the PCI memory space and PCI
> I/O space that is accessed by this port. The MBUS_ID(x, y) thing is
> here to encode the information that are needed to create such windows,
> which as you can see are different for each port.
> 
> So we cannot have a single MBus window for all the IO space, we need
> one per PCI port.

Right.

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-04-28 16:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-27 22:58 pci_ioremap_set_mem_type(), pci_remap_iospace() Bjorn Helgaas
2016-04-28  7:21 ` Thomas Petazzoni
2016-04-28 12:06 ` Liviu Dudau
2016-04-28 12:13   ` Thomas Petazzoni
2016-04-28 13:02     ` Liviu Dudau
2016-04-28 13:12       ` Thomas Petazzoni
2016-04-28 14:41         ` Liviu Dudau
2016-04-28 14:47           ` Thomas Petazzoni
2016-04-28 16:23             ` Arnd Bergmann
2016-04-28 14:19 ` Bjorn Helgaas
2016-04-28 14:36   ` Thomas Petazzoni
2016-04-28 16:03     ` Bjorn Helgaas
2016-04-28 14:33 ` Bjorn Helgaas
2016-04-28 14:38   ` Thomas Petazzoni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).