linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions
@ 2011-08-10 15:03 Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 01/16] ARM: LPAE: add ISBs around MMU enabling code Catalin Marinas
                   ` (15 more replies)
  0 siblings, 16 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This is version 7 of the set of patches adding support for the Large
Physical Address Extensions on the ARM architecture (available with the
Cortex-A15 processor). LPAE comes with a new 3-level page table format
(compared to 2-level for the classic one), allowing up to 40-bit
physical address space.

This set of patches against Linux 3.1-rc1 is available on this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm.git for-next

The full set of patches against Linux 3.0 (LPAE, support for an emulated
Versatile Express with Cortex-A15 tile and generic timers) is available
on this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm.git arm-lpae


Changelog (from v6):

- Rebased against Linux 3.1-rc1.
- pteval_t, pmdval_t etc. typedefs reworked for consistency.
- __phys_to_virt/__virt_to_phys macros typecasting patch removed (an
  explicit cast to unsigned long has been added to dma_to_virt).
- Generic dma_addr_t patch removed (already fixed upstream).

Known issues:

- RMK's nopud patch not in mainline yet (compiler warnings) but carried
  over in the above branches.
- No decission on using the ISB during MMU enabling (no simple
  alternative solution).
- The patch converting PGDIR_* to PMD_* may be further modified
  depending on the merging of the DMA coherent patches and other changes
  to prepare_page_tables() proposed by RMK.


Catalin Marinas (13):
  ARM: LPAE: Cast the dma_addr_t argument to unsigned long in
    dma_to_virt
  ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  ARM: LPAE: Factor out 2-level page table definitions into separate
    files
  ARM: LPAE: Add (pte|pmd)val_t type definitions as u32
  ARM: LPAE: Use a mask for physical addresses in page table entries
  ARM: LPAE: Introduce the 3-level page table format definitions
  ARM: LPAE: Page table maintenance for the 3-level format
  ARM: LPAE: MMU setup for the 3-level page table format
  ARM: LPAE: Invalidate the TLB before freeing the PMD
  ARM: LPAE: Add fault handling support
  ARM: LPAE: Add context switching support
  ARM: LPAE: Add identity mapping support for the 3-level page table
    format
  ARM: LPAE: Add the Kconfig entries

Will Deacon (3):
  ARM: LPAE: add ISBs around MMU enabling code
  ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
  ARM: LPAE: add support for ATAG_MEM64

 arch/arm/Kconfig                            |    2 +-
 arch/arm/boot/compressed/head.S             |    1 +
 arch/arm/include/asm/assembler.h            |   11 +
 arch/arm/include/asm/dma-mapping.h          |    2 +-
 arch/arm/include/asm/page.h                 |   44 +---
 arch/arm/include/asm/pgalloc.h              |   28 ++-
 arch/arm/include/asm/pgtable-2level-hwdef.h |   93 ++++++
 arch/arm/include/asm/pgtable-2level-types.h |   67 +++++
 arch/arm/include/asm/pgtable-2level.h       |  143 +++++++++
 arch/arm/include/asm/pgtable-3level-hwdef.h |   82 ++++++
 arch/arm/include/asm/pgtable-3level-types.h |   70 +++++
 arch/arm/include/asm/pgtable-3level.h       |  107 +++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   81 +-----
 arch/arm/include/asm/pgtable.h              |  211 +++++---------
 arch/arm/include/asm/proc-fns.h             |   21 ++
 arch/arm/include/asm/setup.h                |    8 +
 arch/arm/include/asm/tlb.h                  |   11 +-
 arch/arm/include/asm/tlbflush.h             |    4 +-
 arch/arm/kernel/head.S                      |  119 ++++++---
 arch/arm/kernel/module.c                    |    2 +-
 arch/arm/kernel/setup.c                     |   10 +
 arch/arm/kernel/sleep.S                     |    2 +
 arch/arm/mm/Kconfig                         |   13 +
 arch/arm/mm/Makefile                        |    4 +
 arch/arm/mm/alignment.c                     |    8 +-
 arch/arm/mm/context.c                       |   19 +-
 arch/arm/mm/dma-mapping.c                   |    6 +-
 arch/arm/mm/fault.c                         |   87 ++++++
 arch/arm/mm/idmap.c                         |   36 +++-
 arch/arm/mm/ioremap.c                       |    8 +-
 arch/arm/mm/mm.h                            |    4 +-
 arch/arm/mm/mmu.c                           |   53 +++-
 arch/arm/mm/pgd.c                           |   51 +++-
 arch/arm/mm/proc-macros.S                   |    5 +-
 arch/arm/mm/proc-v7lpae.S                   |  422 +++++++++++++++++++++++++++
 35 files changed, 1504 insertions(+), 331 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h
 create mode 100644 arch/arm/mm/proc-v7lpae.S

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 01/16] ARM: LPAE: add ISBs around MMU enabling code
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt Catalin Marinas
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

Before we enable the MMU, we must ensure that the TTBR registers contain
sane values. After the MMU has been enabled, we jump to the *virtual*
address of the following function, so we also need to ensure that the
SCTLR write has taken effect.

This patch adds ISB instructions around the SCTLR write to ensure the
visibility of the above.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/boot/compressed/head.S  |    1 +
 arch/arm/include/asm/assembler.h |   11 +++++++++++
 arch/arm/kernel/head.S           |    2 ++
 arch/arm/kernel/sleep.S          |    2 ++
 4 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index e95a598..716c7ba 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -551,6 +551,7 @@ __armv7_mmu_cache_on:
 		mcrne	p15, 0, r3, c2, c0, 0	@ load page table pointer
 		mcrne	p15, 0, r1, c3, c0, 0	@ load domain access control
 #endif
+		mcr	p15, 0, r0, c7, c5, 4	@ ISB
 		mcr	p15, 0, r0, c1, c0, 0	@ load control register
 		mrc	p15, 0, r0, c1, c0, 0	@ and read it back
 		mov	r0, #0
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 29035e8..b6e65de 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -187,6 +187,17 @@
 #endif
 
 /*
+ * Instruction barrier
+ */
+	.macro	instr_sync
+#if __LINUX_ARM_ARCH__ >= 7
+	isb
+#elif __LINUX_ARM_ARCH__ == 6
+	mcr	p15, 0, r0, c7, c5, 4
+#endif
+	.endm
+
+/*
  * SMP data memory barrier
  */
 	.macro	smp_dmb mode
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 742b610..d8231b2 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -396,8 +396,10 @@ ENDPROC(__enable_mmu)
 	.align	5
 __turn_mmu_on:
 	mov	r0, r0
+	instr_sync
 	mcr	p15, 0, r0, c1, c0, 0		@ write control reg
 	mrc	p15, 0, r3, c0, c0, 0		@ read id reg
+	instr_sync
 	mov	r3, r3
 	mov	r3, r13
 	mov	pc, r3
diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S
index dc902f2..ecece65 100644
--- a/arch/arm/kernel/sleep.S
+++ b/arch/arm/kernel/sleep.S
@@ -85,8 +85,10 @@ ENDPROC(cpu_resume_mmu)
 	.ltorg
 	.align	5
 cpu_resume_turn_mmu_on:
+	instr_sync
 	mcr	p15, 0, r1, c1, c0, 0	@ turn on MMU, I-cache, etc
 	mrc	p15, 0, r1, c0, c0, 0	@ read id reg
+	instr_sync
 	mov	r1, r1
 	mov	r1, r1
 	mov	pc, r3			@ jump to virtual address

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 01/16] ARM: LPAE: add ISBs around MMU enabling code Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-13 14:33   ` Russell King - ARM Linux
  2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This is to avoid a compiler warning when invoking the __bus_to_virt()
macro. The dma_to_virt() function gets addresses within the 32-bit
range.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/dma-mapping.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
index 7a21d0b..28b7ee8 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -32,7 +32,7 @@ static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 
 static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
 {
-	return (void *)__bus_to_virt(addr);
+	return (void *)__bus_to_virt((unsigned long)addr);
 }
 
 static inline dma_addr_t virt_to_dma(struct device *dev, void *addr)

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 01/16] ARM: LPAE: add ISBs around MMU enabling code Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-13 14:34   ` Russell King - ARM Linux
                     ` (2 more replies)
  2011-08-10 15:03 ` [PATCH v7 04/16] ARM: LPAE: Factor out 2-level page table definitions into separate files Catalin Marinas
                   ` (12 subsequent siblings)
  15 siblings, 3 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
the same value (21). This patch converts the PGDIR_* uses in the kernel
to the PMD_* equivalent so that LPAE builds can reuse the same code.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/module.c  |    2 +-
 arch/arm/mm/dma-mapping.c |    6 +++---
 arch/arm/mm/mmu.c         |   10 +++++-----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index 05b3776..91befcf 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -33,7 +33,7 @@
  * recompiling the whole kernel when CONFIG_XIP_KERNEL is turned on/off.
  */
 #undef MODULES_VADDR
-#define MODULES_VADDR	(((unsigned long)_etext + ~PGDIR_MASK) & PGDIR_MASK)
+#define MODULES_VADDR	(((unsigned long)_etext + ~PMD_MASK) & PMD_MASK)
 #endif
 
 #ifdef CONFIG_MMU
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0a0a1e7..b6a867c 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -123,8 +123,8 @@ static void __dma_free_buffer(struct page *page, size_t size)
 #endif
 
 #define CONSISTENT_OFFSET(x)	(((unsigned long)(x) - CONSISTENT_BASE) >> PAGE_SHIFT)
-#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PGDIR_SHIFT)
-#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PGDIR_SHIFT)
+#define CONSISTENT_PTE_INDEX(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PMD_SHIFT)
+#define NUM_CONSISTENT_PTES (CONSISTENT_DMA_SIZE >> PMD_SHIFT)
 
 /*
  * These are the page tables (2MB each) covering uncached, DMA consistent allocations
@@ -183,7 +183,7 @@ static int __init consistent_init(void)
 		}
 
 		consistent_pte[i++] = pte;
-		base += (1 << PGDIR_SHIFT);
+		base += (1 << PMD_SHIFT);
 	} while (base < CONSISTENT_END);
 
 	return ret;
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 594d677..dc26858 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -863,14 +863,14 @@ static inline void prepare_page_table(void)
 	/*
 	 * Clear out all the mappings below the kernel image.
 	 */
-	for (addr = 0; addr < MODULES_VADDR; addr += PGDIR_SIZE)
+	for (addr = 0; addr < MODULES_VADDR; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 #ifdef CONFIG_XIP_KERNEL
 	/* The XIP kernel is mapped in the module area -- skip over it */
-	addr = ((unsigned long)_etext + PGDIR_SIZE - 1) & PGDIR_MASK;
+	addr = ((unsigned long)_etext + PMD_SIZE - 1) & PMD_MASK;
 #endif
-	for ( ; addr < PAGE_OFFSET; addr += PGDIR_SIZE)
+	for ( ; addr < PAGE_OFFSET; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*
@@ -885,7 +885,7 @@ static inline void prepare_page_table(void)
 	 * memory bank, up to the end of the vmalloc region.
 	 */
 	for (addr = __phys_to_virt(end);
-	     addr < VMALLOC_END; addr += PGDIR_SIZE)
+	     addr < VMALLOC_END; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 }
 
@@ -926,7 +926,7 @@ static void __init devicemaps_init(struct machine_desc *mdesc)
 	 */
 	vectors_page = early_alloc(PAGE_SIZE);
 
-	for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
+	for (addr = VMALLOC_END; addr; addr += PMD_SIZE)
 		pmd_clear(pmd_off_k(addr));
 
 	/*

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 04/16] ARM: LPAE: Factor out 2-level page table definitions into separate files
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (2 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 05/16] ARM: LPAE: Add (pte|pmd)val_t type definitions as u32 Catalin Marinas
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves page table definitions from asm/page.h, asm/pgtable.h
and asm/ptgable-hwdef.h into corresponding *-2level* files.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |   42 +--------
 arch/arm/include/asm/pgtable-2level-hwdef.h |   91 +++++++++++++++++
 arch/arm/include/asm/pgtable-2level-types.h |   64 ++++++++++++
 arch/arm/include/asm/pgtable-2level.h       |  143 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |   77 +--------------
 arch/arm/include/asm/pgtable.h              |  135 +-------------------------
 6 files changed, 302 insertions(+), 250 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-2level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ac75d08..ca94653 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,47 +151,7 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
-typedef unsigned long pteval_t;
-
-#undef STRICT_MM_TYPECHECKS
-
-#ifdef STRICT_MM_TYPECHECKS
-/*
- * These are used to make use of C type-checking..
- */
-typedef struct { pteval_t pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
-
-#define pte_val(x)      ((x).pte)
-#define pmd_val(x)      ((x).pmd)
-#define pgd_val(x)	((x).pgd[0])
-#define pgprot_val(x)   ((x).pgprot)
-
-#define __pte(x)        ((pte_t) { (x) } )
-#define __pmd(x)        ((pmd_t) { (x) } )
-#define __pgprot(x)     ((pgprot_t) { (x) } )
-
-#else
-/*
- * .. while these make it easier on the compiler
- */
-typedef pteval_t pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
-
-#define pte_val(x)      (x)
-#define pmd_val(x)      (x)
-#define pgd_val(x)	((x)[0])
-#define pgprot_val(x)   (x)
-
-#define __pte(x)        (x)
-#define __pmd(x)        (x)
-#define __pgprot(x)     (x)
-
-#endif /* STRICT_MM_TYPECHECKS */
+#include <asm/pgtable-2level-types.h>
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
new file mode 100644
index 0000000..436529c
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -0,0 +1,91 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level-hwdef.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_HWDEF_H
+#define _ASM_PGTABLE_2LEVEL_HWDEF_H
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1 descriptor (PMD)
+ *   - common
+ */
+#define PMD_TYPE_MASK		(3 << 0)
+#define PMD_TYPE_FAULT		(0 << 0)
+#define PMD_TYPE_TABLE		(1 << 0)
+#define PMD_TYPE_SECT		(2 << 0)
+#define PMD_BIT4		(1 << 4)
+#define PMD_DOMAIN(x)		((x) << 5)
+#define PMD_PROTECTION		(1 << 9)	/* v5 */
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(1 << 2)
+#define PMD_SECT_CACHEABLE	(1 << 3)
+#define PMD_SECT_XN		(1 << 4)	/* v6 */
+#define PMD_SECT_AP_WRITE	(1 << 10)
+#define PMD_SECT_AP_READ	(1 << 11)
+#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
+#define PMD_SECT_APX		(1 << 15)	/* v6 */
+#define PMD_SECT_S		(1 << 16)	/* v6 */
+#define PMD_SECT_nG		(1 << 17)	/* v6 */
+#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
+#define PMD_SECT_AF		(0)
+
+#define PMD_SECT_UNCACHED	(0)
+#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
+#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
+#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
+#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
+#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
+
+/*
+ *   - coarse table (not used)
+ */
+
+/*
+ * + Level 2 descriptor (PTE)
+ *   - common
+ */
+#define PTE_TYPE_MASK		(3 << 0)
+#define PTE_TYPE_FAULT		(0 << 0)
+#define PTE_TYPE_LARGE		(1 << 0)
+#define PTE_TYPE_SMALL		(2 << 0)
+#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
+#define PTE_BUFFERABLE		(1 << 2)
+#define PTE_CACHEABLE		(1 << 3)
+
+/*
+ *   - extended small page/tiny page
+ */
+#define PTE_EXT_XN		(1 << 0)	/* v6 */
+#define PTE_EXT_AP_MASK		(3 << 4)
+#define PTE_EXT_AP0		(1 << 4)
+#define PTE_EXT_AP1		(2 << 4)
+#define PTE_EXT_AP_UNO_SRO	(0 << 4)
+#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
+#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
+#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
+#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
+#define PTE_EXT_APX		(1 << 9)	/* v6 */
+#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
+#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
+#define PTE_EXT_NG		(1 << 11)	/* v6 */
+
+/*
+ *   - small page
+ */
+#define PTE_SMALL_AP_MASK	(0xff << 4)
+#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
+#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
+#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
+#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
new file mode 100644
index 0000000..8a01f62
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -0,0 +1,64 @@
+/*
+ * arch/arm/include/asm/pgtable-2level-types.h
+ *
+ * Copyright (C) 1995-2003 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
+#define _ASM_PGTABLE_2LEVEL_TYPES_H
+
+typedef unsigned long pteval_t;
+
+#undef STRICT_MM_TYPECHECKS
+
+#ifdef STRICT_MM_TYPECHECKS
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { pteval_t pte; } pte_t;
+typedef struct { unsigned long pmd; } pmd_t;
+typedef struct { unsigned long pgd[2]; } pgd_t;
+typedef struct { unsigned long pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pmd_val(x)      ((x).pmd)
+#define pgd_val(x)	((x).pgd[0])
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pmd(x)        ((pmd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else
+/*
+ * .. while these make it easier on the compiler
+ */
+typedef pteval_t pte_t;
+typedef unsigned long pmd_t;
+typedef unsigned long pgd_t[2];
+typedef unsigned long pgprot_t;
+
+#define pte_val(x)      (x)
+#define pmd_val(x)      (x)
+#define pgd_val(x)	((x)[0])
+#define pgprot_val(x)   (x)
+
+#define __pte(x)        (x)
+#define __pmd(x)        (x)
+#define __pgprot(x)     (x)
+
+#endif /* STRICT_MM_TYPECHECKS */
+
+#endif	/* _ASM_PGTABLE_2LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-2level.h b/arch/arm/include/asm/pgtable-2level.h
new file mode 100644
index 0000000..470457e
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-2level.h
@@ -0,0 +1,143 @@
+/*
+ *  arch/arm/include/asm/pgtable-2level.h
+ *
+ *  Copyright (C) 1995-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_PGTABLE_2LEVEL_H
+#define _ASM_PGTABLE_2LEVEL_H
+
+/*
+ * Hardware-wise, we have a two level page table structure, where the first
+ * level has 4096 entries, and the second level has 256 entries.  Each entry
+ * is one 32-bit word.  Most of the bits in the second level entry are used
+ * by hardware, and there aren't any "accessed" and "dirty" bits.
+ *
+ * Linux on the other hand has a three level page table structure, which can
+ * be wrapped to fit a two level page table structure easily - using the PGD
+ * and PTE only.  However, Linux also expects one "PTE" table per page, and
+ *@least a "dirty" bit.
+ *
+ * Therefore, we tweak the implementation slightly - we tell Linux that we
+ * have 2048 entries in the first level, each of which is 8 bytes (iow, two
+ * hardware pointers to the second level.)  The second level contains two
+ * hardware PTE tables arranged contiguously, preceded by Linux versions
+ * which contain the state information Linux needs.  We, therefore, end up
+ * with 512 entries in the "PTE" level.
+ *
+ * This leads to the page tables having the following layout:
+ *
+ *    pgd             pte
+ * |        |
+ * +--------+
+ * |        |       +------------+ +0
+ * +- - - - +       | Linux pt 0 |
+ * |        |       +------------+ +1024
+ * +--------+ +0    | Linux pt 1 |
+ * |        |-----> +------------+ +2048
+ * +- - - - + +4    |  h/w pt 0  |
+ * |        |-----> +------------+ +3072
+ * +--------+ +8    |  h/w pt 1  |
+ * |        |       +------------+ +4096
+ *
+ * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
+ * PTE_xxx for definitions of bits appearing in the "h/w pt".
+ *
+ * PMD_xxx definitions refer to bits in the first level page table.
+ *
+ * The "dirty" bit is emulated by only granting hardware write permission
+ * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
+ * means that a write to a clean page will cause a permission fault, and
+ * the Linux MM layer will mark the page dirty via handle_pte_fault().
+ * For the hardware to notice the permission change, the TLB entry must
+ * be flushed, and ptep_set_access_flags() does that for us.
+ *
+ * The "accessed" or "young" bit is emulated by a similar method; we only
+ * allow accesses to the page if the "young" bit is set.  Accesses to the
+ * page will cause a fault, and handle_pte_fault() will set the young bit
+ * for us as long as the page is marked present in the corresponding Linux
+ * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
+ * up to date.
+ *
+ * However, when the "young" bit is cleared, we deny access to the page
+ * by clearing the hardware PTE.  Currently Linux does not flush the TLB
+ * for us in this case, which means the TLB will retain the transation
+ * until either the TLB entry is evicted under pressure, or a context
+ * switch which changes the user space mapping occurs.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		1
+#define PTRS_PER_PGD		2048
+
+#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
+#define PTE_HWTABLE_OFF		(PTE_HWTABLE_PTRS * sizeof(pte_t))
+#define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u32))
+
+/*
+ * PMD_SHIFT determines the size of the area a second-level page table can map
+ * PGDIR_SHIFT determines what a third-level page table entry can map
+ */
+#define PMD_SHIFT		21
+#define PGDIR_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		20
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+/*
+ * ARMv6 supersection address mask and size definitions.
+ */
+#define SUPERSECTION_SHIFT	24
+#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
+#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
+
+#define USER_PTRS_PER_PGD	(TASK_SIZE / PGDIR_SIZE)
+
+/*
+ * "Linux" PTE definitions.
+ *
+ * We keep two sets of PTEs - the hardware and the linux version.
+ * This allows greater flexibility in the way we map the Linux bits
+ * onto the hardware tables, and allows us to have YOUNG and DIRTY
+ * bits.
+ *
+ * The PTE table pointer refers to the hardware entries; the "Linux"
+ * entries are stored 1024 bytes below.
+ */
+#define L_PTE_PRESENT		(_AT(pteval_t, 1) << 0)
+#define L_PTE_YOUNG		(_AT(pteval_t, 1) << 1)
+#define L_PTE_FILE		(_AT(pteval_t, 1) << 2)	/* only when !PRESENT */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 6)
+#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)
+#define L_PTE_USER		(_AT(pteval_t, 1) << 8)
+#define L_PTE_XN		(_AT(pteval_t, 1) << 9)
+#define L_PTE_SHARED		(_AT(pteval_t, 1) << 10)	/* shared(v6), coherent(xsc3) */
+
+/*
+ * These are the memory types, defined to be compatible with
+ * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
+ */
+#define L_PTE_MT_UNCACHED	(_AT(pteval_t, 0x00) << 2)	/* 0000 */
+#define L_PTE_MT_BUFFERABLE	(_AT(pteval_t, 0x01) << 2)	/* 0001 */
+#define L_PTE_MT_WRITETHROUGH	(_AT(pteval_t, 0x02) << 2)	/* 0010 */
+#define L_PTE_MT_WRITEBACK	(_AT(pteval_t, 0x03) << 2)	/* 0011 */
+#define L_PTE_MT_MINICACHE	(_AT(pteval_t, 0x06) << 2)	/* 0110 (sa1100, xscale) */
+#define L_PTE_MT_WRITEALLOC	(_AT(pteval_t, 0x07) << 2)	/* 0111 */
+#define L_PTE_MT_DEV_SHARED	(_AT(pteval_t, 0x04) << 2)	/* 0100 */
+#define L_PTE_MT_DEV_NONSHARED	(_AT(pteval_t, 0x0c) << 2)	/* 1100 */
+#define L_PTE_MT_DEV_WC		(_AT(pteval_t, 0x09) << 2)	/* 1001 */
+#define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 0x0b) << 2)	/* 1011 */
+#define L_PTE_MT_MASK		(_AT(pteval_t, 0x0f) << 2)
+
+#endif /* _ASM_PGTABLE_2LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index fd1521d..1831111 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,81 +10,6 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
-/*
- * Hardware page table definitions.
- *
- * + Level 1 descriptor (PMD)
- *   - common
- */
-#define PMD_TYPE_MASK		(3 << 0)
-#define PMD_TYPE_FAULT		(0 << 0)
-#define PMD_TYPE_TABLE		(1 << 0)
-#define PMD_TYPE_SECT		(2 << 0)
-#define PMD_BIT4		(1 << 4)
-#define PMD_DOMAIN(x)		((x) << 5)
-#define PMD_PROTECTION		(1 << 9)	/* v5 */
-/*
- *   - section
- */
-#define PMD_SECT_BUFFERABLE	(1 << 2)
-#define PMD_SECT_CACHEABLE	(1 << 3)
-#define PMD_SECT_XN		(1 << 4)	/* v6 */
-#define PMD_SECT_AP_WRITE	(1 << 10)
-#define PMD_SECT_AP_READ	(1 << 11)
-#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
-#define PMD_SECT_APX		(1 << 15)	/* v6 */
-#define PMD_SECT_S		(1 << 16)	/* v6 */
-#define PMD_SECT_nG		(1 << 17)	/* v6 */
-#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
-
-#define PMD_SECT_UNCACHED	(0)
-#define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
-#define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
-#define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_MINICACHE	(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE)
-#define PMD_SECT_WBWA		(PMD_SECT_TEX(1) | PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
-#define PMD_SECT_NONSHARED_DEV	(PMD_SECT_TEX(2))
-
-/*
- *   - coarse table (not used)
- */
-
-/*
- * + Level 2 descriptor (PTE)
- *   - common
- */
-#define PTE_TYPE_MASK		(3 << 0)
-#define PTE_TYPE_FAULT		(0 << 0)
-#define PTE_TYPE_LARGE		(1 << 0)
-#define PTE_TYPE_SMALL		(2 << 0)
-#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
-#define PTE_BUFFERABLE		(1 << 2)
-#define PTE_CACHEABLE		(1 << 3)
-
-/*
- *   - extended small page/tiny page
- */
-#define PTE_EXT_XN		(1 << 0)	/* v6 */
-#define PTE_EXT_AP_MASK		(3 << 4)
-#define PTE_EXT_AP0		(1 << 4)
-#define PTE_EXT_AP1		(2 << 4)
-#define PTE_EXT_AP_UNO_SRO	(0 << 4)
-#define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
-#define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
-#define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
-#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
-#define PTE_EXT_APX		(1 << 9)	/* v6 */
-#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
-#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
-#define PTE_EXT_NG		(1 << 11)	/* v6 */
-
-/*
- *   - small page
- */
-#define PTE_SMALL_AP_MASK	(0xff << 4)
-#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
-#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
-#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
-#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+#include <asm/pgtable-2level-hwdef.h>
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index c2663f4..9618052 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -25,6 +25,8 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#include <asm/pgtable-2level.h>
+
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the
  * current 8MB value just means that there will be a 8MB "hole" after the
@@ -42,79 +44,6 @@
 #define VMALLOC_START		(((unsigned long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1))
 #endif
 
-/*
- * Hardware-wise, we have a two level page table structure, where the first
- * level has 4096 entries, and the second level has 256 entries.  Each entry
- * is one 32-bit word.  Most of the bits in the second level entry are used
- * by hardware, and there aren't any "accessed" and "dirty" bits.
- *
- * Linux on the other hand has a three level page table structure, which can
- * be wrapped to fit a two level page table structure easily - using the PGD
- * and PTE only.  However, Linux also expects one "PTE" table per page, and
- * at least a "dirty" bit.
- *
- * Therefore, we tweak the implementation slightly - we tell Linux that we
- * have 2048 entries in the first level, each of which is 8 bytes (iow, two
- * hardware pointers to the second level.)  The second level contains two
- * hardware PTE tables arranged contiguously, preceded by Linux versions
- * which contain the state information Linux needs.  We, therefore, end up
- * with 512 entries in the "PTE" level.
- *
- * This leads to the page tables having the following layout:
- *
- *    pgd             pte
- * |        |
- * +--------+
- * |        |       +------------+ +0
- * +- - - - +       | Linux pt 0 |
- * |        |       +------------+ +1024
- * +--------+ +0    | Linux pt 1 |
- * |        |-----> +------------+ +2048
- * +- - - - + +4    |  h/w pt 0  |
- * |        |-----> +------------+ +3072
- * +--------+ +8    |  h/w pt 1  |
- * |        |       +------------+ +4096
- *
- * See L_PTE_xxx below for definitions of bits in the "Linux pt", and
- * PTE_xxx for definitions of bits appearing in the "h/w pt".
- *
- * PMD_xxx definitions refer to bits in the first level page table.
- *
- * The "dirty" bit is emulated by only granting hardware write permission
- * iff the page is marked "writable" and "dirty" in the Linux PTE.  This
- * means that a write to a clean page will cause a permission fault, and
- * the Linux MM layer will mark the page dirty via handle_pte_fault().
- * For the hardware to notice the permission change, the TLB entry must
- * be flushed, and ptep_set_access_flags() does that for us.
- *
- * The "accessed" or "young" bit is emulated by a similar method; we only
- * allow accesses to the page if the "young" bit is set.  Accesses to the
- * page will cause a fault, and handle_pte_fault() will set the young bit
- * for us as long as the page is marked present in the corresponding Linux
- * PTE entry.  Again, ptep_set_access_flags() will ensure that the TLB is
- * up to date.
- *
- * However, when the "young" bit is cleared, we deny access to the page
- * by clearing the hardware PTE.  Currently Linux does not flush the TLB
- * for us in this case, which means the TLB will retain the transation
- * until either the TLB entry is evicted under pressure, or a context
- * switch which changes the user space mapping occurs.
- */
-#define PTRS_PER_PTE		512
-#define PTRS_PER_PMD		1
-#define PTRS_PER_PGD		2048
-
-#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
-#define PTE_HWTABLE_OFF		(PTE_HWTABLE_PTRS * sizeof(pte_t))
-#define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u32))
-
-/*
- * PMD_SHIFT determines the size of the area a second-level page table can map
- * PGDIR_SHIFT determines what a third-level page table entry can map
- */
-#define PMD_SHIFT		21
-#define PGDIR_SHIFT		21
-
 #define LIBRARY_TEXT_START	0x0c000000
 
 #ifndef __ASSEMBLY__
@@ -125,12 +54,6 @@ extern void __pgd_error(const char *file, int line, pgd_t);
 #define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte)
 #define pmd_ERROR(pmd)		__pmd_error(__FILE__, __LINE__, pmd)
 #define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd)
-#endif /* !__ASSEMBLY__ */
-
-#define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE-1))
-#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK		(~(PGDIR_SIZE-1))
 
 /*
  * This is the lowest virtual address we can permit any user space
@@ -139,60 +62,6 @@ extern void __pgd_error(const char *file, int line, pgd_t);
  */
 #define FIRST_USER_ADDRESS	PAGE_SIZE
 
-#define USER_PTRS_PER_PGD	(TASK_SIZE / PGDIR_SIZE)
-
-/*
- * section address mask and size definitions.
- */
-#define SECTION_SHIFT		20
-#define SECTION_SIZE		(1UL << SECTION_SHIFT)
-#define SECTION_MASK		(~(SECTION_SIZE-1))
-
-/*
- * ARMv6 supersection address mask and size definitions.
- */
-#define SUPERSECTION_SHIFT	24
-#define SUPERSECTION_SIZE	(1UL << SUPERSECTION_SHIFT)
-#define SUPERSECTION_MASK	(~(SUPERSECTION_SIZE-1))
-
-/*
- * "Linux" PTE definitions.
- *
- * We keep two sets of PTEs - the hardware and the linux version.
- * This allows greater flexibility in the way we map the Linux bits
- * onto the hardware tables, and allows us to have YOUNG and DIRTY
- * bits.
- *
- * The PTE table pointer refers to the hardware entries; the "Linux"
- * entries are stored 1024 bytes below.
- */
-#define L_PTE_PRESENT		(_AT(pteval_t, 1) << 0)
-#define L_PTE_YOUNG		(_AT(pteval_t, 1) << 1)
-#define L_PTE_FILE		(_AT(pteval_t, 1) << 2)	/* only when !PRESENT */
-#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 6)
-#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)
-#define L_PTE_USER		(_AT(pteval_t, 1) << 8)
-#define L_PTE_XN		(_AT(pteval_t, 1) << 9)
-#define L_PTE_SHARED		(_AT(pteval_t, 1) << 10)	/* shared(v6), coherent(xsc3) */
-
-/*
- * These are the memory types, defined to be compatible with
- * pre-ARMv6 CPUs cacheable and bufferable bits:   XXCB
- */
-#define L_PTE_MT_UNCACHED	(_AT(pteval_t, 0x00) << 2)	/* 0000 */
-#define L_PTE_MT_BUFFERABLE	(_AT(pteval_t, 0x01) << 2)	/* 0001 */
-#define L_PTE_MT_WRITETHROUGH	(_AT(pteval_t, 0x02) << 2)	/* 0010 */
-#define L_PTE_MT_WRITEBACK	(_AT(pteval_t, 0x03) << 2)	/* 0011 */
-#define L_PTE_MT_MINICACHE	(_AT(pteval_t, 0x06) << 2)	/* 0110 (sa1100, xscale) */
-#define L_PTE_MT_WRITEALLOC	(_AT(pteval_t, 0x07) << 2)	/* 0111 */
-#define L_PTE_MT_DEV_SHARED	(_AT(pteval_t, 0x04) << 2)	/* 0100 */
-#define L_PTE_MT_DEV_NONSHARED	(_AT(pteval_t, 0x0c) << 2)	/* 1100 */
-#define L_PTE_MT_DEV_WC		(_AT(pteval_t, 0x09) << 2)	/* 1001 */
-#define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 0x0b) << 2)	/* 1011 */
-#define L_PTE_MT_MASK		(_AT(pteval_t, 0x0f) << 2)
-
-#ifndef __ASSEMBLY__
-
 /*
  * The pgprot_* and protection_map entries will be fixed up in runtime
  * to include the cachable and bufferable bits based on memory policy,

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 05/16] ARM: LPAE: Add (pte|pmd)val_t type definitions as u32
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (3 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 04/16] ARM: LPAE: Factor out 2-level page table definitions into separate files Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 06/16] ARM: LPAE: Use a mask for physical addresses in page table entries Catalin Marinas
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch defines the (pte|pmd)val_t as u32 and changes the page table
types to be based on these. The PMD bits are converted to the
corresponding type using the _AT macro.

The flush_pmd_entry/clean_pmd_entry argument was changed to (void *) to
allow them to be used with both PGD and PMD pointers and avoid code
duplication.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgalloc.h              |    4 +-
 arch/arm/include/asm/pgtable-2level-hwdef.h |   82 +++++++++++++-------------
 arch/arm/include/asm/pgtable-2level-types.h |   17 +++--
 arch/arm/include/asm/tlbflush.h             |    4 +-
 arch/arm/mm/mm.h                            |    4 +-
 arch/arm/mm/mmu.c                           |    4 +-
 6 files changed, 59 insertions(+), 56 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index a87d4cf..7418894 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -105,9 +105,9 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 }
 
 static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t pte,
-	unsigned long prot)
+				  pmdval_t prot)
 {
-	unsigned long pmdval = (pte + PTE_HWTABLE_OFF) | prot;
+	pmdval_t pmdval = (pte + PTE_HWTABLE_OFF) | prot;
 	pmdp[0] = __pmd(pmdval);
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
 	flush_pmd_entry(pmdp);
diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
index 436529c..2b52c40 100644
--- a/arch/arm/include/asm/pgtable-2level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -16,29 +16,29 @@
  * + Level 1 descriptor (PMD)
  *   - common
  */
-#define PMD_TYPE_MASK		(3 << 0)
-#define PMD_TYPE_FAULT		(0 << 0)
-#define PMD_TYPE_TABLE		(1 << 0)
-#define PMD_TYPE_SECT		(2 << 0)
-#define PMD_BIT4		(1 << 4)
-#define PMD_DOMAIN(x)		((x) << 5)
-#define PMD_PROTECTION		(1 << 9)	/* v5 */
+#define PMD_TYPE_MASK		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmdval_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmdval_t, 1) << 0)
+#define PMD_TYPE_SECT		(_AT(pmdval_t, 2) << 0)
+#define PMD_BIT4		(_AT(pmdval_t, 1) << 4)
+#define PMD_DOMAIN(x)		(_AT(pmdval_t, (x)) << 5)
+#define PMD_PROTECTION		(_AT(pmdval_t, 1) << 9)		/* v5 */
 /*
  *   - section
  */
-#define PMD_SECT_BUFFERABLE	(1 << 2)
-#define PMD_SECT_CACHEABLE	(1 << 3)
-#define PMD_SECT_XN		(1 << 4)	/* v6 */
-#define PMD_SECT_AP_WRITE	(1 << 10)
-#define PMD_SECT_AP_READ	(1 << 11)
-#define PMD_SECT_TEX(x)		((x) << 12)	/* v5 */
-#define PMD_SECT_APX		(1 << 15)	/* v6 */
-#define PMD_SECT_S		(1 << 16)	/* v6 */
-#define PMD_SECT_nG		(1 << 17)	/* v6 */
-#define PMD_SECT_SUPER		(1 << 18)	/* v6 */
-#define PMD_SECT_AF		(0)
+#define PMD_SECT_BUFFERABLE	(_AT(pmdval_t, 1) << 2)
+#define PMD_SECT_CACHEABLE	(_AT(pmdval_t, 1) << 3)
+#define PMD_SECT_XN		(_AT(pmdval_t, 1) << 4)		/* v6 */
+#define PMD_SECT_AP_WRITE	(_AT(pmdval_t, 1) << 10)
+#define PMD_SECT_AP_READ	(_AT(pmdval_t, 1) << 11)
+#define PMD_SECT_TEX(x)		(_AT(pmdval_t, (x)) << 12)	/* v5 */
+#define PMD_SECT_APX		(_AT(pmdval_t, 1) << 15)	/* v6 */
+#define PMD_SECT_S		(_AT(pmdval_t, 1) << 16)	/* v6 */
+#define PMD_SECT_nG		(_AT(pmdval_t, 1) << 17)	/* v6 */
+#define PMD_SECT_SUPER		(_AT(pmdval_t, 1) << 18)	/* v6 */
+#define PMD_SECT_AF		(_AT(pmdval_t, 0))
 
-#define PMD_SECT_UNCACHED	(0)
+#define PMD_SECT_UNCACHED	(_AT(pmdval_t, 0))
 #define PMD_SECT_BUFFERED	(PMD_SECT_BUFFERABLE)
 #define PMD_SECT_WT		(PMD_SECT_CACHEABLE)
 #define PMD_SECT_WB		(PMD_SECT_CACHEABLE | PMD_SECT_BUFFERABLE)
@@ -54,38 +54,38 @@
  * + Level 2 descriptor (PTE)
  *   - common
  */
-#define PTE_TYPE_MASK		(3 << 0)
-#define PTE_TYPE_FAULT		(0 << 0)
-#define PTE_TYPE_LARGE		(1 << 0)
-#define PTE_TYPE_SMALL		(2 << 0)
-#define PTE_TYPE_EXT		(3 << 0)	/* v5 */
-#define PTE_BUFFERABLE		(1 << 2)
-#define PTE_CACHEABLE		(1 << 3)
+#define PTE_TYPE_MASK		(_AT(pteval_t, 3) << 0)
+#define PTE_TYPE_FAULT		(_AT(pteval_t, 0) << 0)
+#define PTE_TYPE_LARGE		(_AT(pteval_t, 1) << 0)
+#define PTE_TYPE_SMALL		(_AT(pteval_t, 2) << 0)
+#define PTE_TYPE_EXT		(_AT(pteval_t, 3) << 0)		/* v5 */
+#define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)
+#define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)
 
 /*
  *   - extended small page/tiny page
  */
-#define PTE_EXT_XN		(1 << 0)	/* v6 */
-#define PTE_EXT_AP_MASK		(3 << 4)
-#define PTE_EXT_AP0		(1 << 4)
-#define PTE_EXT_AP1		(2 << 4)
-#define PTE_EXT_AP_UNO_SRO	(0 << 4)
+#define PTE_EXT_XN		(_AT(pteval_t, 1) << 0)		/* v6 */
+#define PTE_EXT_AP_MASK		(_AT(pteval_t, 3) << 4)
+#define PTE_EXT_AP0		(_AT(pteval_t, 1) << 4)
+#define PTE_EXT_AP1		(_AT(pteval_t, 2) << 4)
+#define PTE_EXT_AP_UNO_SRO	(_AT(pteval_t, 0) << 4)
 #define PTE_EXT_AP_UNO_SRW	(PTE_EXT_AP0)
 #define PTE_EXT_AP_URO_SRW	(PTE_EXT_AP1)
 #define PTE_EXT_AP_URW_SRW	(PTE_EXT_AP1|PTE_EXT_AP0)
-#define PTE_EXT_TEX(x)		((x) << 6)	/* v5 */
-#define PTE_EXT_APX		(1 << 9)	/* v6 */
-#define PTE_EXT_COHERENT	(1 << 9)	/* XScale3 */
-#define PTE_EXT_SHARED		(1 << 10)	/* v6 */
-#define PTE_EXT_NG		(1 << 11)	/* v6 */
+#define PTE_EXT_TEX(x)		(_AT(pteval_t, (x)) << 6)	/* v5 */
+#define PTE_EXT_APX		(_AT(pteval_t, 1) << 9)		/* v6 */
+#define PTE_EXT_COHERENT	(_AT(pteval_t, 1) << 9)		/* XScale3 */
+#define PTE_EXT_SHARED		(_AT(pteval_t, 1) << 10)	/* v6 */
+#define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* v6 */
 
 /*
  *   - small page
  */
-#define PTE_SMALL_AP_MASK	(0xff << 4)
-#define PTE_SMALL_AP_UNO_SRO	(0x00 << 4)
-#define PTE_SMALL_AP_UNO_SRW	(0x55 << 4)
-#define PTE_SMALL_AP_URO_SRW	(0xaa << 4)
-#define PTE_SMALL_AP_URW_SRW	(0xff << 4)
+#define PTE_SMALL_AP_MASK	(_AT(pteval_t, 0xff) << 4)
+#define PTE_SMALL_AP_UNO_SRO	(_AT(pteval_t, 0x00) << 4)
+#define PTE_SMALL_AP_UNO_SRW	(_AT(pteval_t, 0x55) << 4)
+#define PTE_SMALL_AP_URO_SRW	(_AT(pteval_t, 0xaa) << 4)
+#define PTE_SMALL_AP_URW_SRW	(_AT(pteval_t, 0xff) << 4)
 
 #endif
diff --git a/arch/arm/include/asm/pgtable-2level-types.h b/arch/arm/include/asm/pgtable-2level-types.h
index 8a01f62..66cb5b0 100644
--- a/arch/arm/include/asm/pgtable-2level-types.h
+++ b/arch/arm/include/asm/pgtable-2level-types.h
@@ -19,7 +19,10 @@
 #ifndef _ASM_PGTABLE_2LEVEL_TYPES_H
 #define _ASM_PGTABLE_2LEVEL_TYPES_H
 
-typedef unsigned long pteval_t;
+#include <asm/types.h>
+
+typedef u32 pteval_t;
+typedef u32 pmdval_t;
 
 #undef STRICT_MM_TYPECHECKS
 
@@ -28,9 +31,9 @@ typedef unsigned long pteval_t;
  * These are used to make use of C type-checking..
  */
 typedef struct { pteval_t pte; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
-typedef struct { unsigned long pgd[2]; } pgd_t;
-typedef struct { unsigned long pgprot; } pgprot_t;
+typedef struct { pmdval_t pmd; } pmd_t;
+typedef struct { pmdval_t pgd[2]; } pgd_t;
+typedef struct { pteval_t pgprot; } pgprot_t;
 
 #define pte_val(x)      ((x).pte)
 #define pmd_val(x)      ((x).pmd)
@@ -46,9 +49,9 @@ typedef struct { unsigned long pgprot; } pgprot_t;
  * .. while these make it easier on the compiler
  */
 typedef pteval_t pte_t;
-typedef unsigned long pmd_t;
-typedef unsigned long pgd_t[2];
-typedef unsigned long pgprot_t;
+typedef pmdval_t pmd_t;
+typedef pmdval_t pgd_t[2];
+typedef pteval_t pgprot_t;
 
 #define pte_val(x)      (x)
 #define pmd_val(x)      (x)
diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
index 8077145..02b2f82 100644
--- a/arch/arm/include/asm/tlbflush.h
+++ b/arch/arm/include/asm/tlbflush.h
@@ -471,7 +471,7 @@ static inline void local_flush_tlb_kernel_page(unsigned long kaddr)
  *	these operations.  This is typically used when we are removing
  *	PMD entries.
  */
-static inline void flush_pmd_entry(pmd_t *pmd)
+static inline void flush_pmd_entry(void *pmd)
 {
 	const unsigned int __tlb_flag = __cpu_tlb_flags;
 
@@ -487,7 +487,7 @@ static inline void flush_pmd_entry(pmd_t *pmd)
 		dsb();
 }
 
-static inline void clean_pmd_entry(pmd_t *pmd)
+static inline void clean_pmd_entry(void *pmd)
 {
 	const unsigned int __tlb_flag = __cpu_tlb_flags;
 
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 0105667..ad7cce3 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -12,8 +12,8 @@ static inline pmd_t *pmd_off_k(unsigned long virt)
 
 struct mem_type {
 	pteval_t prot_pte;
-	unsigned int prot_l1;
-	unsigned int prot_sect;
+	pmdval_t prot_l1;
+	pmdval_t prot_sect;
 	unsigned int domain;
 };
 
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index dc26858..c990280 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -60,7 +60,7 @@ EXPORT_SYMBOL(pgprot_kernel);
 struct cachepolicy {
 	const char	policy[16];
 	unsigned int	cr_mask;
-	unsigned int	pmd;
+	pmdval_t	pmd;
 	pteval_t	pte;
 };
 
@@ -288,7 +288,7 @@ static void __init build_mem_type_table(void)
 {
 	struct cachepolicy *cp;
 	unsigned int cr = get_cr();
-	unsigned int user_pgprot, kern_pgprot, vecs_pgprot;
+	pteval_t user_pgprot, kern_pgprot, vecs_pgprot;
 	int cpu_arch = cpu_architecture();
 	int i;
 

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 06/16] ARM: LPAE: Use a mask for physical addresses in page table entries
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (4 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 05/16] ARM: LPAE: Add (pte|pmd)val_t type definitions as u32 Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 07/16] ARM: LPAE: Introduce the 3-level page table format definitions Catalin Marinas
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

With LPAE, the physical address mask is 40-bit while the page table
entry is 64-bit. This patch introduces PHYS_MASK for the 2-level page
table format, defined as ~0UL.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable-2level-hwdef.h |    2 ++
 arch/arm/include/asm/pgtable.h              |    6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-2level-hwdef.h b/arch/arm/include/asm/pgtable-2level-hwdef.h
index 2b52c40..5cfba15 100644
--- a/arch/arm/include/asm/pgtable-2level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-2level-hwdef.h
@@ -88,4 +88,6 @@
 #define PTE_SMALL_AP_URO_SRW	(_AT(pteval_t, 0xaa) << 4)
 #define PTE_SMALL_AP_URW_SRW	(_AT(pteval_t, 0xff) << 4)
 
+#define PHYS_MASK		(~0UL)
+
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 9618052..8f9e1dd 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -199,10 +199,10 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 
 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
-	return __va(pmd_val(pmd) & PAGE_MASK);
+	return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
 }
 
-#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
+#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
 /* we don't need complex calculations here as the pmd is folded into the pgd */
 #define pmd_addr_end(addr,end)	(end)
@@ -223,7 +223,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pte_offset_map(pmd,addr)	(__pte_map(pmd) + pte_index(addr))
 #define pte_unmap(pte)			__pte_unmap(pte)
 
-#define pte_pfn(pte)		(pte_val(pte) >> PAGE_SHIFT)
+#define pte_pfn(pte)		((pte_val(pte) & PHYS_MASK) >> PAGE_SHIFT)
 #define pfn_pte(pfn,prot)	__pte(__pfn_to_phys(pfn) | pgprot_val(prot))
 
 #define pte_page(pte)		pfn_to_page(pte_pfn(pte))

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 07/16] ARM: LPAE: Introduce the 3-level page table format definitions
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (5 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 06/16] ARM: LPAE: Use a mask for physical addresses in page table entries Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format Catalin Marinas
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces the pgtable-3level*.h files with definitions
specific to the LPAE page table format (3 levels of page tables).

Each table is 4KB and has 512 64-bit entries. An entry can point to a
40-bit physical address. The young, write and exec software bits share
the corresponding hardware bits (negated). Other software bits use spare
bits in the PTE.

The patch also changes some variable types from unsigned long or int to
pteval_t or pgprot_t.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/page.h                 |    4 +
 arch/arm/include/asm/pgtable-3level-hwdef.h |   82 +++++++++++++++++++++
 arch/arm/include/asm/pgtable-3level-types.h |   70 ++++++++++++++++++
 arch/arm/include/asm/pgtable-3level.h       |  102 +++++++++++++++++++++++++++
 arch/arm/include/asm/pgtable-hwdef.h        |    4 +
 arch/arm/include/asm/pgtable.h              |    4 +
 6 files changed, 266 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm/include/asm/pgtable-3level.h

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ca94653..97b440c 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -151,7 +151,11 @@ extern void __cpu_copy_user_highpage(struct page *to, struct page *from,
 #define clear_page(page)	memset((void *)(page), 0, PAGE_SIZE)
 extern void copy_page(void *to, const void *from);
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-types.h>
+#else
 #include <asm/pgtable-2level-types.h>
+#endif
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
new file mode 100644
index 0000000..7c238a3
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -0,0 +1,82 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-hwdef.h
+ *
+ * Copyright (C) 2011 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_HWDEF_H
+#define _ASM_PGTABLE_3LEVEL_HWDEF_H
+
+/*
+ * Hardware page table definitions.
+ *
+ * + Level 1/2 descriptor
+ *   - common
+ */
+#define PMD_TYPE_MASK		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmdval_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
+#define PMD_BIT4		(_AT(pmdval_t, 0))
+#define PMD_DOMAIN(x)		(_AT(pmdval_t, 0))
+
+/*
+ *   - section
+ */
+#define PMD_SECT_BUFFERABLE	(_AT(pmdval_t, 1) << 2)
+#define PMD_SECT_CACHEABLE	(_AT(pmdval_t, 1) << 3)
+#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
+#define PMD_SECT_nG		(_AT(pmdval_t, 1) << 11)
+#ifdef __ASSEMBLY__
+/* avoid 'shift count out of range' warning */
+#define PMD_SECT_XN		(0)
+#else
+#define PMD_SECT_XN		((pmdval_t)1 << 54)
+#endif
+#define PMD_SECT_AP_WRITE	(_AT(pmdval_t, 0))
+#define PMD_SECT_AP_READ	(_AT(pmdval_t, 0))
+#define PMD_SECT_TEX(x)		(_AT(pmdval_t, 0))
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define PMD_SECT_UNCACHED	(_AT(pmdval_t, 0) << 2)	/* strongly ordered */
+#define PMD_SECT_BUFFERED	(_AT(pmdval_t, 1) << 2)	/* normal non-cacheable */
+#define PMD_SECT_WT		(_AT(pmdval_t, 2) << 2)	/* normal inner write-through */
+#define PMD_SECT_WB		(_AT(pmdval_t, 3) << 2)	/* normal inner write-back */
+#define PMD_SECT_WBWA		(_AT(pmdval_t, 7) << 2)	/* normal inner write-alloc */
+
+/*
+ * + Level 3 descriptor (PTE)
+ */
+#define PTE_TYPE_MASK		(_AT(pteval_t, 3) << 0)
+#define PTE_TYPE_FAULT		(_AT(pteval_t, 0) << 0)
+#define PTE_TYPE_PAGE		(_AT(pteval_t, 3) << 0)
+#define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
+#define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
+#define PTE_EXT_XN		(_AT(pteval_t, 1) << 54)	/* XN */
+
+/*
+ * 40-bit physical address supported.
+ */
+#define PHYS_MASK_SHIFT		(40)
+#define PHYS_MASK		((1ULL << PHYS_MASK_SHIFT) - 1)
+
+#endif
diff --git a/arch/arm/include/asm/pgtable-3level-types.h b/arch/arm/include/asm/pgtable-3level-types.h
new file mode 100644
index 0000000..921aa30
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level-types.h
@@ -0,0 +1,70 @@
+/*
+ * arch/arm/include/asm/pgtable-3level-types.h
+ *
+ * Copyright (C) 2011 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_TYPES_H
+#define _ASM_PGTABLE_3LEVEL_TYPES_H
+
+#include <asm/types.h>
+
+typedef u64 pteval_t;
+typedef u64 pmdval_t;
+typedef u64 pgdval_t;
+
+#undef STRICT_MM_TYPECHECKS
+
+#ifdef STRICT_MM_TYPECHECKS
+
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { pteval_t pte; } pte_t;
+typedef struct { pmdval_t pmd; } pmd_t;
+typedef struct { pgdval_t pgd; } pgd_t;
+typedef struct { pteval_t pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pmd_val(x)      ((x).pmd)
+#define pgd_val(x)	((x).pgd)
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pmd(x)        ((pmd_t) { (x) } )
+#define __pgd(x)	((pgd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else	/* !STRICT_MM_TYPECHECKS */
+
+typedef pteval_t pte_t;
+typedef pmdval_t pmd_t;
+typedef pgdval_t pgd_t;
+typedef pteval_t pgprot_t;
+
+#define pte_val(x)	(x)
+#define pmd_val(x)	(x)
+#define pgd_val(x)	(x)
+#define pgprot_val(x)	(x)
+
+#define __pte(x)	(x)
+#define __pmd(x)	(x)
+#define __pgd(x)	(x)
+#define __pgprot(x)	(x)
+
+#endif	/* STRICT_MM_TYPECHECKS */
+
+#endif	/* _ASM_PGTABLE_3LEVEL_TYPES_H */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
new file mode 100644
index 0000000..79bf0ac
--- /dev/null
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -0,0 +1,102 @@
+/*
+ * arch/arm/include/asm/pgtable-3level.h
+ *
+ * Copyright (C) 2011 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#ifndef _ASM_PGTABLE_3LEVEL_H
+#define _ASM_PGTABLE_3LEVEL_H
+
+/*
+ * With LPAE, there are 3 levels of page tables. Each level has 512 entries of
+ * 8 bytes each, occupying a 4K page. The first level table covers a range of
+ * 512GB, each entry representing 1GB. Since we are limited to 4GB input
+ * address range, only 4 entries in the PGD are used.
+ *
+ * There are enough spare bits in a page table entry for the kernel specific
+ * state.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		512
+#define PTRS_PER_PGD		4
+
+#define PTE_HWTABLE_PTRS	(PTRS_PER_PTE)
+#define PTE_HWTABLE_OFF		(0)
+#define PTE_HWTABLE_SIZE	(PTRS_PER_PTE * sizeof(u64))
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map.
+ */
+#define PGDIR_SHIFT		30
+
+/*
+ * PMD_SHIFT determines the size a middle-level page table entry can map.
+ */
+#define PMD_SHIFT		21
+
+#define PMD_SIZE		(1UL << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+#define PGDIR_SIZE		(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		21
+#define SECTION_SIZE		(1UL << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+#define USER_PTRS_PER_PGD	(PAGE_OFFSET / PGDIR_SIZE)
+
+/*
+ * "Linux" PTE definitions for LPAE.
+ *
+ * These bits overlap with the hardware bits but the naming is preserved for
+ * consistency with the classic page table format.
+ */
+#define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Valid */
+#define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
+#define L_PTE_BUFFERABLE	(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
+#define L_PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
+#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
+#define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
+#define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
+#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+
+/*
+ * To be used in assembly code with the upper page attributes.
+ */
+#define L_PTE_XN_HIGH		(1 << (54 - 32))
+#define L_PTE_DIRTY_HIGH	(1 << (55 - 32))
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define L_PTE_MT_UNCACHED	(_AT(pteval_t, 0) << 2)	/* strongly ordered */
+#define L_PTE_MT_BUFFERABLE	(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_WRITETHROUGH	(_AT(pteval_t, 2) << 2)	/* normal inner write-through */
+#define L_PTE_MT_WRITEBACK	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_WRITEALLOC	(_AT(pteval_t, 7) << 2)	/* normal inner write-alloc */
+#define L_PTE_MT_DEV_SHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_NONSHARED	(_AT(pteval_t, 4) << 2)	/* device */
+#define L_PTE_MT_DEV_WC		(_AT(pteval_t, 1) << 2)	/* normal non-cacheable */
+#define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
+#define L_PTE_MT_MASK		(_AT(pteval_t, 7) << 2)
+
+#endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable-hwdef.h b/arch/arm/include/asm/pgtable-hwdef.h
index 1831111..8426229 100644
--- a/arch/arm/include/asm/pgtable-hwdef.h
+++ b/arch/arm/include/asm/pgtable-hwdef.h
@@ -10,6 +10,10 @@
 #ifndef _ASMARM_PGTABLE_HWDEF_H
 #define _ASMARM_PGTABLE_HWDEF_H
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level-hwdef.h>
+#else
 #include <asm/pgtable-2level-hwdef.h>
+#endif
 
 #endif
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 8f9e1dd..95fefd9 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -25,7 +25,11 @@
 #include <mach/vmalloc.h>
 #include <asm/pgtable-hwdef.h>
 
+#ifdef CONFIG_ARM_LPAE
+#include <asm/pgtable-3level.h>
+#else
 #include <asm/pgtable-2level.h>
+#endif
 
 /*
  * Just any arbitrary offset to the start of the vmalloc VM area: the

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (6 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 07/16] ARM: LPAE: Introduce the 3-level page table format definitions Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-10-23 11:56   ` Russell King - ARM Linux
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch modifies the pgd/pmd/pte manipulation functions to support
the 3-level page table format. Since there is no need for an 'ext'
argument to cpu_set_pte_ext(), this patch conditionally defines a
different prototype for this function when CONFIG_ARM_LPAE.

The patch also introduces the L_PGD_SWAPPER flag to mark pgd entries
pointing to pmd tables pre-allocated in the swapper_pg_dir and avoid
trying to free them at run-time. This flag is 0 with the classic page
table format.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgalloc.h        |   24 +++++++++++++
 arch/arm/include/asm/pgtable-3level.h |    5 +++
 arch/arm/include/asm/pgtable.h        |   62 ++++++++++++++++++++++++++++++++-
 arch/arm/include/asm/proc-fns.h       |   21 +++++++++++
 arch/arm/mm/ioremap.c                 |    8 +++--
 arch/arm/mm/pgd.c                     |   51 +++++++++++++++++++++++++--
 6 files changed, 163 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/pgalloc.h b/arch/arm/include/asm/pgalloc.h
index 7418894..943504f 100644
--- a/arch/arm/include/asm/pgalloc.h
+++ b/arch/arm/include/asm/pgalloc.h
@@ -25,6 +25,26 @@
 #define _PAGE_USER_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_USER))
 #define _PAGE_KERNEL_TABLE	(PMD_TYPE_TABLE | PMD_BIT4 | PMD_DOMAIN(DOMAIN_KERNEL))
 
+#ifdef CONFIG_ARM_LPAE
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	return (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
+	free_page((unsigned long)pmd);
+}
+
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+	set_pud(pud, __pud(__pa(pmd) | PMD_TYPE_TABLE));
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * Since we have only two-level page tables, these are trivial
  */
@@ -32,6 +52,8 @@
 #define pmd_free(mm, pmd)		do { } while (0)
 #define pud_populate(mm,pmd,pte)	BUG()
 
+#endif	/* CONFIG_ARM_LPAE */
+
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
 extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 
@@ -109,7 +131,9 @@ static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t pte,
 {
 	pmdval_t pmdval = (pte + PTE_HWTABLE_OFF) | prot;
 	pmdp[0] = __pmd(pmdval);
+#ifndef CONFIG_ARM_LPAE
 	pmdp[1] = __pmd(pmdval + 256 * sizeof(pte_t));
+#endif
 	flush_pmd_entry(pmdp);
 }
 
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 79bf0ac..a6261f5 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -99,4 +99,9 @@
 #define L_PTE_MT_DEV_CACHED	(_AT(pteval_t, 3) << 2)	/* normal inner write-back */
 #define L_PTE_MT_MASK		(_AT(pteval_t, 7) << 2)
 
+/*
+ * Software PGD flags.
+ */
+#define L_PGD_SWAPPER		(_AT(pgdval_t, 1) << 55)	/* swapper_pg_dir entry */
+
 #endif /* _ASM_PGTABLE_3LEVEL_H */
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 95fefd9..1db9ad6 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -165,6 +165,31 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
 
+#ifdef CONFIG_ARM_LPAE
+
+#define pud_none(pud)		(!pud_val(pud))
+#define pud_bad(pud)		(!(pud_val(pud) & 2))
+#define pud_present(pud)	(pud_val(pud))
+
+#define pud_clear(pudp)			\
+	do {				\
+		*pudp = __pud(0);	\
+		clean_pmd_entry(pudp);	\
+	} while (0)
+
+#define set_pud(pudp, pud)		\
+	do {				\
+		*pudp = pud;		\
+		flush_pmd_entry(pudp);	\
+	} while (0)
+
+static inline pmd_t *pud_page_vaddr(pud_t pud)
+{
+	return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
+}
+
+#else	/* !CONFIG_ARM_LPAE */
+
 /*
  * The "pud_xxx()" functions here are trivial when the pmd is folded into
  * the pud: the pud entry is never bad, always exists, and can't be set or
@@ -176,15 +201,43 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 #define pud_clear(pudp)		do { } while (0)
 #define set_pud(pud,pudp)	do { } while (0)
 
+#endif	/* CONFIG_ARM_LPAE */
 
 /* Find an entry in the second-level page table.. */
+#ifdef CONFIG_ARM_LPAE
+#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
+{
+	return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
+}
+#else
 static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 {
 	return (pmd_t *)pud;
 }
+#endif
 
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define pmd_present(pmd)	(pmd_val(pmd))
+
+#ifdef CONFIG_ARM_LPAE
+
+#define pmd_bad(pmd)		(!(pmd_val(pmd) & 2))
+
+#define copy_pmd(pmdpd,pmdps)		\
+	do {				\
+		*pmdpd = *pmdps;	\
+		flush_pmd_entry(pmdpd);	\
+	} while (0)
+
+#define pmd_clear(pmdp)			\
+	do {				\
+		*pmdp = __pmd(0);	\
+		clean_pmd_entry(pmdp);	\
+	} while (0)
+
+#else	/* !CONFIG_ARM_LPAE */
+
 #define pmd_bad(pmd)		(pmd_val(pmd) & 2)
 
 #define copy_pmd(pmdpd,pmdps)		\
@@ -201,6 +254,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 		clean_pmd_entry(pmdp);	\
 	} while (0)
 
+#endif	/* CONFIG_ARM_LPAE */
+
 static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 {
 	return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
@@ -233,9 +288,14 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pte_page(pte)		pfn_to_page(pte_pfn(pte))
 #define mk_pte(page,prot)	pfn_pte(page_to_pfn(page), prot)
 
-#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
 #define pte_clear(mm,addr,ptep)	set_pte_ext(ptep, __pte(0), 0)
 
+#ifdef CONFIG_ARM_LPAE
+#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,__pte(pte_val(pte)|(ext)))
+#else
+#define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
+#endif
+
 #if __LINUX_ARM_ARCH__ < 6
 static inline void __sync_icache_dcache(pte_t pteval)
 {
diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 633d1cb..34e852c 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -65,7 +65,11 @@ extern struct processor {
 	 * Set a possibly extended PTE.  Non-extended PTEs should
 	 * ignore 'ext'.
 	 */
+#ifdef CONFIG_ARM_LPAE
+	void (*set_pte_ext)(pte_t *ptep, pte_t pte);
+#else
 	void (*set_pte_ext)(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 
 	/* Suspend/resume */
 	unsigned int suspend_size;
@@ -79,7 +83,11 @@ extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
 extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+#ifdef CONFIG_ARM_LPAE
+extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
+#else
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte, unsigned int ext);
+#endif
 extern void cpu_reset(unsigned long addr) __attribute__((noreturn));
 #else
 #define cpu_proc_init			processor._proc_init
@@ -99,6 +107,18 @@ extern void cpu_resume(void);
 
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_get_pgd()	\
+	({						\
+		unsigned long pg, pg2;			\
+		__asm__("mrrc	p15, 0, %0, %1, c2"	\
+			: "=r" (pg), "=r" (pg2)		\
+			:				\
+			: "cc");			\
+		pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);	\
+		(pgd_t *)phys_to_virt(pg);		\
+	})
+#else
 #define cpu_get_pgd()	\
 	({						\
 		unsigned long pg;			\
@@ -107,6 +127,7 @@ extern void cpu_resume(void);
 		pg &= ~0x3fff;				\
 		(pgd_t *)phys_to_virt(pg);		\
 	})
+#endif
 
 #endif
 
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index ab50627..6bdf42c 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -64,7 +64,7 @@ void __check_kvm_seq(struct mm_struct *mm)
 	} while (seq != init_mm.context.kvm_seq);
 }
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 /*
  * Section support is unsafe on SMP - If you iounmap and ioremap a region,
  * the other CPUs will not see this change until their next context switch.
@@ -195,11 +195,13 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
 	unsigned long addr;
  	struct vm_struct * area;
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * High mappings must be supersection aligned
 	 */
 	if (pfn >= 0x100000 && (__pfn_to_phys(pfn) & ~SUPERSECTION_MASK))
 		return NULL;
+#endif
 
 	/*
 	 * Don't allow RAM to be mapped - this causes problems with ARMv6+
@@ -221,7 +223,7 @@ void __iomem * __arm_ioremap_pfn_caller(unsigned long pfn,
  		return NULL;
  	addr = (unsigned long)area->addr;
 
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	if (DOMAIN_IO == 0 &&
 	    (((cpu_architecture() >= CPU_ARCH_ARMv6) && (get_cr() & CR_XP)) ||
 	       cpu_is_xsc3()) && pfn >= 0x100000 &&
@@ -292,7 +294,7 @@ EXPORT_SYMBOL(__arm_ioremap);
 void __iounmap(volatile void __iomem *io_addr)
 {
 	void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr);
-#ifndef CONFIG_SMP
+#if !defined(CONFIG_SMP) && !defined(CONFIG_ARM_LPAE)
 	struct vm_struct **p, *tmp;
 
 	/*
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index b2027c1..a3e78cc 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -10,6 +10,7 @@
 #include <linux/mm.h>
 #include <linux/gfp.h>
 #include <linux/highmem.h>
+#include <linux/slab.h>
 
 #include <asm/pgalloc.h>
 #include <asm/page.h>
@@ -17,6 +18,14 @@
 
 #include "mm.h"
 
+#ifdef CONFIG_ARM_LPAE
+#define __pgd_alloc()	kmalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL)
+#define __pgd_free(pgd)	kfree(pgd)
+#else
+#define __pgd_alloc()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
+#define __pgd_free(pgd)	free_pages((unsigned long)pgd, 2)
+#endif
+
 /*
  * need to get a 16k page for level 1
  */
@@ -27,7 +36,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	pmd_t *new_pmd, *init_pmd;
 	pte_t *new_pte, *init_pte;
 
-	new_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, 2);
+	new_pgd = __pgd_alloc();
 	if (!new_pgd)
 		goto no_pgd;
 
@@ -42,10 +51,25 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 
 	clean_dcache_area(new_pgd, PTRS_PER_PGD * sizeof(pgd_t));
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Allocate PMD table for modules and pkmap mappings.
+	 */
+	new_pud = pud_alloc(mm, new_pgd + pgd_index(MODULES_VADDR),
+			    MODULES_VADDR);
+	if (!new_pud)
+		goto no_pud;
+
+	new_pmd = pmd_alloc(mm, new_pud, 0);
+	if (!new_pmd)
+		goto no_pmd;
+#endif
+
 	if (!vectors_high()) {
 		/*
 		 * On ARM, first page must always be allocated since it
-		 * contains the machine vectors.
+		 * contains the machine vectors. The vectors are always high
+		 * with LPAE.
 		 */
 		new_pud = pud_alloc(mm, new_pgd, 0);
 		if (!new_pud)
@@ -74,7 +98,7 @@ no_pte:
 no_pmd:
 	pud_free(mm, new_pud);
 no_pud:
-	free_pages((unsigned long)new_pgd, 2);
+	__pgd_free(new_pgd);
 no_pgd:
 	return NULL;
 }
@@ -111,5 +135,24 @@ no_pud:
 	pgd_clear(pgd);
 	pud_free(mm, pud);
 no_pgd:
-	free_pages((unsigned long) pgd_base, 2);
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Free modules/pkmap or identity pmd tables.
+	 */
+	for (pgd = pgd_base; pgd < pgd_base + PTRS_PER_PGD; pgd++) {
+		if (pgd_none_or_clear_bad(pgd))
+			continue;
+		if (pgd_val(*pgd) & L_PGD_SWAPPER)
+			continue;
+		pud = pud_offset(pgd, 0);
+		if (pud_none_or_clear_bad(pud))
+			continue;
+		pmd = pmd_offset(pud, 0);
+		pud_clear(pud);
+		pmd_free(mm, pmd);
+		pgd_clear(pgd);
+		pud_free(mm, pud);
+	}
+#endif
+	__pgd_free(pgd_base);
 }

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (7 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-13 11:49   ` Vasily Khoruzhick
                     ` (3 more replies)
  2011-08-10 15:03 ` [PATCH v7 10/16] ARM: LPAE: Invalidate the TLB before freeing the PMD Catalin Marinas
                   ` (6 subsequent siblings)
  15 siblings, 4 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the MMU initialisation for the LPAE page table format.
The swapper_pg_dir size with LPAE is 5 rather than 4 pages. A new
proc-v7lpae.S file contains the initialisation, context switch and
save/restore code for ARMv7 with the LPAE. The TTBRx split is based on
the PAGE_OFFSET with TTBR1 used for the kernel mappings. The 36-bit
mappings (supersections) and a few other memory types in mmu.c are
conditionally compiled.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/head.S    |  117 +++++++++----
 arch/arm/mm/Makefile      |    4 +
 arch/arm/mm/mmu.c         |   34 ++++-
 arch/arm/mm/proc-macros.S |    5 +-
 arch/arm/mm/proc-v7lpae.S |  422 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 542 insertions(+), 40 deletions(-)
 create mode 100644 arch/arm/mm/proc-v7lpae.S

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index d8231b2..0bdafc4 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -21,6 +21,7 @@
 #include <asm/memory.h>
 #include <asm/thread_info.h>
 #include <asm/system.h>
+#include <asm/pgtable.h>
 
 #ifdef CONFIG_DEBUG_LL
 #include <mach/debug-macro.S>
@@ -38,11 +39,20 @@
 #error KERNEL_RAM_VADDR must start at 0xXXXX8000
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+	/* LPAE requires an additional page for the PGD */
+#define PG_DIR_SIZE	0x5000
+#define PMD_ORDER	3
+#else
+#define PG_DIR_SIZE	0x4000
+#define PMD_ORDER	2
+#endif
+
 	.globl	swapper_pg_dir
-	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
+	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
 
 	.macro	pgtbl, rd, phys
-	add	\rd, \phys, #TEXT_OFFSET - 0x4000
+	add	\rd, \phys, #TEXT_OFFSET - PG_DIR_SIZE
 	.endm
 
 #ifdef CONFIG_XIP_KERNEL
@@ -148,11 +158,11 @@ __create_page_tables:
 	pgtbl	r4, r8				@ page table address
 
 	/*
-	 * Clear the 16K level 1 swapper page table
+	 * Clear the swapper page table
 	 */
 	mov	r0, r4
 	mov	r3, #0
-	add	r6, r0, #0x4000
+	add	r6, r0, #PG_DIR_SIZE
 1:	str	r3, [r0], #4
 	str	r3, [r0], #4
 	str	r3, [r0], #4
@@ -160,6 +170,25 @@ __create_page_tables:
 	teq	r0, r6
 	bne	1b
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Build the PGD table (first level) to point to the PMD table. A PGD
+	 * entry is 64-bit wide.
+	 */
+	mov	r0, r4
+	add	r3, r4, #0x1000			@ first PMD table address
+	orr	r3, r3, #3			@ PGD block type
+	mov	r6, #4				@ PTRS_PER_PGD
+	mov	r7, #1 << (55 - 32)		@ L_PGD_SWAPPER
+1:	str	r3, [r0], #4			@ set bottom PGD entry bits
+	str	r7, [r0], #4			@ set top PGD entry bits
+	add	r3, r3, #0x1000			@ next PMD table
+	subs	r6, r6, #1
+	bne	1b
+
+	add	r4, r4, #0x1000			@ point to the PMD tables
+#endif
+
 	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
 
 	/*
@@ -171,30 +200,30 @@ __create_page_tables:
 	sub	r0, r0, r3			@ virt->phys offset
 	add	r5, r5, r0			@ phys __enable_mmu
 	add	r6, r6, r0			@ phys __enable_mmu_end
-	mov	r5, r5, lsr #20
-	mov	r6, r6, lsr #20
+	mov	r5, r5, lsr #SECTION_SHIFT
+	mov	r6, r6, lsr #SECTION_SHIFT
 
-1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
-	str	r3, [r4, r5, lsl #2]		@ identity mapping
-	teq	r5, r6
-	addne	r5, r5, #1			@ next section
-	bne	1b
+1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
+	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
+	cmp	r5, r6
+	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
+	blo	1b
 
 	/*
 	 * Now setup the pagetables for our kernel direct
 	 * mapped region.
 	 */
 	mov	r3, pc
-	mov	r3, r3, lsr #20
-	orr	r3, r7, r3, lsl #20
-	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
-	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
+	mov	r3, r3, lsr #SECTION_SHIFT
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
+	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
+	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT - PMD_ORDER)]!
 	ldr	r6, =(KERNEL_END - 1)
-	add	r0, r0, #4
-	add	r6, r4, r6, lsr #18
+	add	r0, r0, #1 << PMD_ORDER
+	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
 1:	cmp	r0, r6
-	add	r3, r3, #1 << 20
-	strls	r3, [r0], #4
+	add	r3, r3, #1 << SECTION_SHIFT
+	strls	r3, [r0], #1 << PMD_ORDER
 	bls	1b
 
 #ifdef CONFIG_XIP_KERNEL
@@ -203,11 +232,11 @@ __create_page_tables:
 	 */
 	add	r3, r8, #TEXT_OFFSET
 	orr	r3, r3, r7
-	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> 18
-	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >> 18]!
+	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
+	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >> (SECTION_SHIFT - PMD_ORDER)]!
 	ldr	r6, =(_end - 1)
 	add	r0, r0, #4
-	add	r6, r4, r6, lsr #18
+	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
 1:	cmp	r0, r6
 	add	r3, r3, #1 << 20
 	strls	r3, [r0], #4
@@ -215,15 +244,15 @@ __create_page_tables:
 #endif
 
 	/*
-	 * Then map boot params address in r2 or
-	 * the first 1MB of ram if boot params address is not specified.
+	 * Then map boot params address in r2 or the first 1MB (2MB with LPAE)
+	 * of ram if boot params address is not specified.
 	 */
-	mov	r0, r2, lsr #20
-	movs	r0, r0, lsl #20
+	mov	r0, r2, lsr #SECTION_SHIFT
+	movs	r0, r0, lsl #SECTION_SHIFT
 	moveq	r0, r8
 	sub	r3, r0, r8
 	add	r3, r3, #PAGE_OFFSET
-	add	r3, r4, r3, lsr #18
+	add	r3, r4, r3, lsr #(SECTION_SHIFT - PMD_ORDER)
 	orr	r6, r7, r0
 	str	r6, [r3]
 
@@ -236,21 +265,27 @@ __create_page_tables:
 	 */
 	addruart r7, r3
 
-	mov	r3, r3, lsr #20
-	mov	r3, r3, lsl #2
+	mov	r3, r3, lsr #SECTION_SHIFT
+	mov	r3, r3, lsl #PMD_ORDER
 
 	add	r0, r4, r3
 	rsb	r3, r3, #0x4000			@ PTRS_PER_PGD*sizeof(long)
 	cmp	r3, #0x0800			@ limit to 512MB
 	movhi	r3, #0x0800
 	add	r6, r0, r3
-	mov	r3, r7, lsr #20
+	mov	r3, r7, lsr #SECTION_SHIFT
 	ldr	r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags
-	orr	r3, r7, r3, lsl #20
+	orr	r3, r7, r3, lsl #SECTION_SHIFT
+#ifdef CONFIG_ARM_LPAE
+	mov	r7, #1 << (54 - 32)		@ XN
+#endif
 1:	str	r3, [r0], #4
-	add	r3, r3, #1 << 20
-	teq	r0, r6
-	bne	1b
+#ifdef CONFIG_ARM_LPAE
+	str	r7, [r0], #4
+#endif
+	add	r3, r3, #1 << SECTION_SHIFT
+	cmp	r0, r6
+	blo	1b
 
 #else /* CONFIG_DEBUG_ICEDCC */
 	/* we don't need any serial debugging mappings for ICEDCC */
@@ -262,7 +297,7 @@ __create_page_tables:
 	 * If we're using the NetWinder or CATS, we also need to map
 	 * in the 16550-type serial port for the debug messages
 	 */
-	add	r0, r4, #0xff000000 >> 18
+	add	r0, r4, #0xff000000 >> (SECTION_SHIFT - PMD_ORDER)
 	orr	r3, r7, #0x7c000000
 	str	r3, [r0]
 #endif
@@ -272,13 +307,16 @@ __create_page_tables:
 	 * Similar reasons here - for debug.  This is
 	 * only for Acorn RiscPC architectures.
 	 */
-	add	r0, r4, #0x02000000 >> 18
+	add	r0, r4, #0x02000000 >> (SECTION_SHIFT - PMD_ORDER)
 	orr	r3, r7, #0x02000000
 	str	r3, [r0]
-	add	r0, r4, #0xd8000000 >> 18
+	add	r0, r4, #0xd8000000 >> (SECTION_SHIFT - PMD_ORDER)
 	str	r3, [r0]
 #endif
 #endif
+#ifdef CONFIG_ARM_LPAE
+	sub	r4, r4, #0x1000		@ point to the PGD table
+#endif
 	mov	pc, lr
 ENDPROC(__create_page_tables)
 	.ltorg
@@ -370,12 +408,17 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
 	bic	r0, r0, #CR_I
 #endif
+#ifdef CONFIG_ARM_LPAE
+	mov	r5, #0
+	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
+#else
 	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
 		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
 	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
 	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
+#endif
 	b	__turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index bca7e61..48639e7 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -91,7 +91,11 @@ obj-$(CONFIG_CPU_MOHAWK)	+= proc-mohawk.o
 obj-$(CONFIG_CPU_FEROCEON)	+= proc-feroceon.o
 obj-$(CONFIG_CPU_V6)		+= proc-v6.o
 obj-$(CONFIG_CPU_V6K)		+= proc-v6.o
+ifeq ($(CONFIG_ARM_LPAE),y)
+obj-$(CONFIG_CPU_V7)		+= proc-v7lpae.o
+else
 obj-$(CONFIG_CPU_V7)		+= proc-v7.o
+endif
 
 AFLAGS_proc-v6.o	:=-Wa,-march=armv6
 AFLAGS_proc-v7.o	:=-Wa,-march=armv7-a
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index c990280..1ba2a5a 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -150,6 +150,7 @@ static int __init early_nowrite(char *__unused)
 }
 early_param("nowb", early_nowrite);
 
+#ifndef CONFIG_ARM_LPAE
 static int __init early_ecc(char *p)
 {
 	if (memcmp(p, "on", 2) == 0)
@@ -159,6 +160,7 @@ static int __init early_ecc(char *p)
 	return 0;
 }
 early_param("ecc", early_ecc);
+#endif
 
 static int __init noalign_setup(char *__unused)
 {
@@ -228,10 +230,12 @@ static struct mem_type mem_types[] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
 		.domain    = DOMAIN_KERNEL,
 	},
+#ifndef CONFIG_ARM_LPAE
 	[MT_MINICLEAN] = {
 		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
 		.domain    = DOMAIN_KERNEL,
 	},
+#endif
 	[MT_LOW_VECTORS] = {
 		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
 				L_PTE_RDONLY,
@@ -421,6 +425,7 @@ static void __init build_mem_type_table(void)
 	 * ARMv6 and above have extended page tables.
 	 */
 	if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
+#ifndef CONFIG_ARM_LPAE
 		/*
 		 * Mark cache clean areas and XIP ROM read only
 		 * from SVC mode and no access from userspace.
@@ -428,6 +433,7 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
 		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
+#endif
 
 		if (is_smp()) {
 			/*
@@ -466,6 +472,18 @@ static void __init build_mem_type_table(void)
 		mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
 	}
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Do not generate access flag faults for the kernel mappings.
+	 */
+	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
+		mem_types[i].prot_pte |= PTE_EXT_AF;
+		mem_types[i].prot_sect |= PMD_SECT_AF;
+	}
+	kern_pgprot |= PTE_EXT_AF;
+	vecs_pgprot |= PTE_EXT_AF;
+#endif
+
 	for (i = 0; i < 16; i++) {
 		unsigned long v = pgprot_val(protection_map[i]);
 		protection_map[i] = __pgprot(v | user_pgprot);
@@ -564,8 +582,10 @@ static void __init alloc_init_section(pud_t *pud, unsigned long addr,
 	if (((addr | end | phys) & ~SECTION_MASK) == 0) {
 		pmd_t *p = pmd;
 
+#ifndef CONFIG_ARM_LPAE
 		if (addr & SECTION_SIZE)
 			pmd++;
+#endif
 
 		do {
 			*pmd = __pmd(phys | type->prot_sect);
@@ -595,6 +615,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
 	} while (pud++, addr = next, addr != end);
 }
 
+#ifndef CONFIG_ARM_LPAE
 static void __init create_36bit_mapping(struct map_desc *md,
 					const struct mem_type *type)
 {
@@ -654,6 +675,7 @@ static void __init create_36bit_mapping(struct map_desc *md,
 		pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
 	} while (addr != end);
 }
+#endif	/* !CONFIG_ARM_LPAE */
 
 /*
  * Create the page directory entries and any necessary
@@ -685,6 +707,7 @@ static void __init create_mapping(struct map_desc *md)
 
 	type = &mem_types[md->type];
 
+#ifndef CONFIG_ARM_LPAE
 	/*
 	 * Catch 36-bit addresses
 	 */
@@ -692,6 +715,7 @@ static void __init create_mapping(struct map_desc *md)
 		create_36bit_mapping(md, type);
 		return;
 	}
+#endif
 
 	addr = md->virtual & PAGE_MASK;
 	phys = __pfn_to_phys(md->pfn);
@@ -889,6 +913,14 @@ static inline void prepare_page_table(void)
 		pmd_clear(pmd_off_k(addr));
 }
 
+#ifdef CONFIG_ARM_LPAE
+/* the first page is reserved for pgd */
+#define SWAPPER_PG_DIR_SIZE	(PAGE_SIZE + \
+				 PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
+#else
+#define SWAPPER_PG_DIR_SIZE	(PTRS_PER_PGD * sizeof(pgd_t))
+#endif
+
 /*
  * Reserve the special regions of memory
  */
@@ -898,7 +930,7 @@ void __init arm_mm_memblock_reserve(void)
 	 * Reserve the page tables.  These are already in use,
 	 * and can only be in node 0.
 	 */
-	memblock_reserve(__pa(swapper_pg_dir), PTRS_PER_PGD * sizeof(pgd_t));
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
 
 #ifdef CONFIG_SA1111
 	/*
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index 307a4de..2d8ff3a 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -91,8 +91,9 @@
 #if L_PTE_SHARED != PTE_EXT_SHARED
 #error PTE shared bit mismatch
 #endif
-#if (L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
-     L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
+#if !defined (CONFIG_ARM_LPAE) && \
+	(L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
+	 L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
 #error Invalid Linux PTE bit settings
 #endif
 #endif	/* CONFIG_MMU */
diff --git a/arch/arm/mm/proc-v7lpae.S b/arch/arm/mm/proc-v7lpae.S
new file mode 100644
index 0000000..0bee213
--- /dev/null
+++ b/arch/arm/mm/proc-v7lpae.S
@@ -0,0 +1,422 @@
+/*
+ * arch/arm/mm/proc-v7lpae.S
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2011 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *   based on arch/arm/mm/proc-v7.S
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/hwcap.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/pgtable.h>
+
+#include "proc-macros.S"
+
+#define TTB_IRGN_NC	(0 << 8)
+#define TTB_IRGN_WBWA	(1 << 8)
+#define TTB_IRGN_WT	(2 << 8)
+#define TTB_IRGN_WB	(3 << 8)
+#define TTB_RGN_NC	(0 << 10)
+#define TTB_RGN_OC_WBWA	(1 << 10)
+#define TTB_RGN_OC_WT	(2 << 10)
+#define TTB_RGN_OC_WB	(3 << 10)
+#define TTB_S		(3 << 12)
+#define TTB_EAE		(1 << 31)
+
+/* PTWs cacheable, inner WB not shareable, outer WB not shareable */
+#define TTB_FLAGS_UP	(TTB_IRGN_WB|TTB_RGN_OC_WB)
+#define PMD_FLAGS_UP	(PMD_SECT_WB)
+
+/* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
+#define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_RGN_OC_WBWA)
+#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
+
+ENTRY(cpu_v7_proc_init)
+	mov	pc, lr
+ENDPROC(cpu_v7_proc_init)
+
+ENTRY(cpu_v7_proc_fin)
+	mrc	p15, 0, r0, c1, c0, 0		@ ctrl register
+	bic	r0, r0, #0x1000			@ ...i............
+	bic	r0, r0, #0x0006			@ .............ca.
+	mcr	p15, 0, r0, c1, c0, 0		@ disable caches
+	mov	pc, lr
+ENDPROC(cpu_v7_proc_fin)
+
+/*
+ *	cpu_v7_reset(loc)
+ *
+ *	Perform a soft reset of the system.  Put the CPU into the
+ *	same state as it would be if it had been reset, and branch
+ *	to what would be the reset vector.
+ *
+ *	- loc   - location to jump to for soft reset
+ */
+	.align	5
+ENTRY(cpu_v7_reset)
+	mov	pc, r0
+ENDPROC(cpu_v7_reset)
+
+/*
+ *	cpu_v7_do_idle()
+ *
+ *	Idle the processor (eg, wait for interrupt).
+ *
+ *	IRQs are already disabled.
+ */
+ENTRY(cpu_v7_do_idle)
+	dsb					@ WFI may enter a low-power mode
+	wfi
+	mov	pc, lr
+ENDPROC(cpu_v7_do_idle)
+
+ENTRY(cpu_v7_dcache_clean_area)
+#ifndef TLB_CAN_READ_FROM_L1_CACHE
+	dcache_line_size r2, r3
+1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
+	add	r0, r0, r2
+	subs	r1, r1, r2
+	bhi	1b
+	dsb
+#endif
+	mov	pc, lr
+ENDPROC(cpu_v7_dcache_clean_area)
+
+/*
+ *	cpu_v7_switch_mm(pgd_phys, tsk)
+ *
+ *	Set the translation table base pointer to be pgd_phys
+ *
+ *	- pgd_phys - physical address of new TTB
+ *
+ *	It is assumed that:
+ *	- we are not using split page tables
+ */
+ENTRY(cpu_v7_switch_mm)
+#ifdef CONFIG_MMU
+	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
+	mov	r2, #0
+	and	r3, r1, #0xff
+	mov	r3, r3, lsl #(48 - 32)		@ ASID
+	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
+	isb
+#endif
+	mov	pc, lr
+ENDPROC(cpu_v7_switch_mm)
+
+/*
+ *	cpu_v7_set_pte_ext(ptep, pte)
+ *
+ *	Set a level 2 translation table entry.
+ *
+ *	- ptep  - pointer to level 2 translation table entry
+ *		  (hardware version is stored at +2048 bytes)
+ *	- pte   - PTE value to store
+ *	- ext	- value for extended PTE bits
+ */
+ENTRY(cpu_v7_set_pte_ext)
+#ifdef CONFIG_MMU
+	tst	r2, #L_PTE_PRESENT
+	beq	1f
+	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
+	orreq	r2, #L_PTE_RDONLY
+1:	strd	r2, r3, [r0]
+	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
+#endif
+	mov	pc, lr
+ENDPROC(cpu_v7_set_pte_ext)
+
+cpu_v7_name:
+	.ascii	"ARMv7 Processor"
+	.align
+
+	/*
+	 * Memory region attributes for LPAE (defined in pgtable-3level.h):
+	 *
+	 *   n = AttrIndx[2:0]
+	 *
+	 *			n	MAIR
+	 *   UNCACHED		000	00000000
+	 *   BUFFERABLE		001	01000100
+	 *   DEV_WC		001	01000100
+	 *   WRITETHROUGH	010	10101010
+	 *   WRITEBACK		011	11101110
+	 *   DEV_CACHED		011	11101110
+	 *   DEV_SHARED		100	00000100
+	 *   DEV_NONSHARED	100	00000100
+	 *   unused		101
+	 *   unused		110
+	 *   WRITEALLOC		111	11111111
+	 */
+.equ	MAIR0,	0xeeaa4400			@ MAIR0
+.equ	MAIR1,	0xff000004			@ MAIR1
+
+/* Suspend/resume support: derived from arch/arm/mach-s5pv210/sleep.S */
+.globl	cpu_v7_suspend_size
+.equ	cpu_v7_suspend_size, 4 * 10
+#ifdef CONFIG_PM_SLEEP
+ENTRY(cpu_v7_do_suspend)
+	stmfd	sp!, {r4 - r11, lr}
+	mrc	p15, 0, r4, c13, c0, 0	@ FCSE/PID
+	mrc	p15, 0, r5, c13, c0, 1	@ Context ID
+	mrc	p15, 0, r6, c3, c0, 0	@ Domain ID
+	mrrc	p15, 0, r7, r8, c2	@ TTB 0
+	mrrc	p15, 1, r2, r3, c2	@ TTB 1
+	mrc	p15, 0, r9, c1, c0, 0	@ Control register
+	mrc	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
+	mrc	p15, 0, r11, c1, c0, 2	@ Co-processor access control
+	stmia	r0, {r2 - r11}
+	ldmfd	sp!, {r4 - r11, pc}
+ENDPROC(cpu_v7_do_suspend)
+
+ENTRY(cpu_v7_do_resume)
+	mov	ip, #0
+	mcr	p15, 0, ip, c8, c7, 0	@ invalidate TLBs
+	mcr	p15, 0, ip, c7, c5, 0	@ invalidate I cache
+	ldmia	r0, {r2 - r11}
+	mcr	p15, 0, r4, c13, c0, 0	@ FCSE/PID
+	mcr	p15, 0, r5, c13, c0, 1	@ Context ID
+	mcr	p15, 0, r6, c3, c0, 0	@ Domain ID
+	mcrr	p15, 0, r7, r8, c2	@ TTB 0
+	mcrr	p15, 1, r2, r3, c2	@ TTB 1
+	mcr	p15, 0, ip, c2, c0, 2	@ TTB control register
+	mcr	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
+	mcr	p15, 0, r11, c1, c0, 2	@ Co-processor access control
+	ldr	r4, =MAIR0
+	ldr	r5, =MAIR1
+	mcr	p15, 0, r4, c10, c2, 0	@ write MAIR0
+	mcr	p15, 0, r5, c10, c2, 1	@ write MAIR1
+	isb
+	mov	r0, r9			@ control register
+	mov	r2, r7, lsr #14		@ get TTB0 base
+	mov	r2, r2, lsl #14
+	ldr	r3, cpu_resume_l1_flags
+	b	cpu_resume_mmu
+ENDPROC(cpu_v7_do_resume)
+cpu_resume_l1_flags:
+	ALT_SMP(.long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_SMP)
+	ALT_UP(.long  PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_UP)
+#else
+#define cpu_v7_do_suspend	0
+#define cpu_v7_do_resume	0
+#endif
+
+	__CPUINIT
+
+/*
+ *	__v7_setup
+ *
+ *	Initialise TLB, Caches, and MMU state ready to switch the MMU
+ *	on. Return in r0 the new CP15 C1 control register setting.
+ *
+ *	This should be able to cover all ARMv7 cores with LPAE.
+ *
+ *	It is assumed that:
+ *	- cache type register is implemented
+ */
+__v7_ca15mp_setup:
+	mov	r10, #0
+1:
+#ifdef CONFIG_SMP
+	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
+	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
+	tst	r0, #(1 << 6)			@ SMP/nAMP mode enabled?
+	orreq	r0, r0, #(1 << 6)		@ Enable SMP/nAMP mode
+	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
+	mcreq	p15, 0, r0, c1, c0, 1
+#endif
+__v7_setup:
+	adr	r12, __v7_setup_stack		@ the local stack
+	stmia	r12, {r0-r5, r7, r9, r11, lr}
+	bl	v7_flush_dcache_all
+	ldmia	r12, {r0-r5, r7, r9, r11, lr}
+
+	mov	r10, #0
+	mcr	p15, 0, r10, c7, c5, 0		@ I+BTB cache invalidate
+	dsb
+#ifdef CONFIG_MMU
+	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
+	mov	r5, #TTB_EAE
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP)
+	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP << 16)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP)
+	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP << 16)
+	mrc	p15, 0, r10, c2, c0, 2
+	orr	r10, r10, r5
+#if PHYS_OFFSET <= PAGE_OFFSET
+	/*
+	 * TTBR0/TTBR1 split (PAGE_OFFSET):
+	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
+	 *   0x80000000: T0SZ = 0, T1SZ = 1
+	 *   0xc0000000: T0SZ = 0, T1SZ = 2
+	 *
+	 * Only use this feature if PAGE_OFFSET <=  PAGE_OFFSET, otherwise
+	 * booting secondary CPUs would end up using TTBR1 for the identity
+	 * mapping set up in TTBR0.
+	 */
+	orr	r10, r10, #(((PAGE_OFFSET >> 30) - 1) << 16)	@ TTBCR.T1SZ
+#endif
+	mcr	p15, 0, r10, c2, c0, 2		@ TTB control register
+	mov	r5, #0
+#if defined CONFIG_VMSPLIT_2G
+	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
+	add	r6, r8, #1 << 4			@ skip two L1 entries
+#elif defined CONFIG_VMSPLIT_3G
+	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
+	add	r6, r8, #4096 * (1 + 3)		@ only L2 used, skip pgd+3*pmd
+#else
+	mov	r6, r8
+#endif
+	mcrr	p15, 1, r6, r5, c2		@ load TTBR1
+	ldr	r5, =MAIR0
+	ldr	r6, =MAIR1
+	mcr	p15, 0, r5, c10, c2, 0		@ write MAIR0
+	mcr	p15, 0, r6, c10, c2, 1		@ write MAIR1
+#endif
+	adr	r5, v7_crval
+	ldmia	r5, {r5, r6}
+#ifdef CONFIG_CPU_ENDIAN_BE8
+	orr	r6, r6, #1 << 25		@ big-endian page tables
+#endif
+#ifdef CONFIG_SWP_EMULATE
+	orr     r5, r5, #(1 << 10)              @ set SW bit in "clear"
+	bic     r6, r6, #(1 << 10)              @ clear it in "mmuset"
+#endif
+	mrc	p15, 0, r0, c1, c0, 0		@ read control register
+	bic	r0, r0, r5			@ clear bits them
+	orr	r0, r0, r6			@ set them
+ THUMB(	orr	r0, r0, #1 << 30	)	@ Thumb exceptions
+	mov	pc, lr				@ return to head.S:__ret
+ENDPROC(__v7_setup)
+
+	/*   AT
+	 *  TFR   EV X F   IHD LR    S
+	 * .EEE ..EE PUI. .TAT 4RVI ZWRS BLDP WCAM
+	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
+	 *   11    0 110    1  0011 1100 .111 1101 < we want
+	 */
+	.type	v7_crval, #object
+v7_crval:
+	crval	clear=0x0120c302, mmuset=0x30c23c7d, ucset=0x00c01c7c
+
+__v7_setup_stack:
+	.space	4 * 11				@ 11 registers
+
+	__INITDATA
+
+	.type	v7_processor_functions, #object
+ENTRY(v7_processor_functions)
+	.word	v7_early_abort
+	.word	v7_pabort
+	.word	cpu_v7_proc_init
+	.word	cpu_v7_proc_fin
+	.word	cpu_v7_reset
+	.word	cpu_v7_do_idle
+	.word	cpu_v7_dcache_clean_area
+	.word	cpu_v7_switch_mm
+	.word	cpu_v7_set_pte_ext
+	.word	0
+	.word	0
+	.word	0
+	.size	v7_processor_functions, . - v7_processor_functions
+
+	.section ".rodata"
+
+	.type	cpu_arch_name, #object
+cpu_arch_name:
+	.asciz	"armv7"
+	.size	cpu_arch_name, . - cpu_arch_name
+
+	.type	cpu_elf_name, #object
+cpu_elf_name:
+	.asciz	"v7"
+	.size	cpu_elf_name, . - cpu_elf_name
+	.align
+
+	.section ".proc.info.init", #alloc, #execinstr
+
+	.type	__v7_ca15mp_proc_info, #object
+__v7_ca15mp_proc_info:
+	.long	0x410fc0f0		@ Required ID value
+	.long	0xff0ffff0		@ Mask for ID
+	ALT_SMP(.long \
+		PMD_TYPE_SECT | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
+		PMD_FLAGS_SMP)
+	ALT_UP(.long \
+		PMD_TYPE_SECT | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
+		PMD_FLAGS_UP)
+		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
+	.long   PMD_TYPE_SECT | \
+		PMD_SECT_XN | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF
+	b	__v7_ca15mp_setup
+	.long	cpu_arch_name
+	.long	cpu_elf_name
+	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_TLS
+	.long	cpu_v7_name
+	.long	v7_processor_functions
+	.long	v7wbi_tlb_fns
+	.long	v6_user_fns
+	.long	v7_cache_fns
+	.size	__v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
+
+	/*
+	 * Match any ARMv7 processor core.
+	 */
+	.type	__v7_proc_info, #object
+__v7_proc_info:
+	.long	0x000f0000		@ Required ID value
+	.long	0x000f0000		@ Mask for ID
+	ALT_SMP(.long \
+		PMD_TYPE_SECT | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
+		PMD_FLAGS_SMP)
+	ALT_UP(.long \
+		PMD_TYPE_SECT | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF | \
+		PMD_FLAGS_UP)
+		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
+	.long   PMD_TYPE_SECT | \
+		PMD_SECT_XN | \
+		PMD_SECT_AP_WRITE | \
+		PMD_SECT_AP_READ | \
+		PMD_SECT_AF
+	W(b)	__v7_setup
+	.long	cpu_arch_name
+	.long	cpu_elf_name
+	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|HWCAP_TLS
+	.long	cpu_v7_name
+	.long	v7_processor_functions
+	.long	v7wbi_tlb_fns
+	.long	v6_user_fns
+	.long	v7_cache_fns
+	.size	__v7_proc_info, . - __v7_proc_info

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 10/16] ARM: LPAE: Invalidate the TLB before freeing the PMD
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (8 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 11/16] ARM: LPAE: Add fault handling support Catalin Marinas
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

Similar to the PTE freeing, this patch introduced __pmd_free_tlb() which
invalidates the TLB before freeing a PMD page. This is needed because on
newer processors the entry in the upper page table may be cached by the
TLB and point to random data after the PMD has been freed.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/tlb.h |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index b509e44..5d3ed7e3 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -202,8 +202,17 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
 	tlb_remove_page(tlb, pte);
 }
 
+static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
+				  unsigned long addr)
+{
+#ifdef CONFIG_ARM_LPAE
+	tlb_add_flush(tlb, addr);
+	tlb_remove_page(tlb, virt_to_page(pmdp));
+#endif
+}
+
 #define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
-#define pmd_free_tlb(tlb, pmdp, addr)	pmd_free((tlb)->mm, pmdp)
+#define pmd_free_tlb(tlb, pmdp, addr)	__pmd_free_tlb(tlb, pmdp, addr)
 #define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
 
 #define tlb_migrate_finish(mm)		do { } while (0)

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 11/16] ARM: LPAE: Add fault handling support
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (9 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 10/16] ARM: LPAE: Invalidate the TLB before freeing the PMD Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-10-23 11:57   ` Russell King - ARM Linux
  2011-08-10 15:03 ` [PATCH v7 12/16] ARM: LPAE: Add context switching support Catalin Marinas
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

The DFSR and IFSR register format is different when LPAE is enabled. In
addition, DFSR and IFSR have the similar definitions for the fault type.
This modifies modifies the fault code to correctly handle the new
format.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/alignment.c |    8 ++++-
 arch/arm/mm/fault.c     |   87 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 94 insertions(+), 1 deletions(-)

diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index be7c638..f0bf61a 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -909,6 +909,12 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	return 0;
 }
 
+#ifdef CONFIG_ARM_LPAE
+#define ALIGNMENT_FAULT		33
+#else
+#define ALIGNMENT_FAULT		1
+#endif
+
 /*
  * This needs to be done after sysctl_init, otherwise sys/ will be
  * overwritten.  Actually, this shouldn't be in sys/ at all since
@@ -942,7 +948,7 @@ static int __init alignment_init(void)
 		ai_usermode = UM_FIXUP;
 	}
 
-	hook_fault_code(1, do_alignment, SIGBUS, BUS_ADRALN,
+	hook_fault_code(ALIGNMENT_FAULT, do_alignment, SIGBUS, BUS_ADRALN,
 			"alignment exception");
 
 	/*
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 3b5ea68..91d1768 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -33,10 +33,15 @@
 #define FSR_WRITE		(1 << 11)
 #define FSR_FS4			(1 << 10)
 #define FSR_FS3_0		(15)
+#define FSR_FS5_0		(0x3f)
 
 static inline int fsr_fs(unsigned int fsr)
 {
+#ifdef CONFIG_ARM_LPAE
+	return fsr & FSR_FS5_0;
+#else
 	return (fsr & FSR_FS3_0) | (fsr & FSR_FS4) >> 6;
+#endif
 }
 
 #ifdef CONFIG_MMU
@@ -122,8 +127,10 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
 
 		pte = pte_offset_map(pmd, addr);
 		printk(", *pte=%08llx", (long long)pte_val(*pte));
+#ifndef CONFIG_ARM_LPAE
 		printk(", *ppte=%08llx",
 		       (long long)pte_val(pte[PTE_HWTABLE_PTRS]));
+#endif
 		pte_unmap(pte);
 	} while(0);
 
@@ -440,6 +447,12 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 	pmd = pmd_offset(pud, addr);
 	pmd_k = pmd_offset(pud_k, addr);
 
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * Only one hardware entry per PMD with LPAE.
+	 */
+	index = 0;
+#else
 	/*
 	 * On ARM one Linux PGD entry contains two hardware entries (see page
 	 * tables layout in pgtable.h). We normally guarantee that we always
@@ -449,6 +462,7 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 	 * for the first of pair.
 	 */
 	index = (addr >> SECTION_SHIFT) & 1;
+#endif
 	if (pmd_none(pmd_k[index]))
 		goto bad_area;
 
@@ -494,6 +508,72 @@ static struct fsr_info {
 	int	code;
 	const char *name;
 } fsr_info[] = {
+#ifdef CONFIG_ARM_LPAE
+	{ do_bad,		SIGBUS,  0,		"unknown 0"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 1"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 2"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 3"			},
+	{ do_bad,		SIGBUS,  0,		"reserved translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 2 translation fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_MAPERR,	"level 3 translation fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 2 access flag fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 access flag fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved permission fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 permission fault"	},
+	{ do_sect_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
+	{ do_bad,		SIGBUS,  0,		"synchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 18"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 19"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 26"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 27"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
+	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
+	{ do_bad,		SIGBUS,  0,		"debug event"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 37"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 38"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 39"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 40"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 41"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 42"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 43"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 44"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 45"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 46"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 47"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 48"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 49"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 50"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 51"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (lockdown abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 53"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 54"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 55"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 56"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 57"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (coprocessor abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 59"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 60"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 61"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 62"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 63"			},
+#else	/* !CONFIG_ARM_LPAE */
 	/*
 	 * The following are the standard ARMv3 and ARMv4 aborts.  ARMv5
 	 * defines these to be "precise" aborts.
@@ -535,6 +615,7 @@ static struct fsr_info {
 	{ do_bad,		SIGBUS,  0,		"unknown 29"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   }
+#endif	/* CONFIG_ARM_LPAE */
 };
 
 void __init
@@ -573,6 +654,9 @@ do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 }
 
 
+#ifdef CONFIG_ARM_LPAE
+#define ifsr_info	fsr_info
+#else	/* !CONFIG_ARM_LPAE */
 static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 0"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 1"			   },
@@ -607,6 +691,7 @@ static struct fsr_info ifsr_info[] = {
 	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
 	{ do_bad,		SIGBUS,  0,		"unknown 31"			   },
 };
+#endif	/* CONFIG_ARM_LPAE */
 
 void __init
 hook_ifault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *),
@@ -642,6 +727,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
 
 static int __init exceptions_init(void)
 {
+#ifndef CONFIG_ARM_LPAE
 	if (cpu_architecture() >= CPU_ARCH_ARMv6) {
 		hook_fault_code(4, do_translation_fault, SIGSEGV, SEGV_MAPERR,
 				"I-cache maintenance fault");
@@ -657,6 +743,7 @@ static int __init exceptions_init(void)
 		hook_fault_code(6, do_bad, SIGSEGV, SEGV_MAPERR,
 				"section access flag fault");
 	}
+#endif
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 12/16] ARM: LPAE: Add context switching support
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (10 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 11/16] ARM: LPAE: Add fault handling support Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format Catalin Marinas
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

With LPAE, TTBRx registers are 64-bit. The ASID is stored in TTBR0
rather than a separate Context ID register. This patch makes the
necessary changes to handle context switching on LPAE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/context.c |   19 +++++++++++++++++--
 1 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index b0ee9ba..fcdb101 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -22,6 +22,21 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 DEFINE_PER_CPU(struct mm_struct *, current_mm);
 #endif
 
+#ifdef CONFIG_ARM_LPAE
+#define cpu_set_asid(asid) {						\
+	unsigned long ttbl, ttbh;					\
+	asm volatile(							\
+	"	mrrc	p15, 0, %0, %1, c2		@ read TTBR0\n"	\
+	"	mov	%1, %2, lsl #(48 - 32)		@ set ASID\n"	\
+	"	mcrr	p15, 0, %0, %1, c2		@ set TTBR0\n"	\
+	: "=&r" (ttbl), "=&r" (ttbh)					\
+	: "r" (asid & ~ASID_MASK));					\
+}
+#else
+#define cpu_set_asid(asid) \
+	asm("	mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (asid))
+#endif
+
 /*
  * We fork()ed a process, and we need a new context for the child
  * to run in.  We reserve version 0 for initial tasks so we will
@@ -37,7 +52,7 @@ void __init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 static void flush_context(void)
 {
 	/* set the reserved ASID before flushing the TLB */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (0));
+	cpu_set_asid(0);
 	isb();
 	local_flush_tlb_all();
 	if (icache_is_vivt_asid_tagged()) {
@@ -99,7 +114,7 @@ static void reset_context(void *info)
 	set_mm_context(mm, asid);
 
 	/* set the new ASID */
-	asm("mcr	p15, 0, %0, c13, c0, 1\n" : : "r" (mm->context.id));
+	cpu_set_asid(mm->context.id);
 	isb();
 }
 

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (11 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 12/16] ARM: LPAE: Add context switching support Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-10-23 11:59   ` Russell King - ARM Linux
  2011-08-10 15:03 ` [PATCH v7 14/16] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem Catalin Marinas
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

With LPAE, the pgd is a separate page table with entries pointing to the
pmd. The identity_mapping_add() function needs to ensure that the pgd is
populated before populating the pmd level. The do..while blocks now loop
over the pmd in order to have the same implementation for the two page
table formats. The pmd_addr_end() definition has been removed and the
generic one used instead. The pmd clean-up is done in the pgd_free()
function.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/pgtable.h |    4 ----
 arch/arm/mm/idmap.c            |   36 ++++++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 1db9ad6..9645e52 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -263,10 +263,6 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 #define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
 
-/* we don't need complex calculations here as the pmd is folded into the pgd */
-#define pmd_addr_end(addr,end)	(end)
-
-
 #ifndef CONFIG_HIGHPTE
 #define __pte_map(pmd)		pmd_page_vaddr(*(pmd))
 #define __pte_unmap(pte)	do { } while (0)
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index 2be9139..24e0655 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -1,9 +1,36 @@
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kernel.h>
 
 #include <asm/cputype.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 
+#ifdef CONFIG_ARM_LPAE
+static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
+	unsigned long prot)
+{
+	pmd_t *pmd;
+	unsigned long next;
+
+	if (pud_none_or_clear_bad(pud) || (pud_val(*pud) & L_PGD_SWAPPER)) {
+		pmd = pmd_alloc_one(NULL, addr);
+		if (!pmd) {
+			pr_warning("Failed to allocate identity pmd.\n");
+			return;
+		}
+		pud_populate(NULL, pud, pmd);
+		pmd += pmd_index(addr);
+	} else
+		pmd = pmd_offset(pud, addr);
+
+	do {
+		next = pmd_addr_end(addr, end);
+		*pmd = __pmd((addr & PMD_MASK) | prot);
+		flush_pmd_entry(pmd);
+	} while (pmd++, addr = next, addr != end);
+}
+#else	/* !CONFIG_ARM_LPAE */
 static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
 	unsigned long prot)
 {
@@ -15,6 +42,7 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
 	pmd[1] = __pmd(addr);
 	flush_pmd_entry(pmd);
 }
+#endif	/* CONFIG_ARM_LPAE */
 
 static void idmap_add_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
 	unsigned long prot)
@@ -32,7 +60,7 @@ void identity_mapping_add(pgd_t *pgd, unsigned long addr, unsigned long end)
 {
 	unsigned long prot, next;
 
-	prot = PMD_TYPE_SECT | PMD_SECT_AP_WRITE;
+	prot = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_AF;
 	if (cpu_architecture() <= CPU_ARCH_ARMv5TEJ && !cpu_is_xscale())
 		prot |= PMD_BIT4;
 
@@ -46,7 +74,11 @@ void identity_mapping_add(pgd_t *pgd, unsigned long addr, unsigned long end)
 #ifdef CONFIG_SMP
 static void idmap_del_pmd(pud_t *pud, unsigned long addr, unsigned long end)
 {
-	pmd_t *pmd = pmd_offset(pud, addr);
+	pmd_t *pmd;
+
+	if (pud_none_or_clear_bad(pud))
+		return;
+	pmd = pmd_offset(pud, addr);
 	pmd_clear(pmd);
 }
 

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 14/16] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (12 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64 Catalin Marinas
  2011-08-10 15:03 ` [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries Catalin Marinas
  15 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

Memory banks living outside of the 32-bit physical address
space do not have a 1:1 pa <-> va mapping and therefore the
__va macro may wrap.

This patch ensures that such banks are marked as highmem so
that the Kernel doesn't try to split them up when it sees that
the wrapped virtual address overlaps the vmalloc space.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/mm/mmu.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 1ba2a5a..85aa25e 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -792,7 +792,8 @@ void __init sanity_check_meminfo(void)
 		*bank = meminfo.bank[i];
 
 #ifdef CONFIG_HIGHMEM
-		if (__va(bank->start) >= vmalloc_min ||
+		if (bank->start > ULONG_MAX ||
+		    __va(bank->start) >= vmalloc_min ||
 		    __va(bank->start) < (void *)PAGE_OFFSET)
 			highmem = 1;
 
@@ -802,7 +803,7 @@ void __init sanity_check_meminfo(void)
 		 * Split those memory banks which are partially overlapping
 		 * the vmalloc area greatly simplifying things later.
 		 */
-		if (__va(bank->start) < vmalloc_min &&
+		if (!highmem && __va(bank->start) < vmalloc_min &&
 		    bank->size > vmalloc_min - __va(bank->start)) {
 			if (meminfo.nr_banks >= NR_BANKS) {
 				printk(KERN_CRIT "NR_BANKS too low, "

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (13 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 14/16] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-10-23 11:59   ` Russell King - ARM Linux
  2011-08-10 15:03 ` [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries Catalin Marinas
  15 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

From: Will Deacon <will.deacon@arm.com>

LPAE provides support for memory banks with physical addresses of up
to 40 bits.

This patch adds a new atag, ATAG_MEM64, so that the Kernel can be
informed about memory that exists above the 4GB boundary.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/include/asm/setup.h |    8 ++++++++
 arch/arm/kernel/setup.c      |   10 ++++++++++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index 915696d..a3ca303 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -43,6 +43,13 @@ struct tag_mem32 {
 	__u32	start;	/* physical start address */
 };
 
+#define ATAG_MEM64	0x54420002
+
+struct tag_mem64 {
+	__u64	size;
+	__u64	start;	/* physical start address */
+};
+
 /* VGA text type displays */
 #define ATAG_VIDEOTEXT	0x54410003
 
@@ -148,6 +155,7 @@ struct tag {
 	union {
 		struct tag_core		core;
 		struct tag_mem32	mem;
+		struct tag_mem64	mem64;
 		struct tag_videotext	videotext;
 		struct tag_ramdisk	ramdisk;
 		struct tag_initrd	initrd;
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 70bca64..a126558 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -608,6 +608,16 @@ static int __init parse_tag_mem32(const struct tag *tag)
 
 __tagtable(ATAG_MEM, parse_tag_mem32);
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+static int __init parse_tag_mem64(const struct tag *tag)
+{
+	/* We only use 32-bits for the size. */
+	return arm_add_memory(tag->u.mem64.start, (unsigned long)tag->u.mem64.size);
+}
+
+__tagtable(ATAG_MEM64, parse_tag_mem64);
+#endif /* CONFIG_PHYS_ADDR_T_64BIT */
+
 #if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_DUMMY_CONSOLE)
 struct screen_info screen_info = {
  .orig_video_lines	= 30,

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries
  2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
                   ` (14 preceding siblings ...)
  2011-08-10 15:03 ` [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64 Catalin Marinas
@ 2011-08-10 15:03 ` Catalin Marinas
  2011-10-23 12:00   ` Russell King - ARM Linux
  2011-11-02 17:21   ` Russell King - ARM Linux
  15 siblings, 2 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-10 15:03 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the ARM_LPAE and ARCH_PHYS_ADDR_T_64BIT Kconfig entries
allowing LPAE support to be compiled into the kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/Kconfig    |    2 +-
 arch/arm/mm/Kconfig |   13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 2c71a8f..2a81b90 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1820,7 +1820,7 @@ endchoice
 
 config XIP_KERNEL
 	bool "Kernel Execute-In-Place from ROM"
-	depends on !ZBOOT_ROM
+	depends on !ZBOOT_ROM && !ARM_LPAE
 	help
 	  Execute-In-Place allows the kernel to run from non-volatile storage
 	  directly addressable by the CPU, such as NOR flash. This saves RAM
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 88633fe..2df5504 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -629,6 +629,19 @@ config IO_36
 
 comment "Processor Features"
 
+config ARM_LPAE
+	bool "Support for the Large Physical Address Extension"
+	depends on MMU && CPU_V7
+	help
+	  Say Y if you have an ARMv7 processor supporting the LPAE page table
+	  format and you would like to access memory beyond the 4GB limit.
+
+config ARCH_PHYS_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
+config ARCH_DMA_ADDR_T_64BIT
+	def_bool ARM_LPAE
+
 config ARM_THUMB
 	bool "Support Thumb user binaries"
 	depends on CPU_ARM720T || CPU_ARM740T || CPU_ARM920T || CPU_ARM922T || CPU_ARM925T || CPU_ARM926T || CPU_ARM940T || CPU_ARM946E || CPU_ARM1020 || CPU_ARM1020E || CPU_ARM1022 || CPU_ARM1026 || CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_V6 || CPU_V6K || CPU_V7 || CPU_FEROCEON

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
@ 2011-08-13 11:49   ` Vasily Khoruzhick
  2011-08-13 12:56   ` Vasily Khoruzhick
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 46+ messages in thread
From: Vasily Khoruzhick @ 2011-08-13 11:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 10 August 2011 18:03:32 Catalin Marinas wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. A new
> proc-v7lpae.S file contains the initialisation, context switch and
> save/restore code for ARMv7 with the LPAE. The TTBRx split is based on
> the PAGE_OFFSET with TTBR1 used for the kernel mappings. The 36-bit
> mappings (supersections) and a few other memory types in mmu.c are
> conditionally compiled.

Looks like this patch breaks ARMv4. I can't boot kernel anymore on my s3c2442-
based PDA after this patch. Reverting it helps. Any ideas?

> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S    |  117 +++++++++----
>  arch/arm/mm/Makefile      |    4 +
>  arch/arm/mm/mmu.c         |   34 ++++-
>  arch/arm/mm/proc-macros.S |    5 +-
>  arch/arm/mm/proc-v7lpae.S |  422
> +++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 542
> insertions(+), 40 deletions(-)
>  create mode 100644 arch/arm/mm/proc-v7lpae.S
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index d8231b2..0bdafc4 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -21,6 +21,7 @@
>  #include <asm/memory.h>
>  #include <asm/thread_info.h>
>  #include <asm/system.h>
> +#include <asm/pgtable.h>
> 
>  #ifdef CONFIG_DEBUG_LL
>  #include <mach/debug-macro.S>
> @@ -38,11 +39,20 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>  #endif
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE	0x5000
> +#define PMD_ORDER	3
> +#else
> +#define PG_DIR_SIZE	0x4000
> +#define PMD_ORDER	2
> +#endif
> +
>  	.globl	swapper_pg_dir
> -	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
> +	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
> 
>  	.macro	pgtbl, rd, phys
> -	add	\rd, \phys, #TEXT_OFFSET - 0x4000
> +	add	\rd, \phys, #TEXT_OFFSET - PG_DIR_SIZE
>  	.endm
> 
>  #ifdef CONFIG_XIP_KERNEL
> @@ -148,11 +158,11 @@ __create_page_tables:
>  	pgtbl	r4, r8				@ page table address
> 
>  	/*
> -	 * Clear the 16K level 1 swapper page table
> +	 * Clear the swapper page table
>  	 */
>  	mov	r0, r4
>  	mov	r3, #0
> -	add	r6, r0, #0x4000
> +	add	r6, r0, #PG_DIR_SIZE
>  1:	str	r3, [r0], #4
>  	str	r3, [r0], #4
>  	str	r3, [r0], #4
> @@ -160,6 +170,25 @@ __create_page_tables:
>  	teq	r0, r6
>  	bne	1b
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Build the PGD table (first level) to point to the PMD table. A PGD
> +	 * entry is 64-bit wide.
> +	 */
> +	mov	r0, r4
> +	add	r3, r4, #0x1000			@ first PMD table address
> +	orr	r3, r3, #3			@ PGD block type
> +	mov	r6, #4				@ PTRS_PER_PGD
> +	mov	r7, #1 << (55 - 32)		@ L_PGD_SWAPPER
> +1:	str	r3, [r0], #4			@ set bottom PGD entry bits
> +	str	r7, [r0], #4			@ set top PGD entry bits
> +	add	r3, r3, #0x1000			@ next PMD table
> +	subs	r6, r6, #1
> +	bne	1b
> +
> +	add	r4, r4, #0x1000			@ point to the PMD tables
> +#endif
> +
>  	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
> 
>  	/*
> @@ -171,30 +200,30 @@ __create_page_tables:
>  	sub	r0, r0, r3			@ virt->phys offset
>  	add	r5, r5, r0			@ phys __enable_mmu
>  	add	r6, r6, r0			@ phys __enable_mmu_end
> -	mov	r5, r5, lsr #20
> -	mov	r6, r6, lsr #20
> +	mov	r5, r5, lsr #SECTION_SHIFT
> +	mov	r6, r6, lsr #SECTION_SHIFT
> 
> -1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
> -	str	r3, [r4, r5, lsl #2]		@ identity mapping
> -	teq	r5, r6
> -	addne	r5, r5, #1			@ next section
> -	bne	1b
> +1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> +	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
> +	cmp	r5, r6
> +	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> +	blo	1b
> 
>  	/*
>  	 * Now setup the pagetables for our kernel direct
>  	 * mapped region.
>  	 */
>  	mov	r3, pc
> -	mov	r3, r3, lsr #20
> -	orr	r3, r7, r3, lsl #20
> -	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
> +	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
> +	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT -
> PMD_ORDER)]! ldr	r6, =(KERNEL_END - 1)
> -	add	r0, r0, #4
> -	add	r6, r4, r6, lsr #18
> +	add	r0, r0, #1 << PMD_ORDER
> +	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
>  1:	cmp	r0, r6
> -	add	r3, r3, #1 << 20
> -	strls	r3, [r0], #4
> +	add	r3, r3, #1 << SECTION_SHIFT
> +	strls	r3, [r0], #1 << PMD_ORDER
>  	bls	1b
> 
>  #ifdef CONFIG_XIP_KERNEL
> @@ -203,11 +232,11 @@ __create_page_tables:
>  	 */
>  	add	r3, r8, #TEXT_OFFSET
>  	orr	r3, r3, r7
> -	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >> 18]!
> +	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> (SECTION_SHIFT -
> PMD_ORDER) +	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >>
> (SECTION_SHIFT - PMD_ORDER)]! ldr	r6, =(_end - 1)
>  	add	r0, r0, #4
> -	add	r6, r4, r6, lsr #18
> +	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
>  1:	cmp	r0, r6
>  	add	r3, r3, #1 << 20
>  	strls	r3, [r0], #4
> @@ -215,15 +244,15 @@ __create_page_tables:
>  #endif
> 
>  	/*
> -	 * Then map boot params address in r2 or
> -	 * the first 1MB of ram if boot params address is not specified.
> +	 * Then map boot params address in r2 or the first 1MB (2MB with LPAE)
> +	 * of ram if boot params address is not specified.
>  	 */
> -	mov	r0, r2, lsr #20
> -	movs	r0, r0, lsl #20
> +	mov	r0, r2, lsr #SECTION_SHIFT
> +	movs	r0, r0, lsl #SECTION_SHIFT
>  	moveq	r0, r8
>  	sub	r3, r0, r8
>  	add	r3, r3, #PAGE_OFFSET
> -	add	r3, r4, r3, lsr #18
> +	add	r3, r4, r3, lsr #(SECTION_SHIFT - PMD_ORDER)
>  	orr	r6, r7, r0
>  	str	r6, [r3]
> 
> @@ -236,21 +265,27 @@ __create_page_tables:
>  	 */
>  	addruart r7, r3
> 
> -	mov	r3, r3, lsr #20
> -	mov	r3, r3, lsl #2
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	mov	r3, r3, lsl #PMD_ORDER
> 
>  	add	r0, r4, r3
>  	rsb	r3, r3, #0x4000			@ PTRS_PER_PGD*sizeof(long)
>  	cmp	r3, #0x0800			@ limit to 512MB
>  	movhi	r3, #0x0800
>  	add	r6, r0, r3
> -	mov	r3, r7, lsr #20
> +	mov	r3, r7, lsr #SECTION_SHIFT
>  	ldr	r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags
> -	orr	r3, r7, r3, lsl #20
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
> +#ifdef CONFIG_ARM_LPAE
> +	mov	r7, #1 << (54 - 32)		@ XN
> +#endif
>  1:	str	r3, [r0], #4
> -	add	r3, r3, #1 << 20
> -	teq	r0, r6
> -	bne	1b
> +#ifdef CONFIG_ARM_LPAE
> +	str	r7, [r0], #4
> +#endif
> +	add	r3, r3, #1 << SECTION_SHIFT
> +	cmp	r0, r6
> +	blo	1b
> 
>  #else /* CONFIG_DEBUG_ICEDCC */
>  	/* we don't need any serial debugging mappings for ICEDCC */
> @@ -262,7 +297,7 @@ __create_page_tables:
>  	 * If we're using the NetWinder or CATS, we also need to map
>  	 * in the 16550-type serial port for the debug messages
>  	 */
> -	add	r0, r4, #0xff000000 >> 18
> +	add	r0, r4, #0xff000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	orr	r3, r7, #0x7c000000
>  	str	r3, [r0]
>  #endif
> @@ -272,13 +307,16 @@ __create_page_tables:
>  	 * Similar reasons here - for debug.  This is
>  	 * only for Acorn RiscPC architectures.
>  	 */
> -	add	r0, r4, #0x02000000 >> 18
> +	add	r0, r4, #0x02000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	orr	r3, r7, #0x02000000
>  	str	r3, [r0]
> -	add	r0, r4, #0xd8000000 >> 18
> +	add	r0, r4, #0xd8000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	str	r3, [r0]
>  #endif
>  #endif
> +#ifdef CONFIG_ARM_LPAE
> +	sub	r4, r4, #0x1000		@ point to the PGD table
> +#endif
>  	mov	pc, lr
>  ENDPROC(__create_page_tables)
>  	.ltorg
> @@ -370,12 +408,17 @@ __enable_mmu:
>  #ifdef CONFIG_CPU_ICACHE_DISABLE
>  	bic	r0, r0, #CR_I
>  #endif
> +#ifdef CONFIG_ARM_LPAE
> +	mov	r5, #0
> +	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
> +#else
>  	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
>  	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
>  	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
> +#endif
>  	b	__turn_mmu_on
>  ENDPROC(__enable_mmu)
> 
> diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
> index bca7e61..48639e7 100644
> --- a/arch/arm/mm/Makefile
> +++ b/arch/arm/mm/Makefile
> @@ -91,7 +91,11 @@ obj-$(CONFIG_CPU_MOHAWK)	+= proc-mohawk.o
>  obj-$(CONFIG_CPU_FEROCEON)	+= proc-feroceon.o
>  obj-$(CONFIG_CPU_V6)		+= proc-v6.o
>  obj-$(CONFIG_CPU_V6K)		+= proc-v6.o
> +ifeq ($(CONFIG_ARM_LPAE),y)
> +obj-$(CONFIG_CPU_V7)		+= proc-v7lpae.o
> +else
>  obj-$(CONFIG_CPU_V7)		+= proc-v7.o
> +endif
> 
>  AFLAGS_proc-v6.o	:=-Wa,-march=armv6
>  AFLAGS_proc-v7.o	:=-Wa,-march=armv7-a
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index c990280..1ba2a5a 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -150,6 +150,7 @@ static int __init early_nowrite(char *__unused)
>  }
>  early_param("nowb", early_nowrite);
> 
> +#ifndef CONFIG_ARM_LPAE
>  static int __init early_ecc(char *p)
>  {
>  	if (memcmp(p, "on", 2) == 0)
> @@ -159,6 +160,7 @@ static int __init early_ecc(char *p)
>  	return 0;
>  }
>  early_param("ecc", early_ecc);
> +#endif
> 
>  static int __init noalign_setup(char *__unused)
>  {
> @@ -228,10 +230,12 @@ static struct mem_type mem_types[] = {
>  		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
>  		.domain    = DOMAIN_KERNEL,
>  	},
> +#ifndef CONFIG_ARM_LPAE
>  	[MT_MINICLEAN] = {
>  		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
>  		.domain    = DOMAIN_KERNEL,
>  	},
> +#endif
>  	[MT_LOW_VECTORS] = {
>  		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
>  				L_PTE_RDONLY,
> @@ -421,6 +425,7 @@ static void __init build_mem_type_table(void)
>  	 * ARMv6 and above have extended page tables.
>  	 */
>  	if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
> +#ifndef CONFIG_ARM_LPAE
>  		/*
>  		 * Mark cache clean areas and XIP ROM read only
>  		 * from SVC mode and no access from userspace.
> @@ -428,6 +433,7 @@ static void __init build_mem_type_table(void)
>  		mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
>  		mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
>  		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
> +#endif
> 
>  		if (is_smp()) {
>  			/*
> @@ -466,6 +472,18 @@ static void __init build_mem_type_table(void)
>  		mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
>  	}
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Do not generate access flag faults for the kernel mappings.
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
> +		mem_types[i].prot_pte |= PTE_EXT_AF;
> +		mem_types[i].prot_sect |= PMD_SECT_AF;
> +	}
> +	kern_pgprot |= PTE_EXT_AF;
> +	vecs_pgprot |= PTE_EXT_AF;
> +#endif
> +
>  	for (i = 0; i < 16; i++) {
>  		unsigned long v = pgprot_val(protection_map[i]);
>  		protection_map[i] = __pgprot(v | user_pgprot);
> @@ -564,8 +582,10 @@ static void __init alloc_init_section(pud_t *pud,
> unsigned long addr, if (((addr | end | phys) & ~SECTION_MASK) == 0) {
>  		pmd_t *p = pmd;
> 
> +#ifndef CONFIG_ARM_LPAE
>  		if (addr & SECTION_SIZE)
>  			pmd++;
> +#endif
> 
>  		do {
>  			*pmd = __pmd(phys | type->prot_sect);
> @@ -595,6 +615,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long
> addr, unsigned long end, } while (pud++, addr = next, addr != end);
>  }
> 
> +#ifndef CONFIG_ARM_LPAE
>  static void __init create_36bit_mapping(struct map_desc *md,
>  					const struct mem_type *type)
>  {
> @@ -654,6 +675,7 @@ static void __init create_36bit_mapping(struct map_desc
> *md, pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
>  	} while (addr != end);
>  }
> +#endif	/* !CONFIG_ARM_LPAE */
> 
>  /*
>   * Create the page directory entries and any necessary
> @@ -685,6 +707,7 @@ static void __init create_mapping(struct map_desc *md)
> 
>  	type = &mem_types[md->type];
> 
> +#ifndef CONFIG_ARM_LPAE
>  	/*
>  	 * Catch 36-bit addresses
>  	 */
> @@ -692,6 +715,7 @@ static void __init create_mapping(struct map_desc *md)
>  		create_36bit_mapping(md, type);
>  		return;
>  	}
> +#endif
> 
>  	addr = md->virtual & PAGE_MASK;
>  	phys = __pfn_to_phys(md->pfn);
> @@ -889,6 +913,14 @@ static inline void prepare_page_table(void)
>  		pmd_clear(pmd_off_k(addr));
>  }
> 
> +#ifdef CONFIG_ARM_LPAE
> +/* the first page is reserved for pgd */
> +#define SWAPPER_PG_DIR_SIZE	(PAGE_SIZE + \
> +				 PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
> +#else
> +#define SWAPPER_PG_DIR_SIZE	(PTRS_PER_PGD * sizeof(pgd_t))
> +#endif
> +
>  /*
>   * Reserve the special regions of memory
>   */
> @@ -898,7 +930,7 @@ void __init arm_mm_memblock_reserve(void)
>  	 * Reserve the page tables.  These are already in use,
>  	 * and can only be in node 0.
>  	 */
> -	memblock_reserve(__pa(swapper_pg_dir), PTRS_PER_PGD * sizeof(pgd_t));
> +	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
> 
>  #ifdef CONFIG_SA1111
>  	/*
> diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
> index 307a4de..2d8ff3a 100644
> --- a/arch/arm/mm/proc-macros.S
> +++ b/arch/arm/mm/proc-macros.S
> @@ -91,8 +91,9 @@
>  #if L_PTE_SHARED != PTE_EXT_SHARED
>  #error PTE shared bit mismatch
>  #endif
> -#if (L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
> -     L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
> +#if !defined (CONFIG_ARM_LPAE) && \
> +	(L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
> +	 L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
>  #error Invalid Linux PTE bit settings
>  #endif
>  #endif	/* CONFIG_MMU */
> diff --git a/arch/arm/mm/proc-v7lpae.S b/arch/arm/mm/proc-v7lpae.S
> new file mode 100644
> index 0000000..0bee213
> --- /dev/null
> +++ b/arch/arm/mm/proc-v7lpae.S
> @@ -0,0 +1,422 @@
> +/*
> + * arch/arm/mm/proc-v7lpae.S
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2011 ARM Ltd.
> + * Author: Catalin Marinas <catalin.marinas@arm.com>
> + *   based on arch/arm/mm/proc-v7.S
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +#include <linux/init.h>
> +#include <linux/linkage.h>
> +#include <asm/assembler.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/hwcap.h>
> +#include <asm/pgtable-hwdef.h>
> +#include <asm/pgtable.h>
> +
> +#include "proc-macros.S"
> +
> +#define TTB_IRGN_NC	(0 << 8)
> +#define TTB_IRGN_WBWA	(1 << 8)
> +#define TTB_IRGN_WT	(2 << 8)
> +#define TTB_IRGN_WB	(3 << 8)
> +#define TTB_RGN_NC	(0 << 10)
> +#define TTB_RGN_OC_WBWA	(1 << 10)
> +#define TTB_RGN_OC_WT	(2 << 10)
> +#define TTB_RGN_OC_WB	(3 << 10)
> +#define TTB_S		(3 << 12)
> +#define TTB_EAE		(1 << 31)
> +
> +/* PTWs cacheable, inner WB not shareable, outer WB not shareable */
> +#define TTB_FLAGS_UP	(TTB_IRGN_WB|TTB_RGN_OC_WB)
> +#define PMD_FLAGS_UP	(PMD_SECT_WB)
> +
> +/* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
> +#define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_RGN_OC_WBWA)
> +#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
> +
> +ENTRY(cpu_v7_proc_init)
> +	mov	pc, lr
> +ENDPROC(cpu_v7_proc_init)
> +
> +ENTRY(cpu_v7_proc_fin)
> +	mrc	p15, 0, r0, c1, c0, 0		@ ctrl register
> +	bic	r0, r0, #0x1000			@ ...i............
> +	bic	r0, r0, #0x0006			@ .............ca.
> +	mcr	p15, 0, r0, c1, c0, 0		@ disable caches
> +	mov	pc, lr
> +ENDPROC(cpu_v7_proc_fin)
> +
> +/*
> + *	cpu_v7_reset(loc)
> + *
> + *	Perform a soft reset of the system.  Put the CPU into the
> + *	same state as it would be if it had been reset, and branch
> + *	to what would be the reset vector.
> + *
> + *	- loc   - location to jump to for soft reset
> + */
> +	.align	5
> +ENTRY(cpu_v7_reset)
> +	mov	pc, r0
> +ENDPROC(cpu_v7_reset)
> +
> +/*
> + *	cpu_v7_do_idle()
> + *
> + *	Idle the processor (eg, wait for interrupt).
> + *
> + *	IRQs are already disabled.
> + */
> +ENTRY(cpu_v7_do_idle)
> +	dsb					@ WFI may enter a low-power mode
> +	wfi
> +	mov	pc, lr
> +ENDPROC(cpu_v7_do_idle)
> +
> +ENTRY(cpu_v7_dcache_clean_area)
> +#ifndef TLB_CAN_READ_FROM_L1_CACHE
> +	dcache_line_size r2, r3
> +1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
> +	add	r0, r0, r2
> +	subs	r1, r1, r2
> +	bhi	1b
> +	dsb
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_dcache_clean_area)
> +
> +/*
> + *	cpu_v7_switch_mm(pgd_phys, tsk)
> + *
> + *	Set the translation table base pointer to be pgd_phys
> + *
> + *	- pgd_phys - physical address of new TTB
> + *
> + *	It is assumed that:
> + *	- we are not using split page tables
> + */
> +ENTRY(cpu_v7_switch_mm)
> +#ifdef CONFIG_MMU
> +	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
> +	mov	r2, #0
> +	and	r3, r1, #0xff
> +	mov	r3, r3, lsl #(48 - 32)		@ ASID
> +	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
> +	isb
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_switch_mm)
> +
> +/*
> + *	cpu_v7_set_pte_ext(ptep, pte)
> + *
> + *	Set a level 2 translation table entry.
> + *
> + *	- ptep  - pointer to level 2 translation table entry
> + *		  (hardware version is stored at +2048 bytes)
> + *	- pte   - PTE value to store
> + *	- ext	- value for extended PTE bits
> + */
> +ENTRY(cpu_v7_set_pte_ext)
> +#ifdef CONFIG_MMU
> +	tst	r2, #L_PTE_PRESENT
> +	beq	1f
> +	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
> +	orreq	r2, #L_PTE_RDONLY
> +1:	strd	r2, r3, [r0]
> +	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_set_pte_ext)
> +
> +cpu_v7_name:
> +	.ascii	"ARMv7 Processor"
> +	.align
> +
> +	/*
> +	 * Memory region attributes for LPAE (defined in pgtable-3level.h):
> +	 *
> +	 *   n = AttrIndx[2:0]
> +	 *
> +	 *			n	MAIR
> +	 *   UNCACHED		000	00000000
> +	 *   BUFFERABLE		001	01000100
> +	 *   DEV_WC		001	01000100
> +	 *   WRITETHROUGH	010	10101010
> +	 *   WRITEBACK		011	11101110
> +	 *   DEV_CACHED		011	11101110
> +	 *   DEV_SHARED		100	00000100
> +	 *   DEV_NONSHARED	100	00000100
> +	 *   unused		101
> +	 *   unused		110
> +	 *   WRITEALLOC		111	11111111
> +	 */
> +.equ	MAIR0,	0xeeaa4400			@ MAIR0
> +.equ	MAIR1,	0xff000004			@ MAIR1
> +
> +/* Suspend/resume support: derived from arch/arm/mach-s5pv210/sleep.S */
> +.globl	cpu_v7_suspend_size
> +.equ	cpu_v7_suspend_size, 4 * 10
> +#ifdef CONFIG_PM_SLEEP
> +ENTRY(cpu_v7_do_suspend)
> +	stmfd	sp!, {r4 - r11, lr}
> +	mrc	p15, 0, r4, c13, c0, 0	@ FCSE/PID
> +	mrc	p15, 0, r5, c13, c0, 1	@ Context ID
> +	mrc	p15, 0, r6, c3, c0, 0	@ Domain ID
> +	mrrc	p15, 0, r7, r8, c2	@ TTB 0
> +	mrrc	p15, 1, r2, r3, c2	@ TTB 1
> +	mrc	p15, 0, r9, c1, c0, 0	@ Control register
> +	mrc	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
> +	mrc	p15, 0, r11, c1, c0, 2	@ Co-processor access control
> +	stmia	r0, {r2 - r11}
> +	ldmfd	sp!, {r4 - r11, pc}
> +ENDPROC(cpu_v7_do_suspend)
> +
> +ENTRY(cpu_v7_do_resume)
> +	mov	ip, #0
> +	mcr	p15, 0, ip, c8, c7, 0	@ invalidate TLBs
> +	mcr	p15, 0, ip, c7, c5, 0	@ invalidate I cache
> +	ldmia	r0, {r2 - r11}
> +	mcr	p15, 0, r4, c13, c0, 0	@ FCSE/PID
> +	mcr	p15, 0, r5, c13, c0, 1	@ Context ID
> +	mcr	p15, 0, r6, c3, c0, 0	@ Domain ID
> +	mcrr	p15, 0, r7, r8, c2	@ TTB 0
> +	mcrr	p15, 1, r2, r3, c2	@ TTB 1
> +	mcr	p15, 0, ip, c2, c0, 2	@ TTB control register
> +	mcr	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
> +	mcr	p15, 0, r11, c1, c0, 2	@ Co-processor access control
> +	ldr	r4, =MAIR0
> +	ldr	r5, =MAIR1
> +	mcr	p15, 0, r4, c10, c2, 0	@ write MAIR0
> +	mcr	p15, 0, r5, c10, c2, 1	@ write MAIR1
> +	isb
> +	mov	r0, r9			@ control register
> +	mov	r2, r7, lsr #14		@ get TTB0 base
> +	mov	r2, r2, lsl #14
> +	ldr	r3, cpu_resume_l1_flags
> +	b	cpu_resume_mmu
> +ENDPROC(cpu_v7_do_resume)
> +cpu_resume_l1_flags:
> +	ALT_SMP(.long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_SMP)
> +	ALT_UP(.long  PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_UP)
> +#else
> +#define cpu_v7_do_suspend	0
> +#define cpu_v7_do_resume	0
> +#endif
> +
> +	__CPUINIT
> +
> +/*
> + *	__v7_setup
> + *
> + *	Initialise TLB, Caches, and MMU state ready to switch the MMU
> + *	on. Return in r0 the new CP15 C1 control register setting.
> + *
> + *	This should be able to cover all ARMv7 cores with LPAE.
> + *
> + *	It is assumed that:
> + *	- cache type register is implemented
> + */
> +__v7_ca15mp_setup:
> +	mov	r10, #0
> +1:
> +#ifdef CONFIG_SMP
> +	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
> +	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> +	tst	r0, #(1 << 6)			@ SMP/nAMP mode enabled?
> +	orreq	r0, r0, #(1 << 6)		@ Enable SMP/nAMP mode
> +	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
> +	mcreq	p15, 0, r0, c1, c0, 1
> +#endif
> +__v7_setup:
> +	adr	r12, __v7_setup_stack		@ the local stack
> +	stmia	r12, {r0-r5, r7, r9, r11, lr}
> +	bl	v7_flush_dcache_all
> +	ldmia	r12, {r0-r5, r7, r9, r11, lr}
> +
> +	mov	r10, #0
> +	mcr	p15, 0, r10, c7, c5, 0		@ I+BTB cache invalidate
> +	dsb
> +#ifdef CONFIG_MMU
> +	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
> +	mov	r5, #TTB_EAE
> +	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP)
> +	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP << 16)
> +	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP)
> +	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP << 16)
> +	mrc	p15, 0, r10, c2, c0, 2
> +	orr	r10, r10, r5
> +#if PHYS_OFFSET <= PAGE_OFFSET
> +	/*
> +	 * TTBR0/TTBR1 split (PAGE_OFFSET):
> +	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
> +	 *   0x80000000: T0SZ = 0, T1SZ = 1
> +	 *   0xc0000000: T0SZ = 0, T1SZ = 2
> +	 *
> +	 * Only use this feature if PAGE_OFFSET <=  PAGE_OFFSET, otherwise
> +	 * booting secondary CPUs would end up using TTBR1 for the identity
> +	 * mapping set up in TTBR0.
> +	 */
> +	orr	r10, r10, #(((PAGE_OFFSET >> 30) - 1) << 16)	@ TTBCR.T1SZ
> +#endif
> +	mcr	p15, 0, r10, c2, c0, 2		@ TTB control register
> +	mov	r5, #0
> +#if defined CONFIG_VMSPLIT_2G
> +	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
> +	add	r6, r8, #1 << 4			@ skip two L1 entries
> +#elif defined CONFIG_VMSPLIT_3G
> +	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
> +	add	r6, r8, #4096 * (1 + 3)		@ only L2 used, skip pgd+3*pmd
> +#else
> +	mov	r6, r8
> +#endif
> +	mcrr	p15, 1, r6, r5, c2		@ load TTBR1
> +	ldr	r5, =MAIR0
> +	ldr	r6, =MAIR1
> +	mcr	p15, 0, r5, c10, c2, 0		@ write MAIR0
> +	mcr	p15, 0, r6, c10, c2, 1		@ write MAIR1
> +#endif
> +	adr	r5, v7_crval
> +	ldmia	r5, {r5, r6}
> +#ifdef CONFIG_CPU_ENDIAN_BE8
> +	orr	r6, r6, #1 << 25		@ big-endian page tables
> +#endif
> +#ifdef CONFIG_SWP_EMULATE
> +	orr     r5, r5, #(1 << 10)              @ set SW bit in "clear"
> +	bic     r6, r6, #(1 << 10)              @ clear it in "mmuset"
> +#endif
> +	mrc	p15, 0, r0, c1, c0, 0		@ read control register
> +	bic	r0, r0, r5			@ clear bits them
> +	orr	r0, r0, r6			@ set them
> + THUMB(	orr	r0, r0, #1 << 30	)	@ Thumb exceptions
> +	mov	pc, lr				@ return to head.S:__ret
> +ENDPROC(__v7_setup)
> +
> +	/*   AT
> +	 *  TFR   EV X F   IHD LR    S
> +	 * .EEE ..EE PUI. .TAT 4RVI ZWRS BLDP WCAM
> +	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
> +	 *   11    0 110    1  0011 1100 .111 1101 < we want
> +	 */
> +	.type	v7_crval, #object
> +v7_crval:
> +	crval	clear=0x0120c302, mmuset=0x30c23c7d, ucset=0x00c01c7c
> +
> +__v7_setup_stack:
> +	.space	4 * 11				@ 11 registers
> +
> +	__INITDATA
> +
> +	.type	v7_processor_functions, #object
> +ENTRY(v7_processor_functions)
> +	.word	v7_early_abort
> +	.word	v7_pabort
> +	.word	cpu_v7_proc_init
> +	.word	cpu_v7_proc_fin
> +	.word	cpu_v7_reset
> +	.word	cpu_v7_do_idle
> +	.word	cpu_v7_dcache_clean_area
> +	.word	cpu_v7_switch_mm
> +	.word	cpu_v7_set_pte_ext
> +	.word	0
> +	.word	0
> +	.word	0
> +	.size	v7_processor_functions, . - v7_processor_functions
> +
> +	.section ".rodata"
> +
> +	.type	cpu_arch_name, #object
> +cpu_arch_name:
> +	.asciz	"armv7"
> +	.size	cpu_arch_name, . - cpu_arch_name
> +
> +	.type	cpu_elf_name, #object
> +cpu_elf_name:
> +	.asciz	"v7"
> +	.size	cpu_elf_name, . - cpu_elf_name
> +	.align
> +
> +	.section ".proc.info.init", #alloc, #execinstr
> +
> +	.type	__v7_ca15mp_proc_info, #object
> +__v7_ca15mp_proc_info:
> +	.long	0x410fc0f0		@ Required ID value
> +	.long	0xff0ffff0		@ Mask for ID
> +	ALT_SMP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_SMP)
> +	ALT_UP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_UP)
> +		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
> +	.long   PMD_TYPE_SECT | \
> +		PMD_SECT_XN | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF
> +	b	__v7_ca15mp_setup
> +	.long	cpu_arch_name
> +	.long	cpu_elf_name
> +	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|
HWCAP_T
> LS +	.long	cpu_v7_name
> +	.long	v7_processor_functions
> +	.long	v7wbi_tlb_fns
> +	.long	v6_user_fns
> +	.long	v7_cache_fns
> +	.size	__v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
> +
> +	/*
> +	 * Match any ARMv7 processor core.
> +	 */
> +	.type	__v7_proc_info, #object
> +__v7_proc_info:
> +	.long	0x000f0000		@ Required ID value
> +	.long	0x000f0000		@ Mask for ID
> +	ALT_SMP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_SMP)
> +	ALT_UP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_UP)
> +		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
> +	.long   PMD_TYPE_SECT | \
> +		PMD_SECT_XN | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF
> +	W(b)	__v7_setup
> +	.long	cpu_arch_name
> +	.long	cpu_elf_name
> +	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|
HWCAP_T
> LS +	.long	cpu_v7_name
> +	.long	v7_processor_functions
> +	.long	v7wbi_tlb_fns
> +	.long	v6_user_fns
> +	.long	v7_cache_fns
> +	.size	__v7_proc_info, . - __v7_proc_info
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
  2011-08-13 11:49   ` Vasily Khoruzhick
@ 2011-08-13 12:56   ` Vasily Khoruzhick
  2011-08-13 12:58     ` [PATCH] Fix non-LPAE boot regression Vasily Khoruzhick
  2011-08-15 16:51   ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
  2011-08-19 10:25   ` Ian Campbell
  3 siblings, 1 reply; 46+ messages in thread
From: Vasily Khoruzhick @ 2011-08-13 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 10 August 2011 18:03:32 Catalin Marinas wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. A new
> proc-v7lpae.S file contains the initialisation, context switch and
> save/restore code for ARMv7 with the LPAE. The TTBRx split is based on
> the PAGE_OFFSET with TTBR1 used for the kernel mappings. The 36-bit
> mappings (supersections) and a few other memory types in mmu.c are
> conditionally compiled.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/kernel/head.S    |  117 +++++++++----
>  arch/arm/mm/Makefile      |    4 +
>  arch/arm/mm/mmu.c         |   34 ++++-
>  arch/arm/mm/proc-macros.S |    5 +-
>  arch/arm/mm/proc-v7lpae.S |  422
> +++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 542
> insertions(+), 40 deletions(-)
>  create mode 100644 arch/arm/mm/proc-v7lpae.S
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index d8231b2..0bdafc4 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -21,6 +21,7 @@
>  #include <asm/memory.h>
>  #include <asm/thread_info.h>
>  #include <asm/system.h>
> +#include <asm/pgtable.h>
> 
>  #ifdef CONFIG_DEBUG_LL
>  #include <mach/debug-macro.S>
> @@ -38,11 +39,20 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXXX8000
>  #endif
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/* LPAE requires an additional page for the PGD */
> +#define PG_DIR_SIZE	0x5000
> +#define PMD_ORDER	3
> +#else
> +#define PG_DIR_SIZE	0x4000
> +#define PMD_ORDER	2
> +#endif
> +
>  	.globl	swapper_pg_dir
> -	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - 0x4000
> +	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
> 
>  	.macro	pgtbl, rd, phys
> -	add	\rd, \phys, #TEXT_OFFSET - 0x4000
> +	add	\rd, \phys, #TEXT_OFFSET - PG_DIR_SIZE
>  	.endm
> 
>  #ifdef CONFIG_XIP_KERNEL
> @@ -148,11 +158,11 @@ __create_page_tables:
>  	pgtbl	r4, r8				@ page table address
> 
>  	/*
> -	 * Clear the 16K level 1 swapper page table
> +	 * Clear the swapper page table
>  	 */
>  	mov	r0, r4
>  	mov	r3, #0
> -	add	r6, r0, #0x4000
> +	add	r6, r0, #PG_DIR_SIZE
>  1:	str	r3, [r0], #4
>  	str	r3, [r0], #4
>  	str	r3, [r0], #4
> @@ -160,6 +170,25 @@ __create_page_tables:
>  	teq	r0, r6
>  	bne	1b
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Build the PGD table (first level) to point to the PMD table. A PGD
> +	 * entry is 64-bit wide.
> +	 */
> +	mov	r0, r4
> +	add	r3, r4, #0x1000			@ first PMD table address
> +	orr	r3, r3, #3			@ PGD block type
> +	mov	r6, #4				@ PTRS_PER_PGD
> +	mov	r7, #1 << (55 - 32)		@ L_PGD_SWAPPER
> +1:	str	r3, [r0], #4			@ set bottom PGD entry bits
> +	str	r7, [r0], #4			@ set top PGD entry bits
> +	add	r3, r3, #0x1000			@ next PMD table
> +	subs	r6, r6, #1
> +	bne	1b
> +
> +	add	r4, r4, #0x1000			@ point to the PMD tables
> +#endif
> +
>  	ldr	r7, [r10, #PROCINFO_MM_MMUFLAGS] @ mm_mmuflags
> 
>  	/*
> @@ -171,30 +200,30 @@ __create_page_tables:
>  	sub	r0, r0, r3			@ virt->phys offset
>  	add	r5, r5, r0			@ phys __enable_mmu
>  	add	r6, r6, r0			@ phys __enable_mmu_end
> -	mov	r5, r5, lsr #20
> -	mov	r6, r6, lsr #20
> +	mov	r5, r5, lsr #SECTION_SHIFT
> +	mov	r6, r6, lsr #SECTION_SHIFT
> 
> -1:	orr	r3, r7, r5, lsl #20		@ flags + kernel base
> -	str	r3, [r4, r5, lsl #2]		@ identity mapping
> -	teq	r5, r6
> -	addne	r5, r5, #1			@ next section
> -	bne	1b
> +1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> +	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
> +	cmp	r5, r6
> +	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section

20 >> 20 = 0, not 1 as in original version

> +	blo	1b
> 
>  	/*
>  	 * Now setup the pagetables for our kernel direct
>  	 * mapped region.
>  	 */
>  	mov	r3, pc
> -	mov	r3, r3, lsr #20
> -	orr	r3, r7, r3, lsl #20
> -	add	r0, r4,  #(KERNEL_START & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> 18]!
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
> +	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
> +	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT -

0x00e00000 should be 0x00f00000 here.

> PMD_ORDER)]! ldr	r6, =(KERNEL_END - 1)
> -	add	r0, r0, #4
> -	add	r6, r4, r6, lsr #18
> +	add	r0, r0, #1 << PMD_ORDER
> +	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
>  1:	cmp	r0, r6
> -	add	r3, r3, #1 << 20
> -	strls	r3, [r0], #4
> +	add	r3, r3, #1 << SECTION_SHIFT
> +	strls	r3, [r0], #1 << PMD_ORDER
>  	bls	1b
> 
>  #ifdef CONFIG_XIP_KERNEL
> @@ -203,11 +232,11 @@ __create_page_tables:
>  	 */
>  	add	r3, r8, #TEXT_OFFSET
>  	orr	r3, r3, r7
> -	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> 18
> -	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >> 18]!
> +	add	r0, r4,  #(KERNEL_RAM_VADDR & 0xff000000) >> (SECTION_SHIFT -
> PMD_ORDER) +	str	r3, [r0, #(KERNEL_RAM_VADDR & 0x00f00000) >>
> (SECTION_SHIFT - PMD_ORDER)]! ldr	r6, =(_end - 1)
>  	add	r0, r0, #4
> -	add	r6, r4, r6, lsr #18
> +	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
>  1:	cmp	r0, r6
>  	add	r3, r3, #1 << 20
>  	strls	r3, [r0], #4
> @@ -215,15 +244,15 @@ __create_page_tables:
>  #endif
> 
>  	/*
> -	 * Then map boot params address in r2 or
> -	 * the first 1MB of ram if boot params address is not specified.
> +	 * Then map boot params address in r2 or the first 1MB (2MB with LPAE)
> +	 * of ram if boot params address is not specified.
>  	 */
> -	mov	r0, r2, lsr #20
> -	movs	r0, r0, lsl #20
> +	mov	r0, r2, lsr #SECTION_SHIFT
> +	movs	r0, r0, lsl #SECTION_SHIFT
>  	moveq	r0, r8
>  	sub	r3, r0, r8
>  	add	r3, r3, #PAGE_OFFSET
> -	add	r3, r4, r3, lsr #18
> +	add	r3, r4, r3, lsr #(SECTION_SHIFT - PMD_ORDER)
>  	orr	r6, r7, r0
>  	str	r6, [r3]
> 
> @@ -236,21 +265,27 @@ __create_page_tables:
>  	 */
>  	addruart r7, r3
> 
> -	mov	r3, r3, lsr #20
> -	mov	r3, r3, lsl #2
> +	mov	r3, r3, lsr #SECTION_SHIFT
> +	mov	r3, r3, lsl #PMD_ORDER
> 
>  	add	r0, r4, r3
>  	rsb	r3, r3, #0x4000			@ PTRS_PER_PGD*sizeof(long)
>  	cmp	r3, #0x0800			@ limit to 512MB
>  	movhi	r3, #0x0800
>  	add	r6, r0, r3
> -	mov	r3, r7, lsr #20
> +	mov	r3, r7, lsr #SECTION_SHIFT
>  	ldr	r7, [r10, #PROCINFO_IO_MMUFLAGS] @ io_mmuflags
> -	orr	r3, r7, r3, lsl #20
> +	orr	r3, r7, r3, lsl #SECTION_SHIFT
> +#ifdef CONFIG_ARM_LPAE
> +	mov	r7, #1 << (54 - 32)		@ XN
> +#endif
>  1:	str	r3, [r0], #4
> -	add	r3, r3, #1 << 20
> -	teq	r0, r6
> -	bne	1b
> +#ifdef CONFIG_ARM_LPAE
> +	str	r7, [r0], #4
> +#endif
> +	add	r3, r3, #1 << SECTION_SHIFT
> +	cmp	r0, r6
> +	blo	1b
> 
>  #else /* CONFIG_DEBUG_ICEDCC */
>  	/* we don't need any serial debugging mappings for ICEDCC */
> @@ -262,7 +297,7 @@ __create_page_tables:
>  	 * If we're using the NetWinder or CATS, we also need to map
>  	 * in the 16550-type serial port for the debug messages
>  	 */
> -	add	r0, r4, #0xff000000 >> 18
> +	add	r0, r4, #0xff000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	orr	r3, r7, #0x7c000000
>  	str	r3, [r0]
>  #endif
> @@ -272,13 +307,16 @@ __create_page_tables:
>  	 * Similar reasons here - for debug.  This is
>  	 * only for Acorn RiscPC architectures.
>  	 */
> -	add	r0, r4, #0x02000000 >> 18
> +	add	r0, r4, #0x02000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	orr	r3, r7, #0x02000000
>  	str	r3, [r0]
> -	add	r0, r4, #0xd8000000 >> 18
> +	add	r0, r4, #0xd8000000 >> (SECTION_SHIFT - PMD_ORDER)
>  	str	r3, [r0]
>  #endif
>  #endif
> +#ifdef CONFIG_ARM_LPAE
> +	sub	r4, r4, #0x1000		@ point to the PGD table
> +#endif
>  	mov	pc, lr
>  ENDPROC(__create_page_tables)
>  	.ltorg
> @@ -370,12 +408,17 @@ __enable_mmu:
>  #ifdef CONFIG_CPU_ICACHE_DISABLE
>  	bic	r0, r0, #CR_I
>  #endif
> +#ifdef CONFIG_ARM_LPAE
> +	mov	r5, #0
> +	mcrr	p15, 0, r4, r5, c2		@ load TTBR0
> +#else
>  	mov	r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
>  		      domain_val(DOMAIN_IO, DOMAIN_CLIENT))
>  	mcr	p15, 0, r5, c3, c0, 0		@ load domain access register
>  	mcr	p15, 0, r4, c2, c0, 0		@ load page table pointer
> +#endif
>  	b	__turn_mmu_on
>  ENDPROC(__enable_mmu)
> 
> diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
> index bca7e61..48639e7 100644
> --- a/arch/arm/mm/Makefile
> +++ b/arch/arm/mm/Makefile
> @@ -91,7 +91,11 @@ obj-$(CONFIG_CPU_MOHAWK)	+= proc-mohawk.o
>  obj-$(CONFIG_CPU_FEROCEON)	+= proc-feroceon.o
>  obj-$(CONFIG_CPU_V6)		+= proc-v6.o
>  obj-$(CONFIG_CPU_V6K)		+= proc-v6.o
> +ifeq ($(CONFIG_ARM_LPAE),y)
> +obj-$(CONFIG_CPU_V7)		+= proc-v7lpae.o
> +else
>  obj-$(CONFIG_CPU_V7)		+= proc-v7.o
> +endif
> 
>  AFLAGS_proc-v6.o	:=-Wa,-march=armv6
>  AFLAGS_proc-v7.o	:=-Wa,-march=armv7-a
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index c990280..1ba2a5a 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -150,6 +150,7 @@ static int __init early_nowrite(char *__unused)
>  }
>  early_param("nowb", early_nowrite);
> 
> +#ifndef CONFIG_ARM_LPAE
>  static int __init early_ecc(char *p)
>  {
>  	if (memcmp(p, "on", 2) == 0)
> @@ -159,6 +160,7 @@ static int __init early_ecc(char *p)
>  	return 0;
>  }
>  early_param("ecc", early_ecc);
> +#endif
> 
>  static int __init noalign_setup(char *__unused)
>  {
> @@ -228,10 +230,12 @@ static struct mem_type mem_types[] = {
>  		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
>  		.domain    = DOMAIN_KERNEL,
>  	},
> +#ifndef CONFIG_ARM_LPAE
>  	[MT_MINICLEAN] = {
>  		.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
>  		.domain    = DOMAIN_KERNEL,
>  	},
> +#endif
>  	[MT_LOW_VECTORS] = {
>  		.prot_pte  = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
>  				L_PTE_RDONLY,
> @@ -421,6 +425,7 @@ static void __init build_mem_type_table(void)
>  	 * ARMv6 and above have extended page tables.
>  	 */
>  	if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
> +#ifndef CONFIG_ARM_LPAE
>  		/*
>  		 * Mark cache clean areas and XIP ROM read only
>  		 * from SVC mode and no access from userspace.
> @@ -428,6 +433,7 @@ static void __init build_mem_type_table(void)
>  		mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
>  		mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
>  		mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
> +#endif
> 
>  		if (is_smp()) {
>  			/*
> @@ -466,6 +472,18 @@ static void __init build_mem_type_table(void)
>  		mem_types[MT_MEMORY_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
>  	}
> 
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Do not generate access flag faults for the kernel mappings.
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
> +		mem_types[i].prot_pte |= PTE_EXT_AF;
> +		mem_types[i].prot_sect |= PMD_SECT_AF;
> +	}
> +	kern_pgprot |= PTE_EXT_AF;
> +	vecs_pgprot |= PTE_EXT_AF;
> +#endif
> +
>  	for (i = 0; i < 16; i++) {
>  		unsigned long v = pgprot_val(protection_map[i]);
>  		protection_map[i] = __pgprot(v | user_pgprot);
> @@ -564,8 +582,10 @@ static void __init alloc_init_section(pud_t *pud,
> unsigned long addr, if (((addr | end | phys) & ~SECTION_MASK) == 0) {
>  		pmd_t *p = pmd;
> 
> +#ifndef CONFIG_ARM_LPAE
>  		if (addr & SECTION_SIZE)
>  			pmd++;
> +#endif
> 
>  		do {
>  			*pmd = __pmd(phys | type->prot_sect);
> @@ -595,6 +615,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long
> addr, unsigned long end, } while (pud++, addr = next, addr != end);
>  }
> 
> +#ifndef CONFIG_ARM_LPAE
>  static void __init create_36bit_mapping(struct map_desc *md,
>  					const struct mem_type *type)
>  {
> @@ -654,6 +675,7 @@ static void __init create_36bit_mapping(struct map_desc
> *md, pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
>  	} while (addr != end);
>  }
> +#endif	/* !CONFIG_ARM_LPAE */
> 
>  /*
>   * Create the page directory entries and any necessary
> @@ -685,6 +707,7 @@ static void __init create_mapping(struct map_desc *md)
> 
>  	type = &mem_types[md->type];
> 
> +#ifndef CONFIG_ARM_LPAE
>  	/*
>  	 * Catch 36-bit addresses
>  	 */
> @@ -692,6 +715,7 @@ static void __init create_mapping(struct map_desc *md)
>  		create_36bit_mapping(md, type);
>  		return;
>  	}
> +#endif
> 
>  	addr = md->virtual & PAGE_MASK;
>  	phys = __pfn_to_phys(md->pfn);
> @@ -889,6 +913,14 @@ static inline void prepare_page_table(void)
>  		pmd_clear(pmd_off_k(addr));
>  }
> 
> +#ifdef CONFIG_ARM_LPAE
> +/* the first page is reserved for pgd */
> +#define SWAPPER_PG_DIR_SIZE	(PAGE_SIZE + \
> +				 PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
> +#else
> +#define SWAPPER_PG_DIR_SIZE	(PTRS_PER_PGD * sizeof(pgd_t))
> +#endif
> +
>  /*
>   * Reserve the special regions of memory
>   */
> @@ -898,7 +930,7 @@ void __init arm_mm_memblock_reserve(void)
>  	 * Reserve the page tables.  These are already in use,
>  	 * and can only be in node 0.
>  	 */
> -	memblock_reserve(__pa(swapper_pg_dir), PTRS_PER_PGD * sizeof(pgd_t));
> +	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
> 
>  #ifdef CONFIG_SA1111
>  	/*
> diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
> index 307a4de..2d8ff3a 100644
> --- a/arch/arm/mm/proc-macros.S
> +++ b/arch/arm/mm/proc-macros.S
> @@ -91,8 +91,9 @@
>  #if L_PTE_SHARED != PTE_EXT_SHARED
>  #error PTE shared bit mismatch
>  #endif
> -#if (L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
> -     L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
> +#if !defined (CONFIG_ARM_LPAE) && \
> +	(L_PTE_XN+L_PTE_USER+L_PTE_RDONLY+L_PTE_DIRTY+L_PTE_YOUNG+\
> +	 L_PTE_FILE+L_PTE_PRESENT) > L_PTE_SHARED
>  #error Invalid Linux PTE bit settings
>  #endif
>  #endif	/* CONFIG_MMU */
> diff --git a/arch/arm/mm/proc-v7lpae.S b/arch/arm/mm/proc-v7lpae.S
> new file mode 100644
> index 0000000..0bee213
> --- /dev/null
> +++ b/arch/arm/mm/proc-v7lpae.S
> @@ -0,0 +1,422 @@
> +/*
> + * arch/arm/mm/proc-v7lpae.S
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2011 ARM Ltd.
> + * Author: Catalin Marinas <catalin.marinas@arm.com>
> + *   based on arch/arm/mm/proc-v7.S
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +#include <linux/init.h>
> +#include <linux/linkage.h>
> +#include <asm/assembler.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/hwcap.h>
> +#include <asm/pgtable-hwdef.h>
> +#include <asm/pgtable.h>
> +
> +#include "proc-macros.S"
> +
> +#define TTB_IRGN_NC	(0 << 8)
> +#define TTB_IRGN_WBWA	(1 << 8)
> +#define TTB_IRGN_WT	(2 << 8)
> +#define TTB_IRGN_WB	(3 << 8)
> +#define TTB_RGN_NC	(0 << 10)
> +#define TTB_RGN_OC_WBWA	(1 << 10)
> +#define TTB_RGN_OC_WT	(2 << 10)
> +#define TTB_RGN_OC_WB	(3 << 10)
> +#define TTB_S		(3 << 12)
> +#define TTB_EAE		(1 << 31)
> +
> +/* PTWs cacheable, inner WB not shareable, outer WB not shareable */
> +#define TTB_FLAGS_UP	(TTB_IRGN_WB|TTB_RGN_OC_WB)
> +#define PMD_FLAGS_UP	(PMD_SECT_WB)
> +
> +/* PTWs cacheable, inner WBWA shareable, outer WBWA not shareable */
> +#define TTB_FLAGS_SMP	(TTB_IRGN_WBWA|TTB_S|TTB_RGN_OC_WBWA)
> +#define PMD_FLAGS_SMP	(PMD_SECT_WBWA|PMD_SECT_S)
> +
> +ENTRY(cpu_v7_proc_init)
> +	mov	pc, lr
> +ENDPROC(cpu_v7_proc_init)
> +
> +ENTRY(cpu_v7_proc_fin)
> +	mrc	p15, 0, r0, c1, c0, 0		@ ctrl register
> +	bic	r0, r0, #0x1000			@ ...i............
> +	bic	r0, r0, #0x0006			@ .............ca.
> +	mcr	p15, 0, r0, c1, c0, 0		@ disable caches
> +	mov	pc, lr
> +ENDPROC(cpu_v7_proc_fin)
> +
> +/*
> + *	cpu_v7_reset(loc)
> + *
> + *	Perform a soft reset of the system.  Put the CPU into the
> + *	same state as it would be if it had been reset, and branch
> + *	to what would be the reset vector.
> + *
> + *	- loc   - location to jump to for soft reset
> + */
> +	.align	5
> +ENTRY(cpu_v7_reset)
> +	mov	pc, r0
> +ENDPROC(cpu_v7_reset)
> +
> +/*
> + *	cpu_v7_do_idle()
> + *
> + *	Idle the processor (eg, wait for interrupt).
> + *
> + *	IRQs are already disabled.
> + */
> +ENTRY(cpu_v7_do_idle)
> +	dsb					@ WFI may enter a low-power mode
> +	wfi
> +	mov	pc, lr
> +ENDPROC(cpu_v7_do_idle)
> +
> +ENTRY(cpu_v7_dcache_clean_area)
> +#ifndef TLB_CAN_READ_FROM_L1_CACHE
> +	dcache_line_size r2, r3
> +1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
> +	add	r0, r0, r2
> +	subs	r1, r1, r2
> +	bhi	1b
> +	dsb
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_dcache_clean_area)
> +
> +/*
> + *	cpu_v7_switch_mm(pgd_phys, tsk)
> + *
> + *	Set the translation table base pointer to be pgd_phys
> + *
> + *	- pgd_phys - physical address of new TTB
> + *
> + *	It is assumed that:
> + *	- we are not using split page tables
> + */
> +ENTRY(cpu_v7_switch_mm)
> +#ifdef CONFIG_MMU
> +	ldr	r1, [r1, #MM_CONTEXT_ID]	@ get mm->context.id
> +	mov	r2, #0
> +	and	r3, r1, #0xff
> +	mov	r3, r3, lsl #(48 - 32)		@ ASID
> +	mcrr	p15, 0, r0, r3, c2		@ set TTB 0
> +	isb
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_switch_mm)
> +
> +/*
> + *	cpu_v7_set_pte_ext(ptep, pte)
> + *
> + *	Set a level 2 translation table entry.
> + *
> + *	- ptep  - pointer to level 2 translation table entry
> + *		  (hardware version is stored at +2048 bytes)
> + *	- pte   - PTE value to store
> + *	- ext	- value for extended PTE bits
> + */
> +ENTRY(cpu_v7_set_pte_ext)
> +#ifdef CONFIG_MMU
> +	tst	r2, #L_PTE_PRESENT
> +	beq	1f
> +	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
> +	orreq	r2, #L_PTE_RDONLY
> +1:	strd	r2, r3, [r0]
> +	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +#endif
> +	mov	pc, lr
> +ENDPROC(cpu_v7_set_pte_ext)
> +
> +cpu_v7_name:
> +	.ascii	"ARMv7 Processor"
> +	.align
> +
> +	/*
> +	 * Memory region attributes for LPAE (defined in pgtable-3level.h):
> +	 *
> +	 *   n = AttrIndx[2:0]
> +	 *
> +	 *			n	MAIR
> +	 *   UNCACHED		000	00000000
> +	 *   BUFFERABLE		001	01000100
> +	 *   DEV_WC		001	01000100
> +	 *   WRITETHROUGH	010	10101010
> +	 *   WRITEBACK		011	11101110
> +	 *   DEV_CACHED		011	11101110
> +	 *   DEV_SHARED		100	00000100
> +	 *   DEV_NONSHARED	100	00000100
> +	 *   unused		101
> +	 *   unused		110
> +	 *   WRITEALLOC		111	11111111
> +	 */
> +.equ	MAIR0,	0xeeaa4400			@ MAIR0
> +.equ	MAIR1,	0xff000004			@ MAIR1
> +
> +/* Suspend/resume support: derived from arch/arm/mach-s5pv210/sleep.S */
> +.globl	cpu_v7_suspend_size
> +.equ	cpu_v7_suspend_size, 4 * 10
> +#ifdef CONFIG_PM_SLEEP
> +ENTRY(cpu_v7_do_suspend)
> +	stmfd	sp!, {r4 - r11, lr}
> +	mrc	p15, 0, r4, c13, c0, 0	@ FCSE/PID
> +	mrc	p15, 0, r5, c13, c0, 1	@ Context ID
> +	mrc	p15, 0, r6, c3, c0, 0	@ Domain ID
> +	mrrc	p15, 0, r7, r8, c2	@ TTB 0
> +	mrrc	p15, 1, r2, r3, c2	@ TTB 1
> +	mrc	p15, 0, r9, c1, c0, 0	@ Control register
> +	mrc	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
> +	mrc	p15, 0, r11, c1, c0, 2	@ Co-processor access control
> +	stmia	r0, {r2 - r11}
> +	ldmfd	sp!, {r4 - r11, pc}
> +ENDPROC(cpu_v7_do_suspend)
> +
> +ENTRY(cpu_v7_do_resume)
> +	mov	ip, #0
> +	mcr	p15, 0, ip, c8, c7, 0	@ invalidate TLBs
> +	mcr	p15, 0, ip, c7, c5, 0	@ invalidate I cache
> +	ldmia	r0, {r2 - r11}
> +	mcr	p15, 0, r4, c13, c0, 0	@ FCSE/PID
> +	mcr	p15, 0, r5, c13, c0, 1	@ Context ID
> +	mcr	p15, 0, r6, c3, c0, 0	@ Domain ID
> +	mcrr	p15, 0, r7, r8, c2	@ TTB 0
> +	mcrr	p15, 1, r2, r3, c2	@ TTB 1
> +	mcr	p15, 0, ip, c2, c0, 2	@ TTB control register
> +	mcr	p15, 0, r10, c1, c0, 1	@ Auxiliary control register
> +	mcr	p15, 0, r11, c1, c0, 2	@ Co-processor access control
> +	ldr	r4, =MAIR0
> +	ldr	r5, =MAIR1
> +	mcr	p15, 0, r4, c10, c2, 0	@ write MAIR0
> +	mcr	p15, 0, r5, c10, c2, 1	@ write MAIR1
> +	isb
> +	mov	r0, r9			@ control register
> +	mov	r2, r7, lsr #14		@ get TTB0 base
> +	mov	r2, r2, lsl #14
> +	ldr	r3, cpu_resume_l1_flags
> +	b	cpu_resume_mmu
> +ENDPROC(cpu_v7_do_resume)
> +cpu_resume_l1_flags:
> +	ALT_SMP(.long PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_SMP)
> +	ALT_UP(.long  PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_FLAGS_UP)
> +#else
> +#define cpu_v7_do_suspend	0
> +#define cpu_v7_do_resume	0
> +#endif
> +
> +	__CPUINIT
> +
> +/*
> + *	__v7_setup
> + *
> + *	Initialise TLB, Caches, and MMU state ready to switch the MMU
> + *	on. Return in r0 the new CP15 C1 control register setting.
> + *
> + *	This should be able to cover all ARMv7 cores with LPAE.
> + *
> + *	It is assumed that:
> + *	- cache type register is implemented
> + */
> +__v7_ca15mp_setup:
> +	mov	r10, #0
> +1:
> +#ifdef CONFIG_SMP
> +	ALT_SMP(mrc	p15, 0, r0, c1, c0, 1)
> +	ALT_UP(mov	r0, #(1 << 6))		@ fake it for UP
> +	tst	r0, #(1 << 6)			@ SMP/nAMP mode enabled?
> +	orreq	r0, r0, #(1 << 6)		@ Enable SMP/nAMP mode
> +	orreq	r0, r0, r10			@ Enable CPU-specific SMP bits
> +	mcreq	p15, 0, r0, c1, c0, 1
> +#endif
> +__v7_setup:
> +	adr	r12, __v7_setup_stack		@ the local stack
> +	stmia	r12, {r0-r5, r7, r9, r11, lr}
> +	bl	v7_flush_dcache_all
> +	ldmia	r12, {r0-r5, r7, r9, r11, lr}
> +
> +	mov	r10, #0
> +	mcr	p15, 0, r10, c7, c5, 0		@ I+BTB cache invalidate
> +	dsb
> +#ifdef CONFIG_MMU
> +	mcr	p15, 0, r10, c8, c7, 0		@ invalidate I + D TLBs
> +	mov	r5, #TTB_EAE
> +	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP)
> +	ALT_SMP(orr	r5, r5, #TTB_FLAGS_SMP << 16)
> +	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP)
> +	ALT_UP(orr	r5, r5, #TTB_FLAGS_UP << 16)
> +	mrc	p15, 0, r10, c2, c0, 2
> +	orr	r10, r10, r5
> +#if PHYS_OFFSET <= PAGE_OFFSET
> +	/*
> +	 * TTBR0/TTBR1 split (PAGE_OFFSET):
> +	 *   0x40000000: T0SZ = 2, T1SZ = 0 (not used)
> +	 *   0x80000000: T0SZ = 0, T1SZ = 1
> +	 *   0xc0000000: T0SZ = 0, T1SZ = 2
> +	 *
> +	 * Only use this feature if PAGE_OFFSET <=  PAGE_OFFSET, otherwise
> +	 * booting secondary CPUs would end up using TTBR1 for the identity
> +	 * mapping set up in TTBR0.
> +	 */
> +	orr	r10, r10, #(((PAGE_OFFSET >> 30) - 1) << 16)	@ TTBCR.T1SZ
> +#endif
> +	mcr	p15, 0, r10, c2, c0, 2		@ TTB control register
> +	mov	r5, #0
> +#if defined CONFIG_VMSPLIT_2G
> +	/* PAGE_OFFSET == 0x80000000, T1SZ == 1 */
> +	add	r6, r8, #1 << 4			@ skip two L1 entries
> +#elif defined CONFIG_VMSPLIT_3G
> +	/* PAGE_OFFSET == 0xc0000000, T1SZ == 2 */
> +	add	r6, r8, #4096 * (1 + 3)		@ only L2 used, skip pgd+3*pmd
> +#else
> +	mov	r6, r8
> +#endif
> +	mcrr	p15, 1, r6, r5, c2		@ load TTBR1
> +	ldr	r5, =MAIR0
> +	ldr	r6, =MAIR1
> +	mcr	p15, 0, r5, c10, c2, 0		@ write MAIR0
> +	mcr	p15, 0, r6, c10, c2, 1		@ write MAIR1
> +#endif
> +	adr	r5, v7_crval
> +	ldmia	r5, {r5, r6}
> +#ifdef CONFIG_CPU_ENDIAN_BE8
> +	orr	r6, r6, #1 << 25		@ big-endian page tables
> +#endif
> +#ifdef CONFIG_SWP_EMULATE
> +	orr     r5, r5, #(1 << 10)              @ set SW bit in "clear"
> +	bic     r6, r6, #(1 << 10)              @ clear it in "mmuset"
> +#endif
> +	mrc	p15, 0, r0, c1, c0, 0		@ read control register
> +	bic	r0, r0, r5			@ clear bits them
> +	orr	r0, r0, r6			@ set them
> + THUMB(	orr	r0, r0, #1 << 30	)	@ Thumb exceptions
> +	mov	pc, lr				@ return to head.S:__ret
> +ENDPROC(__v7_setup)
> +
> +	/*   AT
> +	 *  TFR   EV X F   IHD LR    S
> +	 * .EEE ..EE PUI. .TAT 4RVI ZWRS BLDP WCAM
> +	 * rxxx rrxx xxx0 0101 xxxx xxxx x111 xxxx < forced
> +	 *   11    0 110    1  0011 1100 .111 1101 < we want
> +	 */
> +	.type	v7_crval, #object
> +v7_crval:
> +	crval	clear=0x0120c302, mmuset=0x30c23c7d, ucset=0x00c01c7c
> +
> +__v7_setup_stack:
> +	.space	4 * 11				@ 11 registers
> +
> +	__INITDATA
> +
> +	.type	v7_processor_functions, #object
> +ENTRY(v7_processor_functions)
> +	.word	v7_early_abort
> +	.word	v7_pabort
> +	.word	cpu_v7_proc_init
> +	.word	cpu_v7_proc_fin
> +	.word	cpu_v7_reset
> +	.word	cpu_v7_do_idle
> +	.word	cpu_v7_dcache_clean_area
> +	.word	cpu_v7_switch_mm
> +	.word	cpu_v7_set_pte_ext
> +	.word	0
> +	.word	0
> +	.word	0
> +	.size	v7_processor_functions, . - v7_processor_functions
> +
> +	.section ".rodata"
> +
> +	.type	cpu_arch_name, #object
> +cpu_arch_name:
> +	.asciz	"armv7"
> +	.size	cpu_arch_name, . - cpu_arch_name
> +
> +	.type	cpu_elf_name, #object
> +cpu_elf_name:
> +	.asciz	"v7"
> +	.size	cpu_elf_name, . - cpu_elf_name
> +	.align
> +
> +	.section ".proc.info.init", #alloc, #execinstr
> +
> +	.type	__v7_ca15mp_proc_info, #object
> +__v7_ca15mp_proc_info:
> +	.long	0x410fc0f0		@ Required ID value
> +	.long	0xff0ffff0		@ Mask for ID
> +	ALT_SMP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_SMP)
> +	ALT_UP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_UP)
> +		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
> +	.long   PMD_TYPE_SECT | \
> +		PMD_SECT_XN | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF
> +	b	__v7_ca15mp_setup
> +	.long	cpu_arch_name
> +	.long	cpu_elf_name
> +	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|
HWCAP_T
> LS +	.long	cpu_v7_name
> +	.long	v7_processor_functions
> +	.long	v7wbi_tlb_fns
> +	.long	v6_user_fns
> +	.long	v7_cache_fns
> +	.size	__v7_ca15mp_proc_info, . - __v7_ca15mp_proc_info
> +
> +	/*
> +	 * Match any ARMv7 processor core.
> +	 */
> +	.type	__v7_proc_info, #object
> +__v7_proc_info:
> +	.long	0x000f0000		@ Required ID value
> +	.long	0x000f0000		@ Mask for ID
> +	ALT_SMP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_SMP)
> +	ALT_UP(.long \
> +		PMD_TYPE_SECT | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF | \
> +		PMD_FLAGS_UP)
> +		/* PMD_SECT_XN is set explicitly in head.S for LPAE */
> +	.long   PMD_TYPE_SECT | \
> +		PMD_SECT_XN | \
> +		PMD_SECT_AP_WRITE | \
> +		PMD_SECT_AP_READ | \
> +		PMD_SECT_AF
> +	W(b)	__v7_setup
> +	.long	cpu_arch_name
> +	.long	cpu_elf_name
> +	.long	HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP|
HWCAP_T
> LS +	.long	cpu_v7_name
> +	.long	v7_processor_functions
> +	.long	v7wbi_tlb_fns
> +	.long	v6_user_fns
> +	.long	v7_cache_fns
> +	.size	__v7_proc_info, . - __v7_proc_info

Will send a fix soon

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 12:56   ` Vasily Khoruzhick
@ 2011-08-13 12:58     ` Vasily Khoruzhick
  2011-08-13 14:14       ` Catalin Marinas
  2011-08-15 12:09       ` Catalin Marinas
  0 siblings, 2 replies; 46+ messages in thread
From: Vasily Khoruzhick @ 2011-08-13 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

It was introduced by  407f8b4cb07cbc5c1c7cc386f231224e2524ccea
ARM: LPAE: MMU setup for the 3-level page table format

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
---
 arch/arm/kernel/head.S |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 0bdafc4..5add5f5 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -206,7 +206,7 @@ __create_page_tables:
 1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
 	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
 	cmp	r5, r6
-	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
+	addlo	r5, r5, #1			@ next section
 	blo	1b
 
 	/*
@@ -217,7 +217,7 @@ __create_page_tables:
 	mov	r3, r3, lsr #SECTION_SHIFT
 	orr	r3, r7, r3, lsl #SECTION_SHIFT
 	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
-	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT - PMD_ORDER)]!
+	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> (SECTION_SHIFT - PMD_ORDER)]!
 	ldr	r6, =(KERNEL_END - 1)
 	add	r0, r0, #1 << PMD_ORDER
 	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
-- 
1.7.5.rc3

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 12:58     ` [PATCH] Fix non-LPAE boot regression Vasily Khoruzhick
@ 2011-08-13 14:14       ` Catalin Marinas
  2011-08-13 14:39         ` Russell King - ARM Linux
  2011-08-15 12:09       ` Catalin Marinas
  1 sibling, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-13 14:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Saturday, 13 August 2011, Vasily Khoruzhick <anarsoul@gmail.com> wrote:
> It was introduced by ?407f8b4cb07cbc5c1c7cc386f231224e2524ccea
> ARM: LPAE: MMU setup for the 3-level page table format
>
> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
> ---
> ?arch/arm/kernel/head.S | ? ?4 ++--
> ?1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 0bdafc4..5add5f5 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -206,7 +206,7 @@ __create_page_tables:
> ?1: ? ? orr ? ? r3, r7, r5, lsl #SECTION_SHIFT ?@ flags + kernel base
>  ? ? ? ?str ? ? r3, [r4, r5, lsl #PMD_ORDER] ? ?@ identity mapping
>  ? ? ? ?cmp ? ? r5, r6
> - ? ? ? addlo ? r5, r5, #SECTION_SHIFT >> 20 ? ?@ next section
> + ? ? ? addlo ? r5, r5, #1 ? ? ? ? ? ? ? ? ? ? ?@ next section
>  ? ? ? ?blo ? ? 1b

Thanks for this. The original code was indeed broken but I think the
fix should be to use SECTION_SIZE instead of SHIFT. I'll have a look
on Monday.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt
  2011-08-10 15:03 ` [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt Catalin Marinas
@ 2011-08-13 14:33   ` Russell King - ARM Linux
  2011-08-23 11:15     ` Russell King - ARM Linux
  0 siblings, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-08-13 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:25PM +0100, Catalin Marinas wrote:
> This is to avoid a compiler warning when invoking the __bus_to_virt()
> macro. The dma_to_virt() function gets addresses within the 32-bit
> range.

Ok.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
@ 2011-08-13 14:34   ` Russell King - ARM Linux
  2011-08-15 16:48   ` Catalin Marinas
  2011-08-23 11:15   ` Russell King - ARM Linux
  2 siblings, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-08-13 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:26PM +0100, Catalin Marinas wrote:
> PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
> the same value (21). This patch converts the PGDIR_* uses in the kernel
> to the PMD_* equivalent so that LPAE builds can reuse the same code.

Ok.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 14:14       ` Catalin Marinas
@ 2011-08-13 14:39         ` Russell King - ARM Linux
  2011-08-13 14:45           ` Catalin Marinas
  2011-08-15 11:41           ` Catalin Marinas
  0 siblings, 2 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-08-13 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Aug 13, 2011 at 03:14:30PM +0100, Catalin Marinas wrote:
> Thanks for this. The original code was indeed broken but I think the
> fix should be to use SECTION_SIZE instead of SHIFT. I'll have a look
> on Monday.

No, the original code is not broken.  Look at what it's doing:

        mov     r5, r5, lsr #20
        mov     r6, r6, lsr #20

1:      orr     r3, r7, r5, lsl #20             @ flags + kernel base
        str     r3, [r4, r5, lsl #2]            @ identity mapping
        teq     r5, r6
        addne   r5, r5, #1                      @ next section
        bne     1b

The addition of one is to step us to the next page table entry.  It's
not SECTION_SHIFT >> 20 or anything like that.

Let's rewrite it in C:

	pmd_idx = r5 >> 20;
	pmd_end = r6 >> 20;

	do {
		pmd[pmd_idx] = flags | (pmd_idx << 20);
		if (pmd_idx == pmd_end)
			break;
		pmd_idx++;
	} while (1);

which is quite correct for non-LPAE.  Those shifts of 20 could well have
been SECTION_SHIFT instead to make it more clear what's going on there.

Now, with LPAE, where pmds are now 64-bit, the fact that SECTION_SHIFT
becomes 21 is merely coincidental.  That doesn't mean that the add
instruction should be SECTION_SIZE >> 20, as you're using apples to
describe oranges there.

With SECTION_SIZE >> 20, your modified code looks like this for LPAE:

+       mov     r5, r5, lsr #21
+       mov     r6, r6, lsr #21

+1:     orr     r3, r7, r5, lsl #21             @ flags + kernel base
+       str     r3, [r4, r5, lsl #3]            @ identity mapping
+       cmp     r5, r6
+       addlo   r5, r5, #2                      @ next section
+       blo     1b

So: for LPAE:
	r5 increments by 2, so r3 increments by 2 << 21.
	[r4, r5, lsl #3] increments by 2<<3 = 16.
for non-LPAE (from above):
	r5 increments by 1, so r3 increments by 1 << 20.
	[r4, r5, lsl #2] increments by 1<<2 = 4.

so that's not correct either.  Rather than incrementing by one section
on LPAE, we increment by two.  Not only that, but the pointer also
increments by twice as much.

So, this should become something like this instead:

        mov     r5, r5, lsr #SECTION_SHIFT
        mov     r6, r6, lsr #SECTION_SHIFT

1:      orr     r3, r7, r5, lsl #SECTION_SHIFT  @ flags + kernel base
        str     r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
        teq     r5, r6
        addne   r5, r5, #1                      @ next section
        bne     1b

which is what Vasily's patch does.

I think this patch is trying to do too much in one go.  It needs splitting
up into two, just like is done with the C PGDIR_SHIFT vs PMD_SHIFT stuff
(and arguably the first part should be combined with the patch fixing the
PGDIR_SHIFT stuff.)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 14:39         ` Russell King - ARM Linux
@ 2011-08-13 14:45           ` Catalin Marinas
  2011-08-15 11:41           ` Catalin Marinas
  1 sibling, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-13 14:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Saturday, 13 August 2011, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Aug 13, 2011 at 03:14:30PM +0100, Catalin Marinas wrote:
>> Thanks for this. The original code was indeed broken but I think the
>> fix should be to use SECTION_SIZE instead of SHIFT. I'll have a look
>> on Monday.
>
> No, the original code is not broken. ?Look at what it's doing:

I meant my original LPAE code (though strangely it was booting fine, I
guess it didn't use more than a section from those mappings). I'll
look at the details on Monday, I'm away from my PC now.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 14:39         ` Russell King - ARM Linux
  2011-08-13 14:45           ` Catalin Marinas
@ 2011-08-15 11:41           ` Catalin Marinas
  1 sibling, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-15 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Aug 13, 2011 at 03:39:03PM +0100, Russell King - ARM Linux wrote:
> I think this patch is trying to do too much in one go.  It needs splitting
> up into two, just like is done with the C PGDIR_SHIFT vs PMD_SHIFT stuff
> (and arguably the first part should be combined with the patch fixing the
> PGDIR_SHIFT stuff.)

OK, I'll do this and repost the two patches in reply to this one (and
merge one of them with the PMD_SHIFT but I would avoid reposting the
full series again).

For patch clarity, I'll fold Vasily's fix as well rather than keeping it
in a separate patch.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-13 12:58     ` [PATCH] Fix non-LPAE boot regression Vasily Khoruzhick
  2011-08-13 14:14       ` Catalin Marinas
@ 2011-08-15 12:09       ` Catalin Marinas
  2011-08-15 12:31         ` Vasily Khoruzhick
  1 sibling, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-15 12:09 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Vasily,

On Sat, Aug 13, 2011 at 01:58:19PM +0100, Vasily Khoruzhick wrote:
> It was introduced by  407f8b4cb07cbc5c1c7cc386f231224e2524ccea
> ARM: LPAE: MMU setup for the 3-level page table format
> 
> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
> ---
>  arch/arm/kernel/head.S |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 0bdafc4..5add5f5 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -206,7 +206,7 @@ __create_page_tables:
>  1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
>  	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
>  	cmp	r5, r6
> -	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> +	addlo	r5, r5, #1			@ next section
>  	blo	1b

That's correct.

>  	/*
> @@ -217,7 +217,7 @@ __create_page_tables:
>  	mov	r3, r3, lsr #SECTION_SHIFT
>  	orr	r3, r7, r3, lsl #SECTION_SHIFT
>  	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT - PMD_ORDER)
> -	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT - PMD_ORDER)]!
> +	str	r3, [r0, #(KERNEL_START & 0x00f00000) >> (SECTION_SHIFT - PMD_ORDER)]!

The reason for this was that the sections are 2MB with LPAE and a page
table entry is 64-bit wide. We always shift that value by 18 but with
LPAE we don't want to write in the middle of a page table entry if
KERNEL_START is not 2MB aligned.

But if KERNEL_START is not 2MB aligned, I think we get the wrong
physical address by 1MB (with the classic page table format).

There are a few alternatives to fixing this:

1. Different KERNEL_START masking for classic or LPAE page tables.
2. Always force 2MB section and the code above moving the phys addr into
   r3 would need to take this into account.

I would go for 1 with some shifting like below:

+	str	r3, [r0, #((KERNEL_START & 0x00f00000) >> SECTION_SHIFT) << PMD_ORDER]!

This should give us 0x00f00000 with classic page tables and 0x00e00000
with LPAE.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-15 12:09       ` Catalin Marinas
@ 2011-08-15 12:31         ` Vasily Khoruzhick
  2011-08-24  8:16           ` Vasily Khoruzhick
  0 siblings, 1 reply; 46+ messages in thread
From: Vasily Khoruzhick @ 2011-08-15 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 15 August 2011 15:09:14 Catalin Marinas wrote:
> Hi Vasily,
> 
> On Sat, Aug 13, 2011 at 01:58:19PM +0100, Vasily Khoruzhick wrote:
> > It was introduced by  407f8b4cb07cbc5c1c7cc386f231224e2524ccea
> > ARM: LPAE: MMU setup for the 3-level page table format
> > 
> > Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
> > ---
> > 
> >  arch/arm/kernel/head.S |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > index 0bdafc4..5add5f5 100644
> > --- a/arch/arm/kernel/head.S
> > +++ b/arch/arm/kernel/head.S
> > 
> > @@ -206,7 +206,7 @@ __create_page_tables:
> >  1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> >  
> >  	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
> >  	cmp	r5, r6
> > 
> > -	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> > +	addlo	r5, r5, #1			@ next section
> > 
> >  	blo	1b
> 
> That's correct.
> 
> >  	/*
> > 
> > @@ -217,7 +217,7 @@ __create_page_tables:
> >  	mov	r3, r3, lsr #SECTION_SHIFT
> >  	orr	r3, r7, r3, lsl #SECTION_SHIFT
> >  	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT -
> >  	PMD_ORDER)
> > 
> > -	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT -
> > PMD_ORDER)]! +	str	r3, [r0, #(KERNEL_START & 0x00f00000) >>
> > (SECTION_SHIFT - PMD_ORDER)]!
> 
> The reason for this was that the sections are 2MB with LPAE and a page
> table entry is 64-bit wide. We always shift that value by 18 but with
> LPAE we don't want to write in the middle of a page table entry if
> KERNEL_START is not 2MB aligned.
> 
> But if KERNEL_START is not 2MB aligned, I think we get the wrong
> physical address by 1MB (with the classic page table format).

Yep, for my case KERNEL_START is not 2MB aligned (TEXT_OFFSET is 0x00108000). 
(CONFIG_PM_H1940 is set)

> There are a few alternatives to fixing this:
> 
> 1. Different KERNEL_START masking for classic or LPAE page tables.
> 2. Always force 2MB section and the code above moving the phys addr into
>    r3 would need to take this into account.
> 
> I would go for 1 with some shifting like below:
> 
> +	str	r3, [r0, #((KERNEL_START & 0x00f00000) >> SECTION_SHIFT) <<
> PMD_ORDER]!
> 
> This should give us 0x00f00000 with classic page tables and 0x00e00000
> with LPAE.
> 
> Thanks.

Ok, I'll test this change when I get home.

Regards
Vasily

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
  2011-08-13 14:34   ` Russell King - ARM Linux
@ 2011-08-15 16:48   ` Catalin Marinas
  2011-08-23 11:15   ` Russell King - ARM Linux
  2 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-15 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:26PM +0100, Catalin Marinas wrote:
> PGDIR_SHIFT and PMD_SHIFT for the classic 2-level page table format have
> the same value (21). This patch converts the PGDIR_* uses in the kernel
> to the PMD_* equivalent so that LPAE builds can reuse the same code.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Following Russell's advice to split the MMU setup patch, the diff below
will be folded into this patch for easier reviewing (and the commit will
be changed slightly to reflect this):

8<-----------------------

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
  2011-08-13 11:49   ` Vasily Khoruzhick
  2011-08-13 12:56   ` Vasily Khoruzhick
@ 2011-08-15 16:51   ` Catalin Marinas
  2011-08-19 10:25   ` Ian Campbell
  3 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-15 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:32PM +0100, Catalin Marinas wrote:
> This patch adds the MMU initialisation for the LPAE page table format.
> The swapper_pg_dir size with LPAE is 5 rather than 4 pages. A new
> proc-v7lpae.S file contains the initialisation, context switch and
> save/restore code for ARMv7 with the LPAE. The TTBRx split is based on
> the PAGE_OFFSET with TTBR1 used for the kernel mappings. The 36-bit
> mappings (supersections) and a few other memory types in mmu.c are
> conditionally compiled.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Following the split of this patch, it becomes:

8<--------------------

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
                     ` (2 preceding siblings ...)
  2011-08-15 16:51   ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
@ 2011-08-19 10:25   ` Ian Campbell
  2011-08-19 11:10     ` Catalin Marinas
  3 siblings, 1 reply; 46+ messages in thread
From: Ian Campbell @ 2011-08-19 10:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2011-08-10 at 16:03 +0100, Catalin Marinas wrote:
> +/*
> + *     cpu_v7_set_pte_ext(ptep, pte)
> + *
> + *     Set a level 2 translation table entry.
> + *
> + *     - ptep  - pointer to level 2 translation table entry
> + *               (hardware version is stored at +2048 bytes)

+2048 thing not true for LPAE?

> + *     - pte   - PTE value to store
> + *     - ext   - value for extended PTE bits

"ext" is not actually present/used in this variant, rather pte is split
between r1 and r2?

> + */
> +ENTRY(cpu_v7_set_pte_ext)
> +#ifdef CONFIG_MMU
> +       tst     r2, #L_PTE_PRESENT
> +       beq     1f
> +       tst     r3, #1 << (55 - 32)             @ L_PTE_DIRTY
> +       orreq   r2, #L_PTE_RDONLY
> +1:     strd    r2, r3, [r0]

AIUI this 64-bit store is not atomic. Is there something about the ARM
architecture which would prevent the MMU prefetching the half written
entry and caching it in the TLB?

i.e. If you are transitioning from a
	"0..0 | 0..0 (!L_PTE_PRESENT)"
entry to a
        "ST   | UFF  ( L_PTE_PRESENT)"
entry you will temporarily be in the
        "0..0 | UFF  ( L_PTE_PRESENT)"
state. (or vice versa going the other way if you do the writes in the
other order). This might mean that a subsequent access through the VA
corresponding to this PTE goes to the wrong place.

I'm asking because we had a very subtle bug on x86 Xen relating to this
sort of issue ages ago, it was hell to debug ;-).

Ian.

> +       mcr     p15, 0, r0, c7, c10, 1          @ flush_pte
> +#endif
> +       mov     pc, lr
> +ENDPROC(cpu_v7_set_pte_ext) 
-- 
Ian Campbell

Working with Julie Andrews is like getting hit over the head with a valentine.
		-- Christopher Plummer

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-19 10:25   ` Ian Campbell
@ 2011-08-19 11:10     ` Catalin Marinas
  2011-08-19 11:47       ` Ian Campbell
  0 siblings, 1 reply; 46+ messages in thread
From: Catalin Marinas @ 2011-08-19 11:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 19, 2011 at 11:25:57AM +0100, Ian Campbell wrote:
> On Wed, 2011-08-10 at 16:03 +0100, Catalin Marinas wrote:
> > +/*
> > + *     cpu_v7_set_pte_ext(ptep, pte)
> > + *
> > + *     Set a level 2 translation table entry.
> > + *
> > + *     - ptep  - pointer to level 2 translation table entry
> > + *               (hardware version is stored at +2048 bytes)
> 
> +2048 thing not true for LPAE?
> 
> > + *     - pte   - PTE value to store
> > + *     - ext   - value for extended PTE bits
> 
> "ext" is not actually present/used in this variant, rather pte is split
> between r1 and r2?

Yes, you are right, the comments have just been copied from proc-v7.S.
I'll go through them again make sure they are still valid.

> > + */
> > +ENTRY(cpu_v7_set_pte_ext)
> > +#ifdef CONFIG_MMU
> > +       tst     r2, #L_PTE_PRESENT
> > +       beq     1f
> > +       tst     r3, #1 << (55 - 32)             @ L_PTE_DIRTY
> > +       orreq   r2, #L_PTE_RDONLY
> > +1:     strd    r2, r3, [r0]
> 
> AIUI this 64-bit store is not atomic. Is there something about the ARM
> architecture which would prevent the MMU prefetching the half written
> entry and caching it in the TLB?

CPU implementations that include LPAE guarantee the atomicity of a
double-word store (STRD) if the alignment is correct.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format
  2011-08-19 11:10     ` Catalin Marinas
@ 2011-08-19 11:47       ` Ian Campbell
  0 siblings, 0 replies; 46+ messages in thread
From: Ian Campbell @ 2011-08-19 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2011-08-19 at 12:10 +0100, Catalin Marinas wrote:
> 
> > > + */
> > > +ENTRY(cpu_v7_set_pte_ext)
> > > +#ifdef CONFIG_MMU
> > > +       tst     r2, #L_PTE_PRESENT
> > > +       beq     1f
> > > +       tst     r3, #1 << (55 - 32)             @ L_PTE_DIRTY
> > > +       orreq   r2, #L_PTE_RDONLY
> > > +1:     strd    r2, r3, [r0]
> > 
> > AIUI this 64-bit store is not atomic. Is there something about the
> ARM
> > architecture which would prevent the MMU prefetching the half
> written
> > entry and caching it in the TLB?
> 
> CPU implementations that include LPAE guarantee the atomicity of a
> double-word store (STRD) if the alignment is correct. 

Ah, I was looking at the standard v7 docs and not the LPAE extensions, I
see it now.

Thanks,
Ian.

-- 
Ian Campbell

The disks are getting full; purge a file today.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
  2011-08-13 14:34   ` Russell King - ARM Linux
  2011-08-15 16:48   ` Catalin Marinas
@ 2011-08-23 11:15   ` Russell King - ARM Linux
  2011-08-23 13:09     ` Catalin Marinas
  2 siblings, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:26PM +0100, Catalin Marinas wrote:
> @@ -183,7 +183,7 @@ static int __init consistent_init(void)
>  		}
>  
>  		consistent_pte[i++] = pte;
> -		base += (1 << PGDIR_SHIFT);
> +		base += (1 << PMD_SHIFT);

Please replace with PMD_SIZE.

Once combined with the "fixup! ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*"
patch (and once whatever "fixup!" is done) this can be submitted to the
patch system.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt
  2011-08-13 14:33   ` Russell King - ARM Linux
@ 2011-08-23 11:15     ` Russell King - ARM Linux
  0 siblings, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-08-23 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Aug 13, 2011 at 03:33:58PM +0100, Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2011 at 04:03:25PM +0100, Catalin Marinas wrote:
> > This is to avoid a compiler warning when invoking the __bus_to_virt()
> > macro. The dma_to_virt() function gets addresses within the 32-bit
> > range.
> 
> Ok.

I haven't noticed anything happening with this patch yet...

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*
  2011-08-23 11:15   ` Russell King - ARM Linux
@ 2011-08-23 13:09     ` Catalin Marinas
  0 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-08-23 13:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Aug 23, 2011 at 12:15:33PM +0100, Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2011 at 04:03:26PM +0100, Catalin Marinas wrote:
> > @@ -183,7 +183,7 @@ static int __init consistent_init(void)
> >  		}
> >  
> >  		consistent_pte[i++] = pte;
> > -		base += (1 << PGDIR_SHIFT);
> > +		base += (1 << PMD_SHIFT);
> 
> Please replace with PMD_SIZE.
> 
> Once combined with the "fixup! ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_*"
> patch (and once whatever "fixup!" is done) this can be submitted to the
> patch system.

I send this one (with the folded fix-ups) and the other type casting to
your patch system (the "fixup!" is generated with git commit --fixup and
git rebase -i --autosquash does the folding automatically).

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Fix non-LPAE boot regression.
  2011-08-15 12:31         ` Vasily Khoruzhick
@ 2011-08-24  8:16           ` Vasily Khoruzhick
  0 siblings, 0 replies; 46+ messages in thread
From: Vasily Khoruzhick @ 2011-08-24  8:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 15 August 2011 15:31:44 Vasily Khoruzhick wrote:
> On Monday 15 August 2011 15:09:14 Catalin Marinas wrote:
> > Hi Vasily,
> > 
> > On Sat, Aug 13, 2011 at 01:58:19PM +0100, Vasily Khoruzhick wrote:
> > > It was introduced by  407f8b4cb07cbc5c1c7cc386f231224e2524ccea
> > > ARM: LPAE: MMU setup for the 3-level page table format
> > > 
> > > Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
> > > ---
> > > 
> > >  arch/arm/kernel/head.S |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > > index 0bdafc4..5add5f5 100644
> > > --- a/arch/arm/kernel/head.S
> > > +++ b/arch/arm/kernel/head.S
> > > 
> > > @@ -206,7 +206,7 @@ __create_page_tables:
> > >  1:	orr	r3, r7, r5, lsl #SECTION_SHIFT	@ flags + kernel base
> > >  
> > >  	str	r3, [r4, r5, lsl #PMD_ORDER]	@ identity mapping
> > >  	cmp	r5, r6
> > > 
> > > -	addlo	r5, r5, #SECTION_SHIFT >> 20	@ next section
> > > +	addlo	r5, r5, #1			@ next section
> > > 
> > >  	blo	1b
> > 
> > That's correct.
> > 
> > >  	/*
> > > 
> > > @@ -217,7 +217,7 @@ __create_page_tables:
> > >  	mov	r3, r3, lsr #SECTION_SHIFT
> > >  	orr	r3, r7, r3, lsl #SECTION_SHIFT
> > >  	add	r0, r4,  #(KERNEL_START & 0xff000000) >> (SECTION_SHIFT -
> > >  	PMD_ORDER)
> > > 
> > > -	str	r3, [r0, #(KERNEL_START & 0x00e00000) >> (SECTION_SHIFT -
> > > PMD_ORDER)]! +	str	r3, [r0, #(KERNEL_START & 0x00f00000) >>
> > > (SECTION_SHIFT - PMD_ORDER)]!
> > 
> > The reason for this was that the sections are 2MB with LPAE and a page
> > table entry is 64-bit wide. We always shift that value by 18 but with
> > LPAE we don't want to write in the middle of a page table entry if
> > KERNEL_START is not 2MB aligned.
> > 
> > But if KERNEL_START is not 2MB aligned, I think we get the wrong
> > physical address by 1MB (with the classic page table format).
> 
> Yep, for my case KERNEL_START is not 2MB aligned (TEXT_OFFSET is 0x00108000). 
> (CONFIG_PM_H1940 is set)
> 
> > There are a few alternatives to fixing this:
> > 
> > 1. Different KERNEL_START masking for classic or LPAE page tables.
> > 2. Always force 2MB section and the code above moving the phys addr into
> >    r3 would need to take this into account.
> > 
> > I would go for 1 with some shifting like below:
> > 
> > +	str	r3, [r0, #((KERNEL_START & 0x00f00000) >> SECTION_SHIFT) <<
> > PMD_ORDER]!
> > 
> > This should give us 0x00f00000 with classic page tables and 0x00e00000
> > with LPAE.
> > 
> > Thanks.
> 
> Ok, I'll test this change when I get home.

Works for me on s3c24xx. Sorry for a delay.

Regards
Vasily

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format
  2011-08-10 15:03 ` [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format Catalin Marinas
@ 2011-10-23 11:56   ` Russell King - ARM Linux
  2011-10-23 12:49     ` Catalin Marinas
  0 siblings, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-10-23 11:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:31PM +0100, Catalin Marinas wrote:
> This patch modifies the pgd/pmd/pte manipulation functions to support
> the 3-level page table format. Since there is no need for an 'ext'
> argument to cpu_set_pte_ext(), this patch conditionally defines a
> different prototype for this function when CONFIG_ARM_LPAE.

This has a really large number of ifdefs.  You've split the 2 and 3
level page table stuff into two different header files already,
conditionalized on CONFIG_ARM_LPAE, yet we still end up with lots of
junk in the common header file conditionalized on that symbol.  Can't
we find a way to restructure pgtable.h to sort this out more cleanly?

Do we really need to change the set_pte_ext() prototype as well - do
we _really_ need ifdefs around its declaration, and every usage of it
as well?  Can't we just leave the 3rd argument as zero?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 11/16] ARM: LPAE: Add fault handling support
  2011-08-10 15:03 ` [PATCH v7 11/16] ARM: LPAE: Add fault handling support Catalin Marinas
@ 2011-10-23 11:57   ` Russell King - ARM Linux
  2011-11-02 17:02     ` Catalin Marinas
  0 siblings, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-10-23 11:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:34PM +0100, Catalin Marinas wrote:
> diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
> index be7c638..f0bf61a 100644
> --- a/arch/arm/mm/alignment.c
> +++ b/arch/arm/mm/alignment.c
> @@ -909,6 +909,12 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>  	return 0;
>  }
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define ALIGNMENT_FAULT		33
> +#else
> +#define ALIGNMENT_FAULT		1
> +#endif

Probably makes sense to move this into a header, along with the other
fault codes, other FSR bits and fsr_fs().

> +
>  /*
>   * This needs to be done after sysctl_init, otherwise sys/ will be
>   * overwritten.  Actually, this shouldn't be in sys/ at all since
> @@ -942,7 +948,7 @@ static int __init alignment_init(void)
>  		ai_usermode = UM_FIXUP;
>  	}
>  
> -	hook_fault_code(1, do_alignment, SIGBUS, BUS_ADRALN,
> +	hook_fault_code(ALIGNMENT_FAULT, do_alignment, SIGBUS, BUS_ADRALN,
>  			"alignment exception");
>  
>  	/*
> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index 3b5ea68..91d1768 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -33,10 +33,15 @@
>  #define FSR_WRITE		(1 << 11)
>  #define FSR_FS4			(1 << 10)
>  #define FSR_FS3_0		(15)
> +#define FSR_FS5_0		(0x3f)
>  
>  static inline int fsr_fs(unsigned int fsr)
>  {
> +#ifdef CONFIG_ARM_LPAE
> +	return fsr & FSR_FS5_0;
> +#else
>  	return (fsr & FSR_FS3_0) | (fsr & FSR_FS4) >> 6;
> +#endif
>  }
>  
>  #ifdef CONFIG_MMU
> @@ -122,8 +127,10 @@ void show_pte(struct mm_struct *mm, unsigned long addr)
>  
>  		pte = pte_offset_map(pmd, addr);
>  		printk(", *pte=%08llx", (long long)pte_val(*pte));
> +#ifndef CONFIG_ARM_LPAE
>  		printk(", *ppte=%08llx",
>  		       (long long)pte_val(pte[PTE_HWTABLE_PTRS]));
> +#endif
>  		pte_unmap(pte);
>  	} while(0);
>  
> @@ -440,6 +447,12 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
>  	pmd = pmd_offset(pud, addr);
>  	pmd_k = pmd_offset(pud_k, addr);
>  
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * Only one hardware entry per PMD with LPAE.
> +	 */
> +	index = 0;
> +#else
>  	/*
>  	 * On ARM one Linux PGD entry contains two hardware entries (see page
>  	 * tables layout in pgtable.h). We normally guarantee that we always
> @@ -449,6 +462,7 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
>  	 * for the first of pair.
>  	 */
>  	index = (addr >> SECTION_SHIFT) & 1;
> +#endif
>  	if (pmd_none(pmd_k[index]))
>  		goto bad_area;
>  
> @@ -494,6 +508,72 @@ static struct fsr_info {
>  	int	code;
>  	const char *name;
>  } fsr_info[] = {
> +#ifdef CONFIG_ARM_LPAE
> +	{ do_bad,		SIGBUS,  0,		"unknown 0"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 1"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 2"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 3"			},
> +	{ do_bad,		SIGBUS,  0,		"reserved translation fault"	},
> +	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	},
> +	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 2 translation fault"	},
> +	{ do_page_fault,	SIGSEGV, SEGV_MAPERR,	"level 3 translation fault"	},
> +	{ do_bad,		SIGBUS,  0,		"reserved access flag fault"	},
> +	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 access flag fault"	},
> +	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 2 access flag fault"	},
> +	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 access flag fault"	},
> +	{ do_bad,		SIGBUS,  0,		"reserved permission fault"	},
> +	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 permission fault"	},
> +	{ do_sect_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
> +	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
> +	{ do_bad,		SIGBUS,  0,		"synchronous external abort"	},
> +	{ do_bad,		SIGBUS,  0,		"asynchronous external abort"	},
> +	{ do_bad,		SIGBUS,  0,		"unknown 18"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 19"			},
> +	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous parity error"	},
> +	{ do_bad,		SIGBUS,  0,		"asynchronous parity error"	},
> +	{ do_bad,		SIGBUS,  0,		"unknown 26"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 27"			},
> +	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
> +	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
> +	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
> +	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
> +	{ do_bad,		SIGBUS,  0,		"debug event"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 37"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 38"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 39"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 40"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 41"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 42"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 43"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 44"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 45"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 46"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 47"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 48"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 49"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 50"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 51"			},
> +	{ do_bad,		SIGBUS,  0,		"implementation fault (lockdown abort)" },
> +	{ do_bad,		SIGBUS,  0,		"unknown 53"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 54"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 55"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 56"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 57"			},
> +	{ do_bad,		SIGBUS,  0,		"implementation fault (coprocessor abort)" },
> +	{ do_bad,		SIGBUS,  0,		"unknown 59"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 60"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 61"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 62"			},
> +	{ do_bad,		SIGBUS,  0,		"unknown 63"			},
> +#else	/* !CONFIG_ARM_LPAE */
>  	/*
>  	 * The following are the standard ARMv3 and ARMv4 aborts.  ARMv5
>  	 * defines these to be "precise" aborts.
> @@ -535,6 +615,7 @@ static struct fsr_info {
>  	{ do_bad,		SIGBUS,  0,		"unknown 29"			   },
>  	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
>  	{ do_bad,		SIGBUS,  0,		"unknown 31"			   }
> +#endif	/* CONFIG_ARM_LPAE */

Can't we do better than this?

>  };
>  
>  void __init
> @@ -573,6 +654,9 @@ do_DataAbort(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
>  }
>  
>  
> +#ifdef CONFIG_ARM_LPAE
> +#define ifsr_info	fsr_info
> +#else	/* !CONFIG_ARM_LPAE */
>  static struct fsr_info ifsr_info[] = {
>  	{ do_bad,		SIGBUS,  0,		"unknown 0"			   },
>  	{ do_bad,		SIGBUS,  0,		"unknown 1"			   },
> @@ -607,6 +691,7 @@ static struct fsr_info ifsr_info[] = {
>  	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
>  	{ do_bad,		SIGBUS,  0,		"unknown 31"			   },
>  };
> +#endif	/* CONFIG_ARM_LPAE */
>  
>  void __init
>  hook_ifault_code(int nr, int (*fn)(unsigned long, unsigned int, struct pt_regs *),
> @@ -642,6 +727,7 @@ do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)
>  
>  static int __init exceptions_init(void)
>  {
> +#ifndef CONFIG_ARM_LPAE
>  	if (cpu_architecture() >= CPU_ARCH_ARMv6) {
>  		hook_fault_code(4, do_translation_fault, SIGSEGV, SEGV_MAPERR,
>  				"I-cache maintenance fault");
> @@ -657,6 +743,7 @@ static int __init exceptions_init(void)
>  		hook_fault_code(6, do_bad, SIGSEGV, SEGV_MAPERR,
>  				"section access flag fault");
>  	}
> +#endif
>  
>  	return 0;
>  }

Do we even need exceptions_init() at all for LPAE?  If not, can't we
avoid the whole of this including the useless init call.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format
  2011-08-10 15:03 ` [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format Catalin Marinas
@ 2011-10-23 11:59   ` Russell King - ARM Linux
  0 siblings, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-10-23 11:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:36PM +0100, Catalin Marinas wrote:
> The pmd_addr_end() definition has been removed and the
> generic one used instead.

NAK - it helps to read code comments.

commit c0ba10b512eb2e2a3888b6e6cc0e089f5e7a191b
Author: Russell King <rmk+kernel@arm.linux.org.uk>
Date:   Sun Nov 21 14:42:47 2010 +0000

    ARM: improve compiler's ability to optimize page tables
    
    Allow the compiler to better optimize the page table walking code
    by avoiding over-complex pmd_addr_end() calculations.  These
    calculations prevent the compiler spotting that we'll never iterate
    over the PMD table, causing it to create double nested loops where
    a single loop will do.
    
    Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index b155414..53d1d5d 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -374,6 +374,9 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 #define pmd_page(pmd)          pfn_to_page(__phys_to_pfn(pmd_val(pmd)))
 
+/* we don't need complex calculations here as the pmd is folded into the pgd */
+#define pmd_addr_end(addr,end) (end)
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64
  2011-08-10 15:03 ` [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64 Catalin Marinas
@ 2011-10-23 11:59   ` Russell King - ARM Linux
  0 siblings, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-10-23 11:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:38PM +0100, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> LPAE provides support for memory banks with physical addresses of up
> to 40 bits.
> 
> This patch adds a new atag, ATAG_MEM64, so that the Kernel can be
> informed about memory that exists above the 4GB boundary.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm/include/asm/setup.h |    8 ++++++++
>  arch/arm/kernel/setup.c      |   10 ++++++++++
>  2 files changed, 18 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
> index 915696d..a3ca303 100644
> --- a/arch/arm/include/asm/setup.h
> +++ b/arch/arm/include/asm/setup.h
> @@ -43,6 +43,13 @@ struct tag_mem32 {
>  	__u32	start;	/* physical start address */
>  };
>  
> +#define ATAG_MEM64	0x54420002
> +
> +struct tag_mem64 {
> +	__u64	size;
> +	__u64	start;	/* physical start address */
> +};
> +
>  /* VGA text type displays */
>  #define ATAG_VIDEOTEXT	0x54410003
>  
> @@ -148,6 +155,7 @@ struct tag {
>  	union {
>  		struct tag_core		core;
>  		struct tag_mem32	mem;
> +		struct tag_mem64	mem64;
>  		struct tag_videotext	videotext;
>  		struct tag_ramdisk	ramdisk;
>  		struct tag_initrd	initrd;
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 70bca64..a126558 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -608,6 +608,16 @@ static int __init parse_tag_mem32(const struct tag *tag)
>  
>  __tagtable(ATAG_MEM, parse_tag_mem32);
>  
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +static int __init parse_tag_mem64(const struct tag *tag)
> +{
> +	/* We only use 32-bits for the size. */
> +	return arm_add_memory(tag->u.mem64.start, (unsigned long)tag->u.mem64.size);
> +}
> +
> +__tagtable(ATAG_MEM64, parse_tag_mem64);
> +#endif /* CONFIG_PHYS_ADDR_T_64BIT */
> +

We should allow this even on 32-bit only kernels - but avoiding adding
>32-bit memory to the system in that case.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries
  2011-08-10 15:03 ` [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries Catalin Marinas
@ 2011-10-23 12:00   ` Russell King - ARM Linux
  2011-11-02 17:21   ` Russell King - ARM Linux
  1 sibling, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-10-23 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:39PM +0100, Catalin Marinas wrote:
> diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
> index 88633fe..2df5504 100644
> --- a/arch/arm/mm/Kconfig
> +++ b/arch/arm/mm/Kconfig
> @@ -629,6 +629,19 @@ config IO_36
>  
>  comment "Processor Features"
>  
> +config ARM_LPAE
> +	bool "Support for the Large Physical Address Extension"
> +	depends on MMU && CPU_V7
> +	help
> +	  Say Y if you have an ARMv7 processor supporting the LPAE page table
> +	  format and you would like to access memory beyond the 4GB limit.

This help text is entirely insufficient.  It doesn't tell people what the
implications are for enabling this option on CPUs without LPAE support.
(It won't boot.)  It doesn't suggest what people should select when they're
unsure what setting to use.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format
  2011-10-23 11:56   ` Russell King - ARM Linux
@ 2011-10-23 12:49     ` Catalin Marinas
  0 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-10-23 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 23 October 2011 13:56, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Aug 10, 2011 at 04:03:31PM +0100, Catalin Marinas wrote:
>> This patch modifies the pgd/pmd/pte manipulation functions to support
>> the 3-level page table format. Since there is no need for an 'ext'
>> argument to cpu_set_pte_ext(), this patch conditionally defines a
>> different prototype for this function when CONFIG_ARM_LPAE.
>
> This has a really large number of ifdefs. ?You've split the 2 and 3
> level page table stuff into two different header files already,
> conditionalized on CONFIG_ARM_LPAE, yet we still end up with lots of
> junk in the common header file conditionalized on that symbol. ?Can't
> we find a way to restructure pgtable.h to sort this out more cleanly?

I'll look into this.

> Do we really need to change the set_pte_ext() prototype as well - do
> we _really_ need ifdefs around its declaration, and every usage of it
> as well? ?Can't we just leave the 3rd argument as zero?

The LPAE variant of cpu_v7_set_pte_ext takes the second argument as a
64-bit value:

r0 - ptep
r2, r3 - pteval

If we pass a third argument, that would go on the stack.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 11/16] ARM: LPAE: Add fault handling support
  2011-10-23 11:57   ` Russell King - ARM Linux
@ 2011-11-02 17:02     ` Catalin Marinas
  0 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-11-02 17:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 23, 2011 at 12:57:56PM +0100, Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2011 at 04:03:34PM +0100, Catalin Marinas wrote:
> > @@ -494,6 +508,72 @@ static struct fsr_info {
> >  	int	code;
> >  	const char *name;
> >  } fsr_info[] = {
> > +#ifdef CONFIG_ARM_LPAE
...
> > +#else	/* !CONFIG_ARM_LPAE */
> >  	/*
> >  	 * The following are the standard ARMv3 and ARMv4 aborts.  ARMv5
> >  	 * defines these to be "precise" aborts.
> > @@ -535,6 +615,7 @@ static struct fsr_info {
> >  	{ do_bad,		SIGBUS,  0,		"unknown 29"			   },
> >  	{ do_bad,		SIGBUS,  0,		"unknown 30"			   },
> >  	{ do_bad,		SIGBUS,  0,		"unknown 31"			   }
> > +#endif	/* CONFIG_ARM_LPAE */
> 
> Can't we do better than this?

The first thought was defining the fsr_info array in a different file
but that would mean exposing functions from the fault.c which are
currently defined as static.

The other variant is including a C file directly in fault.c. It's not
very nice but at least we remove the big #ifdef.

-- 
Catalin

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries
  2011-08-10 15:03 ` [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries Catalin Marinas
  2011-10-23 12:00   ` Russell King - ARM Linux
@ 2011-11-02 17:21   ` Russell King - ARM Linux
  2011-11-02 18:07     ` Catalin Marinas
  1 sibling, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2011-11-02 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 10, 2011 at 04:03:39PM +0100, Catalin Marinas wrote:
> +config ARCH_DMA_ADDR_T_64BIT
> +	def_bool ARM_LPAE
> +

I think this should be selected only when we have a DMA engine supporting
64-bit addresses.  Technically LPAE itself doesn't give us that assurance.

If you have this kind of a setup:

CPU <==++==> RAM
       ||
     IOMMU
       |
   DMA device

where == and || means >32-bit addressing, and | means 32-bit addressing.

In such a setup, having dma_addr_t be 64-bit is pointless because it
should never see 64-bit addresses (the DMA API should deal with the
IOMMU and provide a list of DMA addresses to be placed into RAM for
the DMA device which takes account of the mappings setup in the IOMMU.)

So, I think 64-bit dma_addr_t should be a property of the DMA devices
present in the system rather than whether the CPUs MMU can deal with
>32-bit addresses or not.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries
  2011-11-02 17:21   ` Russell King - ARM Linux
@ 2011-11-02 18:07     ` Catalin Marinas
  0 siblings, 0 replies; 46+ messages in thread
From: Catalin Marinas @ 2011-11-02 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 02, 2011 at 05:21:42PM +0000, Russell King - ARM Linux wrote:
> On Wed, Aug 10, 2011 at 04:03:39PM +0100, Catalin Marinas wrote:
> > +config ARCH_DMA_ADDR_T_64BIT
> > +	def_bool ARM_LPAE
> > +
> 
> I think this should be selected only when we have a DMA engine supporting
> 64-bit addresses.  Technically LPAE itself doesn't give us that assurance.
> 
> If you have this kind of a setup:
> 
> CPU <==++==> RAM
>        ||
>      IOMMU
>        |
>    DMA device
> 
> where == and || means >32-bit addressing, and | means 32-bit addressing.

That's one configuration but there are other configurations with PCI
devices that can access 64-bit addresses without requiring an IOMMU.

> In such a setup, having dma_addr_t be 64-bit is pointless because it
> should never see 64-bit addresses (the DMA API should deal with the
> IOMMU and provide a list of DMA addresses to be placed into RAM for
> the DMA device which takes account of the mappings setup in the IOMMU.)
> 
> So, I think 64-bit dma_addr_t should be a property of the DMA devices
> present in the system rather than whether the CPUs MMU can deal with
> >32-bit addresses or not.

I agree, that's a property of the device but it doesn't mean that we
can't have a dma_addr_t as u64.

The current DMA allocator uses pages that can be placed anywhere if the
mask is 0xffffffff, so just making this type u32 does not change such
allocation, just ignoring the top bits of the bus address. What I think
we need for this case is GFP_DMA32, assuming that ZONE_DMA32 is set up:

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index e4e7f6c..a51026b 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -81,6 +81,8 @@ static struct page *__dma_alloc_buffer(struct device *dev, size_t size, gfp_t gf
 
 	if (mask < 0xffffffffULL)
 		gfp |= GFP_DMA;
+	else if (mask == 0xffffffffULL)
+		gfp |= GFP_DMA32;
 
 	page = alloc_pages(gfp, order);
 	if (!page)

-- 
Catalin

^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2011-11-02 18:07 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-10 15:03 [PATCH v7 00/16] ARM: Add support for the Large Physical Address Extensions Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 01/16] ARM: LPAE: add ISBs around MMU enabling code Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 02/16] ARM: LPAE: Cast the dma_addr_t argument to unsigned long in dma_to_virt Catalin Marinas
2011-08-13 14:33   ` Russell King - ARM Linux
2011-08-23 11:15     ` Russell King - ARM Linux
2011-08-10 15:03 ` [PATCH v7 03/16] ARM: LPAE: Use PMD_(SHIFT|SIZE|MASK) instead of PGDIR_* Catalin Marinas
2011-08-13 14:34   ` Russell King - ARM Linux
2011-08-15 16:48   ` Catalin Marinas
2011-08-23 11:15   ` Russell King - ARM Linux
2011-08-23 13:09     ` Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 04/16] ARM: LPAE: Factor out 2-level page table definitions into separate files Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 05/16] ARM: LPAE: Add (pte|pmd)val_t type definitions as u32 Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 06/16] ARM: LPAE: Use a mask for physical addresses in page table entries Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 07/16] ARM: LPAE: Introduce the 3-level page table format definitions Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 08/16] ARM: LPAE: Page table maintenance for the 3-level format Catalin Marinas
2011-10-23 11:56   ` Russell King - ARM Linux
2011-10-23 12:49     ` Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
2011-08-13 11:49   ` Vasily Khoruzhick
2011-08-13 12:56   ` Vasily Khoruzhick
2011-08-13 12:58     ` [PATCH] Fix non-LPAE boot regression Vasily Khoruzhick
2011-08-13 14:14       ` Catalin Marinas
2011-08-13 14:39         ` Russell King - ARM Linux
2011-08-13 14:45           ` Catalin Marinas
2011-08-15 11:41           ` Catalin Marinas
2011-08-15 12:09       ` Catalin Marinas
2011-08-15 12:31         ` Vasily Khoruzhick
2011-08-24  8:16           ` Vasily Khoruzhick
2011-08-15 16:51   ` [PATCH v7 09/16] ARM: LPAE: MMU setup for the 3-level page table format Catalin Marinas
2011-08-19 10:25   ` Ian Campbell
2011-08-19 11:10     ` Catalin Marinas
2011-08-19 11:47       ` Ian Campbell
2011-08-10 15:03 ` [PATCH v7 10/16] ARM: LPAE: Invalidate the TLB before freeing the PMD Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 11/16] ARM: LPAE: Add fault handling support Catalin Marinas
2011-10-23 11:57   ` Russell King - ARM Linux
2011-11-02 17:02     ` Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 12/16] ARM: LPAE: Add context switching support Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 13/16] ARM: LPAE: Add identity mapping support for the 3-level page table format Catalin Marinas
2011-10-23 11:59   ` Russell King - ARM Linux
2011-08-10 15:03 ` [PATCH v7 14/16] ARM: LPAE: mark memory banks with start > ULONG_MAX as highmem Catalin Marinas
2011-08-10 15:03 ` [PATCH v7 15/16] ARM: LPAE: add support for ATAG_MEM64 Catalin Marinas
2011-10-23 11:59   ` Russell King - ARM Linux
2011-08-10 15:03 ` [PATCH v7 16/16] ARM: LPAE: Add the Kconfig entries Catalin Marinas
2011-10-23 12:00   ` Russell King - ARM Linux
2011-11-02 17:21   ` Russell King - ARM Linux
2011-11-02 18:07     ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).