* [PATCH v2 0/2] flag contiguous PTEs in linear mapping @ 2016-02-17 16:29 Jeremy Linton 2016-02-17 16:29 ` [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton 2016-02-17 16:29 ` [PATCH v2 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton 0 siblings, 2 replies; 4+ messages in thread From: Jeremy Linton @ 2016-02-17 16:29 UTC (permalink / raw) To: linux-arm-kernel This is a rebase of the previous contiguous ptes in linear map patches on top of Mark Rutland's fixmap changes. Those changes appear to be sufficient to allow this patch set to boot on JunoR2, seattle and the xgene/m400. I've also done basic testing with RODATA turned on. In all cases with ACPI systems. This patch also adds the ability to align 64k kernels on the 2M CONT boundary which helps to assure that a number of the sections are completely mapped with CONT bits. There remain a number of holes due to smaller remapping operations that don't really affect the page protection states. That can be worked around in a number of cases if the code gets smart enough to detect that the break doesn't result in an actual change in permissions. Some of these are visible in the included example where its not initially obvious why a contigious 2M region isn't marked as such until digging into it. Changed v1->v2: 16k kernels now use CONT_SIZE rather than SECTION_SIZE for alignment. With 64k pages, and section alignment enabled the kernel looks like: ---[ Kernel Mapping ]--- 0xfffffe0000000000-0xfffffe0000200000 2M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0000200000-0xfffffe0001200000 16M ro x SHD AF CON UXN MEM/NORMAL 0xfffffe0001200000-0xfffffe0001400000 2M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0001400000-0xfffffe0001600000 2M RW NX SHD AF UXN MEM/NORMAL 0xfffffe0001600000-0xfffffe0002600000 16M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0002600000-0xfffffe0002800000 2M RW NX SHD AF UXN MEM/NORMAL 0xfffffe0002800000-0xfffffe0020000000 472M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0020000000-0xfffffe0060000000 1G RW NX SHD AF BLK UXN MEM/NORMAL 0xfffffe00600f0000-0xfffffe0060200000 1088K RW NX SHD AF UXN MEM/NORMAL 0xfffffe0060200000-0xfffffe0076400000 354M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0076400000-0xfffffe0076600000 2M RW NX SHD AF UXN MEM/NORMAL 0xfffffe0076600000-0xfffffe0078e00000 40M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe00793b0000-0xfffffe0079400000 320K RW NX SHD AF UXN MEM/NORMAL 0xfffffe0079400000-0xfffffe007e200000 78M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe007e200000-0xfffffe007e3d0000 1856K RW NX SHD AF UXN MEM/NORMAL 0xfffffe007e420000-0xfffffe007e600000 1920K RW NX SHD AF UXN MEM/NORMAL 0xfffffe007e600000-0xfffffe007f000000 10M RW NX SHD AF CON UXN MEM/NORMAL 0xfffffe0800000000-0xfffffe0980000000 6G RW NX SHD AF BLK UXN MEM/NORMAL With 4k pages ---[ Kernel Mapping ]--- 0xffffffc000000000-0xffffffc000200000 2M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc000200000-0xffffffc001200000 16M ro x SHD AF BLK UXN MEM/NORMAL 0xffffffc001200000-0xffffffc001400000 2M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc001400000-0xffffffc0015c0000 1792K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc0015c0000-0xffffffc0015d0000 64K RW NX SHD AF UXN MEM/NORMAL 0xffffffc0015d0000-0xffffffc001600000 192K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc001600000-0xffffffc002600000 16M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc002600000-0xffffffc002620000 128K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc002620000-0xffffffc002630000 64K RW NX SHD AF UXN MEM/NORMAL 0xffffffc002630000-0xffffffc002800000 1856K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc002800000-0xffffffc060000000 1496M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc0600f0000-0xffffffc060200000 1088K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc060200000-0xffffffc076400000 354M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc076400000-0xffffffc076590000 1600K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc076590000-0xffffffc076595000 20K RW NX SHD AF UXN MEM/NORMAL 0xffffffc076596000-0xffffffc0765a0000 40K RW NX SHD AF UXN MEM/NORMAL 0xffffffc0765a0000-0xffffffc076600000 384K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc076600000-0xffffffc078e00000 40M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc0793b0000-0xffffffc079400000 320K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc079400000-0xffffffc07e200000 78M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc07e200000-0xffffffc07e3d0000 1856K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc07e420000-0xffffffc07e600000 1920K RW NX SHD AF CON UXN MEM/NORMAL 0xffffffc07e600000-0xffffffc07f000000 10M RW NX SHD AF BLK UXN MEM/NORMAL 0xffffffc800000000-0xffffffc980000000 6G RW NX SHD AF BLK UXN MEM/NORMAL Jeremy Linton (2): arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels. arm64: Mark kernel page ranges contiguous arch/arm64/Kconfig.debug | 12 ++++---- arch/arm64/kernel/vmlinux.lds.S | 11 +++---- arch/arm64/mm/mmu.c | 64 +++++++++++++++++++++++++++++++++++++---- 3 files changed, 70 insertions(+), 17 deletions(-) -- 2.4.3 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels. 2016-02-17 16:29 [PATCH v2 0/2] flag contiguous PTEs in linear mapping Jeremy Linton @ 2016-02-17 16:29 ` Jeremy Linton 2016-02-17 17:17 ` Ard Biesheuvel 2016-02-17 16:29 ` [PATCH v2 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton 1 sibling, 1 reply; 4+ messages in thread From: Jeremy Linton @ 2016-02-17 16:29 UTC (permalink / raw) To: linux-arm-kernel This change allows ALIGN_RODATA for 16k and 64k kernels. In the case of 64k kernels it actually aligns to the CONT_SIZE rather than the SECTION_SIZE (which is 512M). This makes it generally more useful, especially for CONT enabled kernels. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> --- arch/arm64/Kconfig.debug | 12 ++++++------ arch/arm64/kernel/vmlinux.lds.S | 11 ++++++----- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug index e13c4bf..65705ee 100644 --- a/arch/arm64/Kconfig.debug +++ b/arch/arm64/Kconfig.debug @@ -59,15 +59,15 @@ config DEBUG_RODATA If in doubt, say Y config DEBUG_ALIGN_RODATA - depends on DEBUG_RODATA && ARM64_4K_PAGES + depends on DEBUG_RODATA bool "Align linker sections up to SECTION_SIZE" help If this option is enabled, sections that may potentially be marked as - read only or non-executable will be aligned up to the section size of - the kernel. This prevents sections from being split into pages and - avoids a potential TLB penalty. The downside is an increase in - alignment and potentially wasted space. Turn on this option if - performance is more important than memory pressure. + read only or non-executable will be aligned up to the section size + or contiguous hint size of the kernel. This prevents sections from + being split into pages and avoids a potential TLB penalty. The downside + is an increase in alignment and potentially wasted space. Turn on + this option if performance is more important than memory pressure. If in doubt, say N diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index b78a3c7..8f4fc2c 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -63,13 +63,14 @@ PECOFF_FILE_ALIGNMENT = 0x200; #endif #if defined(CONFIG_DEBUG_ALIGN_RODATA) -#define ALIGN_DEBUG_RO . = ALIGN(1<<SECTION_SHIFT); -#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO +#if defined(CONFIG_ARM64_4K_PAGES) +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(SECTION_SIZE); +#else +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(CONT_SIZE); +#endif #elif defined(CONFIG_DEBUG_RODATA) -#define ALIGN_DEBUG_RO . = ALIGN(1<<PAGE_SHIFT); -#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(PAGE_SIZE); #else -#define ALIGN_DEBUG_RO #define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(min); #endif -- 2.4.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels. 2016-02-17 16:29 ` [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton @ 2016-02-17 17:17 ` Ard Biesheuvel 0 siblings, 0 replies; 4+ messages in thread From: Ard Biesheuvel @ 2016-02-17 17:17 UTC (permalink / raw) To: linux-arm-kernel On 17 February 2016 at 17:29, Jeremy Linton <jeremy.linton@arm.com> wrote: > This change allows ALIGN_RODATA for 16k and 64k kernels. > In the case of 64k kernels it actually aligns to the CONT_SIZE ... and 16k kernels ... > rather than the SECTION_SIZE (which is 512M). This makes it generally > more useful, especially for CONT enabled kernels. > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> It probably makes sense to mention here that the alignment is 2 MB for all page sizes. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm64/Kconfig.debug | 12 ++++++------ > arch/arm64/kernel/vmlinux.lds.S | 11 ++++++----- > 2 files changed, 12 insertions(+), 11 deletions(-) > > diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug > index e13c4bf..65705ee 100644 > --- a/arch/arm64/Kconfig.debug > +++ b/arch/arm64/Kconfig.debug > @@ -59,15 +59,15 @@ config DEBUG_RODATA > If in doubt, say Y > > config DEBUG_ALIGN_RODATA > - depends on DEBUG_RODATA && ARM64_4K_PAGES > + depends on DEBUG_RODATA > bool "Align linker sections up to SECTION_SIZE" > help > If this option is enabled, sections that may potentially be marked as > - read only or non-executable will be aligned up to the section size of > - the kernel. This prevents sections from being split into pages and > - avoids a potential TLB penalty. The downside is an increase in > - alignment and potentially wasted space. Turn on this option if > - performance is more important than memory pressure. > + read only or non-executable will be aligned up to the section size > + or contiguous hint size of the kernel. This prevents sections from > + being split into pages and avoids a potential TLB penalty. The downside > + is an increase in alignment and potentially wasted space. Turn on > + this option if performance is more important than memory pressure. > > If in doubt, say N > > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S > index b78a3c7..8f4fc2c 100644 > --- a/arch/arm64/kernel/vmlinux.lds.S > +++ b/arch/arm64/kernel/vmlinux.lds.S > @@ -63,13 +63,14 @@ PECOFF_FILE_ALIGNMENT = 0x200; > #endif > > #if defined(CONFIG_DEBUG_ALIGN_RODATA) > -#define ALIGN_DEBUG_RO . = ALIGN(1<<SECTION_SHIFT); > -#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO > +#if defined(CONFIG_ARM64_4K_PAGES) > +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(SECTION_SIZE); > +#else > +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(CONT_SIZE); > +#endif > #elif defined(CONFIG_DEBUG_RODATA) > -#define ALIGN_DEBUG_RO . = ALIGN(1<<PAGE_SHIFT); > -#define ALIGN_DEBUG_RO_MIN(min) ALIGN_DEBUG_RO > +#define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(PAGE_SIZE); > #else > -#define ALIGN_DEBUG_RO > #define ALIGN_DEBUG_RO_MIN(min) . = ALIGN(min); > #endif > > -- > 2.4.3 > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] arm64: Mark kernel page ranges contiguous 2016-02-17 16:29 [PATCH v2 0/2] flag contiguous PTEs in linear mapping Jeremy Linton 2016-02-17 16:29 ` [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton @ 2016-02-17 16:29 ` Jeremy Linton 1 sibling, 0 replies; 4+ messages in thread From: Jeremy Linton @ 2016-02-17 16:29 UTC (permalink / raw) To: linux-arm-kernel With 64k pages, the next larger segment size is 512M. The linux kernel also uses different protection flags to cover its code and data. Because of this requirement, the vast majority of the kernel code and data structures end up being mapped with 64k pages instead of the larger pages common with a 4k page kernel. Recent ARM processors support a contiguous bit in the page tables which allows the a TLB to cover a range larger than a single PTE if that range is mapped into physically contiguous ram. So, for the kernel its a good idea to set this flag. Some basic micro benchmarks show it can significantly reduce the number of L1 dTLB refills. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> --- arch/arm64/mm/mmu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 58 insertions(+), 6 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 7711554..ab69a99 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1,3 +1,4 @@ + /* * Based on arch/arm/mm/mmu.c * @@ -103,17 +104,49 @@ static void split_pmd(pmd_t *pmd, pte_t *pte) * Need to have the least restrictive permissions available * permissions will be fixed up later */ - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC)); + set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC_CONT)); pfn++; } while (pte++, i++, i < PTRS_PER_PTE); } +static void clear_cont_pte_range(pte_t *pte, unsigned long addr) +{ + int i; + + pte -= CONT_RANGE_OFFSET(addr); + for (i = 0; i < CONT_PTES; i++) { + if (pte_cont(*pte)) + set_pte(pte, pte_mknoncont(*pte)); + pte++; + } + flush_tlb_all(); +} + +/* + * Given a range of PTEs set the pfn and provided page protection flags + */ +static void __populate_init_pte(pte_t *pte, unsigned long addr, + unsigned long end, phys_addr_t phys, + pgprot_t prot) +{ + unsigned long pfn = __phys_to_pfn(phys); + + do { + /* clear all the bits except the pfn, then apply the prot */ + set_pte(pte, pfn_pte(pfn, prot)); + pte++; + pfn++; + addr += PAGE_SIZE; + } while (addr != end); +} + static void alloc_init_pte(pmd_t *pmd, unsigned long addr, - unsigned long end, unsigned long pfn, + unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void)) { pte_t *pte; + unsigned long next; if (pmd_none(*pmd) || pmd_sect(*pmd)) { phys_addr_t pte_phys = pgtable_alloc(); @@ -127,10 +160,29 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, BUG_ON(pmd_bad(*pmd)); pte = pte_set_fixmap_offset(pmd, addr); + do { - set_pte(pte, pfn_pte(pfn, prot)); - pfn++; - } while (pte++, addr += PAGE_SIZE, addr != end); + next = min(end, (addr + CONT_SIZE) & CONT_MASK); + if (((addr | next | phys) & ~CONT_MASK) == 0) { + /* a block of CONT_PTES */ + __populate_init_pte(pte, addr, next, phys, + prot | __pgprot(PTE_CONT)); + } else { + /* + * If the range being split is already inside of a + * contiguous range but this PTE isn't going to be + * contiguous, then we want to unmark the adjacent + * ranges, then update the portion of the range we + * are interrested in. + */ + clear_cont_pte_range(pte, addr); + __populate_init_pte(pte, addr, next, phys, prot); + } + + pte += (next - addr) >> PAGE_SHIFT; + phys += next - addr; + addr = next; + } while (addr != end); pte_clear_fixmap(); } @@ -194,7 +246,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, } } } else { - alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys), + alloc_init_pte(pmd, addr, next, phys, prot, pgtable_alloc); } phys += next - addr; -- 2.4.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-02-17 17:17 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-17 16:29 [PATCH v2 0/2] flag contiguous PTEs in linear mapping Jeremy Linton 2016-02-17 16:29 ` [PATCH v2 1/2] arm64: mm: Enable CONT_SIZE aligned sections for 64k page kernels Jeremy Linton 2016-02-17 17:17 ` Ard Biesheuvel 2016-02-17 16:29 ` [PATCH v2 2/2] arm64: Mark kernel page ranges contiguous Jeremy Linton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).