* [PATCH 0/5] arm64: avoid block entries that we need to split later @ 2016-06-29 12:51 Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 1/5] arm64: mm: add param to force create_pgd_mapping() to use page mappings Ard Biesheuvel ` (5 more replies) 0 siblings, 6 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel This is a followup to Catalin's series 'arm64: Avoid overlapping EFI regions' whose aim was to get rid of the split_pmd() and split_pud() routines in mm/mmu.c, which may violate the architecture when used in the wrong way. This series updates the MMU mapping routines invoked by the code that sets up the UEFI runtime services environment so that regions that may be subject to splitting are always mapped down to pages. On the one hand, these are regions that are covered by the Memory Attributes table as well as the UEFI memory map, in which case we need to map down to pages in the first pass to ensure that the more granular permission attributes do not require splitting block entries. On the other hand, any region that is not aligned to the OS's page size is mapped down to pages as well, so that the page such a region may share with the next region is always mapped using pages, and no splitting is required for mapping the second region. Finally, remove the split_pmd() and split_pud() routines. Patch #5 is a cleanup patch, [0] http://thread.gmane.org/gmane.linux.kernel.efi/8649 Ard Biesheuvel (4): arm64: mm: add param to force create_pgd_mapping() to use page mappings arm64: efi: always map runtime services code and data regions down to pages arm64: efi: avoid block mappings for unaligned UEFI memory regions arm64: mm: fold init_pgd() into __create_pgd_mapping() Catalin Marinas (1): arm64: mm: Remove split_p*d() functions arch/arm64/include/asm/efi.h | 3 +- arch/arm64/include/asm/mmu.h | 2 +- arch/arm64/kernel/efi.c | 50 +++++++- arch/arm64/mm/mmu.c | 126 +++++--------------- 4 files changed, 82 insertions(+), 99 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/5] arm64: mm: add param to force create_pgd_mapping() to use page mappings 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel @ 2016-06-29 12:51 ` Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages Ard Biesheuvel ` (4 subsequent siblings) 5 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel Add a bool parameter 'allow_block_mappings' to create_pgd_mapping() and the various helper functions that it descends into, to give the caller control over whether block entries may be used to create the mapping. The UEFI runtime mapping routines will use this to avoid creating block entries that would need to split up into page entries when applying the permissions listed in the Memory Attributes firmware table. This also replaces the block_mappings_allowed() helper function that was added for DEBUG_PAGEALLOC functionality, but the resulting code is functionally equivalent (given that debug_page_alloc does not operate on EFI page table entries anyway) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/include/asm/mmu.h | 2 +- arch/arm64/kernel/efi.c | 2 +- arch/arm64/mm/mmu.c | 67 ++++++++------------ 3 files changed, 29 insertions(+), 42 deletions(-) diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 97b1d8f26b9c..8d9fce037b2f 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -34,7 +34,7 @@ extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt); extern void init_mem_pgprot(void); extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, unsigned long virt, phys_addr_t size, - pgprot_t prot); + pgprot_t prot, bool allow_block_mappings); extern void *fixmap_remap_fdt(phys_addr_t dt_phys); #endif diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 78f52488f9ff..981604948521 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -65,7 +65,7 @@ int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) create_pgd_mapping(mm, md->phys_addr, md->virt_addr, md->num_pages << EFI_PAGE_SHIFT, - __pgprot(prot_val | PTE_NG)); + __pgprot(prot_val | PTE_NG), true); return 0; } diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 0f85a46c3e18..a86d1acb3a7b 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -156,29 +156,10 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd) } while (pmd++, i++, i < PTRS_PER_PMD); } -#ifdef CONFIG_DEBUG_PAGEALLOC -static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void)) -{ - - /* - * If debug_page_alloc is enabled we must map the linear map - * using pages. However, other mappings created by - * create_mapping_noalloc must use sections in some cases. Allow - * sections to be used in those cases, where no pgtable_alloc - * function is provided. - */ - return !pgtable_alloc || !debug_pagealloc_enabled(); -} -#else -static bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void)) -{ - return true; -} -#endif - static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void)) + phys_addr_t (*pgtable_alloc)(void), + bool allow_block_mappings) { pmd_t *pmd; unsigned long next; @@ -209,7 +190,7 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, next = pmd_addr_end(addr, end); /* try section mapping first */ if (((addr | next | phys) & ~SECTION_MASK) == 0 && - block_mappings_allowed(pgtable_alloc)) { + allow_block_mappings) { pmd_t old_pmd =*pmd; pmd_set_huge(pmd, phys, prot); /* @@ -248,7 +229,8 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next, static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void)) + phys_addr_t (*pgtable_alloc)(void), + bool allow_block_mappings) { pud_t *pud; unsigned long next; @@ -268,8 +250,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, /* * For 4K granule only, attempt to put down a 1GB block */ - if (use_1G_block(addr, next, phys) && - block_mappings_allowed(pgtable_alloc)) { + if (use_1G_block(addr, next, phys) && allow_block_mappings) { pud_t old_pud = *pud; pud_set_huge(pud, phys, prot); @@ -290,7 +271,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, } } else { alloc_init_pmd(pud, addr, next, phys, prot, - pgtable_alloc); + pgtable_alloc, allow_block_mappings); } phys += next - addr; } while (pud++, addr = next, addr != end); @@ -304,7 +285,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, */ static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void)) + phys_addr_t (*pgtable_alloc)(void), + bool allow_block_mappings) { unsigned long addr, length, end, next; @@ -322,7 +304,8 @@ static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt, end = addr + length; do { next = pgd_addr_end(addr, end); - alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc); + alloc_init_pud(pgd, addr, next, phys, prot, pgtable_alloc, + allow_block_mappings); phys += next - addr; } while (pgd++, addr = next, addr != end); } @@ -340,9 +323,11 @@ static phys_addr_t late_pgtable_alloc(void) static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, - phys_addr_t (*alloc)(void)) + phys_addr_t (*alloc)(void), + bool allow_block_mappings) { - init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc); + init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc, + allow_block_mappings); } /* @@ -358,16 +343,15 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt, &phys, virt); return; } - __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, - NULL); + __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL, true); } void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, unsigned long virt, phys_addr_t size, - pgprot_t prot) + pgprot_t prot, bool allow_block_mappings) { __create_pgd_mapping(mm->pgd, phys, virt, size, prot, - late_pgtable_alloc); + late_pgtable_alloc, allow_block_mappings); } static void create_mapping_late(phys_addr_t phys, unsigned long virt, @@ -380,7 +364,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt, } __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, - late_pgtable_alloc); + late_pgtable_alloc, !debug_pagealloc_enabled()); } static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end) @@ -397,7 +381,8 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end if (end < kernel_start || start >= kernel_end) { __create_pgd_mapping(pgd, start, __phys_to_virt(start), end - start, PAGE_KERNEL, - early_pgtable_alloc); + early_pgtable_alloc, + !debug_pagealloc_enabled()); return; } @@ -409,12 +394,14 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end __create_pgd_mapping(pgd, start, __phys_to_virt(start), kernel_start - start, PAGE_KERNEL, - early_pgtable_alloc); + early_pgtable_alloc, + !debug_pagealloc_enabled()); if (kernel_end < end) __create_pgd_mapping(pgd, kernel_end, __phys_to_virt(kernel_end), end - kernel_end, PAGE_KERNEL, - early_pgtable_alloc); + early_pgtable_alloc, + !debug_pagealloc_enabled()); /* * Map the linear alias of the [_text, _etext) interval as @@ -424,7 +411,7 @@ static void __init __map_memblock(pgd_t *pgd, phys_addr_t start, phys_addr_t end */ __create_pgd_mapping(pgd, kernel_start, __phys_to_virt(kernel_start), kernel_end - kernel_start, PAGE_KERNEL_RO, - early_pgtable_alloc); + early_pgtable_alloc, !debug_pagealloc_enabled()); } static void __init map_mem(pgd_t *pgd) @@ -481,7 +468,7 @@ static void __init map_kernel_segment(pgd_t *pgd, void *va_start, void *va_end, BUG_ON(!PAGE_ALIGNED(size)); __create_pgd_mapping(pgd, pa_start, (unsigned long)va_start, size, prot, - early_pgtable_alloc); + early_pgtable_alloc, !debug_pagealloc_enabled()); vma->addr = va_start; vma->phys_addr = pa_start; -- 2.7.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 1/5] arm64: mm: add param to force create_pgd_mapping() to use page mappings Ard Biesheuvel @ 2016-06-29 12:51 ` Ard Biesheuvel 2016-07-22 14:30 ` Sudeep Holla 2016-06-29 12:51 ` [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions Ard Biesheuvel ` (3 subsequent siblings) 5 siblings, 1 reply; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel To avoid triggering diagnostics in the MMU code that are finicky about splitting block mappings into more granular mappings, ensure that regions that are likely to appear in the Memory Attributes table as well as the UEFI memory map are always mapped down to pages. This way, we can use apply_to_page_range() instead of create_pgd_mapping() for the second pass, which cannot split or merge block entries, and operates strictly on PTEs. Note that this aligns the arm64 Memory Attributes table handling code with the ARM code, which already uses apply_to_page_range() to set the strict permissions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/include/asm/efi.h | 3 +- arch/arm64/kernel/efi.c | 36 +++++++++++++++++++- 2 files changed, 36 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h index 622db3c6474e..8b13476cdf96 100644 --- a/arch/arm64/include/asm/efi.h +++ b/arch/arm64/include/asm/efi.h @@ -14,8 +14,7 @@ extern void efi_init(void); #endif int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md); - -#define efi_set_mapping_permissions efi_create_mapping +int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); #define arch_efi_call_virt_setup() \ ({ \ diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 981604948521..4aef89f37049 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -62,13 +62,47 @@ struct screen_info screen_info __section(.data); int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) { pteval_t prot_val = create_mapping_protection(md); + bool allow_block_mappings = (md->type != EFI_RUNTIME_SERVICES_CODE && + md->type != EFI_RUNTIME_SERVICES_DATA); create_pgd_mapping(mm, md->phys_addr, md->virt_addr, md->num_pages << EFI_PAGE_SHIFT, - __pgprot(prot_val | PTE_NG), true); + __pgprot(prot_val | PTE_NG), allow_block_mappings); return 0; } +static int __init set_permissions(pte_t *ptep, pgtable_t token, + unsigned long addr, void *data) +{ + efi_memory_desc_t *md = data; + pte_t pte = *ptep; + + if (md->attribute & EFI_MEMORY_RO) + pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); + if (md->attribute & EFI_MEMORY_XP) + pte = set_pte_bit(pte, __pgprot(PTE_PXN)); + set_pte(ptep, pte); + return 0; +} + +int __init efi_set_mapping_permissions(struct mm_struct *mm, + efi_memory_desc_t *md) +{ + BUG_ON(md->type != EFI_RUNTIME_SERVICES_CODE && + md->type != EFI_RUNTIME_SERVICES_DATA); + + /* + * Calling apply_to_page_range() is only safe on regions that are + * guaranteed to be mapped down to pages. Since we are only called + * for regions that have been mapped using efi_create_mapping() above + * (and this is checked by the generic Memory Attributes table parsing + * routines), there is no need to check that again here. + */ + return apply_to_page_range(mm, md->virt_addr, + md->num_pages << EFI_PAGE_SHIFT, + set_permissions, md); +} + static int __init arm64_dmi_init(void) { /* -- 2.7.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages 2016-06-29 12:51 ` [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages Ard Biesheuvel @ 2016-07-22 14:30 ` Sudeep Holla 2016-07-22 16:27 ` Ard Biesheuvel 0 siblings, 1 reply; 20+ messages in thread From: Sudeep Holla @ 2016-07-22 14:30 UTC (permalink / raw) To: linux-arm-kernel Hi Ard, On 29/06/16 13:51, Ard Biesheuvel wrote: > To avoid triggering diagnostics in the MMU code that are finicky about > splitting block mappings into more granular mappings, ensure that regions > that are likely to appear in the Memory Attributes table as well as the > UEFI memory map are always mapped down to pages. This way, we can use > apply_to_page_range() instead of create_pgd_mapping() for the second pass, > which cannot split or merge block entries, and operates strictly on PTEs. > > Note that this aligns the arm64 Memory Attributes table handling code with > the ARM code, which already uses apply_to_page_range() to set the strict > permissions. > This patch is merged in arm64/for-next/core now and when I try that branch with defconfig + CONFIG_PROVE_LOCKING, I get the following splat on boot and it fails to boot further on Juno. I could bisect that to this patch(Commit bd264d046aad ("arm64: efi: always map runtime services code and data regions down to pages") in that branch) Regards, Sudeep -->8 efi: memattr: Processing EFI Memory Attributes table: efi: memattr: 0x0000f9400000-0x0000f942ffff [Runtime Data |RUN| | |XP| | | | | | | | ] Unable to handle kernel NULL pointer dereference at virtual address 00000018 pgd = ffff000009aa4000 [00000018] *pgd=00000009ffffe003, *pud=00000009ffffd003, *pmd=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc7-next-20160722 #134 Hardware name: ARM Juno development board (r2) (DT) task: ffff800976ca0000 task.stack: ffff800976c3c000 PC is at __lock_acquire+0x13c/0x19e0 LR is at lock_acquire+0x4c/0x68 pc : [<ffff000008104544>] lr : [<ffff000008106114>] pstate: 200000c5 .... __lock_acquire+0x13c/0x19e0 lock_acquire+0x4c/0x68 _raw_spin_lock+0x40/0x58 apply_to_page_range+0x18c/0x388 efi_set_mapping_permissions+0x34/0x44 efi_memattr_apply_permissions+0x200/0x2a8 arm_enable_runtime_services+0x1b4/0x1fc do_one_initcall+0x38/0x128 kernel_init_freeable+0x84/0x1f0 kernel_init+0x10/0x100 ret_from_fork+0x10/0x40 Code: 5280003c 79004401 140000b5 b000b880 (f9400282) ---[ end trace 892120beb6681b4e ]--- ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages 2016-07-22 14:30 ` Sudeep Holla @ 2016-07-22 16:27 ` Ard Biesheuvel 2016-07-22 16:36 ` Suzuki K Poulose 0 siblings, 1 reply; 20+ messages in thread From: Ard Biesheuvel @ 2016-07-22 16:27 UTC (permalink / raw) To: linux-arm-kernel On 22 July 2016 at 16:30, Sudeep Holla <sudeep.holla@arm.com> wrote: > Hi Ard, > > On 29/06/16 13:51, Ard Biesheuvel wrote: >> >> To avoid triggering diagnostics in the MMU code that are finicky about >> splitting block mappings into more granular mappings, ensure that regions >> that are likely to appear in the Memory Attributes table as well as the >> UEFI memory map are always mapped down to pages. This way, we can use >> apply_to_page_range() instead of create_pgd_mapping() for the second pass, >> which cannot split or merge block entries, and operates strictly on PTEs. >> >> Note that this aligns the arm64 Memory Attributes table handling code with >> the ARM code, which already uses apply_to_page_range() to set the strict >> permissions. >> > > This patch is merged in arm64/for-next/core now and when I try that > branch with defconfig + CONFIG_PROVE_LOCKING, I get the following splat > on boot and it fails to boot further on Juno. > > I could bisect that to this patch(Commit bd264d046aad ("arm64: efi: > always map runtime services code and data regions down to pages") in > that branch) > Hi Sudeep, I can reproduce this on QEMU as well. It appears that apply_to_page_range() expects pages containing translation tables to have their per-page spinlock initialized if they are not part of init_mm. This --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -272,6 +272,7 @@ static phys_addr_t late_pgtable_alloc(void) { void *ptr = (void *)__get_free_page(PGALLOC_GFP); BUG_ON(!ptr); + BUG_ON(!pgtable_page_ctor(virt_to_page(ptr))); /* Ensure the zeroed page is visible to the page table walker */ dsb(ishst); makes the problem go away for me (just as a temporary hack) but I will try to come up with something more appropriate, and check if ARM has the same issue (since it uses apply_to_page_range() as well) -- Ard. > -->8 > > efi: memattr: Processing EFI Memory Attributes table: > efi: memattr: 0x0000f9400000-0x0000f942ffff [Runtime Data |RUN| | > |XP| | | | | | | | ] > Unable to handle kernel NULL pointer dereference at virtual address 00000018 > pgd = ffff000009aa4000 > [00000018] *pgd=00000009ffffe003, *pud=00000009ffffd003, > *pmd=0000000000000000 > Internal error: Oops: 96000004 [#1] PREEMPT SMP > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc7-next-20160722 #134 > Hardware name: ARM Juno development board (r2) (DT) > task: ffff800976ca0000 task.stack: ffff800976c3c000 > PC is at __lock_acquire+0x13c/0x19e0 > LR is at lock_acquire+0x4c/0x68 > pc : [<ffff000008104544>] lr : [<ffff000008106114>] pstate: 200000c5 > .... > __lock_acquire+0x13c/0x19e0 > lock_acquire+0x4c/0x68 > _raw_spin_lock+0x40/0x58 > apply_to_page_range+0x18c/0x388 > efi_set_mapping_permissions+0x34/0x44 > efi_memattr_apply_permissions+0x200/0x2a8 > arm_enable_runtime_services+0x1b4/0x1fc > do_one_initcall+0x38/0x128 > kernel_init_freeable+0x84/0x1f0 > kernel_init+0x10/0x100 > ret_from_fork+0x10/0x40 > Code: 5280003c 79004401 140000b5 b000b880 (f9400282) > ---[ end trace 892120beb6681b4e ]--- > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages 2016-07-22 16:27 ` Ard Biesheuvel @ 2016-07-22 16:36 ` Suzuki K Poulose 2016-07-22 16:45 ` Ard Biesheuvel 0 siblings, 1 reply; 20+ messages in thread From: Suzuki K Poulose @ 2016-07-22 16:36 UTC (permalink / raw) To: linux-arm-kernel On 22/07/16 17:27, Ard Biesheuvel wrote: > On 22 July 2016 at 16:30, Sudeep Holla <sudeep.holla@arm.com> wrote: >> Hi Ard, >> >> On 29/06/16 13:51, Ard Biesheuvel wrote: >>> >>> To avoid triggering diagnostics in the MMU code that are finicky about >>> splitting block mappings into more granular mappings, ensure that regions >>> that are likely to appear in the Memory Attributes table as well as the >>> UEFI memory map are always mapped down to pages. This way, we can use >>> apply_to_page_range() instead of create_pgd_mapping() for the second pass, >>> which cannot split or merge block entries, and operates strictly on PTEs. >>> >>> Note that this aligns the arm64 Memory Attributes table handling code with >>> the ARM code, which already uses apply_to_page_range() to set the strict >>> permissions. >>> >> >> This patch is merged in arm64/for-next/core now and when I try that >> branch with defconfig + CONFIG_PROVE_LOCKING, I get the following splat >> on boot and it fails to boot further on Juno. >> >> I could bisect that to this patch(Commit bd264d046aad ("arm64: efi: >> always map runtime services code and data regions down to pages") in >> that branch) >> > > Hi Sudeep, > > I can reproduce this on QEMU as well. It appears that > apply_to_page_range() expects pages containing translation tables to > have their per-page spinlock initialized if they are not part of > init_mm. > > This > > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -272,6 +272,7 @@ static phys_addr_t late_pgtable_alloc(void) > { > void *ptr = (void *)__get_free_page(PGALLOC_GFP); > BUG_ON(!ptr); > + BUG_ON(!pgtable_page_ctor(virt_to_page(ptr))); > > /* Ensure the zeroed page is visible to the page table walker */ > dsb(ishst); > > makes the problem go away for me (just as a temporary hack) but I will > try to come up with something more appropriate, and check if ARM has > the same issue (since it uses apply_to_page_range() as well) > Ard, I took a quick look at it. Looks like we don't initialise the page-table pages allocated via late_pgtable_alloc. Since we allocate it for an mm != init_mm, the lock validator comes into picture and finds a lock which is not initialised. The following patch fixes the issue. But is not a perfect one. May need to polish it a little bit. ----8>---- diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index a96a241..d312667 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -270,12 +270,12 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, static phys_addr_t late_pgtable_alloc(void) { - void *ptr = (void *)__get_free_page(PGALLOC_GFP); - BUG_ON(!ptr); + struct page *page = pte_alloc_one(NULL, 0); + BUG_ON(!page); /* Ensure the zeroed page is visible to the page table walker */ dsb(ishst); - return __pa(ptr); + return __pa(page_address(page)); } Thanks Suzuki ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages 2016-07-22 16:36 ` Suzuki K Poulose @ 2016-07-22 16:45 ` Ard Biesheuvel 0 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-07-22 16:45 UTC (permalink / raw) To: linux-arm-kernel On 22 July 2016 at 18:36, Suzuki K Poulose <Suzuki.Poulose@arm.com> wrote: > On 22/07/16 17:27, Ard Biesheuvel wrote: >> >> On 22 July 2016 at 16:30, Sudeep Holla <sudeep.holla@arm.com> wrote: >>> >>> Hi Ard, >>> >>> On 29/06/16 13:51, Ard Biesheuvel wrote: >>>> >>>> >>>> To avoid triggering diagnostics in the MMU code that are finicky about >>>> splitting block mappings into more granular mappings, ensure that >>>> regions >>>> that are likely to appear in the Memory Attributes table as well as the >>>> UEFI memory map are always mapped down to pages. This way, we can use >>>> apply_to_page_range() instead of create_pgd_mapping() for the second >>>> pass, >>>> which cannot split or merge block entries, and operates strictly on >>>> PTEs. >>>> >>>> Note that this aligns the arm64 Memory Attributes table handling code >>>> with >>>> the ARM code, which already uses apply_to_page_range() to set the strict >>>> permissions. >>>> >>> >>> This patch is merged in arm64/for-next/core now and when I try that >>> branch with defconfig + CONFIG_PROVE_LOCKING, I get the following splat >>> on boot and it fails to boot further on Juno. >>> >>> I could bisect that to this patch(Commit bd264d046aad ("arm64: efi: >>> always map runtime services code and data regions down to pages") in >>> that branch) >>> >> >> Hi Sudeep, >> >> I can reproduce this on QEMU as well. It appears that >> apply_to_page_range() expects pages containing translation tables to >> have their per-page spinlock initialized if they are not part of >> init_mm. >> >> This >> >> --- a/arch/arm64/mm/mmu.c >> +++ b/arch/arm64/mm/mmu.c >> @@ -272,6 +272,7 @@ static phys_addr_t late_pgtable_alloc(void) >> { >> void *ptr = (void *)__get_free_page(PGALLOC_GFP); >> BUG_ON(!ptr); >> + BUG_ON(!pgtable_page_ctor(virt_to_page(ptr))); >> >> /* Ensure the zeroed page is visible to the page table walker */ >> dsb(ishst); >> >> makes the problem go away for me (just as a temporary hack) but I will >> try to come up with something more appropriate, and check if ARM has >> the same issue (since it uses apply_to_page_range() as well) >> > > Ard, > > I took a quick look at it. Looks like we don't initialise the page-table > pages allocated via late_pgtable_alloc. Since we allocate it for an mm != > init_mm, > the lock validator comes into picture and finds a lock which is not > initialised. > The following patch fixes the issue. But is not a perfect one. May need to > polish it > a little bit. > > ----8>---- > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index a96a241..d312667 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -270,12 +270,12 @@ static void __create_pgd_mapping(pgd_t *pgdir, > phys_addr_t phys, > static phys_addr_t late_pgtable_alloc(void) > { > - void *ptr = (void *)__get_free_page(PGALLOC_GFP); > - BUG_ON(!ptr); > + struct page *page = pte_alloc_one(NULL, 0); > + BUG_ON(!page); > /* Ensure the zeroed page is visible to the page table walker */ > dsb(ishst); > - return __pa(ptr); > + return __pa(page_address(page)); > } > Actually, I just sent a response to the same effect. Alternatively, we could educate apply_to_page_range() to treat the EFI page tables specially (or simply all statically allocated mm_struct instances), e.g., diff --git a/mm/memory.c b/mm/memory.c index 15322b73636b..dc6145129170 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1852,8 +1852,11 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, int err; pgtable_t token; spinlock_t *uninitialized_var(ptl); + bool is_kernel_mm; - pte = (mm == &init_mm) ? + is_kernel_mm = (mm == &init_mm || core_kernel_data((unsigned long)mm)); + + pte = is_kernel_mm ? pte_alloc_kernel(pmd, addr) : pte_alloc_map_lock(mm, pmd, addr, &ptl); if (!pte) @@ -1873,7 +1876,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, arch_leave_lazy_mmu_mode(); - if (mm != &init_mm) + if (!is_kernel_mm) pte_unmap_unlock(pte-1, ptl); return err; } but it seems more appropriate to initialize the struct pages correctly, to preemptively deal with other code that may make similar assumptions. I also noticed that create_mapping_late() and create_mapping_noalloc() are essentially the same, since the only two invocations of the former should not split block entries, and simply remaps regions that have already been mapped with stricter permissions. This means late_pgtable_alloc is only used by create_pgd_mapping, which is only used by the EFI code. So that allows for some shortcuts to be taken, I would think. The only downside is that I will need to fix it in two places (arm64 and ARM) Anyway, thanks for the suggestion. -- Ard. ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 1/5] arm64: mm: add param to force create_pgd_mapping() to use page mappings Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages Ard Biesheuvel @ 2016-06-29 12:51 ` Ard Biesheuvel 2016-06-29 16:45 ` Catalin Marinas 2016-06-29 12:51 ` [PATCH 4/5] arm64: mm: Remove split_p*d() functions Ard Biesheuvel ` (2 subsequent siblings) 5 siblings, 1 reply; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel When running the OS with a page size > 4 KB, we need to round up mappings for regions that are not aligned to the OS's page size. We already avoid block mappings for EfiRuntimeServicesCode/Data regions for other reasons, but in the unlikely event that other unaliged regions exists that have the EFI_MEMORY_RUNTIME attribute set, ensure that unaligned regions are always mapped down to pages. This way, the overlapping page is guaranteed not to be covered by a block mapping that needs to be split. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/kernel/efi.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 4aef89f37049..ba9bee389fd5 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -65,6 +65,20 @@ int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md) bool allow_block_mappings = (md->type != EFI_RUNTIME_SERVICES_CODE && md->type != EFI_RUNTIME_SERVICES_DATA); + if (!PAGE_ALIGNED(md->phys_addr) || + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { + /* + * If the end address of this region is not aligned to page + * size, the mapping is rounded up, and may end up sharing a + * page frame with the next UEFI memory region. If we create + * a block entry now, we may need to split it again when mapping + * the next region, and support for that is going to be removed + * from the MMU routines. So avoid block mappings altogether in + * that case. + */ + allow_block_mappings = false; + } + create_pgd_mapping(mm, md->phys_addr, md->virt_addr, md->num_pages << EFI_PAGE_SHIFT, __pgprot(prot_val | PTE_NG), allow_block_mappings); -- 2.7.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 12:51 ` [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions Ard Biesheuvel @ 2016-06-29 16:45 ` Catalin Marinas 2016-06-29 16:50 ` Ard Biesheuvel 0 siblings, 1 reply; 20+ messages in thread From: Catalin Marinas @ 2016-06-29 16:45 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: > + if (!PAGE_ALIGNED(md->phys_addr) || > + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { > + /* > + * If the end address of this region is not aligned to page > + * size, the mapping is rounded up, and may end up sharing a > + * page frame with the next UEFI memory region. If we create > + * a block entry now, we may need to split it again when mapping > + * the next region, and support for that is going to be removed > + * from the MMU routines. So avoid block mappings altogether in > + * that case. > + */ > + allow_block_mappings = false; > + } How common is it for large areas to have unaligned start/end? I wonder whether it's worth implementing my approach to look ahead and explicitly check the overlap with the next section instead of disabling block mappings altogether for this region. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 16:45 ` Catalin Marinas @ 2016-06-29 16:50 ` Ard Biesheuvel 2016-06-29 16:53 ` Ard Biesheuvel 0 siblings, 1 reply; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 16:50 UTC (permalink / raw) To: linux-arm-kernel On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: >> + if (!PAGE_ALIGNED(md->phys_addr) || >> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { >> + /* >> + * If the end address of this region is not aligned to page >> + * size, the mapping is rounded up, and may end up sharing a >> + * page frame with the next UEFI memory region. If we create >> + * a block entry now, we may need to split it again when mapping >> + * the next region, and support for that is going to be removed >> + * from the MMU routines. So avoid block mappings altogether in >> + * that case. >> + */ >> + allow_block_mappings = false; >> + } > > How common is it for large areas to have unaligned start/end? I wonder > whether it's worth implementing my approach to look ahead and explicitly > check the overlap with the next section instead of disabling block > mappings altogether for this region. > Very uncommon. Typically, only MMIO regions that represent NOR flash are larger than a couple of pages. Taken from QEMU: ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 16:50 ` Ard Biesheuvel @ 2016-06-29 16:53 ` Ard Biesheuvel 2016-06-29 17:00 ` Leif Lindholm 0 siblings, 1 reply; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 16:53 UTC (permalink / raw) To: linux-arm-kernel On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: >>> + if (!PAGE_ALIGNED(md->phys_addr) || >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { >>> + /* >>> + * If the end address of this region is not aligned to page >>> + * size, the mapping is rounded up, and may end up sharing a >>> + * page frame with the next UEFI memory region. If we create >>> + * a block entry now, we may need to split it again when mapping >>> + * the next region, and support for that is going to be removed >>> + * from the MMU routines. So avoid block mappings altogether in >>> + * that case. >>> + */ >>> + allow_block_mappings = false; >>> + } >> >> How common is it for large areas to have unaligned start/end? I wonder >> whether it's worth implementing my approach to look ahead and explicitly >> check the overlap with the next section instead of disabling block >> mappings altogether for this region. >> > > Very uncommon. Typically, only MMIO regions that represent NOR flash > are larger than a couple of pages. Taken from QEMU: RT_Code : 640 Pages (2,621,440 Bytes) RT_Data : 880 Pages (3,604,480 Bytes) so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data regions 3.5 MB. Ideally, they are grouped together, but in reality, there are always a couple of regions of each type, so there is little to gain here from using block mappings ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 16:53 ` Ard Biesheuvel @ 2016-06-29 17:00 ` Leif Lindholm 2016-06-29 17:04 ` Ard Biesheuvel 0 siblings, 1 reply; 20+ messages in thread From: Leif Lindholm @ 2016-06-29 17:00 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 29, 2016 at 06:53:18PM +0200, Ard Biesheuvel wrote: > On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: > >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: > >>> + if (!PAGE_ALIGNED(md->phys_addr) || > >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { > >>> + /* > >>> + * If the end address of this region is not aligned to page > >>> + * size, the mapping is rounded up, and may end up sharing a > >>> + * page frame with the next UEFI memory region. If we create > >>> + * a block entry now, we may need to split it again when mapping > >>> + * the next region, and support for that is going to be removed > >>> + * from the MMU routines. So avoid block mappings altogether in > >>> + * that case. > >>> + */ > >>> + allow_block_mappings = false; > >>> + } > >> > >> How common is it for large areas to have unaligned start/end? I wonder > >> whether it's worth implementing my approach to look ahead and explicitly > >> check the overlap with the next section instead of disabling block > >> mappings altogether for this region. > >> > > > > Very uncommon. Typically, only MMIO regions that represent NOR flash > > are larger than a couple of pages. Taken from QEMU: > > RT_Code : 640 Pages (2,621,440 Bytes) > RT_Data : 880 Pages (3,604,480 Bytes) > > so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data > regions 3.5 MB. Ideally, they are grouped together, but in reality, > there are always a couple of regions of each type, so there is little > to gain here from using block mappings Is this representative for real platforms? What about efifb and reserved regions? My (x86) Lenovo workstation has one 64MB and one 16MB Runtime/Uncached MMIO region. As well as a 3MB and a 4MB RT_Data one. / Leif ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 17:00 ` Leif Lindholm @ 2016-06-29 17:04 ` Ard Biesheuvel 2016-06-29 17:40 ` Ard Biesheuvel 2016-06-29 17:54 ` Leif Lindholm 0 siblings, 2 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 17:04 UTC (permalink / raw) To: linux-arm-kernel On 29 June 2016 at 19:00, Leif Lindholm <leif.lindholm@linaro.org> wrote: > On Wed, Jun 29, 2016 at 06:53:18PM +0200, Ard Biesheuvel wrote: >> On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: >> >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: >> >>> + if (!PAGE_ALIGNED(md->phys_addr) || >> >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { >> >>> + /* >> >>> + * If the end address of this region is not aligned to page >> >>> + * size, the mapping is rounded up, and may end up sharing a >> >>> + * page frame with the next UEFI memory region. If we create >> >>> + * a block entry now, we may need to split it again when mapping >> >>> + * the next region, and support for that is going to be removed >> >>> + * from the MMU routines. So avoid block mappings altogether in >> >>> + * that case. >> >>> + */ >> >>> + allow_block_mappings = false; >> >>> + } >> >> >> >> How common is it for large areas to have unaligned start/end? I wonder >> >> whether it's worth implementing my approach to look ahead and explicitly >> >> check the overlap with the next section instead of disabling block >> >> mappings altogether for this region. >> >> >> > >> > Very uncommon. Typically, only MMIO regions that represent NOR flash >> > are larger than a couple of pages. Taken from QEMU: >> >> RT_Code : 640 Pages (2,621,440 Bytes) >> RT_Data : 880 Pages (3,604,480 Bytes) >> >> so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data >> regions 3.5 MB. Ideally, they are grouped together, but in reality, >> there are always a couple of regions of each type, so there is little >> to gain here from using block mappings > > Is this representative for real platforms? > I think it is a reasonable ballpark figure > What about efifb and reserved regions? > Those are not tagged as EFI_MEMORY_RUNTIME so they are not covered by the UEFI runtime mappings, and not relevant to this discussion. > My (x86) Lenovo workstation has one 64MB and one 16MB Runtime/Uncached > MMIO region. As well as a 3MB and a 4MB RT_Data one. > Are those MMIO regions naturally aligned? And how about the RT_Data ones? ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 17:04 ` Ard Biesheuvel @ 2016-06-29 17:40 ` Ard Biesheuvel 2016-06-29 17:54 ` Leif Lindholm 1 sibling, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 17:40 UTC (permalink / raw) To: linux-arm-kernel On 29 June 2016 at 19:04, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 29 June 2016 at 19:00, Leif Lindholm <leif.lindholm@linaro.org> wrote: >> On Wed, Jun 29, 2016 at 06:53:18PM +0200, Ard Biesheuvel wrote: >>> On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >>> > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: >>> >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: >>> >>> + if (!PAGE_ALIGNED(md->phys_addr) || >>> >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { >>> >>> + /* >>> >>> + * If the end address of this region is not aligned to page >>> >>> + * size, the mapping is rounded up, and may end up sharing a >>> >>> + * page frame with the next UEFI memory region. If we create >>> >>> + * a block entry now, we may need to split it again when mapping >>> >>> + * the next region, and support for that is going to be removed >>> >>> + * from the MMU routines. So avoid block mappings altogether in >>> >>> + * that case. >>> >>> + */ >>> >>> + allow_block_mappings = false; >>> >>> + } >>> >> >>> >> How common is it for large areas to have unaligned start/end? I wonder >>> >> whether it's worth implementing my approach to look ahead and explicitly >>> >> check the overlap with the next section instead of disabling block >>> >> mappings altogether for this region. >>> >> >>> > >>> > Very uncommon. Typically, only MMIO regions that represent NOR flash >>> > are larger than a couple of pages. Taken from QEMU: >>> >>> RT_Code : 640 Pages (2,621,440 Bytes) >>> RT_Data : 880 Pages (3,604,480 Bytes) >>> >>> so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data >>> regions 3.5 MB. Ideally, they are grouped together, but in reality, >>> there are always a couple of regions of each type, so there is little >>> to gain here from using block mappings >> >> Is this representative for real platforms? >> > > I think it is a reasonable ballpark figure > >> What about efifb and reserved regions? >> > > Those are not tagged as EFI_MEMORY_RUNTIME so they are not covered by > the UEFI runtime mappings, and not relevant to this discussion. > >> My (x86) Lenovo workstation has one 64MB and one 16MB Runtime/Uncached >> MMIO region. As well as a 3MB and a 4MB RT_Data one. >> > > Are those MMIO regions naturally aligned? And how about the RT_Data ones? Just to be clear: we are talking about regions that a) are larger than 2 MB b) whose naturally aligned 2 MB subregions are relatively aligned with their physical mapping c) share a 16 KB or 64 KB page frame at either end with an adjacent region So first of all, the virtual mapping code in the stub does take the 2 MB alignment into account, but only if the physical base of the region is aligned to 2 MB and the size of the region is at least 2 MB. This is intended for things like the NOR flash mapping (or the MMIO regions that Leif refers to). But a region that happens to exceed 2 MB is not mapped with the relative physical alignment in mind, and usually does not end up aligned correctly for block mappings to be usable. On top of that, the v2.6 Memory Attributes table feature (which I think should be recommended in all cases, but is optional in the spec) breaks RT_Code regions into read-only and non-exec slices, so even if a RT_Code region existed whose size and alignment would happen to allow a block mapping to be used, it will subsequently be split anyway. The bottom line is that I don't think it makes a huge amount of sense to go out of our way to deal with large regions with unaligned boundaries until we encounter any in real life. -- Ard. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 17:04 ` Ard Biesheuvel 2016-06-29 17:40 ` Ard Biesheuvel @ 2016-06-29 17:54 ` Leif Lindholm 2016-06-29 17:56 ` Ard Biesheuvel 1 sibling, 1 reply; 20+ messages in thread From: Leif Lindholm @ 2016-06-29 17:54 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 29, 2016 at 07:04:57PM +0200, Ard Biesheuvel wrote: > On 29 June 2016 at 19:00, Leif Lindholm <leif.lindholm@linaro.org> wrote: > > On Wed, Jun 29, 2016 at 06:53:18PM +0200, Ard Biesheuvel wrote: > >> On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >> > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: > >> >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: > >> >>> + if (!PAGE_ALIGNED(md->phys_addr) || > >> >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { > >> >>> + /* > >> >>> + * If the end address of this region is not aligned to page > >> >>> + * size, the mapping is rounded up, and may end up sharing a > >> >>> + * page frame with the next UEFI memory region. If we create > >> >>> + * a block entry now, we may need to split it again when mapping > >> >>> + * the next region, and support for that is going to be removed > >> >>> + * from the MMU routines. So avoid block mappings altogether in > >> >>> + * that case. > >> >>> + */ > >> >>> + allow_block_mappings = false; > >> >>> + } > >> >> > >> >> How common is it for large areas to have unaligned start/end? I wonder > >> >> whether it's worth implementing my approach to look ahead and explicitly > >> >> check the overlap with the next section instead of disabling block > >> >> mappings altogether for this region. > >> >> > >> > > >> > Very uncommon. Typically, only MMIO regions that represent NOR flash > >> > are larger than a couple of pages. Taken from QEMU: > >> > >> RT_Code : 640 Pages (2,621,440 Bytes) > >> RT_Data : 880 Pages (3,604,480 Bytes) > >> > >> so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data > >> regions 3.5 MB. Ideally, they are grouped together, but in reality, > >> there are always a couple of regions of each type, so there is little > >> to gain here from using block mappings > > > > Is this representative for real platforms? > > I think it is a reasonable ballpark figure > > > What about efifb and reserved regions? > > Those are not tagged as EFI_MEMORY_RUNTIME so they are not covered by > the UEFI runtime mappings, and not relevant to this discussion. OK. > > My (x86) Lenovo workstation has one 64MB and one 16MB Runtime/Uncached > > MMIO region. As well as a 3MB and a 4MB RT_Data one. > > Are those MMIO regions naturally aligned? And how about the RT_Data ones? So, I've now gone home and don't have access to the Lenovo, however I have a machine at home also with an AMI UEFI implementation, and identical MMIO regions. And they do look naturally aligned. The RT_Data ones are not naturally aligned. / Leif ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions 2016-06-29 17:54 ` Leif Lindholm @ 2016-06-29 17:56 ` Ard Biesheuvel 0 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 17:56 UTC (permalink / raw) To: linux-arm-kernel On 29 June 2016 at 19:54, Leif Lindholm <leif.lindholm@linaro.org> wrote: > On Wed, Jun 29, 2016 at 07:04:57PM +0200, Ard Biesheuvel wrote: >> On 29 June 2016 at 19:00, Leif Lindholm <leif.lindholm@linaro.org> wrote: >> > On Wed, Jun 29, 2016 at 06:53:18PM +0200, Ard Biesheuvel wrote: >> >> On 29 June 2016 at 18:50, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> >> > On 29 June 2016 at 18:45, Catalin Marinas <catalin.marinas@arm.com> wrote: >> >> >> On Wed, Jun 29, 2016 at 02:51:28PM +0200, Ard Biesheuvel wrote: >> >> >>> + if (!PAGE_ALIGNED(md->phys_addr) || >> >> >>> + !PAGE_ALIGNED(md->num_pages << EFI_PAGE_SHIFT)) { >> >> >>> + /* >> >> >>> + * If the end address of this region is not aligned to page >> >> >>> + * size, the mapping is rounded up, and may end up sharing a >> >> >>> + * page frame with the next UEFI memory region. If we create >> >> >>> + * a block entry now, we may need to split it again when mapping >> >> >>> + * the next region, and support for that is going to be removed >> >> >>> + * from the MMU routines. So avoid block mappings altogether in >> >> >>> + * that case. >> >> >>> + */ >> >> >>> + allow_block_mappings = false; >> >> >>> + } >> >> >> >> >> >> How common is it for large areas to have unaligned start/end? I wonder >> >> >> whether it's worth implementing my approach to look ahead and explicitly >> >> >> check the overlap with the next section instead of disabling block >> >> >> mappings altogether for this region. >> >> >> >> >> > >> >> > Very uncommon. Typically, only MMIO regions that represent NOR flash >> >> > are larger than a couple of pages. Taken from QEMU: >> >> >> >> RT_Code : 640 Pages (2,621,440 Bytes) >> >> RT_Data : 880 Pages (3,604,480 Bytes) >> >> >> >> so all RT_Code regions *combined* are 2.5 MB in total, and all RT_Data >> >> regions 3.5 MB. Ideally, they are grouped together, but in reality, >> >> there are always a couple of regions of each type, so there is little >> >> to gain here from using block mappings >> > >> > Is this representative for real platforms? >> >> I think it is a reasonable ballpark figure >> >> > What about efifb and reserved regions? >> >> Those are not tagged as EFI_MEMORY_RUNTIME so they are not covered by >> the UEFI runtime mappings, and not relevant to this discussion. > > OK. > >> > My (x86) Lenovo workstation has one 64MB and one 16MB Runtime/Uncached >> > MMIO region. As well as a 3MB and a 4MB RT_Data one. >> >> Are those MMIO regions naturally aligned? And how about the RT_Data ones? > > So, I've now gone home and don't have access to the Lenovo, however I > have a machine at home also with an AMI UEFI implementation, and > identical MMIO regions. And they do look naturally aligned. > > The RT_Data ones are not naturally aligned. > OK, that confirms my suspicion. See other email ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 4/5] arm64: mm: Remove split_p*d() functions 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel ` (2 preceding siblings ...) 2016-06-29 12:51 ` [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions Ard Biesheuvel @ 2016-06-29 12:51 ` Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 5/5] arm64: mm: fold init_pgd() into __create_pgd_mapping() Ard Biesheuvel 2016-06-30 16:13 ` [PATCH 0/5] arm64: avoid block entries that we need to split later Catalin Marinas 5 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel From: Catalin Marinas <catalin.marinas@arm.com> Since the efi_create_mapping() no longer generates block mappings and being the last user of the split_p*d code, remove these functions and the corresponding TLBI. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [ardb: replace 'overlapping regions' with 'block mappings' in commit log] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/mm/mmu.c | 47 ++------------------ 1 file changed, 4 insertions(+), 43 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index a86d1acb3a7b..f233c885042d 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -97,24 +97,6 @@ static phys_addr_t __init early_pgtable_alloc(void) return phys; } -/* - * remap a PMD into pages - */ -static void split_pmd(pmd_t *pmd, pte_t *pte) -{ - unsigned long pfn = pmd_pfn(*pmd); - int i = 0; - - do { - /* - * Need to have the least restrictive permissions available - * permissions will be fixed up later - */ - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC)); - pfn++; - } while (pte++, i++, i < PTRS_PER_PTE); -} - static void alloc_init_pte(pmd_t *pmd, unsigned long addr, unsigned long end, unsigned long pfn, pgprot_t prot, @@ -122,15 +104,13 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, { pte_t *pte; - if (pmd_none(*pmd) || pmd_sect(*pmd)) { + BUG_ON(pmd_sect(*pmd)); + if (pmd_none(*pmd)) { phys_addr_t pte_phys; BUG_ON(!pgtable_alloc); pte_phys = pgtable_alloc(); pte = pte_set_fixmap(pte_phys); - if (pmd_sect(*pmd)) - split_pmd(pmd, pte); __pmd_populate(pmd, pte_phys, PMD_TYPE_TABLE); - flush_tlb_all(); pte_clear_fixmap(); } BUG_ON(pmd_bad(*pmd)); @@ -144,18 +124,6 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr, pte_clear_fixmap(); } -static void split_pud(pud_t *old_pud, pmd_t *pmd) -{ - unsigned long addr = pud_pfn(*old_pud) << PAGE_SHIFT; - pgprot_t prot = __pgprot(pud_val(*old_pud) ^ addr); - int i = 0; - - do { - set_pmd(pmd, __pmd(addr | pgprot_val(prot))); - addr += PMD_SIZE; - } while (pmd++, i++, i < PTRS_PER_PMD); -} - static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(void), @@ -167,20 +135,13 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end, /* * Check for initial section mappings in the pgd/pud and remove them. */ - if (pud_none(*pud) || pud_sect(*pud)) { + BUG_ON(pud_sect(*pud)); + if (pud_none(*pud)) { phys_addr_t pmd_phys; BUG_ON(!pgtable_alloc); pmd_phys = pgtable_alloc(); pmd = pmd_set_fixmap(pmd_phys); - if (pud_sect(*pud)) { - /* - * need to have the 1G of mappings continue to be - * present - */ - split_pud(pud, pmd); - } __pud_populate(pud, pmd_phys, PUD_TYPE_TABLE); - flush_tlb_all(); pmd_clear_fixmap(); } BUG_ON(pud_bad(*pud)); -- 2.7.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/5] arm64: mm: fold init_pgd() into __create_pgd_mapping() 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel ` (3 preceding siblings ...) 2016-06-29 12:51 ` [PATCH 4/5] arm64: mm: Remove split_p*d() functions Ard Biesheuvel @ 2016-06-29 12:51 ` Ard Biesheuvel 2016-06-30 16:13 ` [PATCH 0/5] arm64: avoid block entries that we need to split later Catalin Marinas 5 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-29 12:51 UTC (permalink / raw) To: linux-arm-kernel The routine __create_pgd_mapping() does nothing except calling init_pgd(), which has no other callers. So fold the latter into the former. Also, drop a comment that has gone stale. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/mm/mmu.c | 24 +++++--------------- 1 file changed, 6 insertions(+), 18 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index f233c885042d..8cf4da36909a 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -240,16 +240,14 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, unsigned long end, pud_clear_fixmap(); } -/* - * Create the page directory entries and any necessary page tables for the - * mapping specified by 'md'. - */ -static void init_pgd(pgd_t *pgd, phys_addr_t phys, unsigned long virt, - phys_addr_t size, pgprot_t prot, - phys_addr_t (*pgtable_alloc)(void), - bool allow_block_mappings) +static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, + unsigned long virt, phys_addr_t size, + pgprot_t prot, + phys_addr_t (*pgtable_alloc)(void), + bool allow_block_mappings) { unsigned long addr, length, end, next; + pgd_t *pgd = pgd_offset_raw(pgdir, virt); /* * If the virtual and physical address don't have the same offset @@ -281,16 +279,6 @@ static phys_addr_t late_pgtable_alloc(void) return __pa(ptr); } -static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys, - unsigned long virt, phys_addr_t size, - pgprot_t prot, - phys_addr_t (*alloc)(void), - bool allow_block_mappings) -{ - init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc, - allow_block_mappings); -} - /* * This function can only be used to modify existing table entries, * without allocating new levels of table. Note that this permits the -- 2.7.4 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 0/5] arm64: avoid block entries that we need to split later 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel ` (4 preceding siblings ...) 2016-06-29 12:51 ` [PATCH 5/5] arm64: mm: fold init_pgd() into __create_pgd_mapping() Ard Biesheuvel @ 2016-06-30 16:13 ` Catalin Marinas 2016-06-30 16:16 ` Ard Biesheuvel 5 siblings, 1 reply; 20+ messages in thread From: Catalin Marinas @ 2016-06-30 16:13 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jun 29, 2016 at 02:51:25PM +0200, Ard Biesheuvel wrote: > Ard Biesheuvel (4): > arm64: mm: add param to force create_pgd_mapping() to use page > mappings > arm64: efi: always map runtime services code and data regions down to > pages > arm64: efi: avoid block mappings for unaligned UEFI memory regions > arm64: mm: fold init_pgd() into __create_pgd_mapping() > > Catalin Marinas (1): > arm64: mm: Remove split_p*d() functions I plan to queue these patches for 4.8. Do you have any other patches based on these so that I need to create a separate branch? Otherwise, I'll just push them on top of arm64 for-next/core branch. Thanks. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 0/5] arm64: avoid block entries that we need to split later 2016-06-30 16:13 ` [PATCH 0/5] arm64: avoid block entries that we need to split later Catalin Marinas @ 2016-06-30 16:16 ` Ard Biesheuvel 0 siblings, 0 replies; 20+ messages in thread From: Ard Biesheuvel @ 2016-06-30 16:16 UTC (permalink / raw) To: linux-arm-kernel On 30 June 2016 at 18:13, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Wed, Jun 29, 2016 at 02:51:25PM +0200, Ard Biesheuvel wrote: >> Ard Biesheuvel (4): >> arm64: mm: add param to force create_pgd_mapping() to use page >> mappings >> arm64: efi: always map runtime services code and data regions down to >> pages >> arm64: efi: avoid block mappings for unaligned UEFI memory regions >> arm64: mm: fold init_pgd() into __create_pgd_mapping() >> >> Catalin Marinas (1): >> arm64: mm: Remove split_p*d() functions > > I plan to queue these patches for 4.8. Do you have any other patches > based on these so that I need to create a separate branch? Otherwise, > I'll just push them on top of arm64 for-next/core branch. > No, they can go right on top of arm64/for-next/core AFAICT ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-07-22 16:45 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-29 12:51 [PATCH 0/5] arm64: avoid block entries that we need to split later Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 1/5] arm64: mm: add param to force create_pgd_mapping() to use page mappings Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 2/5] arm64: efi: always map runtime services code and data regions down to pages Ard Biesheuvel 2016-07-22 14:30 ` Sudeep Holla 2016-07-22 16:27 ` Ard Biesheuvel 2016-07-22 16:36 ` Suzuki K Poulose 2016-07-22 16:45 ` Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 3/5] arm64: efi: avoid block mappings for unaligned UEFI memory regions Ard Biesheuvel 2016-06-29 16:45 ` Catalin Marinas 2016-06-29 16:50 ` Ard Biesheuvel 2016-06-29 16:53 ` Ard Biesheuvel 2016-06-29 17:00 ` Leif Lindholm 2016-06-29 17:04 ` Ard Biesheuvel 2016-06-29 17:40 ` Ard Biesheuvel 2016-06-29 17:54 ` Leif Lindholm 2016-06-29 17:56 ` Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 4/5] arm64: mm: Remove split_p*d() functions Ard Biesheuvel 2016-06-29 12:51 ` [PATCH 5/5] arm64: mm: fold init_pgd() into __create_pgd_mapping() Ard Biesheuvel 2016-06-30 16:13 ` [PATCH 0/5] arm64: avoid block entries that we need to split later Catalin Marinas 2016-06-30 16:16 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).