* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied @ 2014-07-14 15:25 Ard Biesheuvel 2014-07-14 15:27 ` Ard Biesheuvel 2014-07-14 18:40 ` Mark Salter 0 siblings, 2 replies; 17+ messages in thread From: Ard Biesheuvel @ 2014-07-14 15:25 UTC (permalink / raw) To: linux-arm-kernel If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET bytes above the base of DRAM, accept the lowest alternative mapping available instead of aborting. We may lose a bit of memory at the low end, but we can still proceed normally otherwise. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- This is a proposed bug fix for arm64 platforms that fail to boot through EFI due to the fact that some bits of EFI itself are occupying the low end of DRAM. Note that this code now triggers an 'unused function' warning for efi_relocate_kernel(), as that is no longer used. This warning will disappear automatically once the already queued up EFISTUB refactoring patches will get merged for 3.17. arch/arm64/kernel/efi-stub.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c index 60e98a639ac5..5165b3accefe 100644 --- a/arch/arm64/kernel/efi-stub.c +++ b/arch/arm64/kernel/efi-stub.c @@ -60,20 +60,16 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, kernel_size = _edata - _text; if (*image_addr != (dram_base + TEXT_OFFSET)) { kernel_memsize = kernel_size + (_end - _edata); - status = efi_relocate_kernel(sys_table, image_addr, - kernel_size, kernel_memsize, - dram_base + TEXT_OFFSET, - PAGE_SIZE); + status = efi_low_alloc(sys_table, kernel_memsize + TEXT_OFFSET, + SZ_2M, reserve_addr); if (status != EFI_SUCCESS) { pr_efi_err(sys_table, "Failed to relocate kernel\n"); return status; } - if (*image_addr != (dram_base + TEXT_OFFSET)) { - pr_efi_err(sys_table, "Failed to alloc kernel memory\n"); - efi_free(sys_table, kernel_memsize, *image_addr); - return EFI_ERROR; - } - *image_size = kernel_memsize; + memcpy((void *)*reserve_addr + TEXT_OFFSET, (void *)*image_addr, + kernel_size); + *image_addr = *reserve_addr + TEXT_OFFSET; + *reserve_size = kernel_memsize; } -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-14 15:25 [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied Ard Biesheuvel @ 2014-07-14 15:27 ` Ard Biesheuvel 2014-07-14 18:40 ` Mark Salter 1 sibling, 0 replies; 17+ messages in thread From: Ard Biesheuvel @ 2014-07-14 15:27 UTC (permalink / raw) To: linux-arm-kernel On 14 July 2014 17:25, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET > bytes above the base of DRAM, accept the lowest alternative mapping available > instead of aborting. We may lose a bit of memory at the low end, but we can > still proceed normally otherwise. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > This is a proposed bug fix for arm64 platforms that fail to boot through EFI > due to the fact that some bits of EFI itself are occupying the low end of DRAM. > > Note that this code now triggers an 'unused function' warning for > efi_relocate_kernel(), as that is no longer used. This warning will disappear > automatically once the already queued up EFISTUB refactoring patches will > get merged for 3.17. > > arch/arm64/kernel/efi-stub.c | 18 ++++++------------ > 1 file changed, 6 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c > index 60e98a639ac5..5165b3accefe 100644 > --- a/arch/arm64/kernel/efi-stub.c > +++ b/arch/arm64/kernel/efi-stub.c > @@ -60,20 +60,16 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, > kernel_size = _edata - _text; > if (*image_addr != (dram_base + TEXT_OFFSET)) { > kernel_memsize = kernel_size + (_end - _edata); > - status = efi_relocate_kernel(sys_table, image_addr, > - kernel_size, kernel_memsize, > - dram_base + TEXT_OFFSET, > - PAGE_SIZE); > + status = efi_low_alloc(sys_table, kernel_memsize + TEXT_OFFSET, > + SZ_2M, reserve_addr); > if (status != EFI_SUCCESS) { > pr_efi_err(sys_table, "Failed to relocate kernel\n"); > return status; > } > - if (*image_addr != (dram_base + TEXT_OFFSET)) { > - pr_efi_err(sys_table, "Failed to alloc kernel memory\n"); > - efi_free(sys_table, kernel_memsize, *image_addr); > - return EFI_ERROR; > - } > - *image_size = kernel_memsize; > + memcpy((void *)*reserve_addr + TEXT_OFFSET, (void *)*image_addr, > + kernel_size); > + *image_addr = *reserve_addr + TEXT_OFFSET; > + *reserve_size = kernel_memsize; This still needs a '+ TEXT_OFFSET' btw ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-14 15:25 [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied Ard Biesheuvel 2014-07-14 15:27 ` Ard Biesheuvel @ 2014-07-14 18:40 ` Mark Salter 2014-07-15 10:02 ` Leif Lindholm 2014-07-15 11:00 ` Mark Rutland 1 sibling, 2 replies; 17+ messages in thread From: Mark Salter @ 2014-07-14 18:40 UTC (permalink / raw) To: linux-arm-kernel On Mon, 2014-07-14 at 17:25 +0200, Ard Biesheuvel wrote: > If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET > bytes above the base of DRAM, accept the lowest alternative mapping available > instead of aborting. We may lose a bit of memory at the low end, but we can > still proceed normally otherwise. This breaks APM Mustang because the spin-table holding pen for secondary CPUs is marked as reserved memory in the TEXT_OFFSET area and the kernel placement using your patch makes it unreachable by kernel. Here is a patch I've been working with to solve the same problem: From: Mark Salter <msalter@redhat.com> Date: Thu, 10 Jul 2014 09:25:30 -0400 Subject: [PATCH] arm64/efi: try to handle firmware located below kernel The rule for arm64 kernel image placement is that it must be located TEXT_OFFSET bytes past a 2MiB boundary. The kernel itself will use the TEXT_OFFSET sized area for initial page tables but that area is not part of the kernel image itself. The current EFI stub code finds the base of DRAM from the EFI memmap and relocates the kernel to dram_base+TEXT_OFFSET. This assumes that the low memory is not being used and the kernel relocation simply fails if the base memory allocation fails. At least one vendor has firmware which occupies memory near dram_base so kernel relocations always fail. This patch attempts to work with such firmware by searching the EFI memmap for the lowest available memory which may be used for the kernel image. There are several pitfalls remaining which may lead to boot failure: * The stub does not allocate the TEXT_OFFSET region, so it is required that the firmware not be using that area for anything which may interfere or overlap with the initial kernel page tables. We can't simply include that area in our search for available memory because firmware using the spin-table method for booting secondary CPUs may place the CPU pen in an out of the way part of that region and mark it as reserved memory. * The current code requires FDT to be placed within first 512MiB of DRAM (with the kernel below it). This requirement can be removed in the future, but would involve changes to generic stub code shared by other architectures. Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/arm64/kernel/efi-stub.c | 45 +++++++++++++++++++++++++++++++++++++------- arch/arm64/kernel/efi.c | 19 ++++++++++++++++--- 2 files changed, 54 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c index 60e98a63..f5da27f 100644 --- a/arch/arm64/kernel/efi-stub.c +++ b/arch/arm64/kernel/efi-stub.c @@ -54,21 +54,53 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, efi_loaded_image_t *image) { efi_status_t status; - unsigned long kernel_size, kernel_memsize = 0; + unsigned long kernel_size, kernel_memsize; + unsigned long desired_base = dram_base + TEXT_OFFSET; + unsigned long desired_end; + unsigned long map_size; + struct efi_memory_map map; + efi_memory_desc_t *md; /* Relocate the image, if required. */ kernel_size = _edata - _text; - if (*image_addr != (dram_base + TEXT_OFFSET)) { - kernel_memsize = kernel_size + (_end - _edata); + kernel_memsize = kernel_size + (_end - _edata); + + desired_end = desired_base + kernel_size; + + /* find lowest available address for kernel to live */ + status = efi_get_memory_map(sys_table, (efi_memory_desc_t **)&map.map, + &map_size, &map.desc_size, NULL, NULL); + if (status == EFI_SUCCESS) { + map.map_end = map.map + map_size; + for_each_efi_memory_desc(&map, md) { + unsigned long start, end, offset; + if (!(md->attribute & EFI_MEMORY_WB)) + continue; + if (md->type != EFI_CONVENTIONAL_MEMORY) + continue; + start = md->phys_addr; + end = start + (md->num_pages << EFI_PAGE_SHIFT); + offset = start & (SZ_2M - 1); + if (offset < TEXT_OFFSET) + start += (TEXT_OFFSET - offset); + else if (offset > TEXT_OFFSET) + start = ALIGN(start, SZ_2M) + TEXT_OFFSET; + if (start < end && (start + kernel_memsize) <= end) { + desired_base = start; + break; + } + } + } + + if (*image_addr != desired_base) { status = efi_relocate_kernel(sys_table, image_addr, kernel_size, kernel_memsize, - dram_base + TEXT_OFFSET, - PAGE_SIZE); + desired_base, PAGE_SIZE); if (status != EFI_SUCCESS) { pr_efi_err(sys_table, "Failed to relocate kernel\n"); return status; } - if (*image_addr != (dram_base + TEXT_OFFSET)) { + if (*image_addr != desired_base) { pr_efi_err(sys_table, "Failed to alloc kernel memory\n"); efi_free(sys_table, kernel_memsize, *image_addr); return EFI_ERROR; @@ -76,6 +108,5 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, *image_size = kernel_memsize; } - return EFI_SUCCESS; } diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 453b7f8..2bc6469 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -180,9 +180,18 @@ static __init void reserve_regions(void) if (is_reserve_region(md) || md->type == EFI_BOOT_SERVICES_CODE || md->type == EFI_BOOT_SERVICES_DATA) { - memblock_reserve(paddr, size); - if (uefi_debug) - pr_cont("*"); + if (paddr < PHYS_OFFSET) { + if ((paddr + size) > PHYS_OFFSET) { + size -= (PHYS_OFFSET - paddr); + memblock_reserve(PHYS_OFFSET, size); + if (uefi_debug) + pr_cont("**"); + } + } else { + memblock_reserve(paddr, size); + if (uefi_debug) + pr_cont("*"); + } } if (uefi_debug) @@ -273,6 +282,10 @@ static void __init free_boot_services(void) continue; } + /* Don't free anything below kernel */ + if (md->phys_addr < PHYS_OFFSET) + continue; + /* * We want to free memory from this region. */ -- 1.8.3.1 > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > This is a proposed bug fix for arm64 platforms that fail to boot through EFI > due to the fact that some bits of EFI itself are occupying the low end of DRAM. > > Note that this code now triggers an 'unused function' warning for > efi_relocate_kernel(), as that is no longer used. This warning will disappear > automatically once the already queued up EFISTUB refactoring patches will > get merged for 3.17. > > arch/arm64/kernel/efi-stub.c | 18 ++++++------------ > 1 file changed, 6 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c > index 60e98a639ac5..5165b3accefe 100644 > --- a/arch/arm64/kernel/efi-stub.c > +++ b/arch/arm64/kernel/efi-stub.c > @@ -60,20 +60,16 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, > kernel_size = _edata - _text; > if (*image_addr != (dram_base + TEXT_OFFSET)) { > kernel_memsize = kernel_size + (_end - _edata); > - status = efi_relocate_kernel(sys_table, image_addr, > - kernel_size, kernel_memsize, > - dram_base + TEXT_OFFSET, > - PAGE_SIZE); > + status = efi_low_alloc(sys_table, kernel_memsize + TEXT_OFFSET, > + SZ_2M, reserve_addr); > if (status != EFI_SUCCESS) { > pr_efi_err(sys_table, "Failed to relocate kernel\n"); > return status; > } > - if (*image_addr != (dram_base + TEXT_OFFSET)) { > - pr_efi_err(sys_table, "Failed to alloc kernel memory\n"); > - efi_free(sys_table, kernel_memsize, *image_addr); > - return EFI_ERROR; > - } > - *image_size = kernel_memsize; > + memcpy((void *)*reserve_addr + TEXT_OFFSET, (void *)*image_addr, > + kernel_size); > + *image_addr = *reserve_addr + TEXT_OFFSET; > + *reserve_size = kernel_memsize; > } > > ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-14 18:40 ` Mark Salter @ 2014-07-15 10:02 ` Leif Lindholm 2014-07-15 11:05 ` Mark Rutland 2014-07-15 13:11 ` Mark Salter 2014-07-15 11:00 ` Mark Rutland 1 sibling, 2 replies; 17+ messages in thread From: Leif Lindholm @ 2014-07-15 10:02 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jul 14, 2014 at 02:40:48PM -0400, Mark Salter wrote: > On Mon, 2014-07-14 at 17:25 +0200, Ard Biesheuvel wrote: > > If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET > > bytes above the base of DRAM, accept the lowest alternative mapping available > > instead of aborting. We may lose a bit of memory at the low end, but we can > > still proceed normally otherwise. > > This breaks APM Mustang because the spin-table holding pen for secondary > CPUs is marked as reserved memory in the TEXT_OFFSET area and the kernel > placement using your patch makes it unreachable by kernel. Here is a > patch I've been working with to solve the same problem: Hmm. The thing I don't like about the below approach is that it hard wires in the "memory below TEXT_OFFSET cannot be used" aspect, beyond the current prectical limitation. Since we are likely to see platforms with UEFI memory in use around start of RAM, that is a limitation we should probably try to get rid of. > From: Mark Salter <msalter@redhat.com> > Date: Thu, 10 Jul 2014 09:25:30 -0400 > Subject: [PATCH] arm64/efi: try to handle firmware located below kernel > > The rule for arm64 kernel image placement is that it must be located > TEXT_OFFSET bytes past a 2MiB boundary. The kernel itself will use the > TEXT_OFFSET sized area for initial page tables but that area is not > part of the kernel image itself. > > The current EFI stub code finds the base of DRAM from the EFI memmap > and relocates the kernel to dram_base+TEXT_OFFSET. This assumes that > the low memory is not being used and the kernel relocation simply > fails if the base memory allocation fails. > > At least one vendor has firmware which occupies memory near dram_base > so kernel relocations always fail. This patch attempts to work with > such firmware by searching the EFI memmap for the lowest available > memory which may be used for the kernel image. There are several > pitfalls remaining which may lead to boot failure: > > * The stub does not allocate the TEXT_OFFSET region, so it is > required that the firmware not be using that area for anything > which may interfere or overlap with the initial kernel page > tables. We can't simply include that area in our search for > available memory because firmware using the spin-table method > for booting secondary CPUs may place the CPU pen in an out of > the way part of that region and mark it as reserved memory. > > * The current code requires FDT to be placed within first 512MiB > of DRAM (with the kernel below it). This requirement can be > removed in the future, but would involve changes to generic > stub code shared by other architectures. > > Signed-off-by: Mark Salter <msalter@redhat.com> > --- > arch/arm64/kernel/efi-stub.c | 45 +++++++++++++++++++++++++++++++++++++------- > arch/arm64/kernel/efi.c | 19 ++++++++++++++++--- > 2 files changed, 54 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c > index 60e98a63..f5da27f 100644 > --- a/arch/arm64/kernel/efi-stub.c > +++ b/arch/arm64/kernel/efi-stub.c > @@ -54,21 +54,53 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, > efi_loaded_image_t *image) > { > efi_status_t status; > - unsigned long kernel_size, kernel_memsize = 0; > + unsigned long kernel_size, kernel_memsize; > + unsigned long desired_base = dram_base + TEXT_OFFSET; > + unsigned long desired_end; > + unsigned long map_size; > + struct efi_memory_map map; > + efi_memory_desc_t *md; > > /* Relocate the image, if required. */ > kernel_size = _edata - _text; > - if (*image_addr != (dram_base + TEXT_OFFSET)) { > - kernel_memsize = kernel_size + (_end - _edata); > + kernel_memsize = kernel_size + (_end - _edata); > + > + desired_end = desired_base + kernel_size; > + > + /* find lowest available address for kernel to live */ > + status = efi_get_memory_map(sys_table, (efi_memory_desc_t **)&map.map, > + &map_size, &map.desc_size, NULL, NULL); > + if (status == EFI_SUCCESS) { > + map.map_end = map.map + map_size; > + for_each_efi_memory_desc(&map, md) { > + unsigned long start, end, offset; > + if (!(md->attribute & EFI_MEMORY_WB)) > + continue; > + if (md->type != EFI_CONVENTIONAL_MEMORY) > + continue; > + start = md->phys_addr; > + end = start + (md->num_pages << EFI_PAGE_SHIFT); > + offset = start & (SZ_2M - 1); > + if (offset < TEXT_OFFSET) > + start += (TEXT_OFFSET - offset); > + else if (offset > TEXT_OFFSET) > + start = ALIGN(start, SZ_2M) + TEXT_OFFSET; > + if (start < end && (start + kernel_memsize) <= end) { > + desired_base = start; > + break; > + } > + } > + } I think this would be useful as a helper function - efi_alloc_above() or something. > + > + if (*image_addr != desired_base) { > status = efi_relocate_kernel(sys_table, image_addr, > kernel_size, kernel_memsize, > - dram_base + TEXT_OFFSET, > - PAGE_SIZE); > + desired_base, PAGE_SIZE); > if (status != EFI_SUCCESS) { > pr_efi_err(sys_table, "Failed to relocate kernel\n"); > return status; > } > - if (*image_addr != (dram_base + TEXT_OFFSET)) { > + if (*image_addr != desired_base) { > pr_efi_err(sys_table, "Failed to alloc kernel memory\n"); > efi_free(sys_table, kernel_memsize, *image_addr); > return EFI_ERROR; > @@ -76,6 +108,5 @@ static efi_status_t handle_kernel_image(efi_system_table_t *sys_table, > *image_size = kernel_memsize; > } > > - > return EFI_SUCCESS; > } > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > index 453b7f8..2bc6469 100644 > --- a/arch/arm64/kernel/efi.c > +++ b/arch/arm64/kernel/efi.c > @@ -180,9 +180,18 @@ static __init void reserve_regions(void) > if (is_reserve_region(md) || > md->type == EFI_BOOT_SERVICES_CODE || > md->type == EFI_BOOT_SERVICES_DATA) { > - memblock_reserve(paddr, size); > - if (uefi_debug) > - pr_cont("*"); > + if (paddr < PHYS_OFFSET) { > + if ((paddr + size) > PHYS_OFFSET) { > + size -= (PHYS_OFFSET - paddr); > + memblock_reserve(PHYS_OFFSET, size); > + if (uefi_debug) > + pr_cont("**"); > + } > + } else { > + memblock_reserve(paddr, size); > + if (uefi_debug) > + pr_cont("*"); > + } > } > > if (uefi_debug) > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > continue; > } > > + /* Don't free anything below kernel */ > + if (md->phys_addr < PHYS_OFFSET) > + continue; > + Is the spin table area really allocated as BOOT_SERVICES_*? / Leif ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 10:02 ` Leif Lindholm @ 2014-07-15 11:05 ` Mark Rutland 2014-07-15 13:11 ` Mark Salter 1 sibling, 0 replies; 17+ messages in thread From: Mark Rutland @ 2014-07-15 11:05 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 15, 2014 at 11:02:22AM +0100, Leif Lindholm wrote: > On Mon, Jul 14, 2014 at 02:40:48PM -0400, Mark Salter wrote: > > On Mon, 2014-07-14 at 17:25 +0200, Ard Biesheuvel wrote: > > > If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET > > > bytes above the base of DRAM, accept the lowest alternative mapping available > > > instead of aborting. We may lose a bit of memory at the low end, but we can > > > still proceed normally otherwise. > > > > This breaks APM Mustang because the spin-table holding pen for secondary > > CPUs is marked as reserved memory in the TEXT_OFFSET area and the kernel > > placement using your patch makes it unreachable by kernel. Here is a > > patch I've been working with to solve the same problem: > > Hmm. The thing I don't like about the below approach is that it hard > wires in the "memory below TEXT_OFFSET cannot be used" aspect, beyond > the current prectical limitation. > > Since we are likely to see platforms with UEFI memory in use around > start of RAM, that is a limitation we should probably try to get rid of. This isn't just an issue for UEFI. There are other reasons one might want to load a kernel away from the start of RAM while still wanting to address said RAM(e.g. kdump). We should address that. [...] > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > continue; > > } > > > > + /* Don't free anything below kernel */ > > + if (md->phys_addr < PHYS_OFFSET) > > + continue; > > + > > Is the spin table area really allocated as BOOT_SERVICES_*? If that is the case, this platform is _broken_. The spin-table memory (both the code and the mailboxes) needs to live around forever in case you don't boot all of the secondaries. Thanks, Mark. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 10:02 ` Leif Lindholm 2014-07-15 11:05 ` Mark Rutland @ 2014-07-15 13:11 ` Mark Salter 2014-07-15 13:54 ` Leif Lindholm 1 sibling, 1 reply; 17+ messages in thread From: Mark Salter @ 2014-07-15 13:11 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > continue; > > } > > > > + /* Don't free anything below kernel */ > > + if (md->phys_addr < PHYS_OFFSET) > > + continue; > > + > > Is the spin table area really allocated as BOOT_SERVICES_*? > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, then there could be BS code/data memory which we'd want to ignore. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 13:11 ` Mark Salter @ 2014-07-15 13:54 ` Leif Lindholm 2014-07-15 14:23 ` Mark Salter 0 siblings, 1 reply; 17+ messages in thread From: Leif Lindholm @ 2014-07-15 13:54 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > continue; > > > } > > > > > > + /* Don't free anything below kernel */ > > > + if (md->phys_addr < PHYS_OFFSET) > > > + continue; > > > + > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > then there could be BS code/data memory which we'd want to ignore. Well, if it is boot service code/data - then there is no need for us to keep it around after ExitBootServices(). / Leif ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 13:54 ` Leif Lindholm @ 2014-07-15 14:23 ` Mark Salter 2014-07-15 14:31 ` Mark Rutland 2014-07-15 14:49 ` Leif Lindholm 0 siblings, 2 replies; 17+ messages in thread From: Mark Salter @ 2014-07-15 14:23 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > continue; > > > > } > > > > > > > > + /* Don't free anything below kernel */ > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > + continue; > > > > + > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > then there could be BS code/data memory which we'd want to ignore. > > Well, if it is boot service code/data - then there is no need for us > to keep it around after ExitBootServices(). > > / > Leif One would think, but EFI has proven to be less than strictly compliant in that regard in the past. I'm inclined to keep the boot services around until after SetVirtualAddressMap just in case. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 14:23 ` Mark Salter @ 2014-07-15 14:31 ` Mark Rutland 2014-07-15 14:49 ` Leif Lindholm 1 sibling, 0 replies; 17+ messages in thread From: Mark Rutland @ 2014-07-15 14:31 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 15, 2014 at 03:23:26PM +0100, Mark Salter wrote: > On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > > continue; > > > > > } > > > > > > > > > > + /* Don't free anything below kernel */ > > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > > + continue; > > > > > + > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > then there could be BS code/data memory which we'd want to ignore. > > > > Well, if it is boot service code/data - then there is no need for us > > to keep it around after ExitBootServices(). > > > > / > > Leif > > One would think, but EFI has proven to be less than strictly compliant > in that regard in the past. I'm inclined to keep the boot services > around until after SetVirtualAddressMap just in case. Why should we add a work around for a potential bug that doesn't exist yet? That just provides fertile ground for such a bug to spring into existence and for people to ignore it when bringing up their SoC. The comment doesn't explain the rationale and the code doesn't make sense given a sane implementation. For the moment it's better to be strict, IMO. Otherwise there are plenty of other potential bugs we could attempt to work around to enable people to write firmware with even lower standards... If we have to work around something then we should have an actual issue to work around first. Thanks, Mark. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 14:23 ` Mark Salter 2014-07-15 14:31 ` Mark Rutland @ 2014-07-15 14:49 ` Leif Lindholm 2014-07-15 15:04 ` Mark Salter 1 sibling, 1 reply; 17+ messages in thread From: Leif Lindholm @ 2014-07-15 14:49 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 15, 2014 at 10:23:26AM -0400, Mark Salter wrote: > On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > > continue; > > > > > } > > > > > > > > > > + /* Don't free anything below kernel */ > > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > > + continue; > > > > > + > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > then there could be BS code/data memory which we'd want to ignore. > > > > Well, if it is boot service code/data - then there is no need for us > > to keep it around after ExitBootServices(). > > One would think, but EFI has proven to be less than strictly compliant > in that regard in the past. I'm inclined to keep the boot services > around until after SetVirtualAddressMap just in case. But the function you add this clause to will still throw away all boot services code/data regions - just with this modification it skips those that happen to lie lower in the address space than the kernel. (And I do agree with Mark R here - let's not work around bugs that don't exist yet.) / Leif ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 14:49 ` Leif Lindholm @ 2014-07-15 15:04 ` Mark Salter 2014-07-15 15:28 ` Leif Lindholm 0 siblings, 1 reply; 17+ messages in thread From: Mark Salter @ 2014-07-15 15:04 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2014-07-15 at 15:49 +0100, Leif Lindholm wrote: > On Tue, Jul 15, 2014 at 10:23:26AM -0400, Mark Salter wrote: > > On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > > > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > > > continue; > > > > > > } > > > > > > > > > > > > + /* Don't free anything below kernel */ > > > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > > > + continue; > > > > > > + > > > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > > then there could be BS code/data memory which we'd want to ignore. > > > > > > Well, if it is boot service code/data - then there is no need for us > > > to keep it around after ExitBootServices(). > > > > One would think, but EFI has proven to be less than strictly compliant > > in that regard in the past. I'm inclined to keep the boot services > > around until after SetVirtualAddressMap just in case. > > But the function you add this clause to will still throw away all boot > services code/data regions - just with this modification it skips > those that happen to lie lower in the address space than the kernel. > > (And I do agree with Mark R here - let's not work around bugs that > don't exist yet.) > I'm not sure if they still exist or not, but on Foundation, I saw a crash in SetVirtualAddressMap unless I kept BS regions around. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 15:04 ` Mark Salter @ 2014-07-15 15:28 ` Leif Lindholm 2014-07-16 13:13 ` Mark Salter 0 siblings, 1 reply; 17+ messages in thread From: Leif Lindholm @ 2014-07-15 15:28 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 15, 2014 at 11:04:37AM -0400, Mark Salter wrote: > On Tue, 2014-07-15 at 15:49 +0100, Leif Lindholm wrote: > > On Tue, Jul 15, 2014 at 10:23:26AM -0400, Mark Salter wrote: > > > On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > > > > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > > > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > > > > continue; > > > > > > > } > > > > > > > > > > > > > > + /* Don't free anything below kernel */ > > > > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > > > > + continue; > > > > > > > + > > > > > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > > > then there could be BS code/data memory which we'd want to ignore. > > > > > > > > Well, if it is boot service code/data - then there is no need for us > > > > to keep it around after ExitBootServices(). > > > > > > One would think, but EFI has proven to be less than strictly compliant > > > in that regard in the past. I'm inclined to keep the boot services > > > around until after SetVirtualAddressMap just in case. > > > > But the function you add this clause to will still throw away all boot > > services code/data regions - just with this modification it skips > > those that happen to lie lower in the address space than the kernel. Returning to the actual code we are discussing here: The hunk above has no bearing on whether boot services regions are generally unmapped or not. It only filters explicitly those boot services regions that happen to be lower in memory than the kernel, and keep them around for the duration of the system. > > > > (And I do agree with Mark R here - let's not work around bugs that > > don't exist yet.) > > > > I'm not sure if they still exist or not, but on Foundation, I saw a > crash in SetVirtualAddressMap unless I kept BS regions around. For the topic of keeping boot services code around: I did also see issues with not keeping boot services regions on v7 - ages ago. I have not seen it this year, and I _really_ want to see if any such issues resurface. So post-3.16 I would quite like to see the call to free_boot_services() moved earlier to flush out any such issues before we see large-scale deployments. The foundation model is a development tool, not a production system, so any issue reproducable only there is a pure bonus - since that firmware can _always_ be updated. / Leif ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-15 15:28 ` Leif Lindholm @ 2014-07-16 13:13 ` Mark Salter 2014-07-22 17:08 ` Leif Lindholm 0 siblings, 1 reply; 17+ messages in thread From: Mark Salter @ 2014-07-16 13:13 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2014-07-15 at 16:28 +0100, Leif Lindholm wrote: > On Tue, Jul 15, 2014 at 11:04:37AM -0400, Mark Salter wrote: > > On Tue, 2014-07-15 at 15:49 +0100, Leif Lindholm wrote: > > > On Tue, Jul 15, 2014 at 10:23:26AM -0400, Mark Salter wrote: > > > > On Tue, 2014-07-15 at 14:54 +0100, Leif Lindholm wrote: > > > > > On Tue, Jul 15, 2014 at 09:11:00AM -0400, Mark Salter wrote: > > > > > > On Tue, 2014-07-15 at 11:02 +0100, Leif Lindholm wrote: > > > > > > > > @@ -273,6 +282,10 @@ static void __init free_boot_services(void) > > > > > > > > continue; > > > > > > > > } > > > > > > > > > > > > > > > > + /* Don't free anything below kernel */ > > > > > > > > + if (md->phys_addr < PHYS_OFFSET) > > > > > > > > + continue; > > > > > > > > + > > > > > > > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > > > > then there could be BS code/data memory which we'd want to ignore. > > > > > > > > > > Well, if it is boot service code/data - then there is no need for us > > > > > to keep it around after ExitBootServices(). > > > > > > > > One would think, but EFI has proven to be less than strictly compliant > > > > in that regard in the past. I'm inclined to keep the boot services > > > > around until after SetVirtualAddressMap just in case. > > > > > > But the function you add this clause to will still throw away all boot > > > services code/data regions - just with this modification it skips > > > those that happen to lie lower in the address space than the kernel. > > Returning to the actual code we are discussing here: > The hunk above has no bearing on whether boot services regions are > generally unmapped or not. It only filters explicitly those boot > services regions that happen to be lower in memory than the kernel, > and keep them around for the duration of the system. It doesn't filter them to keep them around, it filters them to avoid calling free_bootmem_late() with an invalid address. If there are UEFI regions below the kernel, we don't want to call memblock_reserve() or free_bootmem_late() for them. > > > > > > > (And I do agree with Mark R here - let's not work around bugs that > > > don't exist yet.) > > > > > > > I'm not sure if they still exist or not, but on Foundation, I saw a > > crash in SetVirtualAddressMap unless I kept BS regions around. > > For the topic of keeping boot services code around: > I did also see issues with not keeping boot services regions on v7 - > ages ago. I have not seen it this year, and I _really_ want to see if > any such issues resurface. My view is that a problem has been seen in the past with tianocore for arm64. There is no harm in delaying the freeing of BS regions. The memory becomes usable for general kernel use at early_initcall time. This issue has also been seen with x86 firmware and some of those same vendors will be providing arm64 firmware. The problem isn't reproducible now, but I'm not sure if there was a bug fix for it or if it just went underground for some reason. Kernel boot may succeed by chance if some needed BS memory isn't reused by kernel. > So post-3.16 I would quite like to see the > call to free_boot_services() moved earlier to flush out any such > issues before we see large-scale deployments. > You can just get rid of it altogether: diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 453b7f8..06b59d9 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -177,9 +177,7 @@ static __init void reserve_regions(void) if (is_normal_ram(md)) early_init_dt_add_memory_arch(paddr, size); - if (is_reserve_region(md) || - md->type == EFI_BOOT_SERVICES_CODE || - md->type == EFI_BOOT_SERVICES_DATA) { + if (is_reserve_region(md)) { memblock_reserve(paddr, size); if (uefi_debug) pr_cont("*"); @@ -191,122 +189,6 @@ static __init void reserve_regions(void) } -static u64 __init free_one_region(u64 start, u64 end) -{ - u64 size = end - start; - - if (uefi_debug) - pr_info(" EFI freeing: 0x%012llx-0x%012llx\n", start, end - 1); - - free_bootmem_late(start, size); - return size; -} - -static u64 __init free_region(u64 start, u64 end) -{ - u64 map_start, map_end, total = 0; - - if (end <= start) - return total; - - map_start = (u64)memmap.phys_map; - map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map)); - map_start &= PAGE_MASK; - - if (start < map_end && end > map_start) { - /* region overlaps UEFI memmap */ - if (start < map_start) - total += free_one_region(start, map_start); - - if (map_end < end) - total += free_one_region(map_end, end); - } else - total += free_one_region(start, end); - - return total; -} - -static void __init free_boot_services(void) -{ - u64 total_freed = 0; - u64 keep_end, free_start, free_end; - efi_memory_desc_t *md; - - /* - * If kernel uses larger pages than UEFI, we have to be careful - * not to inadvertantly free memory we want to keep if there is - * overlap@the kernel page size alignment. We do not want to - * free is_reserve_region() memory nor the UEFI memmap itself. - * - * The memory map is sorted, so we keep track of the end of - * any previous region we want to keep, remember any region - * we want to free and defer freeing it until we encounter - * the next region we want to keep. This way, before freeing - * it, we can clip it as needed to avoid freeing memory we - * want to keep for UEFI. - */ - - keep_end = 0; - free_start = 0; - - for_each_efi_memory_desc(&memmap, md) { - u64 paddr, npages, size; - - if (is_reserve_region(md)) { - /* - * We don't want to free any memory from this region. - */ - if (free_start) { - /* adjust free_end then free region */ - if (free_end > md->phys_addr) - free_end -= PAGE_SIZE; - total_freed += free_region(free_start, free_end); - free_start = 0; - } - keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT); - continue; - } - - if (md->type != EFI_BOOT_SERVICES_CODE && - md->type != EFI_BOOT_SERVICES_DATA) { - /* no need to free this region */ - continue; - } - - /* - * We want to free memory from this region. - */ - paddr = md->phys_addr; - npages = md->num_pages; - memrange_efi_to_native(&paddr, &npages); - size = npages << PAGE_SHIFT; - - if (free_start) { - if (paddr <= free_end) - free_end = paddr + size; - else { - total_freed += free_region(free_start, free_end); - free_start = paddr; - free_end = paddr + size; - } - } else { - free_start = paddr; - free_end = paddr + size; - } - if (free_start < keep_end) { - free_start += PAGE_SIZE; - if (free_start >= free_end) - free_start = 0; - } - } - if (free_start) - total_freed += free_region(free_start, free_end); - - if (total_freed) - pr_info("Freed 0x%llx bytes of EFI boot services memory", - total_freed); -} - void __init efi_init(void) { struct efi_fdt_params params; @@ -439,8 +321,6 @@ static int __init arm64_enter_virtual_mode(void) kfree(virtmap); - free_boot_services(); - if (status != EFI_SUCCESS) { pr_err("Failed to set EFI virtual address map! [%lx]\n", status); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-16 13:13 ` Mark Salter @ 2014-07-22 17:08 ` Leif Lindholm 2014-07-22 19:28 ` Mark Salter 0 siblings, 1 reply; 17+ messages in thread From: Leif Lindholm @ 2014-07-22 17:08 UTC (permalink / raw) To: linux-arm-kernel (Argh, late reply due to broken mail filters.) On Wed, Jul 16, 2014 at 09:13:48AM -0400, Mark Salter wrote: > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > > > > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > > > > > then there could be BS code/data memory which we'd want to ignore. > > > > > > > > > > > > Well, if it is boot service code/data - then there is no need for us > > > > > > to keep it around after ExitBootServices(). > > > > > > > > > > One would think, but EFI has proven to be less than strictly compliant > > > > > in that regard in the past. I'm inclined to keep the boot services > > > > > around until after SetVirtualAddressMap just in case. > > > > > > > > But the function you add this clause to will still throw away all boot > > > > services code/data regions - just with this modification it skips > > > > those that happen to lie lower in the address space than the kernel. > > > > Returning to the actual code we are discussing here: > > The hunk above has no bearing on whether boot services regions are > > generally unmapped or not. It only filters explicitly those boot > > services regions that happen to be lower in memory than the kernel, > > and keep them around for the duration of the system. > > It doesn't filter them to keep them around, it filters them to avoid > calling free_bootmem_late() with an invalid address. If there are UEFI > regions below the kernel, we don't want to call memblock_reserve() or > free_bootmem_late() for them. Then why not just flip things around and do like the arm port and only add the blocks we actually want to keep around to begin with? > > > > (And I do agree with Mark R here - let's not work around bugs that > > > > don't exist yet.) > > > > > > > > > > I'm not sure if they still exist or not, but on Foundation, I saw a > > > crash in SetVirtualAddressMap unless I kept BS regions around. > > > > For the topic of keeping boot services code around: > > I did also see issues with not keeping boot services regions on v7 - > > ages ago. I have not seen it this year, and I _really_ want to see if > > any such issues resurface. > > My view is that a problem has been seen in the past with tianocore for > arm64. There is no harm in delaying the freeing of BS regions. There is a huge harm. > The > memory becomes usable for general kernel use at early_initcall time. > This issue has also been seen with x86 firmware and some of those same > vendors will be providing arm64 firmware. This issue has been seen with x86 firmware because in the early days (last year) noone bothered validating anything other than CSM. They no longer have that luxury. The Linux kernel, currently being the most avid tester of existing arm64 UEFI firmware, falling over itself to cater for hypothetical broken implementations pretty much guarantees the situation will end up just as bad as it ever was on x86 - without us even having CSM. > The problem isn't reproducible > now, but I'm not sure if there was a bug fix for it or if it just went > underground for some reason. Kernel boot may succeed by chance if some > needed BS memory isn't reused by kernel. And it may succeed by chance anyway. I'm not saying we won't see broken firmware - I'm saying that this is the window we have to try to _help_ people (and ourselves) by letting broken firmware fail - before it happens in the data centre. > > So post-3.16 I would quite like to see the > > call to free_boot_services() moved earlier to flush out any such > > issues before we see large-scale deployments. > > > > You can just get rid of it altogether: Well, clearly, that would not be my preference :) > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > index 453b7f8..06b59d9 100644 > --- a/arch/arm64/kernel/efi.c > +++ b/arch/arm64/kernel/efi.c > @@ -177,9 +177,7 @@ static __init void reserve_regions(void) > if (is_normal_ram(md)) > early_init_dt_add_memory_arch(paddr, size); > > - if (is_reserve_region(md) || > - md->type == EFI_BOOT_SERVICES_CODE || > - md->type == EFI_BOOT_SERVICES_DATA) { > + if (is_reserve_region(md)) { > memblock_reserve(paddr, size); > if (uefi_debug) > pr_cont("*"); > @@ -191,122 +189,6 @@ static __init void reserve_regions(void) > } > > > -static u64 __init free_one_region(u64 start, u64 end) > -{ > - u64 size = end - start; > - > - if (uefi_debug) > - pr_info(" EFI freeing: 0x%012llx-0x%012llx\n", start, end - 1); > - > - free_bootmem_late(start, size); > - return size; > -} > - > -static u64 __init free_region(u64 start, u64 end) > -{ > - u64 map_start, map_end, total = 0; > - > - if (end <= start) > - return total; > - > - map_start = (u64)memmap.phys_map; > - map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map)); > - map_start &= PAGE_MASK; > - > - if (start < map_end && end > map_start) { > - /* region overlaps UEFI memmap */ > - if (start < map_start) > - total += free_one_region(start, map_start); > - > - if (map_end < end) > - total += free_one_region(map_end, end); > - } else > - total += free_one_region(start, end); > - > - return total; > -} > - > -static void __init free_boot_services(void) > -{ > - u64 total_freed = 0; > - u64 keep_end, free_start, free_end; > - efi_memory_desc_t *md; > - > - /* > - * If kernel uses larger pages than UEFI, we have to be careful > - * not to inadvertantly free memory we want to keep if there is > - * overlap at the kernel page size alignment. We do not want to > - * free is_reserve_region() memory nor the UEFI memmap itself. > - * > - * The memory map is sorted, so we keep track of the end of > - * any previous region we want to keep, remember any region > - * we want to free and defer freeing it until we encounter > - * the next region we want to keep. This way, before freeing > - * it, we can clip it as needed to avoid freeing memory we > - * want to keep for UEFI. > - */ > - > - keep_end = 0; > - free_start = 0; > - > - for_each_efi_memory_desc(&memmap, md) { > - u64 paddr, npages, size; > - > - if (is_reserve_region(md)) { > - /* > - * We don't want to free any memory from this region. > - */ > - if (free_start) { > - /* adjust free_end then free region */ > - if (free_end > md->phys_addr) > - free_end -= PAGE_SIZE; > - total_freed += free_region(free_start, free_end); > - free_start = 0; > - } > - keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT); > - continue; > - } > - > - if (md->type != EFI_BOOT_SERVICES_CODE && > - md->type != EFI_BOOT_SERVICES_DATA) { > - /* no need to free this region */ > - continue; > - } > - > - /* > - * We want to free memory from this region. > - */ > - paddr = md->phys_addr; > - npages = md->num_pages; > - memrange_efi_to_native(&paddr, &npages); > - size = npages << PAGE_SHIFT; > - > - if (free_start) { > - if (paddr <= free_end) > - free_end = paddr + size; > - else { > - total_freed += free_region(free_start, free_end); > - free_start = paddr; > - free_end = paddr + size; > - } > - } else { > - free_start = paddr; > - free_end = paddr + size; > - } > - if (free_start < keep_end) { > - free_start += PAGE_SIZE; > - if (free_start >= free_end) > - free_start = 0; > - } > - } > - if (free_start) > - total_freed += free_region(free_start, free_end); > - > - if (total_freed) > - pr_info("Freed 0x%llx bytes of EFI boot services memory", > - total_freed); > -} > - > void __init efi_init(void) > { > struct efi_fdt_params params; > @@ -439,8 +321,6 @@ static int __init arm64_enter_virtual_mode(void) > > kfree(virtmap); > > - free_boot_services(); > - > if (status != EFI_SUCCESS) { > pr_err("Failed to set EFI virtual address map! [%lx]\n", > status); > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-22 17:08 ` Leif Lindholm @ 2014-07-22 19:28 ` Mark Salter 2014-07-22 21:28 ` Leif Lindholm 0 siblings, 1 reply; 17+ messages in thread From: Mark Salter @ 2014-07-22 19:28 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2014-07-22 at 18:08 +0100, Leif Lindholm wrote: > (Argh, late reply due to broken mail filters.) > > On Wed, Jul 16, 2014 at 09:13:48AM -0400, Mark Salter wrote: > > > > > > > > > Is the spin table area really allocated as BOOT_SERVICES_*? > > > > > > > > > > > > > > > > No. It is EFI_RESERVED_TYPE. But if UEFI is allowed below the kernel, > > > > > > > > then there could be BS code/data memory which we'd want to ignore. > > > > > > > > > > > > > > Well, if it is boot service code/data - then there is no need for us > > > > > > > to keep it around after ExitBootServices(). > > > > > > > > > > > > One would think, but EFI has proven to be less than strictly compliant > > > > > > in that regard in the past. I'm inclined to keep the boot services > > > > > > around until after SetVirtualAddressMap just in case. > > > > > > > > > > But the function you add this clause to will still throw away all boot > > > > > services code/data regions - just with this modification it skips > > > > > those that happen to lie lower in the address space than the kernel. > > > > > > Returning to the actual code we are discussing here: > > > The hunk above has no bearing on whether boot services regions are > > > generally unmapped or not. It only filters explicitly those boot > > > services regions that happen to be lower in memory than the kernel, > > > and keep them around for the duration of the system. > > > > It doesn't filter them to keep them around, it filters them to avoid > > calling free_bootmem_late() with an invalid address. If there are UEFI > > regions below the kernel, we don't want to call memblock_reserve() or > > free_bootmem_late() for them. > > Then why not just flip things around and do like the arm port and only > add the blocks we actually want to keep around to begin with? I'd rather leave it as-is with everything which can be covered by the normal kernel mem mapping. > > > > > > (And I do agree with Mark R here - let's not work around bugs that > > > > > don't exist yet.) > > > > > > > > > > > > > I'm not sure if they still exist or not, but on Foundation, I saw a > > > > crash in SetVirtualAddressMap unless I kept BS regions around. > > > > > > For the topic of keeping boot services code around: > > > I did also see issues with not keeping boot services regions on v7 - > > > ages ago. I have not seen it this year, and I _really_ want to see if > > > any such issues resurface. > > > > My view is that a problem has been seen in the past with tianocore for > > arm64. There is no harm in delaying the freeing of BS regions. > > There is a huge harm. huge? really? > > > The > > memory becomes usable for general kernel use at early_initcall time. > > This issue has also been seen with x86 firmware and some of those same > > vendors will be providing arm64 firmware. > > This issue has been seen with x86 firmware because in the early days > (last year) noone bothered validating anything other than CSM. They no > longer have that luxury. > > The Linux kernel, currently being the most avid tester of existing > arm64 UEFI firmware, falling over itself to cater for hypothetical > broken implementations pretty much guarantees the situation will end > up just as bad as it ever was on x86 - without us even having CSM. It is hardly falling over itself. And if the problem is hypothetical, why is this in the arm32 EFI patches: +/* + * If you need to (temporarily) support buggy firmware, set to 0. + */ +#define DISCARD_BOOT_SERVICES_REGIONS 1 > > > The problem isn't reproducible > > now, but I'm not sure if there was a bug fix for it or if it just went > > underground for some reason. Kernel boot may succeed by chance if some > > needed BS memory isn't reused by kernel. > > And it may succeed by chance anyway. > I'm not saying we won't see broken firmware - I'm saying that this is > the window we have to try to _help_ people (and ourselves) by letting > broken firmware fail - before it happens in the data centre. In this particular case, we are removing a workaround to a problem which has been seen in the past. So we would open a door to allow this particular problem to reach the data center. > > > > So post-3.16 I would quite like to see the > > > call to free_boot_services() moved earlier to flush out any such > > > issues before we see large-scale deployments. > > > > > > > You can just get rid of it altogether: > > Well, clearly, that would not be my preference :) Why not? There's no point in keeping it if it isn't wanted/needed. Or at least make it optional with a #define as in arm32. Anyway, my opinion is known and I'm really not that attached to the code. So, if someone wants to submit a patch to take it out, I'm not going to make a fuss over it. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-22 19:28 ` Mark Salter @ 2014-07-22 21:28 ` Leif Lindholm 0 siblings, 0 replies; 17+ messages in thread From: Leif Lindholm @ 2014-07-22 21:28 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jul 22, 2014 at 03:28:50PM -0400, Mark Salter wrote: > > > It doesn't filter them to keep them around, it filters them to avoid > > > calling free_bootmem_late() with an invalid address. If there are UEFI > > > regions below the kernel, we don't want to call memblock_reserve() or > > > free_bootmem_late() for them. > > > > Then why not just flip things around and do like the arm port and only > > add the blocks we actually want to keep around to begin with? > > I'd rather leave it as-is with everything which can be covered by the > normal kernel mem mapping. > > > > > > > (And I do agree with Mark R here - let's not work around bugs that > > > > > > don't exist yet.) > > > > > > > > > > > > > > > > I'm not sure if they still exist or not, but on Foundation, I saw a > > > > > crash in SetVirtualAddressMap unless I kept BS regions around. > > > > > > > > For the topic of keeping boot services code around: > > > > I did also see issues with not keeping boot services regions on v7 - > > > > ages ago. I have not seen it this year, and I _really_ want to see if > > > > any such issues resurface. > > > > > > My view is that a problem has been seen in the past with tianocore for > > > arm64. There is no harm in delaying the freeing of BS regions. > > > > There is a huge harm. > > huge? really? Yes. It is setting a precedent that "we will try to second-guess how you might code incorrectly and just deal with it - in those few places where this is possible, and not in others". > > > The > > > memory becomes usable for general kernel use at early_initcall time. > > > This issue has also been seen with x86 firmware and some of those same > > > vendors will be providing arm64 firmware. > > > > This issue has been seen with x86 firmware because in the early days > > (last year) noone bothered validating anything other than CSM. They no > > longer have that luxury. > > > > The Linux kernel, currently being the most avid tester of existing > > arm64 UEFI firmware, falling over itself to cater for hypothetical > > broken implementations pretty much guarantees the situation will end > > up just as bad as it ever was on x86 - without us even having CSM. > > It is hardly falling over itself. And if the problem is hypothetical, > why is this in the arm32 EFI patches: > > +/* > + * If you need to (temporarily) support buggy firmware, set to 0. > + */ > +#define DISCARD_BOOT_SERVICES_REGIONS 1 Because in the early days we did indeed have such an issue. Had we not had this enabled, we never would have spotted it. Felt useful to have around _and_disabled_ in case we ever run into anything like it again. Haven't happened for over a year. > > > The problem isn't reproducible > > > now, but I'm not sure if there was a bug fix for it or if it just went > > > underground for some reason. Kernel boot may succeed by chance if some > > > needed BS memory isn't reused by kernel. > > > > And it may succeed by chance anyway. > > I'm not saying we won't see broken firmware - I'm saying that this is > > the window we have to try to _help_ people (and ourselves) by letting > > broken firmware fail - before it happens in the data centre. > > In this particular case, we are removing a workaround to a problem > which has been seen in the past. So we would open a door to allow > this particular problem to reach the data center. No, we are removing a workaround to a problem that has possibly been seen in the past, not sure of the circumstances, not sure if it was a model cache management bug. A UEFI implementation that does not keep track of local and global symbols is just as likely to not update a pointer between runtime services regions as it is to not stop using boot services regions after ExitBootServices(). Keeping boot services regions around does not solve the problem - it slightly reduces the likelihood of it manifesting. > > > > So post-3.16 I would quite like to see the > > > > call to free_boot_services() moved earlier to flush out any such > > > > issues before we see large-scale deployments. > > > > > > You can just get rid of it altogether: > > > > Well, clearly, that would not be my preference :) > > Why not? There's no point in keeping it if it isn't wanted/needed. Or at > least make it optional with a #define as in arm32. Anyway, my opinion is > known and I'm really not that attached to the code. So, if someone wants > to submit a patch to take it out, I'm not going to make a fuss over it. I'll try to get that together tomorrow. I certainly don't mind keeping a #define around (and it might enable more code sharing arm/arm64). / Leif ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied 2014-07-14 18:40 ` Mark Salter 2014-07-15 10:02 ` Leif Lindholm @ 2014-07-15 11:00 ` Mark Rutland 1 sibling, 0 replies; 17+ messages in thread From: Mark Rutland @ 2014-07-15 11:00 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jul 14, 2014 at 07:40:48PM +0100, Mark Salter wrote: > On Mon, 2014-07-14 at 17:25 +0200, Ard Biesheuvel wrote: > > If we fail to relocate the kernel Image to its preferred offset of TEXT_OFFSET > > bytes above the base of DRAM, accept the lowest alternative mapping available > > instead of aborting. We may lose a bit of memory at the low end, but we can > > still proceed normally otherwise. > > This breaks APM Mustang because the spin-table holding pen for secondary > CPUs is marked as reserved memory in the TEXT_OFFSET area and the kernel > placement using your patch makes it unreachable by kernel. Here is a > patch I've been working with to solve the same problem: I'm not sure that this is strictly speaking an issue with UEFI or the relocation strategy (which sounds sane to me). I believe we could easily hit similar issues with spin-table elsewhere, and I think we can fix this more generally without complicating the EFI stub. As I see it, we have two issues here: 1) The linear mapping starts at VA:PAGE_OFFSET+TEXT_OFFSET / PA:PHYS_OFFSET+TEXT_OFFSET, and we cannot access memory below this start address. This seems like a general issue we need to address, as it forces bootloader code to go through a tricky/impossible dance to get the kernel as close to the start of RAM as possible. 2) We cannot access a given cpu-release-addr if it is not in the linear mapping. This is the problem we're encountering now. We can solve (2) now by using a temporary mapping to write to the cpu-release-addr. Does the below patch (untested) fix your issue with spin-table? For (1) we need to rework the arm64 VA layout to decouple the kernel text mapping from the linear map, but that's a lot more work. Cheers, Mark. ---->8---- >From 73812b654a07f497f71bd38dfb4a6753fb0ad23e Mon Sep 17 00:00:00 2001 From: Mark Rutland <mark.rutland@arm.com> Date: Tue, 15 Jul 2014 11:32:53 +0100 Subject: [PATCH] arm64: spin-table: handle unmapped cpu-release-addrs In certain cases the cpu-release-addr of a CPU may not fall in the linear mapping (e.g. when the kernel is loaded above this address due to the presence of other images in memory). This is problematic for the spin-table code as it assumes that it can trivially convert a cpu-release-addr to a valid VA in the linear map. This patch modifies the spin-table code to use a temporary cached mapping to write to a given cpu-release-addr, enabling us to support addresses regardless of whether they are covered by the linear mapping. Signed-off-by: Mark Rutland <mark.rutland@arm.com> --- arch/arm64/kernel/smp_spin_table.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kernel/smp_spin_table.c b/arch/arm64/kernel/smp_spin_table.c index 0347d38..70181c1 100644 --- a/arch/arm64/kernel/smp_spin_table.c +++ b/arch/arm64/kernel/smp_spin_table.c @@ -20,6 +20,7 @@ #include <linux/init.h> #include <linux/of.h> #include <linux/smp.h> +#include <linux/types.h> #include <asm/cacheflush.h> #include <asm/cpu_ops.h> @@ -65,12 +66,21 @@ static int smp_spin_table_cpu_init(struct device_node *dn, unsigned int cpu) static int smp_spin_table_cpu_prepare(unsigned int cpu) { - void **release_addr; + __le64 __iomem *release_addr; if (!cpu_release_addr[cpu]) return -ENODEV; - release_addr = __va(cpu_release_addr[cpu]); + /* + * The cpu-release-addr may or may not be inside the linear mapping. + * As ioremap_cache will either give us a new mapping or reuse the + * existing linear mapping, we can use it to cover both cases. In + * either case the memory will be MT_NORMAL. + */ + release_addr = ioremap_cache(cpu_release_addr[cpu], + sizeof(*release_addr)); + if (!release_addr) + return -ENOMEM; /* * We write the release address as LE regardless of the native @@ -79,15 +89,16 @@ static int smp_spin_table_cpu_prepare(unsigned int cpu) * boot-loader's endianess before jumping. This is mandated by * the boot protocol. */ - release_addr[0] = (void *) cpu_to_le64(__pa(secondary_holding_pen)); - - __flush_dcache_area(release_addr, sizeof(release_addr[0])); + writeq_relaxed(__pa(secondary_holding_pen), release_addr); + __flush_dcache_area(release_addr, sizeof(*release_addr)); /* * Send an event to wake up the secondary CPU. */ sev(); + iounmap(release_addr); + return 0; } -- 1.9.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-07-22 21:28 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-14 15:25 [PATCH] efi/arm64: efistub: don't abort if base of DRAM is occupied Ard Biesheuvel 2014-07-14 15:27 ` Ard Biesheuvel 2014-07-14 18:40 ` Mark Salter 2014-07-15 10:02 ` Leif Lindholm 2014-07-15 11:05 ` Mark Rutland 2014-07-15 13:11 ` Mark Salter 2014-07-15 13:54 ` Leif Lindholm 2014-07-15 14:23 ` Mark Salter 2014-07-15 14:31 ` Mark Rutland 2014-07-15 14:49 ` Leif Lindholm 2014-07-15 15:04 ` Mark Salter 2014-07-15 15:28 ` Leif Lindholm 2014-07-16 13:13 ` Mark Salter 2014-07-22 17:08 ` Leif Lindholm 2014-07-22 19:28 ` Mark Salter 2014-07-22 21:28 ` Leif Lindholm 2014-07-15 11:00 ` Mark Rutland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).