linux-efi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matt Fleming <matt@codeblueprint.co.uk>
To: Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H . Peter Anvin" <hpa@zytor.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	linux-kernel@vger.kernel.org, linux-efi@vger.kernel.org,
	Borislav Petkov <bp@alien8.de>, Dave Young <dyoung@redhat.com>,
	Leif Lindholm <leif.lindholm@linaro.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Jones <pjones@redhat.com>
Subject: [PATCH 08/29] efi: Allow drivers to reserve boot services forever
Date: Fri,  9 Sep 2016 16:18:30 +0100	[thread overview]
Message-ID: <20160909151851.27577-9-matt@codeblueprint.co.uk> (raw)
In-Reply-To: <20160909151851.27577-1-matt@codeblueprint.co.uk>

Today, it is not possible for drivers to reserve EFI boot services for
access after efi_free_boot_services() has been called on x86. For
ARM/arm64 it can be done simply by calling memblock_reserve().

Having this ability for all three architectures is desirable for a
couple of reasons,

  1) It saves drivers copying data out of those regions
  2) kexec reboot can now make use of things like ESRT

Instead of using the standard memblock_reserve() which is insufficient
to reserve the region on x86 (see efi_reserve_boot_services()), a new
API is introduced in this patch; efi_mem_reserve().

efi.memmap now always represents which EFI memory regions are
available. On x86 the EFI boot services regions that have not been
reserved via efi_mem_reserve() will be removed from efi.memmap during
efi_free_boot_services().

This has implications for kexec, since it is not possible for a newly
kexec'd kernel to access the same boot services regions that the
initial boot kernel had access to unless they are reserved by every
kexec kernel in the chain.

Tested-by: Dave Young <dyoung@redhat.com> [kexec/kdump]
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [arm]
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Peter Jones <pjones@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
---
 arch/x86/platform/efi/quirks.c | 121 +++++++++++++++++++++++++++++++++++++----
 drivers/firmware/efi/efi.c     |  30 ++++++++++
 include/linux/efi.h            |   1 +
 3 files changed, 141 insertions(+), 11 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 9faf18874692..f14b7a9da24b 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -164,6 +164,71 @@ efi_status_t efi_query_variable_store(u32 attributes, unsigned long size,
 EXPORT_SYMBOL_GPL(efi_query_variable_store);
 
 /*
+ * The UEFI specification makes it clear that the operating system is
+ * free to do whatever it wants with boot services code after
+ * ExitBootServices() has been called. Ignoring this recommendation a
+ * significant bunch of EFI implementations continue calling into boot
+ * services code (SetVirtualAddressMap). In order to work around such
+ * buggy implementations we reserve boot services region during EFI
+ * init and make sure it stays executable. Then, after
+ * SetVirtualAddressMap(), it is discarded.
+ *
+ * However, some boot services regions contain data that is required
+ * by drivers, so we need to track which memory ranges can never be
+ * freed. This is done by tagging those regions with the
+ * EFI_MEMORY_RUNTIME attribute.
+ *
+ * Any driver that wants to mark a region as reserved must use
+ * efi_mem_reserve() which will insert a new EFI memory descriptor
+ * into efi.memmap (splitting existing regions if necessary) and tag
+ * it with EFI_MEMORY_RUNTIME.
+ */
+void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
+{
+	phys_addr_t new_phys, new_size;
+	struct efi_mem_range mr;
+	efi_memory_desc_t md;
+	int num_entries;
+	void *new;
+
+	if (efi_mem_desc_lookup(addr, &md)) {
+		pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
+		return;
+	}
+
+	if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
+		pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
+		return;
+	}
+
+	mr.range.start = addr;
+	mr.range.end = addr + size;
+	mr.attribute = md.attribute | EFI_MEMORY_RUNTIME;
+
+	num_entries = efi_memmap_split_count(&md, &mr.range);
+	num_entries += efi.memmap.nr_map;
+
+	new_size = efi.memmap.desc_size * num_entries;
+
+	new_phys = memblock_alloc(new_size, 0);
+	if (!new_phys) {
+		pr_err("Could not allocate boot services memmap\n");
+		return;
+	}
+
+	new = early_memremap(new_phys, new_size);
+	if (!new) {
+		pr_err("Failed to map new boot services memmap\n");
+		return;
+	}
+
+	efi_memmap_insert(&efi.memmap, new, &mr);
+	early_memunmap(new, new_size);
+
+	efi_memmap_install(new_phys, num_entries);
+}
+
+/*
  * Helper function for efi_reserve_boot_services() to figure out if we
  * can free regions in efi_free_boot_services().
  *
@@ -184,15 +249,6 @@ static bool can_free_region(u64 start, u64 size)
 	return true;
 }
 
-/*
- * The UEFI specification makes it clear that the operating system is free to do
- * whatever it wants with boot services code after ExitBootServices() has been
- * called. Ignoring this recommendation a significant bunch of EFI implementations 
- * continue calling into boot services code (SetVirtualAddressMap). In order to 
- * work around such buggy implementations we reserve boot services region during 
- * EFI init and make sure it stays executable. Then, after SetVirtualAddressMap(), it
-* is discarded.
-*/
 void __init efi_reserve_boot_services(void)
 {
 	efi_memory_desc_t *md;
@@ -249,7 +305,10 @@ void __init efi_reserve_boot_services(void)
 
 void __init efi_free_boot_services(void)
 {
+	phys_addr_t new_phys, new_size;
 	efi_memory_desc_t *md;
+	int num_entries = 0;
+	void *new, *new_md;
 
 	for_each_efi_memory_desc(md) {
 		unsigned long long start = md->phys_addr;
@@ -257,12 +316,16 @@ void __init efi_free_boot_services(void)
 		size_t rm_size;
 
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
-		    md->type != EFI_BOOT_SERVICES_DATA)
+		    md->type != EFI_BOOT_SERVICES_DATA) {
+			num_entries++;
 			continue;
+		}
 
 		/* Do not free, someone else owns it: */
-		if (md->attribute & EFI_MEMORY_RUNTIME)
+		if (md->attribute & EFI_MEMORY_RUNTIME) {
+			num_entries++;
 			continue;
+		}
 
 		/*
 		 * Nasty quirk: if all sub-1MB memory is used for boot
@@ -286,6 +349,42 @@ void __init efi_free_boot_services(void)
 
 		free_bootmem_late(start, size);
 	}
+
+	new_size = efi.memmap.desc_size * num_entries;
+	new_phys = memblock_alloc(new_size, 0);
+	if (!new_phys) {
+		pr_err("Failed to allocate new EFI memmap\n");
+		return;
+	}
+
+	new = memremap(new_phys, new_size, MEMREMAP_WB);
+	if (!new) {
+		pr_err("Failed to map new EFI memmap\n");
+		return;
+	}
+
+	/*
+	 * Build a new EFI memmap that excludes any boot services
+	 * regions that are not tagged EFI_MEMORY_RUNTIME, since those
+	 * regions have now been freed.
+	 */
+	new_md = new;
+	for_each_efi_memory_desc(md) {
+		if (!(md->attribute & EFI_MEMORY_RUNTIME) &&
+		    (md->type == EFI_BOOT_SERVICES_CODE ||
+		     md->type == EFI_BOOT_SERVICES_DATA))
+			continue;
+
+		memcpy(new_md, md, efi.memmap.desc_size);
+		new_md += efi.memmap.desc_size;
+	}
+
+	memunmap(new);
+
+	if (efi_memmap_install(new_phys, num_entries)) {
+		pr_err("Could not install new EFI memmap\n");
+		return;
+	}
 }
 
 /*
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index d4886fd50c16..dfe07316cae5 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -27,6 +27,7 @@
 #include <linux/slab.h>
 #include <linux/acpi.h>
 #include <linux/ucs2_string.h>
+#include <linux/memblock.h>
 
 #include <asm/early_ioremap.h>
 
@@ -396,6 +397,35 @@ u64 __init efi_mem_desc_end(efi_memory_desc_t *md)
 	return end;
 }
 
+void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size) {}
+
+/**
+ * efi_mem_reserve - Reserve an EFI memory region
+ * @addr: Physical address to reserve
+ * @size: Size of reservation
+ *
+ * Mark a region as reserved from general kernel allocation and
+ * prevent it being released by efi_free_boot_services().
+ *
+ * This function should be called drivers once they've parsed EFI
+ * configuration tables to figure out where their data lives, e.g.
+ * efi_esrt_init().
+ */
+void __init efi_mem_reserve(phys_addr_t addr, u64 size)
+{
+	if (!memblock_is_region_reserved(addr, size))
+		memblock_reserve(addr, size);
+
+	/*
+	 * Some architectures (x86) reserve all boot services ranges
+	 * until efi_free_boot_services() because of buggy firmware
+	 * implementations. This means the above memblock_reserve() is
+	 * superfluous on x86 and instead what it needs to do is
+	 * ensure the @start, @size is not freed.
+	 */
+	efi_arch_mem_reserve(addr, size);
+}
+
 static __initdata efi_config_table_type_t common_tables[] = {
 	{ACPI_20_TABLE_GUID, "ACPI 2.0", &efi.acpi20},
 	{ACPI_TABLE_GUID, "ACPI", &efi.acpi},
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 987c18f6fcae..3fe4f3c47834 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -944,6 +944,7 @@ extern u64 efi_mem_attribute (unsigned long phys_addr, unsigned long size);
 extern int __init efi_uart_console_only (void);
 extern u64 efi_mem_desc_end(efi_memory_desc_t *md);
 extern int efi_mem_desc_lookup(u64 phys_addr, efi_memory_desc_t *out_md);
+extern void efi_mem_reserve(phys_addr_t addr, u64 size);
 extern void efi_initialize_iomem_resources(struct resource *code_resource,
 		struct resource *data_resource, struct resource *bss_resource);
 extern void efi_reserve_boot_services(void);
-- 
2.9.3

  parent reply	other threads:[~2016-09-09 15:18 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-09 15:18 [GIT PULL 00/29] EFI changes for v4.9 Matt Fleming
2016-09-09 15:18 ` [PATCH 01/29] x86/efi: Test for EFI_MEMMAP functionality when iterating EFI memmap Matt Fleming
2016-09-09 15:18 ` [PATCH 03/29] efi: Refactor efi_memmap_init_early() into arch-neutral code Matt Fleming
2016-09-09 15:18 ` [PATCH 05/29] efi/fake_mem: Refactor main two code chunks into functions Matt Fleming
2016-09-09 15:18 ` [PATCH 06/29] efi: Split out EFI memory map functions into new file Matt Fleming
2016-09-09 15:18 ` Matt Fleming [this message]
2017-01-04  2:48   ` [PATCH 08/29] efi: Allow drivers to reserve boot services forever Dan Williams
     [not found]     ` <CAA9_cmffnH0CH+3DaUW3ytVnRWbZCHa1VcAfWgwMCfbncN_QcA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-04  5:28       ` Dave Young
2017-01-04 14:25       ` Peter Jones
2017-01-04 17:45       ` Nicolai Stange
2017-01-04 18:40         ` Dan Williams
2016-09-09 15:18 ` [PATCH 09/29] efi/runtime-map: Use efi.memmap directly instead of a copy Matt Fleming
2016-09-09 15:18 ` [PATCH 11/29] x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data Matt Fleming
2016-09-09 15:18 ` [PATCH 14/29] efi: Use a file local lock for efivars Matt Fleming
     [not found] ` <20160909151851.27577-1-matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
2016-09-09 15:18   ` [PATCH 02/29] x86/efi: Consolidate region mapping logic Matt Fleming
2016-09-09 15:18   ` [PATCH 04/29] efi: Add efi_memmap_init_late() for permanent EFI memmap Matt Fleming
2016-09-09 15:18   ` [PATCH 07/29] efi: Add efi_memmap_install() for installing new EFI memory maps Matt Fleming
2016-09-09 15:18   ` [PATCH 10/29] efi/esrt: Use efi_mem_reserve() and avoid a kmalloc() Matt Fleming
2016-09-09 15:18   ` [PATCH 12/29] efi/esrt: Use memremap not ioremap to access ESRT table in memory Matt Fleming
2016-09-09 15:18   ` [PATCH 13/29] efi/arm*: esrt: Add missing call to efi_esrt_init() Matt Fleming
2016-09-09 15:18   ` [PATCH 15/29] efi: Don't use spinlocks for efi vars Matt Fleming
2016-09-09 15:18   ` [PATCH 18/29] firmware-gsmi: Delete an unnecessary check before the function call "dma_pool_destroy" Matt Fleming
2016-09-09 15:18   ` [PATCH 20/29] x86/efi: Map in physical addresses in efi_map_region_fixed Matt Fleming
2016-09-09 15:18   ` [PATCH 24/29] x86/efi: Defer efi_esrt_init until after memblock_x86_fill Matt Fleming
2016-09-09 15:18   ` [PATCH 26/29] efi/arm64: Treat regions with WT/WC set but WB cleared as memory Matt Fleming
2016-09-09 15:18   ` [PATCH 28/29] x86/efi: Optimize away setup_gop32/64 if unused Matt Fleming
2016-09-12 10:58   ` [GIT PULL 00/29] EFI changes for v4.9 Matt Fleming
     [not found]     ` <20160912105813.GC3872-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org>
2016-09-12 11:42       ` Ingo Molnar
2016-09-09 15:18 ` [PATCH 16/29] efi: Replace runtime services spinlock with semaphore Matt Fleming
2016-09-09 15:18 ` [PATCH 17/29] x86/efi: Initialize status to ensure garbage is not returned on small size Matt Fleming
2016-09-09 15:18 ` [PATCH 19/29] lib/ucs2_string: Speed up ucs2_utf8size() Matt Fleming
2016-09-09 15:18 ` [PATCH 21/29] fs/efivarfs: Fix double kfree() in error path Matt Fleming
2016-09-09 15:18 ` [PATCH 22/29] x86/efi: Remove unused find_bits() function Matt Fleming
2016-09-09 15:18 ` [PATCH 23/29] efi/arm64: Add debugfs node to dump UEFI runtime page tables Matt Fleming
2016-09-09 15:18 ` [PATCH 25/29] efi: Add efi_test driver for exporting UEFI runtime service interfaces Matt Fleming
2016-09-09 15:18 ` [PATCH 27/29] x86/efi: Use kmalloc_array() in efi_call_phys_prolog() Matt Fleming
2016-09-09 15:18 ` [PATCH 29/29] x86/efi: Allow invocation of arbitrary boot services Matt Fleming
2016-09-13 18:32 ` [GIT PULL 00/29] EFI changes for v4.9 Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909151851.27577-9-matt@codeblueprint.co.uk \
    --to=matt@codeblueprint.co.uk \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bp@alien8.de \
    --cc=dyoung@redhat.com \
    --cc=hpa@zytor.com \
    --cc=leif.lindholm@linaro.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=pjones@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).