public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map
@ 2026-04-23 15:20 Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 01/17] x86/efi: Omit redundant kernel image overlap check Ard Biesheuvel
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

At boot, x86 uses E820 tables (3 different versions!), memblock tables
and the EFI memory map to reason about which parts of system RAM are
available to the OS, and which are reserved.

While other EFI architectures treat the EFI memory map as immutable, the
x86 boot code modifies it to keep track of memory reservations of boot
services data regions, in order to distinguish which parts have been
memblock_reserve()'d permanently, and which ones have been reserved only
temporarily to work around buggy implementations of the EFI runtime
service [SetVirtualAddressMap()] that reconfigures the VA space of the
runtime services themselves.

This method is mostly fine for marking entire regions as reserved, but
it gets complicated when the code decides to split EFI memory map
entries in order to mark some of it permanently reserved, and the rest
of it temporarily reserved.

Let's clean this up, by
- marking permanent reservations of EFI boot services data memory as
  MEMBLOCK_RSRV_KERN
- taking this marking into account when deciding whether or not a EFI
  boot services data region can be freed
- dropping all of the EFI memory map insertion/splitting logic and the
  allocation/freeing logic, all of which have become redundant.

Changes since v2:
- Avoid relying on memblock tables after those may have been freed
  already (spotted by Sashiko). Instead, tweak the ranges_to_free code
  added recently by Mike so that the array can grow arbitrarily, and
  carry multiple entries per EFI boot services data region.
- Fix use of the memory attributes table after kexec too, which is now
  feasible given that memblock_reserve()'ing EFI boot services memory is
  no longer broken - this supersedes [0]
- Drop memblock changes that were merged into 7.1-rc0 (formerly #1-#2)

Changes since v1:
- Also get rid of all reallocation logic, and just reuse the initial
  allocation throughout, and keep track of the number of valid entries
- Drop abuse of the EFI_MEMORY_RUNTIME flag
- Add acks from Mike to #1-#2

[0] https://lore.kernel.org/all/20260326132655.1733873-7-ardb+git@google.com/

Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Young <ruirui.yang@linux.dev>
Cc: Gregory Price <gourry@gourry.net>

Ard Biesheuvel (17):
  x86/efi: Omit redundant kernel image overlap check
  x86/efi: Drop redundant EFI_PARAVIRT check
  x86/efi: Only merge EFI memory map entries on 32-bit systems
  x86/efi: Defer sub-1M check from unmap to free stage
  x86/efi: Simplify real mode trampoline allocation quirk
  x86/efi: Unmap kernel-reserved boot regions from EFI page tables
  x86/efi: Drop EFI_MEMORY_RUNTIME check from __ioremap_check_other()
  x86/efi: Allow ranges_to_free array to grow beyond initial size
  x86/efi: Intersect ranges_to_free with MEMBLOCK_RSRV_KERN regions
  x86/efi: Do not rely on EFI_MEMORY_RUNTIME bit and avoid entry
    splitting
  efi: Use nr_map not map_end to find the last valid memory map entry
  x86/efi: Clean the memory map using iterator and filter API
  x86/efi: Update the runtime map in place
  x86/efi: Reuse memory map instead of reallocating it
  x86/efi: Merge two traversals of the memory map when freeing boot
    regions
  x86/efi: Avoid EFI_MEMORY_RUNTIME for early EFI boot memory
    reservations
  x86/efi: Drop kexec quirk for the EFI memory attributes table

 arch/x86/include/asm/efi.h           |  15 +-
 arch/x86/mm/ioremap.c                |   8 +-
 arch/x86/platform/efi/Makefile       |   2 +-
 arch/x86/platform/efi/efi.c          | 167 +++--------
 arch/x86/platform/efi/efi_32.c       |  31 +++
 arch/x86/platform/efi/memmap.c       | 247 -----------------
 arch/x86/platform/efi/quirks.c       | 291 +++++++-------------
 arch/x86/platform/efi/runtime-map.c  |   4 +-
 drivers/firmware/efi/arm-runtime.c   |   2 +-
 drivers/firmware/efi/memmap.c        |   8 +-
 drivers/firmware/efi/riscv-runtime.c |   2 +-
 include/linux/efi.h                  |  13 +-
 12 files changed, 197 insertions(+), 593 deletions(-)
 delete mode 100644 arch/x86/platform/efi/memmap.c


base-commit: 2e68039281932e6dc37718a1ea7cbb8e2cda42e6
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 01/17] x86/efi: Omit redundant kernel image overlap check
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 02/17] x86/efi: Drop redundant EFI_PARAVIRT check Ard Biesheuvel
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

The physical region covering the kernel's executable image is
memblock_reserve()'d in early_mem_reserve(), and so it is guaranteed not
to intersect with the regions passed to can_free_region(). So remove the
pointless overlap check.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index df24ffc6105d..4d8de7c6ce59 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -305,16 +305,11 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
  * can free regions in efi_free_boot_services().
  *
  * Use this function to ensure we do not free regions owned by somebody
- * else. We must only reserve (and then free) regions:
- *
- * - Not within any part of the kernel
- * - Not the BIOS reserved area (E820_TYPE_RESERVED, E820_TYPE_NVS, etc)
+ * else. We must only reserve (and then free) regions that do not intersect
+ * with the BIOS reserved area (E820_TYPE_RESERVED, E820_TYPE_NVS, etc)
  */
 static __init bool can_free_region(u64 start, u64 size)
 {
-	if (start + size > __pa_symbol(_text) && start <= __pa_symbol(_end))
-		return false;
-
 	if (!e820__mapped_all(start, start+size, E820_TYPE_RAM))
 		return false;
 
@@ -343,10 +338,8 @@ void __init efi_reserve_boot_services(void)
 		 * Because the following memblock_reserve() is paired
 		 * with free_reserved_area() for this region in
 		 * efi_free_boot_services(), we must be extremely
-		 * careful not to reserve, and subsequently free,
-		 * critical regions of memory (like the kernel image) or
-		 * those regions that somebody else has already
-		 * reserved.
+		 * careful not to reserve, and subsequently free, critical
+		 * regions of memory that somebody else has already reserved.
 		 *
 		 * A good example of a critical region that must not be
 		 * freed is page zero (first 4Kb of memory), which may
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 02/17] x86/efi: Drop redundant EFI_PARAVIRT check
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 01/17] x86/efi: Omit redundant kernel image overlap check Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 03/17] x86/efi: Only merge EFI memory map entries on 32-bit systems Ard Biesheuvel
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

efi_memblock_x86_reserve_range() exits early if EFI_PARAVIRT is set, so
there is no point in checking it a second time further down.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/efi.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 0c39adb96b91..1b7a0cd54d08 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -211,11 +211,9 @@ int __init efi_memblock_x86_reserve_range(void)
 	data.desc_size		= e->efi_memdesc_size;
 	data.desc_version	= e->efi_memdesc_version;
 
-	if (!efi_enabled(EFI_PARAVIRT)) {
-		rv = efi_memmap_init_early(&data);
-		if (rv)
-			return rv;
-	}
+	rv = efi_memmap_init_early(&data);
+	if (rv)
+		return rv;
 
 	if (add_efi_memmap || do_efi_soft_reserve())
 		do_add_efi_memmap();
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 03/17] x86/efi: Only merge EFI memory map entries on 32-bit systems
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 01/17] x86/efi: Omit redundant kernel image overlap check Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 02/17] x86/efi: Drop redundant EFI_PARAVIRT check Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 04/17] x86/efi: Defer sub-1M check from unmap to free stage Ard Biesheuvel
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Commit

  202f9d0a4180 ("x86, efi: Merge contiguous memory regions of the same type and attribute")

introduced a pass over the EFI memory map, ensuring that contiguous
regions of the same type and attribute are coalesced into a single
entry. This was needed because relative references may exist between
those regions, and so the virtual remapping needs to preserve the
relative placement of these regions. This virtual remapping was based on
ioremap() at the time, which does not guarantee that adjacent physical
addresses are mapped adjacently in the virtual space.

Commit

  d2f7cbe7b26a ("x86/efi: Runtime services virtual mapping")

introduced a new strategy for virtually remapping the EFI runtime
services, which is now the only remaining one, and commit

  a5caa209ba9c ("x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down")

tweaked the logic to ensure that the relative offset of adjacent regions
of any type is preserved on 64-bit systems, by reversing the order in
which the EFI memory map is traversed when choosing the virtual
placement.

This means that merging regions is no longer needed on 64-bit, given
that the relative placement of adjacent regions is guaranteed to be
preserved in the virtual space. So make this hack 32-bit only.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/efi.h     |  6 ++++
 arch/x86/platform/efi/efi.c    | 31 --------------------
 arch/x86/platform/efi/efi_32.c | 31 ++++++++++++++++++++
 3 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index dc8fe1361c18..f291845b403a 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -143,6 +143,12 @@ extern void efi_unmap_boot_services(void);
 void arch_efi_call_virt_setup(void);
 void arch_efi_call_virt_teardown(void);
 
+#ifdef CONFIG_X86_32
+void efi_merge_regions(void);
+#else
+static inline void efi_merge_regions(void) {}
+#endif
+
 extern u64 efi_setup;
 
 #ifdef CONFIG_EFI
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 1b7a0cd54d08..c0195b5eab21 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -502,37 +502,6 @@ void __init efi_init(void)
 		efi_print_memmap();
 }
 
-/* Merge contiguous regions of the same type and attribute */
-static void __init efi_merge_regions(void)
-{
-	efi_memory_desc_t *md, *prev_md = NULL;
-
-	for_each_efi_memory_desc(md) {
-		u64 prev_size;
-
-		if (!prev_md) {
-			prev_md = md;
-			continue;
-		}
-
-		if (prev_md->type != md->type ||
-		    prev_md->attribute != md->attribute) {
-			prev_md = md;
-			continue;
-		}
-
-		prev_size = prev_md->num_pages << EFI_PAGE_SHIFT;
-
-		if (md->phys_addr == (prev_md->phys_addr + prev_size)) {
-			prev_md->num_pages += md->num_pages;
-			md->type = EFI_RESERVED_TYPE;
-			md->attribute = 0;
-			continue;
-		}
-		prev_md = md;
-	}
-}
-
 static void *realloc_pages(void *old_memmap, int old_shift)
 {
 	void *ret;
diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index b2cc7b4552a1..886ede4117b5 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -152,3 +152,34 @@ void arch_efi_call_virt_teardown(void)
 	firmware_restrict_branch_speculation_end();
 	efi_fpu_end();
 }
+
+/* Merge contiguous regions of the same type and attribute */
+void __init efi_merge_regions(void)
+{
+	efi_memory_desc_t *md, *prev_md = NULL;
+
+	for_each_efi_memory_desc(md) {
+		u64 prev_size;
+
+		if (!prev_md) {
+			prev_md = md;
+			continue;
+		}
+
+		if (prev_md->type != md->type ||
+		    prev_md->attribute != md->attribute) {
+			prev_md = md;
+			continue;
+		}
+
+		prev_size = prev_md->num_pages << EFI_PAGE_SHIFT;
+
+		if (md->phys_addr == (prev_md->phys_addr + prev_size)) {
+			prev_md->num_pages += md->num_pages;
+			md->type = EFI_RESERVED_TYPE;
+			md->attribute = 0;
+			continue;
+		}
+		prev_md = md;
+	}
+}
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 04/17] x86/efi: Defer sub-1M check from unmap to free stage
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 03/17] x86/efi: Only merge EFI memory map entries on 32-bit systems Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 05/17] x86/efi: Simplify real mode trampoline allocation quirk Ard Biesheuvel
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

As a first step towards moving the free logic to a later stage
altogether, and only keeping the unmap and the realmode trampoline hack
during the early stage of freeing the boot service code and data
regions, move the logic that avoids freeing memory below 1M to the later
stage.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 28 +++++++++-----------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 4d8de7c6ce59..e2e57e9201a9 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -468,18 +468,6 @@ void __init efi_unmap_boot_services(void)
 			size -= rm_size;
 		}
 
-		/*
-		 * Don't free memory under 1M for two reasons:
-		 * - BIOS might clobber it
-		 * - Crash kernel needs it to be reserved
-		 */
-		if (start + size < SZ_1M)
-			continue;
-		if (start < SZ_1M) {
-			size -= (SZ_1M - start);
-			start = SZ_1M;
-		}
-
 		/*
 		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
 		 * map are still not initialized and we can't reliably free
@@ -537,12 +525,20 @@ static int __init efi_free_boot_services(void)
 	if (!ranges_to_free)
 		return 0;
 
-	while (range->start) {
-		void *start = phys_to_virt(range->start);
+	while (range->start || range->end) {
+		/*
+		 * Don't free memory under 1M for two reasons:
+		 * - BIOS might clobber it
+		 * - Crash kernel needs it to be reserved
+		 */
+		unsigned long s = max(range->start, SZ_1M);
+		void *start = phys_to_virt(s);
 		void *end = phys_to_virt(range->end);
 
-		free_reserved_area(start, end, -1, NULL);
-		freed += (end - start);
+		if (start < end) {
+			free_reserved_area(start, end, -1, NULL);
+			freed += (end - start);
+		}
 		range++;
 	}
 	kfree(ranges_to_free);
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 05/17] x86/efi: Simplify real mode trampoline allocation quirk
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 04/17] x86/efi: Defer sub-1M check from unmap to free stage Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 06/17] x86/efi: Unmap kernel-reserved boot regions from EFI page tables Ard Biesheuvel
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

To work around a common bug in EFI firmware for x86 systems, Linux
reserves all EFI boot services code and data regions until after it has
invoked the SetVirtualAddressMap() EFI runtime service. This is needed
because those regions may still be accessed by the firmware during that
call, even though the EFI spec says that they shouldn't.

This includes any boot services data regions below 1M, which might mean
that by the time the real mode trampoline is being allocated, all memory
below 1M is already exhausted.

Commit

  5bc653b73182 ("x86/efi: Allocate a trampoline if needed in efi_free_boot_services()")

added a quirk to detect this condition, and to make another attempt at
allocating the real mode trampoline when freeing those boot services
regions again. This is a rather crude hack, which gets in the way of
cleanup work on the EFI/x86 memory map handling code.

Given that

- the real mode trampoline is normally allocated soon after all EFI boot
  services regions are reserved temporarily,
- this allocation logic marks all memory below 1M as reserved,
- the trampoline memory is not actually populated until an early
  initcall,

there is actually no need to reserve any boot services regions below 1M,
even if they are mapped into the EFI page tables during the call to
SetVirtualAddressMap(). So cap the lower bound of the reserved regions
to 1M, and fix up the size accordingly when making the reservation. This
allows the additional quirk to be dropped entirely.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 34 ++++----------------
 1 file changed, 6 insertions(+), 28 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index e2e57e9201a9..e79fb94c1bf6 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -324,10 +324,14 @@ void __init efi_reserve_boot_services(void)
 		return;
 
 	for_each_efi_memory_desc(md) {
-		u64 start = md->phys_addr;
-		u64 size = md->num_pages << EFI_PAGE_SHIFT;
+		u64 start = max(md->phys_addr, SZ_1M);
+		u64 end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
+		u64 size = end - start;
 		bool already_reserved;
 
+		if (end <= start)
+			continue;
+
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
 		    md->type != EFI_BOOT_SERVICES_DATA)
 			continue;
@@ -340,11 +344,6 @@ void __init efi_reserve_boot_services(void)
 		 * efi_free_boot_services(), we must be extremely
 		 * careful not to reserve, and subsequently free, critical
 		 * regions of memory that somebody else has already reserved.
-		 *
-		 * A good example of a critical region that must not be
-		 * freed is page zero (first 4Kb of memory), which may
-		 * contain boot services code/data but is marked
-		 * E820_TYPE_RESERVED by trim_bios_range().
 		 */
 		if (!already_reserved) {
 			memblock_reserve(start, size);
@@ -427,7 +426,6 @@ void __init efi_unmap_boot_services(void)
 	for_each_efi_memory_desc(md) {
 		unsigned long long start = md->phys_addr;
 		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
-		size_t rm_size;
 
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
 		    md->type != EFI_BOOT_SERVICES_DATA) {
@@ -448,26 +446,6 @@ void __init efi_unmap_boot_services(void)
 		 */
 		efi_unmap_pages(md);
 
-		/*
-		 * Nasty quirk: if all sub-1MB memory is used for boot
-		 * services, we can get here without having allocated the
-		 * real mode trampoline.  It's too late to hand boot services
-		 * memory back to the memblock allocator, so instead
-		 * try to manually allocate the trampoline if needed.
-		 *
-		 * I've seen this on a Dell XPS 13 9350 with firmware
-		 * 1.4.4 with SGX enabled booting Linux via Fedora 24's
-		 * grub2-efi on a hard disk.  (And no, I don't know why
-		 * this happened, but Linux should still try to boot rather
-		 * panicking early.)
-		 */
-		rm_size = real_mode_size_needed();
-		if (rm_size && (start + rm_size) < (1<<20) && size >= rm_size) {
-			set_real_mode_mem(start);
-			start += rm_size;
-			size -= rm_size;
-		}
-
 		/*
 		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
 		 * map are still not initialized and we can't reliably free
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 06/17] x86/efi: Unmap kernel-reserved boot regions from EFI page tables
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 05/17] x86/efi: Simplify real mode trampoline allocation quirk Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 07/17] x86/efi: Drop EFI_MEMORY_RUNTIME check from __ioremap_check_other() Ard Biesheuvel
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Currently, the logic that unmaps boot services code and data regions
that were mapped temporarily to work around firmware bugs disregards
regions that have been marked as EFI_MEMORY_RUNTIME. However, such
regions only have significance to the OS, and there is no reason to
retain the mapping in the EFI page tables, given that the runtime
firmware must never touch those regions.

So pull the unmap forward.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index e79fb94c1bf6..1d10277796b7 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -433,12 +433,6 @@ void __init efi_unmap_boot_services(void)
 			continue;
 		}
 
-		/* Do not free, someone else owns it: */
-		if (md->attribute & EFI_MEMORY_RUNTIME) {
-			num_entries++;
-			continue;
-		}
-
 		/*
 		 * Before calling set_virtual_address_map(), EFI boot services
 		 * code/data regions were mapped as a quirk for buggy firmware.
@@ -446,6 +440,12 @@ void __init efi_unmap_boot_services(void)
 		 */
 		efi_unmap_pages(md);
 
+		/* Do not free, someone else owns it: */
+		if (md->attribute & EFI_MEMORY_RUNTIME) {
+			num_entries++;
+			continue;
+		}
+
 		/*
 		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
 		 * map are still not initialized and we can't reliably free
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 07/17] x86/efi: Drop EFI_MEMORY_RUNTIME check from __ioremap_check_other()
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 06/17] x86/efi: Unmap kernel-reserved boot regions from EFI page tables Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 08/17] x86/efi: Allow ranges_to_free array to grow beyond initial size Ard Biesheuvel
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price, Tom Lendacky

From: Ard Biesheuvel <ardb@kernel.org>

__ioremap_check_other() is called when memremap() is used on memory that
turns out to be reserved. This may be the case for ESRT or MOK tables
that are reserved via efi_mem_reserve(), in which case they will be
covered by EfiBootServicesData entries in the EFI memory map.

Such entries are created with the EFI_MEMORY_RUNTIME attribute set, to
distinguish them from EfiBootServicesData entries that were reserved
only temporarily, in order to work around firmware bugs.

However, given that
a) __ioremap_check_other() is only called for memory that could not be
   mapped using try_ram_remap(),
b) on x86, the EFI memory map only retains EfiBootServicesData entries that
   cover a permanent reservation,
the EFI_MEMORY_RUNTIME check is redundant, and can be dropped.

This removes the need to set this attribute in the first place, which is
desirable as it results in considerable complexity in managing the EFI
memory map on x86. This will be addressed in follow-up work.

While at it, use switch() rather than if() to avoid multiple calls to
efi_mem_type(), which is backed by a hypervisor call in some cases.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/mm/ioremap.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 12c8180ca1ba..50377423d422 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -124,10 +124,12 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des
 	if (!IS_ENABLED(CONFIG_EFI))
 		return;
 
-	if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA ||
-	    (efi_mem_type(addr) == EFI_BOOT_SERVICES_DATA &&
-	     efi_mem_attributes(addr) & EFI_MEMORY_RUNTIME))
+	switch (efi_mem_type(addr)) {
+	case EFI_RUNTIME_SERVICES_DATA:
+	case EFI_BOOT_SERVICES_DATA:
 		desc->flags |= IORES_MAP_ENCRYPTED;
+		break;
+	}
 }
 
 static int __ioremap_collect_map_flags(struct resource *res, void *arg)
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 08/17] x86/efi: Allow ranges_to_free array to grow beyond initial size
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 07/17] x86/efi: Drop EFI_MEMORY_RUNTIME check from __ioremap_check_other() Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 09/17] x86/efi: Intersect ranges_to_free with MEMBLOCK_RSRV_KERN regions Ard Biesheuvel
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

In order to avoid the need to mangle the EFI memory map, which is being
done to keep track of which boot services data regions are really really
reserved, and which ones are only reserved temporarily, this information
needs to be recorded in a different manner.

The temporary ranges_to_free array is a suitable candidate, as it is
specifically intended to capture which boot services data regions should
be handed back to the page allocator once deferred struct page
initialization is done.

This requires that boot services data regions are intersected with the
memblock reserved list, and this may result in more ranges_to_free
elements than the current upper bound of the number of EFI memory map
entries.

So reallocate the array when running out of slots.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 40 ++++++++++++++++----
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 1d10277796b7..ce452e5c2f0a 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -401,23 +401,46 @@ struct efi_freeable_range {
 	u64 end;
 };
 
-static struct efi_freeable_range *ranges_to_free;
+static struct efi_freeable_range *ranges_to_free __initdata;
+static int num_to_free __initdata;
+
+static int __init efi_add_range_to_free(u64 range_start, u64 range_end)
+{
+	static int idx __initdata;
+
+	ranges_to_free[idx].start = range_start;
+	ranges_to_free[idx].end = range_end;
+
+	if (++idx >= num_to_free) {
+		num_to_free *= 2;
+		ranges_to_free = krealloc_array(ranges_to_free,
+						num_to_free,
+						sizeof(ranges_to_free[0]),
+						GFP_KERNEL);
+		if (!ranges_to_free)
+			return -ENOMEM;
+	}
+
+	/* add a terminating entry at the end */
+	ranges_to_free[idx].start = ranges_to_free[idx].end = 0;
+
+	return 0;
+}
 
 void __init efi_unmap_boot_services(void)
 {
 	struct efi_memory_map_data data = { 0 };
 	efi_memory_desc_t *md;
 	int num_entries = 0;
-	int idx = 0;
-	size_t sz;
 	void *new, *new_md;
 
 	/* Keep all regions for /sys/kernel/debug/efi */
 	if (efi_enabled(EFI_DBG))
 		return;
 
-	sz = sizeof(*ranges_to_free) * (efi.memmap.nr_map + 1);
-	ranges_to_free = kzalloc(sz, GFP_KERNEL);
+	num_to_free = efi.memmap.nr_map;
+	ranges_to_free = kmalloc_array(num_to_free, sizeof(ranges_to_free[0]),
+				       GFP_KERNEL);
 	if (!ranges_to_free) {
 		pr_err("Failed to allocate storage for freeable EFI regions\n");
 		return;
@@ -452,9 +475,10 @@ void __init efi_unmap_boot_services(void)
 		 * memory here.
 		 * Queue the ranges to free at a later point.
 		 */
-		ranges_to_free[idx].start = start;
-		ranges_to_free[idx].end = start + size;
-		idx++;
+		if (efi_add_range_to_free(start, start + size)) {
+			pr_err("Failed to reallocate storage for freeable EFI regions\n");
+			return;
+		}
 	}
 
 	if (!num_entries)
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 09/17] x86/efi: Intersect ranges_to_free with MEMBLOCK_RSRV_KERN regions
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 08/17] x86/efi: Allow ranges_to_free array to grow beyond initial size Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 10/17] x86/efi: Do not rely on EFI_MEMORY_RUNTIME bit and avoid entry splitting Ard Biesheuvel
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

efi_mem_reserve() will mark reservations in memblock with the
MEMBLOCK_RSRV_KERN attribute, so take this into account when building
the ranges_to_free array. This removes the need to split EFI memory map
entries and tag them with the EFI_MEMORY_RUNTIME attribute.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 44 +++++++++++++++-----
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index ce452e5c2f0a..ce06a388fc11 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -406,19 +406,41 @@ static int num_to_free __initdata;
 
 static int __init efi_add_range_to_free(u64 range_start, u64 range_end)
 {
+	struct memblock_region *region;
 	static int idx __initdata;
 
-	ranges_to_free[idx].start = range_start;
-	ranges_to_free[idx].end = range_end;
-
-	if (++idx >= num_to_free) {
-		num_to_free *= 2;
-		ranges_to_free = krealloc_array(ranges_to_free,
-						num_to_free,
-						sizeof(ranges_to_free[0]),
-						GFP_KERNEL);
-		if (!ranges_to_free)
-			return -ENOMEM;
+	for_each_reserved_mem_region(region) {
+		u64 region_end = region->base + region->size;
+		u64 start, end;
+
+		/* memblock tables are sorted so no need to carry on */
+		if (region->base >= range_end)
+			break;
+
+		if (region_end < range_start)
+			continue;
+
+		if (region->flags & MEMBLOCK_RSRV_KERN)
+			continue;
+
+		start = PAGE_ALIGN(max(range_start, region->base));
+		end = PAGE_ALIGN_DOWN(min(range_end, region_end));
+
+		if (start >= end)
+			continue;
+
+		ranges_to_free[idx].start = start;
+		ranges_to_free[idx].end = end;
+
+		if (++idx >= num_to_free) {
+			num_to_free *= 2;
+			ranges_to_free = krealloc_array(ranges_to_free,
+							num_to_free,
+							sizeof(ranges_to_free[0]),
+							GFP_KERNEL);
+			if (!ranges_to_free)
+				return -ENOMEM;
+		}
 	}
 
 	/* add a terminating entry at the end */
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 10/17] x86/efi: Do not rely on EFI_MEMORY_RUNTIME bit and avoid entry splitting
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 09/17] x86/efi: Intersect ranges_to_free with MEMBLOCK_RSRV_KERN regions Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 11/17] efi: Use nr_map not map_end to find the last valid memory map entry Ard Biesheuvel
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Now that efi_mem_reserve() has been updated to rely on RSRV_KERN
memblock reservations, it is no longer needed to mark memblock reserved
boot services regions as EFI_MEMORY_RUNTIME. This means that it is no
longer needed to split existing entries in the EFI memory map, removing
the need to re-allocate/copy/remap the entire EFI memory map on every
call to efi_mem_reserve().

So drop this functionality - it is no longer needed.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/efi.h     |   4 -
 arch/x86/platform/efi/memmap.c | 138 --------------------
 arch/x86/platform/efi/quirks.c |  77 +++--------
 3 files changed, 17 insertions(+), 202 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index f291845b403a..7d8f627805df 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -402,10 +402,6 @@ extern int __init efi_memmap_alloc(unsigned int num_entries,
 				   struct efi_memory_map_data *data);
 
 extern int __init efi_memmap_install(struct efi_memory_map_data *data);
-extern int __init efi_memmap_split_count(efi_memory_desc_t *md,
-					 struct range *range);
-extern void __init efi_memmap_insert(struct efi_memory_map *old_memmap,
-				     void *buf, struct efi_mem_range *mem);
 
 enum efi_secureboot_mode __x86_efi_boot_mode(void);
 
diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c
index 697a9a26a005..da7483a33d68 100644
--- a/arch/x86/platform/efi/memmap.c
+++ b/arch/x86/platform/efi/memmap.c
@@ -107,141 +107,3 @@ int __init efi_memmap_install(struct efi_memory_map_data *data)
 	__efi_memmap_free(phys, size, flags);
 	return 0;
 }
-
-/**
- * efi_memmap_split_count - Count number of additional EFI memmap entries
- * @md: EFI memory descriptor to split
- * @range: Address range (start, end) to split around
- *
- * Returns the number of additional EFI memmap entries required to
- * accommodate @range.
- */
-int __init efi_memmap_split_count(efi_memory_desc_t *md, struct range *range)
-{
-	u64 m_start, m_end;
-	u64 start, end;
-	int count = 0;
-
-	start = md->phys_addr;
-	end = start + (md->num_pages << EFI_PAGE_SHIFT) - 1;
-
-	/* modifying range */
-	m_start = range->start;
-	m_end = range->end;
-
-	if (m_start <= start) {
-		/* split into 2 parts */
-		if (start < m_end && m_end < end)
-			count++;
-	}
-
-	if (start < m_start && m_start < end) {
-		/* split into 3 parts */
-		if (m_end < end)
-			count += 2;
-		/* split into 2 parts */
-		if (end <= m_end)
-			count++;
-	}
-
-	return count;
-}
-
-/**
- * efi_memmap_insert - Insert a memory region in an EFI memmap
- * @old_memmap: The existing EFI memory map structure
- * @buf: Address of buffer to store new map
- * @mem: Memory map entry to insert
- *
- * It is suggested that you call efi_memmap_split_count() first
- * to see how large @buf needs to be.
- */
-void __init efi_memmap_insert(struct efi_memory_map *old_memmap, void *buf,
-			      struct efi_mem_range *mem)
-{
-	u64 m_start, m_end, m_attr;
-	efi_memory_desc_t *md;
-	u64 start, end;
-	void *old, *new;
-
-	/* modifying range */
-	m_start = mem->range.start;
-	m_end = mem->range.end;
-	m_attr = mem->attribute;
-
-	/*
-	 * The EFI memory map deals with regions in EFI_PAGE_SIZE
-	 * units. Ensure that the region described by 'mem' is aligned
-	 * correctly.
-	 */
-	if (!IS_ALIGNED(m_start, EFI_PAGE_SIZE) ||
-	    !IS_ALIGNED(m_end + 1, EFI_PAGE_SIZE)) {
-		WARN_ON(1);
-		return;
-	}
-
-	for (old = old_memmap->map, new = buf;
-	     old < old_memmap->map_end;
-	     old += old_memmap->desc_size, new += old_memmap->desc_size) {
-
-		/* copy original EFI memory descriptor */
-		memcpy(new, old, old_memmap->desc_size);
-		md = new;
-		start = md->phys_addr;
-		end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT) - 1;
-
-		if (m_start <= start && end <= m_end)
-			md->attribute |= m_attr;
-
-		if (m_start <= start &&
-		    (start < m_end && m_end < end)) {
-			/* first part */
-			md->attribute |= m_attr;
-			md->num_pages = (m_end - md->phys_addr + 1) >>
-				EFI_PAGE_SHIFT;
-			/* latter part */
-			new += old_memmap->desc_size;
-			memcpy(new, old, old_memmap->desc_size);
-			md = new;
-			md->phys_addr = m_end + 1;
-			md->num_pages = (end - md->phys_addr + 1) >>
-				EFI_PAGE_SHIFT;
-		}
-
-		if ((start < m_start && m_start < end) && m_end < end) {
-			/* first part */
-			md->num_pages = (m_start - md->phys_addr) >>
-				EFI_PAGE_SHIFT;
-			/* middle part */
-			new += old_memmap->desc_size;
-			memcpy(new, old, old_memmap->desc_size);
-			md = new;
-			md->attribute |= m_attr;
-			md->phys_addr = m_start;
-			md->num_pages = (m_end - m_start + 1) >>
-				EFI_PAGE_SHIFT;
-			/* last part */
-			new += old_memmap->desc_size;
-			memcpy(new, old, old_memmap->desc_size);
-			md = new;
-			md->phys_addr = m_end + 1;
-			md->num_pages = (end - m_end) >>
-				EFI_PAGE_SHIFT;
-		}
-
-		if ((start < m_start && m_start < end) &&
-		    (end <= m_end)) {
-			/* first part */
-			md->num_pages = (m_start - md->phys_addr) >>
-				EFI_PAGE_SHIFT;
-			/* latter part */
-			new += old_memmap->desc_size;
-			memcpy(new, old, old_memmap->desc_size);
-			md = new;
-			md->phys_addr = m_start;
-			md->num_pages = (end - md->phys_addr + 1) >>
-				EFI_PAGE_SHIFT;
-			md->attribute |= m_attr;
-		}
-	}
-}
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index ce06a388fc11..efb828b7e2ab 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -239,63 +239,9 @@ EXPORT_SYMBOL_GPL(efi_query_variable_store);
  * buggy implementations we reserve boot services region during EFI
  * init and make sure it stays executable. Then, after
  * SetVirtualAddressMap(), it is discarded.
- *
- * However, some boot services regions contain data that is required
- * by drivers, so we need to track which memory ranges can never be
- * freed. This is done by tagging those regions with the
- * EFI_MEMORY_RUNTIME attribute.
- *
- * Any driver that wants to mark a region as reserved must use
- * efi_mem_reserve() which will insert a new EFI memory descriptor
- * into efi.memmap (splitting existing regions if necessary) and tag
- * it with EFI_MEMORY_RUNTIME.
  */
 void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
 {
-	struct efi_memory_map_data data = { 0 };
-	struct efi_mem_range mr;
-	efi_memory_desc_t md;
-	int num_entries;
-	void *new;
-
-	if (efi_mem_desc_lookup(addr, &md) ||
-	    md.type != EFI_BOOT_SERVICES_DATA) {
-		pr_err("Failed to lookup EFI memory descriptor for %pa\n", &addr);
-		return;
-	}
-
-	if (addr + size > md.phys_addr + (md.num_pages << EFI_PAGE_SHIFT)) {
-		pr_err("Region spans EFI memory descriptors, %pa\n", &addr);
-		return;
-	}
-
-	size += addr % EFI_PAGE_SIZE;
-	size = round_up(size, EFI_PAGE_SIZE);
-	addr = round_down(addr, EFI_PAGE_SIZE);
-
-	mr.range.start = addr;
-	mr.range.end = addr + size - 1;
-	mr.attribute = md.attribute | EFI_MEMORY_RUNTIME;
-
-	num_entries = efi_memmap_split_count(&md, &mr.range);
-	num_entries += efi.memmap.nr_map;
-
-	if (efi_memmap_alloc(num_entries, &data) != 0) {
-		pr_err("Could not allocate boot services memmap\n");
-		return;
-	}
-
-	new = early_memremap_prot(data.phys_map, data.size,
-				  pgprot_val(pgprot_encrypted(FIXMAP_PAGE_NORMAL)));
-	if (!new) {
-		pr_err("Failed to map new boot services memmap\n");
-		return;
-	}
-
-	efi_memmap_insert(&efi.memmap, new, &mr);
-	early_memunmap(new, data.size);
-
-	efi_memmap_install(&data);
 	e820__range_update(addr, size, E820_TYPE_RAM, E820_TYPE_RESERVED);
 	e820__update_table(e820_table);
 }
@@ -404,7 +350,8 @@ struct efi_freeable_range {
 static struct efi_freeable_range *ranges_to_free __initdata;
 static int num_to_free __initdata;
 
-static int __init efi_add_range_to_free(u64 range_start, u64 range_end)
+static int __init efi_add_range_to_free(u64 range_start, u64 range_end,
+					bool *has_reservations)
 {
 	struct memblock_region *region;
 	static int idx __initdata;
@@ -420,15 +367,18 @@ static int __init efi_add_range_to_free(u64 range_start, u64 range_end)
 		if (region_end < range_start)
 			continue;
 
-		if (region->flags & MEMBLOCK_RSRV_KERN)
-			continue;
-
 		start = PAGE_ALIGN(max(range_start, region->base));
 		end = PAGE_ALIGN_DOWN(min(range_end, region_end));
 
 		if (start >= end)
 			continue;
 
+		if ((region->flags & MEMBLOCK_RSRV_KERN) ||
+		    !can_free_region(start, end - start)) {
+			*has_reservations = true;
+			continue;
+		}
+
 		ranges_to_free[idx].start = start;
 		ranges_to_free[idx].end = end;
 
@@ -471,6 +421,7 @@ void __init efi_unmap_boot_services(void)
 	for_each_efi_memory_desc(md) {
 		unsigned long long start = md->phys_addr;
 		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
+		bool has_reservations = false;
 
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
 		    md->type != EFI_BOOT_SERVICES_DATA) {
@@ -497,10 +448,13 @@ void __init efi_unmap_boot_services(void)
 		 * memory here.
 		 * Queue the ranges to free at a later point.
 		 */
-		if (efi_add_range_to_free(start, start + size)) {
+		if (efi_add_range_to_free(start, start + size, &has_reservations)) {
 			pr_err("Failed to reallocate storage for freeable EFI regions\n");
 			return;
 		}
+
+		if (has_reservations)
+			num_entries++;
 	}
 
 	if (!num_entries)
@@ -526,8 +480,11 @@ void __init efi_unmap_boot_services(void)
 	for_each_efi_memory_desc(md) {
 		if (!(md->attribute & EFI_MEMORY_RUNTIME) &&
 		    (md->type == EFI_BOOT_SERVICES_CODE ||
-		     md->type == EFI_BOOT_SERVICES_DATA))
+		     md->type == EFI_BOOT_SERVICES_DATA) &&
+		    can_free_region(md->phys_addr,
+				    md->num_pages << EFI_PAGE_SHIFT)) {
 			continue;
+		}
 
 		memcpy(new_md, md, efi.memmap.desc_size);
 		new_md += efi.memmap.desc_size;
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 11/17] efi: Use nr_map not map_end to find the last valid memory map entry
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (9 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 10/17] x86/efi: Do not rely on EFI_MEMORY_RUNTIME bit and avoid entry splitting Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 12/17] x86/efi: Clean the memory map using iterator and filter API Ard Biesheuvel
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Currently, the efi.memmap struct keeps track of the start and the end of
the EFI memory map in memory, as well as the number of entries.

Let's repaint the nr_map field as the number of *valid* entries, and
update all the iterators and other memory map traversal routines
accordingly.

This allows pruning of invalid or unneeded entries by moving the
remaining entries to the start of the map, without the need for
freeing/reallocating or unmapping and remapping. Now that entries are
never added, but only removed, it is possible to retain the same
allocation throughout the boot process, and free the part that is no
longer in use afterwards.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/efi.c          | 18 ++++++++++++------
 arch/x86/platform/efi/memmap.c       |  2 +-
 arch/x86/platform/efi/quirks.c       |  2 +-
 arch/x86/platform/efi/runtime-map.c  |  4 ++--
 drivers/firmware/efi/arm-runtime.c   |  2 +-
 drivers/firmware/efi/memmap.c        |  8 +++-----
 drivers/firmware/efi/riscv-runtime.c |  2 +-
 include/linux/efi.h                  |  8 ++++----
 8 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index c0195b5eab21..edbf6efe3947 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -222,7 +222,7 @@ int __init efi_memblock_x86_reserve_range(void)
 	     "Unexpected EFI_MEMORY_DESCRIPTOR version %ld",
 	     efi.memmap.desc_version);
 
-	memblock_reserve(pmap, efi.memmap.nr_map * efi.memmap.desc_size);
+	memblock_reserve(pmap, efi.memmap.num_valid_entries * efi.memmap.desc_size);
 	set_bit(EFI_PRESERVE_BS_REGIONS, &efi.flags);
 
 	return 0;
@@ -289,7 +289,7 @@ static void __init efi_clean_memmap(void)
 			.phys_map	= efi.memmap.phys_map,
 			.desc_version	= efi.memmap.desc_version,
 			.desc_size	= efi.memmap.desc_size,
-			.size		= efi.memmap.desc_size * (efi.memmap.nr_map - n_removal),
+			.size		= efi.memmap.desc_size * (efi.memmap.num_valid_entries - n_removal),
 			.flags		= 0,
 		};
 
@@ -533,7 +533,8 @@ static inline void *efi_map_next_entry_reverse(void *entry)
 {
 	/* Initial call */
 	if (!entry)
-		return efi.memmap.map_end - efi.memmap.desc_size;
+		return efi_memdesc_ptr(efi.memmap.map, efi.memmap.desc_size,
+				       efi.memmap.num_valid_entries - 1);
 
 	entry -= efi.memmap.desc_size;
 	if (entry < efi.memmap.map)
@@ -555,6 +556,9 @@ static inline void *efi_map_next_entry_reverse(void *entry)
  */
 static void *efi_map_next_entry(void *entry)
 {
+	if (!efi.memmap.num_valid_entries)
+		return NULL;
+
 	if (efi_enabled(EFI_64BIT)) {
 		/*
 		 * Starting in UEFI v2.5 the EFI_PROPERTIES_TABLE
@@ -581,7 +585,9 @@ static void *efi_map_next_entry(void *entry)
 		return efi.memmap.map;
 
 	entry += efi.memmap.desc_size;
-	if (entry >= efi.memmap.map_end)
+	if (entry >= (void *)efi_memdesc_ptr(efi.memmap.map,
+					     efi.memmap.desc_size,
+					     efi.memmap.num_valid_entries))
 		return NULL;
 
 	return entry;
@@ -712,13 +718,13 @@ static void __init kexec_enter_virtual_mode(void)
 	efi_memmap_unmap();
 
 	if (efi_memmap_init_late(efi.memmap.phys_map,
-				 efi.memmap.desc_size * efi.memmap.nr_map)) {
+				 efi.memmap.desc_size * efi.memmap.num_valid_entries)) {
 		pr_err("Failed to remap late EFI memory map\n");
 		clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 		return;
 	}
 
-	num_pages = ALIGN(efi.memmap.nr_map * efi.memmap.desc_size, PAGE_SIZE);
+	num_pages = ALIGN(efi.memmap.num_valid_entries * efi.memmap.desc_size, PAGE_SIZE);
 	num_pages >>= PAGE_SHIFT;
 
 	if (efi_setup_page_tables(efi.memmap.phys_map, num_pages)) {
diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c
index da7483a33d68..524e9f2ef276 100644
--- a/arch/x86/platform/efi/memmap.c
+++ b/arch/x86/platform/efi/memmap.c
@@ -90,7 +90,7 @@ int __init efi_memmap_alloc(unsigned int num_entries,
  */
 int __init efi_memmap_install(struct efi_memory_map_data *data)
 {
-	unsigned long size = efi.memmap.desc_size * efi.memmap.nr_map;
+	unsigned long size = efi.memmap.map_end - efi.memmap.map;
 	unsigned long flags = efi.memmap.flags;
 	u64 phys = efi.memmap.phys_map;
 	int ret;
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index efb828b7e2ab..eb00130bcb66 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -410,7 +410,7 @@ void __init efi_unmap_boot_services(void)
 	if (efi_enabled(EFI_DBG))
 		return;
 
-	num_to_free = efi.memmap.nr_map;
+	num_to_free = efi.memmap.num_valid_entries;
 	ranges_to_free = kmalloc_array(num_to_free, sizeof(ranges_to_free[0]),
 				       GFP_KERNEL);
 	if (!ranges_to_free) {
diff --git a/arch/x86/platform/efi/runtime-map.c b/arch/x86/platform/efi/runtime-map.c
index 053ff161eb9a..fc8ca1974730 100644
--- a/arch/x86/platform/efi/runtime-map.c
+++ b/arch/x86/platform/efi/runtime-map.c
@@ -138,7 +138,7 @@ add_sysfs_runtime_map_entry(struct kobject *kobj, int nr,
 
 int efi_get_runtime_map_size(void)
 {
-	return efi.memmap.nr_map * efi.memmap.desc_size;
+	return efi.memmap.num_valid_entries * efi.memmap.desc_size;
 }
 
 int efi_get_runtime_map_desc_size(void)
@@ -166,7 +166,7 @@ static int __init efi_runtime_map_init(void)
 	if (!efi_enabled(EFI_MEMMAP) || !efi_kobj)
 		return 0;
 
-	map_entries = kzalloc_objs(entry, efi.memmap.nr_map);
+	map_entries = kzalloc_objs(entry, efi.memmap.num_valid_entries);
 	if (!map_entries) {
 		ret = -ENOMEM;
 		goto out;
diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
index 3167cab62014..e19997c09175 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -96,7 +96,7 @@ static int __init arm_enable_runtime_services(void)
 
 	efi_memmap_unmap();
 
-	mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
+	mapsize = efi.memmap.desc_size * efi.memmap.num_valid_entries;
 
 	if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
 		pr_err("Failed to remap EFI memory map\n");
diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index f1c04d7cfd71..035089791c93 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -49,7 +49,7 @@ int __init __efi_memmap_init(struct efi_memory_map_data *data)
 	}
 
 	map.phys_map = data->phys_map;
-	map.nr_map = data->size / data->desc_size;
+	map.num_valid_entries = data->size / data->desc_size;
 	map.map_end = map.map + data->size;
 
 	map.desc_version = data->desc_version;
@@ -87,10 +87,8 @@ void __init efi_memmap_unmap(void)
 		return;
 
 	if (!(efi.memmap.flags & EFI_MEMMAP_LATE)) {
-		unsigned long size;
-
-		size = efi.memmap.desc_size * efi.memmap.nr_map;
-		early_memunmap(efi.memmap.map, size);
+		early_memunmap(efi.memmap.map,
+			       efi.memmap.map_end - efi.memmap.map);
 	} else {
 		memunmap(efi.memmap.map);
 	}
diff --git a/drivers/firmware/efi/riscv-runtime.c b/drivers/firmware/efi/riscv-runtime.c
index 60cdf7bf141f..087a7f8a74e6 100644
--- a/drivers/firmware/efi/riscv-runtime.c
+++ b/drivers/firmware/efi/riscv-runtime.c
@@ -66,7 +66,7 @@ static int __init riscv_enable_runtime_services(void)
 
 	efi_memmap_unmap();
 
-	mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
+	mapsize = efi.memmap.desc_size * efi.memmap.num_valid_entries;
 
 	if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
 		pr_err("Failed to remap EFI memory map\n");
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 72e76ec54641..a8406ca92332 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -568,7 +568,7 @@ struct efi_memory_map {
 	phys_addr_t phys_map;
 	void *map;
 	void *map_end;
-	int nr_map;
+	int num_valid_entries;
 	unsigned long desc_version;
 	unsigned long desc_size;
 #define EFI_MEMMAP_LATE (1UL << 0)
@@ -803,9 +803,9 @@ extern int efi_memattr_apply_permissions(struct mm_struct *mm,
 
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc_in_map(m, md)				   \
-	for ((md) = (m)->map;						   \
-	     (md) && ((void *)(md) + (m)->desc_size) <= (m)->map_end;	   \
-	     (md) = (void *)(md) + (m)->desc_size)
+	for (int __idx = 0;						   \
+	     (md) = efi_memdesc_ptr((m)->map, (m)->desc_size, __idx),	   \
+	     __idx < (m)->num_valid_entries; ++__idx)
 
 /**
  * for_each_efi_memory_desc - iterate over descriptors in efi.memmap
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 12/17] x86/efi: Clean the memory map using iterator and filter API
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (10 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 11/17] efi: Use nr_map not map_end to find the last valid memory map entry Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 13/17] x86/efi: Update the runtime map in place Ard Biesheuvel
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Instead of open coding the iteration logic, use the existing iterator
API to iterate over all valid entries in the EFI memory map.

In addition, break out the logic that iterates over and conditionally
suppresses memory map entries so it can be reused later, as something
similar is happening two more times during boot.

Note that actually reinstalling the EFI memory map, which involves
unmapping and remapping it, is no longer needed, given that the number
of valid entries can only go down. So omit efi_memmap_install() and just
update the number of valid entries.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/efi.c | 30 +++++++++-----------
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index edbf6efe3947..459b5c13167a 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -266,36 +266,32 @@ static bool __init efi_memmap_entry_valid(const efi_memory_desc_t *md, int i)
 	return false;
 }
 
-static void __init efi_clean_memmap(void)
+static int __init
+efi_memmap_filter_entries(bool (*callback)(const efi_memory_desc_t *, int))
 {
 	efi_memory_desc_t *out = efi.memmap.map;
 	const efi_memory_desc_t *in = out;
-	const efi_memory_desc_t *end = efi.memmap.map_end;
-	int i, n_removal;
+	int i = 0, filtered = 0;
 
-	for (i = n_removal = 0; in < end; i++) {
-		if (efi_memmap_entry_valid(in, i)) {
+	for_each_efi_memory_desc(in) {
+		if (callback(in, i++)) {
 			if (out != in)
 				memcpy(out, in, efi.memmap.desc_size);
 			out = (void *)out + efi.memmap.desc_size;
 		} else {
-			n_removal++;
+			filtered++;
 		}
-		in = (void *)in + efi.memmap.desc_size;
 	}
+	efi.memmap.num_valid_entries -= filtered;
+	return filtered;
+}
 
-	if (n_removal > 0) {
-		struct efi_memory_map_data data = {
-			.phys_map	= efi.memmap.phys_map,
-			.desc_version	= efi.memmap.desc_version,
-			.desc_size	= efi.memmap.desc_size,
-			.size		= efi.memmap.desc_size * (efi.memmap.num_valid_entries - n_removal),
-			.flags		= 0,
-		};
+static void __init efi_clean_memmap(void)
+{
+	int n_removal = efi_memmap_filter_entries(efi_memmap_entry_valid);
 
+	if (n_removal > 0)
 		pr_warn("Removing %d invalid memory map entries.\n", n_removal);
-		efi_memmap_install(&data);
-	}
 }
 
 /*
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 13/17] x86/efi: Update the runtime map in place
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (11 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 12/17] x86/efi: Clean the memory map using iterator and filter API Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 14/17] x86/efi: Reuse memory map instead of reallocating it Ard Biesheuvel
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

When creating the EFI runtime map, a copy is created containing only the
entries that will be mapped on behalf of the firmware, but the
assignment of the virtual address field is applied to both copies.

Subsequently, the copy is installed as the new EFI memory map, and the
old one is just leaked.

This means that there is no reason whatsoever to allocate and install
the copy, and it is much easier to just update the existing memory map in
place to set the virtual addresses and suppress unused entries.

So reuse the filter function used by efi_clean_memmap() to drop all
entries that are irrelevant, and then apply the existing logic to assign
the virtual addresses and create the mappings in the EFI page tables.

Note that x86_64 and i386 traverse the memory map in opposite order, so
this part remains a separate pass as before. This logic will be further
simplified in subsequent patch.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/efi.c | 82 ++++----------------
 1 file changed, 16 insertions(+), 66 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 459b5c13167a..f67ea6d4cad0 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -498,27 +498,6 @@ void __init efi_init(void)
 		efi_print_memmap();
 }
 
-static void *realloc_pages(void *old_memmap, int old_shift)
-{
-	void *ret;
-
-	ret = (void *)__get_free_pages(GFP_KERNEL, old_shift + 1);
-	if (!ret)
-		goto out;
-
-	/*
-	 * A first-time allocation doesn't have anything to copy.
-	 */
-	if (!old_memmap)
-		return ret;
-
-	memcpy(ret, old_memmap, PAGE_SIZE << old_shift);
-
-out:
-	free_pages((unsigned long)old_memmap, old_shift);
-	return ret;
-}
-
 /*
  * Iterate the EFI memory map in reverse order because the regions
  * will be mapped top-down. The end result is the same as if we had
@@ -589,7 +568,7 @@ static void *efi_map_next_entry(void *entry)
 	return entry;
 }
 
-static bool should_map_region(efi_memory_desc_t *md)
+static bool should_map_region(const efi_memory_desc_t *md, int unused)
 {
 	/*
 	 * Runtime regions always require runtime mappings (obviously).
@@ -642,40 +621,14 @@ static bool should_map_region(efi_memory_desc_t *md)
  * Map the efi memory ranges of the runtime services and update new_mmap with
  * virtual addresses.
  */
-static void * __init efi_map_regions(int *count, int *pg_shift)
+static void __init efi_map_regions(void)
 {
-	void *p, *new_memmap = NULL;
-	unsigned long left = 0;
-	unsigned long desc_size;
-	efi_memory_desc_t *md;
-
-	desc_size = efi.memmap.desc_size;
-
-	p = NULL;
-	while ((p = efi_map_next_entry(p))) {
-		md = p;
+	efi_memory_desc_t *md = NULL;
 
-		if (!should_map_region(md))
-			continue;
+	efi_memmap_filter_entries(should_map_region);
 
+	while ((md = efi_map_next_entry(md)))
 		efi_map_region(md);
-
-		if (left < desc_size) {
-			new_memmap = realloc_pages(new_memmap, *pg_shift);
-			if (!new_memmap)
-				return NULL;
-
-			left += PAGE_SIZE << *pg_shift;
-			(*pg_shift)++;
-		}
-
-		memcpy(new_memmap + (*count * desc_size), md, desc_size);
-
-		left -= desc_size;
-		(*count)++;
-	}
-
-	return new_memmap;
 }
 
 static void __init kexec_enter_virtual_mode(void)
@@ -752,9 +705,8 @@ static void __init kexec_enter_virtual_mode(void)
  */
 static void __init __efi_enter_virtual_mode(void)
 {
-	int count = 0, pg_shift = 0;
-	void *new_memmap = NULL;
 	efi_status_t status;
+	unsigned long size;
 	unsigned long pa;
 
 	if (efi_alloc_page_tables()) {
@@ -762,15 +714,6 @@ static void __init __efi_enter_virtual_mode(void)
 		goto err;
 	}
 
-	efi_merge_regions();
-	new_memmap = efi_map_regions(&count, &pg_shift);
-	if (!new_memmap) {
-		pr_err("Error reallocating memory, EFI runtime non-functional!\n");
-		goto err;
-	}
-
-	pa = __pa(new_memmap);
-
 	/*
 	 * Unregister the early EFI memmap from efi_init() and install
 	 * the new EFI memory map that we are about to pass to the
@@ -778,22 +721,29 @@ static void __init __efi_enter_virtual_mode(void)
 	 */
 	efi_memmap_unmap();
 
-	if (efi_memmap_init_late(pa, efi.memmap.desc_size * count)) {
+	size = efi.memmap.desc_size * efi.memmap.num_valid_entries;
+	if (efi_memmap_init_late(efi.memmap.phys_map, size)) {
 		pr_err("Failed to remap late EFI memory map\n");
 		goto err;
 	}
 
+	efi_merge_regions();
+	efi_map_regions();
+
 	if (efi_enabled(EFI_DBG)) {
 		pr_info("EFI runtime memory map:\n");
 		efi_print_memmap();
 	}
 
-	if (efi_setup_page_tables(pa, 1 << pg_shift))
+	size = efi.memmap.desc_size * efi.memmap.num_valid_entries;
+	if (efi_setup_page_tables(efi.memmap.phys_map,
+				  DIV_ROUND_UP(size, PAGE_SIZE)))
 		goto err;
 
 	efi_sync_low_kernel_mappings();
 
-	status = efi_set_virtual_address_map(efi.memmap.desc_size * count,
+	pa = efi.memmap.phys_map;
+	status = efi_set_virtual_address_map(size,
 					     efi.memmap.desc_size,
 					     efi.memmap.desc_version,
 					     (efi_memory_desc_t *)pa,
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 14/17] x86/efi: Reuse memory map instead of reallocating it
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (12 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 13/17] x86/efi: Update the runtime map in place Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 15/17] x86/efi: Merge two traversals of the memory map when freeing boot regions Ard Biesheuvel
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

The EFI memory map consists of 10s to 100s of entries of around 40 bytes
each. The initial version is allocated and populated by the EFI stub,
but later on, after freeing the boot services data regions and pruning
the associated entries, a new memory map is allocated with room for only
the remaining entries, which are typically much fewer in number.

Given that the original allocation is never freed, this does not
actually save any memory currently, and it is much simpler to just move
the entries that need to be preserved to the beginning of the map, and
truncate it. That way, a lot of the complicated memory map allocation
and freeing code can simply be dropped.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/efi.h     |   5 -
 arch/x86/platform/efi/Makefile |   2 +-
 arch/x86/platform/efi/memmap.c | 109 --------------------
 arch/x86/platform/efi/quirks.c |  35 +------
 include/linux/efi.h            |   5 +-
 5 files changed, 7 insertions(+), 149 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 7d8f627805df..0de92193764b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -398,11 +398,6 @@ static inline void efi_reserve_boot_services(void)
 }
 #endif /* CONFIG_EFI */
 
-extern int __init efi_memmap_alloc(unsigned int num_entries,
-				   struct efi_memory_map_data *data);
-
-extern int __init efi_memmap_install(struct efi_memory_map_data *data);
-
 enum efi_secureboot_mode __x86_efi_boot_mode(void);
 
 #define arch_efi_boot_mode __x86_efi_boot_mode()
diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index 500cab4a7f7c..28772e046a1b 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -2,7 +2,7 @@
 KASAN_SANITIZE := n
 GCOV_PROFILE := n
 
-obj-$(CONFIG_EFI) 		+= memmap.o quirks.o efi.o efi_$(BITS).o \
+obj-$(CONFIG_EFI) 		+= quirks.o efi.o efi_$(BITS).o \
 				   efi_stub_$(BITS).o
 obj-$(CONFIG_EFI_MIXED)		+= efi_thunk_$(BITS).o
 obj-$(CONFIG_EFI_RUNTIME_MAP)	+= runtime-map.o
diff --git a/arch/x86/platform/efi/memmap.c b/arch/x86/platform/efi/memmap.c
deleted file mode 100644
index 524e9f2ef276..000000000000
--- a/arch/x86/platform/efi/memmap.c
+++ /dev/null
@@ -1,109 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Common EFI memory map functions.
- */
-
-#define pr_fmt(fmt) "efi: " fmt
-
-#include <linux/init.h>
-#include <linux/kernel.h>
-#include <linux/efi.h>
-#include <linux/io.h>
-#include <asm/early_ioremap.h>
-#include <asm/efi.h>
-#include <linux/memblock.h>
-#include <linux/slab.h>
-
-static phys_addr_t __init __efi_memmap_alloc_early(unsigned long size)
-{
-	return memblock_phys_alloc(size, SMP_CACHE_BYTES);
-}
-
-static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
-{
-	unsigned int order = get_order(size);
-	struct page *p = alloc_pages(GFP_KERNEL, order);
-
-	if (!p)
-		return 0;
-
-	return PFN_PHYS(page_to_pfn(p));
-}
-
-static
-void __init __efi_memmap_free(u64 phys, unsigned long size, unsigned long flags)
-{
-	if (flags & EFI_MEMMAP_MEMBLOCK) {
-		memblock_phys_free(phys, size);
-	} else if (flags & EFI_MEMMAP_SLAB) {
-		struct page *p = pfn_to_page(PHYS_PFN(phys));
-		unsigned int order = get_order(size);
-
-		__free_pages(p, order);
-	}
-}
-
-/**
- * efi_memmap_alloc - Allocate memory for the EFI memory map
- * @num_entries: Number of entries in the allocated map.
- * @data: efi memmap installation parameters
- *
- * Depending on whether mm_init() has already been invoked or not,
- * either memblock or "normal" page allocation is used.
- *
- * Returns zero on success, a negative error code on failure.
- */
-int __init efi_memmap_alloc(unsigned int num_entries,
-		struct efi_memory_map_data *data)
-{
-	/* Expect allocation parameters are zero initialized */
-	WARN_ON(data->phys_map || data->size);
-
-	data->size = num_entries * efi.memmap.desc_size;
-	data->desc_version = efi.memmap.desc_version;
-	data->desc_size = efi.memmap.desc_size;
-	data->flags &= ~(EFI_MEMMAP_SLAB | EFI_MEMMAP_MEMBLOCK);
-	data->flags |= efi.memmap.flags & EFI_MEMMAP_LATE;
-
-	if (slab_is_available()) {
-		data->flags |= EFI_MEMMAP_SLAB;
-		data->phys_map = __efi_memmap_alloc_late(data->size);
-	} else {
-		data->flags |= EFI_MEMMAP_MEMBLOCK;
-		data->phys_map = __efi_memmap_alloc_early(data->size);
-	}
-
-	if (!data->phys_map)
-		return -ENOMEM;
-	return 0;
-}
-
-/**
- * efi_memmap_install - Install a new EFI memory map in efi.memmap
- * @data: efi memmap installation parameters
- *
- * Unlike efi_memmap_init_*(), this function does not allow the caller
- * to switch from early to late mappings. It simply uses the existing
- * mapping function and installs the new memmap.
- *
- * Returns zero on success, a negative error code on failure.
- */
-int __init efi_memmap_install(struct efi_memory_map_data *data)
-{
-	unsigned long size = efi.memmap.map_end - efi.memmap.map;
-	unsigned long flags = efi.memmap.flags;
-	u64 phys = efi.memmap.phys_map;
-	int ret;
-
-	efi_memmap_unmap();
-
-	if (efi_enabled(EFI_PARAVIRT))
-		return 0;
-
-	ret = __efi_memmap_init(data);
-	if (ret)
-		return ret;
-
-	__efi_memmap_free(phys, size, flags);
-	return 0;
-}
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index eb00130bcb66..98fdc286eb40 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -401,10 +401,8 @@ static int __init efi_add_range_to_free(u64 range_start, u64 range_end,
 
 void __init efi_unmap_boot_services(void)
 {
-	struct efi_memory_map_data data = { 0 };
 	efi_memory_desc_t *md;
-	int num_entries = 0;
-	void *new, *new_md;
+	void *new_md;
 
 	/* Keep all regions for /sys/kernel/debug/efi */
 	if (efi_enabled(EFI_DBG))
@@ -425,7 +423,6 @@ void __init efi_unmap_boot_services(void)
 
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
 		    md->type != EFI_BOOT_SERVICES_DATA) {
-			num_entries++;
 			continue;
 		}
 
@@ -438,7 +435,6 @@ void __init efi_unmap_boot_services(void)
 
 		/* Do not free, someone else owns it: */
 		if (md->attribute & EFI_MEMORY_RUNTIME) {
-			num_entries++;
 			continue;
 		}
 
@@ -452,23 +448,6 @@ void __init efi_unmap_boot_services(void)
 			pr_err("Failed to reallocate storage for freeable EFI regions\n");
 			return;
 		}
-
-		if (has_reservations)
-			num_entries++;
-	}
-
-	if (!num_entries)
-		return;
-
-	if (efi_memmap_alloc(num_entries, &data) != 0) {
-		pr_err("Failed to allocate new EFI memmap\n");
-		return;
-	}
-
-	new = memremap(data.phys_map, data.size, MEMREMAP_WB);
-	if (!new) {
-		pr_err("Failed to map new EFI memmap\n");
-		return;
 	}
 
 	/*
@@ -476,7 +455,7 @@ void __init efi_unmap_boot_services(void)
 	 * regions that are not tagged EFI_MEMORY_RUNTIME, since those
 	 * regions have now been freed.
 	 */
-	new_md = new;
+	new_md = efi.memmap.map;
 	for_each_efi_memory_desc(md) {
 		if (!(md->attribute & EFI_MEMORY_RUNTIME) &&
 		    (md->type == EFI_BOOT_SERVICES_CODE ||
@@ -486,16 +465,12 @@ void __init efi_unmap_boot_services(void)
 			continue;
 		}
 
-		memcpy(new_md, md, efi.memmap.desc_size);
+		if (new_md != md)
+			memcpy(new_md, md, efi.memmap.desc_size);
 		new_md += efi.memmap.desc_size;
 	}
 
-	memunmap(new);
-
-	if (efi_memmap_install(&data) != 0) {
-		pr_err("Could not install new EFI memmap\n");
-		return;
-	}
+	efi.memmap.num_valid_entries = (new_md - efi.memmap.map) / efi.memmap.desc_size;
 }
 
 static int __init efi_free_boot_services(void)
diff --git a/include/linux/efi.h b/include/linux/efi.h
index a8406ca92332..5986e565a249 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -553,8 +553,7 @@ struct efi_unaccepted_memory {
 
 /*
  * Architecture independent structure for describing a memory map for the
- * benefit of efi_memmap_init_early(), and for passing context between
- * efi_memmap_alloc() and efi_memmap_install().
+ * benefit of efi_memmap_init_early().
  */
 struct efi_memory_map_data {
 	phys_addr_t phys_map;
@@ -572,8 +571,6 @@ struct efi_memory_map {
 	unsigned long desc_version;
 	unsigned long desc_size;
 #define EFI_MEMMAP_LATE (1UL << 0)
-#define EFI_MEMMAP_MEMBLOCK (1UL << 1)
-#define EFI_MEMMAP_SLAB (1UL << 2)
 	unsigned long flags;
 };
 
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 15/17] x86/efi: Merge two traversals of the memory map when freeing boot regions
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (13 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 14/17] x86/efi: Reuse memory map instead of reallocating it Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 16/17] x86/efi: Avoid EFI_MEMORY_RUNTIME for early EFI boot memory reservations Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 17/17] x86/efi: Drop kexec quirk for the EFI memory attributes table Ard Biesheuvel
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Combine two traversals of the EFI memory map, one that instantiates
ranges_to_free array elements, and one that prunes EFI memory map
entries that do not need to be preserved.

This will make it easier to determine whether or not a EFI boot services
region was freed in its entirety, and this informs the decision whether
an entry needs to be preserved in the EFI runtime map.

This will allow the distinction between early reservations of EFI boot
services memory (marked with the EFI_MEMORY_RUNTIME attribute) and late
ones (marked using the MEMBLOCK_RSRV_KERN attribute) to be dropped in a
subsequent patch.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 58 +++++++++-----------
 1 file changed, 27 insertions(+), 31 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 98fdc286eb40..b7c8337d8f88 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -416,13 +416,23 @@ void __init efi_unmap_boot_services(void)
 		return;
 	}
 
+	new_md = efi.memmap.map;
 	for_each_efi_memory_desc(md) {
 		unsigned long long start = md->phys_addr;
 		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
 		bool has_reservations = false;
 
+		/*
+		 * Build a new EFI memmap that excludes any boot services data
+		 * regions that do not cover any reserved areas, since those
+		 * regions are being freed.
+		 */
+		if (new_md != md)
+			memcpy(new_md, md, efi.memmap.desc_size);
+
 		if (md->type != EFI_BOOT_SERVICES_CODE &&
 		    md->type != EFI_BOOT_SERVICES_DATA) {
+			new_md += efi.memmap.desc_size;
 			continue;
 		}
 
@@ -433,40 +443,26 @@ void __init efi_unmap_boot_services(void)
 		 */
 		efi_unmap_pages(md);
 
-		/* Do not free, someone else owns it: */
-		if (md->attribute & EFI_MEMORY_RUNTIME) {
-			continue;
-		}
-
-		/*
-		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
-		 * map are still not initialized and we can't reliably free
-		 * memory here.
-		 * Queue the ranges to free at a later point.
-		 */
-		if (efi_add_range_to_free(start, start + size, &has_reservations)) {
-			pr_err("Failed to reallocate storage for freeable EFI regions\n");
-			return;
-		}
-	}
+		if (!(md->attribute & EFI_MEMORY_RUNTIME)) {
+			/*
+			 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
+			 * map are still not initialized and we can't reliably free
+			 * memory here.
+			 * Queue the ranges to free at a later point.
+			 */
+			if (efi_add_range_to_free(start, start + size, &has_reservations)) {
+				pr_err("Failed to reallocate storage for freeable EFI regions\n");
+				clear_bit(EFI_MEMMAP, &efi.flags);
+				return;
+			}
+
+			/* Continue without advancing new_md so this region is omitted */
+			if (!has_reservations)
+				continue;
 
-	/*
-	 * Build a new EFI memmap that excludes any boot services
-	 * regions that are not tagged EFI_MEMORY_RUNTIME, since those
-	 * regions have now been freed.
-	 */
-	new_md = efi.memmap.map;
-	for_each_efi_memory_desc(md) {
-		if (!(md->attribute & EFI_MEMORY_RUNTIME) &&
-		    (md->type == EFI_BOOT_SERVICES_CODE ||
-		     md->type == EFI_BOOT_SERVICES_DATA) &&
-		    can_free_region(md->phys_addr,
-				    md->num_pages << EFI_PAGE_SHIFT)) {
-			continue;
 		}
 
-		if (new_md != md)
-			memcpy(new_md, md, efi.memmap.desc_size);
+		/* Advance new_md so this region is preserved in the EFI memory map */
 		new_md += efi.memmap.desc_size;
 	}
 
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 16/17] x86/efi: Avoid EFI_MEMORY_RUNTIME for early EFI boot memory reservations
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (14 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 15/17] x86/efi: Merge two traversals of the memory map when freeing boot regions Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  2026-04-23 15:20 ` [PATCH v3 17/17] x86/efi: Drop kexec quirk for the EFI memory attributes table Ard Biesheuvel
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Currently, memblock reservations of EFI boot services memory made before
all EFI boot services memory is temporarily reserved are upgraded, by
being marked with the EFI_MEMORY_RUNTIME bit, and this results in the
entire region to remain reserved permanently, regardless of the size of
the original memblock reservation that triggered this.

This is a hack, and may be quite inefficient in cases where the firmware
does a good job of merging memory map entries.

So instead, rely on the MEMBLOCK_RSRV_KERN flag, by marking existing
memblock reservations with this flag before creating the new, temporary
ones with the flag cleared. This unifies the treatment of early vs late
memblock reservations inside EFI boot services memory, and avoids
clobbering the EFI memory map.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 66 ++++++--------------
 1 file changed, 18 insertions(+), 48 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index b7c8337d8f88..fc6a15c2ace6 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -273,7 +273,6 @@ void __init efi_reserve_boot_services(void)
 		u64 start = max(md->phys_addr, SZ_1M);
 		u64 end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT);
 		u64 size = end - start;
-		bool already_reserved;
 
 		if (end <= start)
 			continue;
@@ -282,37 +281,11 @@ void __init efi_reserve_boot_services(void)
 		    md->type != EFI_BOOT_SERVICES_DATA)
 			continue;
 
-		already_reserved = memblock_is_region_reserved(start, size);
+		/* upgrade existing reservations to MEMBLOCK_RSRV_KERN */
+		if (memblock_is_region_reserved(start, size))
+			memblock_reserved_mark_kern(start, size);
 
-		/*
-		 * Because the following memblock_reserve() is paired
-		 * with free_reserved_area() for this region in
-		 * efi_free_boot_services(), we must be extremely
-		 * careful not to reserve, and subsequently free, critical
-		 * regions of memory that somebody else has already reserved.
-		 */
-		if (!already_reserved) {
-			memblock_reserve(start, size);
-
-			/*
-			 * If we are the first to reserve the region, no
-			 * one else cares about it. We own it and can
-			 * free it later.
-			 */
-			if (can_free_region(start, size))
-				continue;
-		}
-
-		/*
-		 * We don't own the region. We must not free it.
-		 *
-		 * Setting this bit for a boot services region really
-		 * doesn't make sense as far as the firmware is
-		 * concerned, but it does provide us with a way to tag
-		 * those regions that must not be paired with
-		 * memblock_phys_free().
-		 */
-		md->attribute |= EFI_MEMORY_RUNTIME;
+		memblock_reserve(start, size);
 	}
 }
 
@@ -443,25 +416,22 @@ void __init efi_unmap_boot_services(void)
 		 */
 		efi_unmap_pages(md);
 
-		if (!(md->attribute & EFI_MEMORY_RUNTIME)) {
-			/*
-			 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
-			 * map are still not initialized and we can't reliably free
-			 * memory here.
-			 * Queue the ranges to free at a later point.
-			 */
-			if (efi_add_range_to_free(start, start + size, &has_reservations)) {
-				pr_err("Failed to reallocate storage for freeable EFI regions\n");
-				clear_bit(EFI_MEMMAP, &efi.flags);
-				return;
-			}
-
-			/* Continue without advancing new_md so this region is omitted */
-			if (!has_reservations)
-				continue;
-
+		/*
+		 * With CONFIG_DEFERRED_STRUCT_PAGE_INIT parts of the memory
+		 * map are still not initialized and we can't reliably free
+		 * memory here.
+		 * Queue the ranges to free at a later point.
+		 */
+		if (efi_add_range_to_free(start, start + size, &has_reservations)) {
+			pr_err("Failed to reallocate storage for freeable EFI regions\n");
+			clear_bit(EFI_MEMMAP, &efi.flags);
+			return;
 		}
 
+		/* Continue without advancing new_md so this region is omitted */
+		if (!has_reservations)
+			continue;
+
 		/* Advance new_md so this region is preserved in the EFI memory map */
 		new_md += efi.memmap.desc_size;
 	}
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 17/17] x86/efi: Drop kexec quirk for the EFI memory attributes table
  2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
                   ` (15 preceding siblings ...)
  2026-04-23 15:20 ` [PATCH v3 16/17] x86/efi: Avoid EFI_MEMORY_RUNTIME for early EFI boot memory reservations Ard Biesheuvel
@ 2026-04-23 15:20 ` Ard Biesheuvel
  16 siblings, 0 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2026-04-23 15:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-efi, x86, Ard Biesheuvel, Mike Rapoport (Microsoft),
	Benjamin Herrenschmidt, Dave Young, Gregory Price

From: Ard Biesheuvel <ardb@kernel.org>

Now that the EFI memory attributes table is preserved properly, and the
quirk to detect corrupted tables has been updated not to result in false
positives when the number of EFI memory map entries is low compared to
the number of EFI memory attributes table entries, there is no longer a
need to ignore the latter when doing a kexec boot. So drop the
workaround.

This reverts commit

  64b45dd46e15 ("x86/efi: skip memattr table on kexec boot")

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/quirks.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index fc6a15c2ace6..92d37f2a5cbe 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -522,10 +522,6 @@ int __init efi_reuse_config(u64 tables, int nr_tables)
 		if (!efi_guidcmp(guid, SMBIOS_TABLE_GUID))
 			((efi_config_table_64_t *)p)->table = data->smbios;
 
-		/* Do not bother to play with mem attr table across kexec */
-		if (!efi_guidcmp(guid, EFI_MEMORY_ATTRIBUTES_TABLE_GUID))
-			((efi_config_table_64_t *)p)->table = EFI_INVALID_TABLE_ADDR;
-
 		p += sz;
 	}
 	early_memunmap(tablep, nr_tables * sz);
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-04-23 15:21 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 15:20 [PATCH v3 00/17] efi/x86: Avoid the need to mangle the EFI memory map Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 01/17] x86/efi: Omit redundant kernel image overlap check Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 02/17] x86/efi: Drop redundant EFI_PARAVIRT check Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 03/17] x86/efi: Only merge EFI memory map entries on 32-bit systems Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 04/17] x86/efi: Defer sub-1M check from unmap to free stage Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 05/17] x86/efi: Simplify real mode trampoline allocation quirk Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 06/17] x86/efi: Unmap kernel-reserved boot regions from EFI page tables Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 07/17] x86/efi: Drop EFI_MEMORY_RUNTIME check from __ioremap_check_other() Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 08/17] x86/efi: Allow ranges_to_free array to grow beyond initial size Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 09/17] x86/efi: Intersect ranges_to_free with MEMBLOCK_RSRV_KERN regions Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 10/17] x86/efi: Do not rely on EFI_MEMORY_RUNTIME bit and avoid entry splitting Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 11/17] efi: Use nr_map not map_end to find the last valid memory map entry Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 12/17] x86/efi: Clean the memory map using iterator and filter API Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 13/17] x86/efi: Update the runtime map in place Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 14/17] x86/efi: Reuse memory map instead of reallocating it Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 15/17] x86/efi: Merge two traversals of the memory map when freeing boot regions Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 16/17] x86/efi: Avoid EFI_MEMORY_RUNTIME for early EFI boot memory reservations Ard Biesheuvel
2026-04-23 15:20 ` [PATCH v3 17/17] x86/efi: Drop kexec quirk for the EFI memory attributes table Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox