All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>, Andy Shevchenko <andy@kernel.org>,
	Arnd Bergmann <arnd@kernel.org>, Borislav Petkov <bp@alien8.de>,
	Juergen Gross <jgross@suse.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Kees Cook <keescook@chromium.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>,
	Paul Menzel <pmenzel@molgen.mpg.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	David Woodhouse <dwmw@amazon.co.uk>
Subject: [PATCH 12/32] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations
Date: Thu, 15 May 2025 14:05:28 +0200	[thread overview]
Message-ID: <20250515120549.2820541-13-mingo@kernel.org> (raw)
In-Reply-To: <20250515120549.2820541-1-mingo@kernel.org>

So the E820 code has a rather confusing area of code at around
e820__reserve_resources(), which is, by its plain reading,
rather self-contradictory. For example, the comment explaining
e820__reserve_resources() claims:

 - '* Mark E820 reserved areas as busy for the resource manager'

By 'E820 reserved areas' one can naively conclude that it's
talking about E820_TYPE_RESERVED areas - while those areas
are treated in exactly the opposite fashion by do_mark_busy():

        switch (type) {
        case E820_TYPE_RESERVED:
        case E820_TYPE_SOFT_RESERVED:
        case E820_TYPE_PRAM:
        case E820_TYPE_PMEM:
                return false;

Ie. E820_TYPE_RESERVED areas are *not* marked busy for the
resource manager, because E820_TYPE_RESERVED areas are
device regions that might eventually be claimed by a device driver.

This type of confusion permeates this whole area of code,
making it exceedingly difficult to read (for me at least).

So untangle it bit by bit:

 - Instead of talking about ambiguous 'reserved areas',
   talk about 'E820 device address regions' instead,
   and 'register'/'lock' them.

 - The do_mark_busy() function is a misnomer as well, because despite
   its name it 'does' nothing - it only determines what type
   of resource handling an E820 type should receive from the
   kernel. Rename it to e820_device_region() and negate its
   meaning, to avoid the 'busy/reserved' confusion. Because
   that's what this code is really about: filtering out
   device regions such as E820_TYPE_RESERVED, E820_TYPE_PRAM,
   E820_TYPE_PMEM, etc., and allowing them to be claimed
   by device drivers later on.

 - All other E820 regions (system regions) are registered and
   locked early on, before the PCI resource manager does its
   search for device BAR addresses, etc.

Also fix this somewhat misleading comment:

	/*
	 * Try to bump up RAM regions to reasonable boundaries, to
	 * avoid stolen RAM:
	 */

and explain that here we register artificial 'gap' resources
at the end of suspiciously sized RAM regions, as heuristics
to try to avoid buggy firmware with undeclared 'stolen RAM' regions:

	/*
	 * Create additional 'gaps' at the end of RAM regions,
	 * rounding them up to 64k/1MB/64MB boundaries, should
	 * they be weirdly sized, and register extra, locked
	 * resource regions for them, to make sure drivers
	 * won't claim those addresses.
	 *
	 * These are basically blind guesses and heuristics to
	 * avoid resource conflicts with broken firmware that
	 * doesn't properly list 'stolen RAM' as a system region
	 * in the E820 map.
	 */

Also improve the printout of this extra resource a bit: make the
message more unambiguous, and upgrade it from pr_debug() (where
very few people will see it), to pr_info() (where it will make
it into the syslog on default distro configs).

Also fix spelling and improve comment placement.

No change in functionality intended.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 arch/x86/kernel/e820.c | 55 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 5eb0849b492f..c9bb808c4888 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1106,37 +1106,44 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
 	}
 }
 
-static bool __init do_mark_busy(enum e820_type type, struct resource *res)
+/*
+ * We assign one resource entry for each E820 map entry:
+ */
+static struct resource __initdata *e820_res;
+
+/*
+ * Is this a device address region that should not be marked busy?
+ * (Versus system address regions that we register & lock early.)
+ */
+static bool __init e820_device_region(enum e820_type type, struct resource *res)
 {
-	/* this is the legacy bios/dos rom-shadow + mmio region */
+	/* This is the legacy BIOS/DOS ROM-shadow + MMIO region: */
 	if (res->start < (1ULL<<20))
-		return true;
+		return false;
 
 	/*
 	 * Treat persistent memory and other special memory ranges like
-	 * device memory, i.e. reserve it for exclusive use of a driver
+	 * device memory, i.e. keep it available for exclusive use of a
+	 * driver:
 	 */
 	switch (type) {
 	case E820_TYPE_RESERVED:
 	case E820_TYPE_SOFT_RESERVED:
 	case E820_TYPE_PRAM:
 	case E820_TYPE_PMEM:
-		return false;
+		return true;
 	case E820_TYPE_RAM:
 	case E820_TYPE_ACPI:
 	case E820_TYPE_NVS:
 	case E820_TYPE_UNUSABLE:
 	default:
-		return true;
+		return false;
 	}
 }
 
 /*
- * Mark E820 reserved areas as busy for the resource manager:
+ * Mark E820 system regions as busy for the resource manager:
  */
-
-static struct resource __initdata *e820_res;
-
 void __init e820__reserve_resources(void)
 {
 	int i;
@@ -1162,18 +1169,18 @@ void __init e820__reserve_resources(void)
 		res->desc  = e820_type_to_iores_desc(entry);
 
 		/*
-		 * Don't register the region that could be conflicted with
-		 * PCI device BAR resources and insert them later in
-		 * pcibios_resource_survey():
+		 * Skip and don't register device regions that could be conflicted
+		 * with PCI device BAR resources. They get inserted later in
+		 * pcibios_resource_survey() -> e820__reserve_resources_late():
 		 */
-		if (do_mark_busy(entry->type, res)) {
+		if (!e820_device_region(entry->type, res)) {
 			res->flags |= IORESOURCE_BUSY;
 			insert_resource(&iomem_resource, res);
 		}
 		res++;
 	}
 
-	/* Expose the kexec e820 table to the sysfs. */
+	/* Expose the kexec e820 table to sysfs: */
 	for (i = 0; i < e820_table_kexec->nr_entries; i++) {
 		struct e820_entry *entry = e820_table_kexec->entries + i;
 
@@ -1207,6 +1214,10 @@ void __init e820__reserve_resources_late(void)
 	int i;
 	struct resource *res;
 
+	/*
+	 * Register device address regions listed in the E820 map,
+	 * these can be claimed by device drivers later on:
+	 */
 	res = e820_res;
 	for (i = 0; i < e820_table->nr_entries; i++) {
 		if (!res->parent && res->end)
@@ -1215,8 +1226,16 @@ void __init e820__reserve_resources_late(void)
 	}
 
 	/*
-	 * Try to bump up RAM regions to reasonable boundaries, to
-	 * avoid stolen RAM:
+	 * Create additional 'gaps' at the end of RAM regions,
+	 * rounding them up to 64k/1MB/64MB boundaries, should
+	 * they be weirdly sized, and register extra, locked
+	 * resource regions for them, to make sure drivers
+	 * won't claim those addresses.
+	 *
+	 * These are basically blind guesses and heuristics to
+	 * avoid resource conflicts with broken firmware that
+	 * doesn't properly list 'stolen RAM' as a system region
+	 * in the E820 map.
 	 */
 	for (i = 0; i < e820_table->nr_entries; i++) {
 		struct e820_entry *entry = &e820_table->entries[i];
@@ -1232,7 +1251,7 @@ void __init e820__reserve_resources_late(void)
 		if (start >= end)
 			continue;
 
-		printk(KERN_DEBUG "e820: reserve RAM buffer [mem %#010llx-%#010llx]\n", start, end);
+		pr_info("e820: register RAM buffer resource [mem %#010llx-%#010llx]\n", start, end);
 		reserve_region_with_split(&iomem_resource, start, end, "RAM buffer");
 	}
 }
-- 
2.45.2


  parent reply	other threads:[~2025-05-15 12:06 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-15 12:05 [PATCH -v2 00/32] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
2025-05-15 12:05 ` [PATCH 01/32] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable() Ingo Molnar
2025-06-02 11:21   ` Nikolay Borisov
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 02/32] x86/boot/e820: Simplify e820__print_table() a bit Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 03/32] x86/boot/e820: Simplify the PPro Erratum #50 workaround Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 04/32] x86/boot/e820: Mark e820__print_table() static Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 05/32] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 06/32] x86/boot/e820: Make the field separator space character part of e820_print_type() Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 07/32] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
2025-05-19 12:26   ` Andy Shevchenko
2025-05-31 18:21     ` Ingo Molnar
2025-05-15 12:05 ` [PATCH 08/32] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 09/32] x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout Ingo Molnar
2025-05-19 12:27   ` Andy Shevchenko
2025-05-31 18:09     ` Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 10/32] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long' Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 11/32] x86/boot/e820: Remove pointless early_panic() indirection Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` Ingo Molnar [this message]
2025-06-02 11:45   ` [PATCH 12/32] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations Nikolay Borisov
2025-12-14  8:26     ` [PATCH 33/32] x86/boot/e820: Use <linux/sizes.h> symbols for literals Ingo Molnar
2025-12-14  8:37     ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 13/32] x86/boot/e820: Improve e820_print_type() messages Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 14/32] x86/boot/e820: Clean up __e820__range_add() a bit Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 15/32] x86/boot/e820: Clean up __refdata use " Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 16/32] x86/boot/e820: Remove unnecessary header inclusions Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 17/32] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
2025-06-02 12:37   ` Nikolay Borisov
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 18/32] x86/boot/e820: Standardize e820 table index variable types under 'u32' Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 19/32] x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 20/32] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit Ingo Molnar
2025-06-02 12:41   ` Nikolay Borisov
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 21/32] x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 22/32] x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 23/32] x86/boot/e820: Simplify & clarify __e820__range_add() a bit Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 24/32] x86/boot/e820: Standardize __init/__initdata tag placement Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 25/32] x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables Ingo Molnar
2025-06-02 12:50   ` Nikolay Borisov
2025-06-02 12:57     ` Andy Shevchenko
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 26/32] x86/boot/e820: Remove e820__range_remove()'s unused return parameter Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 27/32] x86/boot/e820: Simplify the e820__range_remove() API Ingo Molnar
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 28/32] x86/boot/e820: Make sure e820_search_gap() finds all gaps Ingo Molnar
2025-06-02 14:49   ` Nikolay Borisov
2025-12-14  8:37   ` [tip: x86/boot] " tip-bot2 for Ingo Molnar
2025-05-15 12:05 ` [PATCH 29/32] x86/boot/e820: Introduce E820_TYPE_13 and treat it as a device region Ingo Molnar
2025-05-15 12:05 ` [PATCH 30/32] x86/boot/e820: Change e820_type_to_string() to take a 'type' parameter Ingo Molnar
2025-05-15 12:05 ` [PATCH 31/32] x86/boot/e820: Unify e820_print_type() and e820_type_to_string() Ingo Molnar
2025-05-15 12:05 ` [PATCH 32/32] x86/boot/e820: Move index increments outside accessors in e820__update_table() Ingo Molnar
2025-05-16 18:17   ` H. Peter Anvin
2025-05-17 13:13     ` Ingo Molnar
2025-06-02 14:57   ` Nikolay Borisov
2025-12-14  8:26     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250515120549.2820541-13-mingo@kernel.org \
    --to=mingo@kernel.org \
    --cc=andy@kernel.org \
    --cc=arnd@kernel.org \
    --cc=bp@alien8.de \
    --cc=dwmw@amazon.co.uk \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.